Re: [PATCH] i18n fix for gimple-ssa-sprintf.c (PR translation/79183)

2019-04-18 Thread Richard Biener
On Thu, 18 Apr 2019, Jakub Jelinek wrote:

> Hi!
> 
> This patch fixes the following messages, so that they are translatable even
> to languages that don't use the english
> Plural-Forms: nplurals=2; plural=n != 1;
> See 
> https://www.gnu.org/software/gettext/manual/html_node/Plural-forms.html#Plural-forms
> for more details.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, plus generated
> gcc.pot and eyeballed the changes.  Ok for trunk?

OK.

Richard.

> 2019-04-18  Jakub Jelinek  
> 
>   PR translation/79183
>   * gimple-ssa-sprintf.c (format_directive): Use inform_n instead of
>   inform where appropriate.
> 
> --- gcc/gimple-ssa-sprintf.c.jj   2019-04-10 09:26:49.476692760 +0200
> +++ gcc/gimple-ssa-sprintf.c  2019-04-17 21:37:51.535294586 +0200
> @@ -3016,12 +3016,10 @@ format_directive (const sprintf_dom_walk
>help the user figure out how big a buffer they need.  */
>  
> if (min == max)
> - inform (callloc,
> - (min == 1
> -  ? G_("%qE output %wu byte into a destination of size %wu")
> -  : G_("%qE output %wu bytes into a destination of size "
> -   "%wu")),
> - info.func, min, info.objsize);
> + inform_n (callloc, min,
> +   "%qE output %wu byte into a destination of size %wu",
> +   "%qE output %wu bytes into a destination of size %wu",
> +   info.func, min, info.objsize);
> else if (max < HOST_WIDE_INT_MAX)
>   inform (callloc,
>   "%qE output between %wu and %wu bytes into "
> @@ -3044,11 +3042,9 @@ format_directive (const sprintf_dom_walk
>of printf with no destination size just print the computed
>result.  */
> if (min == max)
> - inform (callloc,
> - (min == 1
> -  ? G_("%qE output %wu byte")
> -  : G_("%qE output %wu bytes")),
> - info.func, min);
> + inform_n (callloc, min,
> +   "%qE output %wu byte", "%qE output %wu bytes",
> +   info.func, min);
> else if (max < HOST_WIDE_INT_MAX)
>   inform (callloc,
>   "%qE output between %wu and %wu bytes",
> 
>   Jakub
> 

-- 
Richard Biener 
SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)

[PATCH, committed] Update my email address

2019-04-18 Thread Iain Sandoe
.. belatedly, it slipped my mind.
Iain



main-patch.diff
Description: Binary data


Re: [PATCH, GCC, AARCH64] Add GNU note section with BTI and PAC.

2019-04-18 Thread James Greenhalgh
On Thu, Apr 04, 2019 at 05:01:06PM +0100, Sudakshina Das wrote:
> Hi Richard
> 
> On 03/04/2019 11:28, Richard Henderson wrote:
> > On 4/3/19 5:19 PM, Sudakshina Das wrote:
> >> +  /* PT_NOTE header: namesz, descsz, type.
> >> +   namesz = 4 ("GNU\0")
> >> +   descsz = 16 (Size of the program property array)
> >> +   type   = 5 (NT_GNU_PROPERTY_TYPE_0).  */
> >> +  assemble_align (POINTER_SIZE);
> >> +  assemble_integer (GEN_INT (4), 4, 32, 1);
> >> +  assemble_integer (GEN_INT (16), 4, 32, 1);
> > 
> > So, it's 16 only if POINTER_SIZE == 64.
> > 
> > I think ROUND_UP (12, POINTER_BYTES) is what you want here.
> >
> 
> 
> Ah yes. I have made that change now.

This is OK, but instead of:

> diff --git a/gcc/testsuite/gcc.target/aarch64/va_arg_1.c 
> b/gcc/testsuite/gcc.target/aarch64/va_arg_1.c
> index 
> e8e3cdac51350b545e5c2a644a3e1f4d1c37f88d..1fe92ff08935d4c6f08affcbd77ea91537030640
>  100644
> --- a/gcc/testsuite/gcc.target/aarch64/va_arg_1.c
> +++ b/gcc/testsuite/gcc.target/aarch64/va_arg_1.c
> @@ -4,7 +4,9 @@
>  int
>  f (int a, ...)
>  {
> -  /* { dg-final { scan-assembler-not "str" } } */
> +  /* Fails on aarch64*-*-linux* if configured with
> +--enable-standard-branch-protection because of the GNU NOTE section.  */
> +  /* { dg-final { scan-assembler-not "str" { target { ! aarch64*-*-linux* } 
> || { ! default_branch_protection } } } } */
>return a;
>  }

Can you just change the regex to check for str followed by a tab, or
something that looks else which looks like the instruction and doesn't
match against 'string'.

Thanks,
James


> 
> Thanks
> Sudi
> 
> > 
> > r~
> > 
> 


Re: [PATCH] avoid aarch64 ICE on large vectors (PR 89797)

2019-04-18 Thread Richard Earnshaw (lists)
On 18/04/2019 03:38, Martin Sebor wrote:
> The fix for pr89797 committed in r270326 was limited to targets
> with NUM_POLY_INT_COEFFS == 1 which I think is all but aarch64.
> The tests for the fix have been failing with an ICE on aarch64
> because it suffers from more or less the same problem but in
> its own target-specific code.  Attached is the patch I posted
> yesterday that fixes the ICE, successfully bootstrapped and
> regtested on x86_64-linux.  I also ran the dg.exp=*attr* and
> aarch64.exp tests with an aarch64-linux-elf cross-compiler.
> There are no ICEs but there are tons of errors in the latter
> tests because many (most?) either expect to be able to find
> libc headers or link executables (I have not built libc for
> aarch64).
> 
> I'm around tomorrow but then traveling the next two weeks (with
> no connectivity the first week) so I unfortunately won't be able
> to fix whatever this change might break until the week of May 6.
> 
> Jeff, if you have an aarch64 tester that could verify this patch
> tomorrow that would help give us some confidence.  Otherwise,
> another option to consider for the time being is to xfail
> the tests on aarch64.
> 
> Thanks
> Martin
> 
> gcc-89797.diff
> 
> PR middle-end/89797 - ICE on a vector_size (1LU << 33) int variable
> 
> gcc/ChangeLog:
> 
>   PR middle-end/89797
>   * tree.h (TYPE_VECTOR_SUBPARTS): Correct computation when
>   NUM_POLY_INT_COEFFS == 2.  Use HOST_WIDE_INT_1U.
>   * config/aarch64/aarch64.c (aarch64_simd_vector_alignment): Avoid
>   assuming type size fits in SHWI.
> 
> Index: gcc/tree.h
> ===
> --- gcc/tree.h(revision 270418)
> +++ gcc/tree.h(working copy)
> @@ -3735,13 +3735,13 @@ TYPE_VECTOR_SUBPARTS (const_tree node)
>if (NUM_POLY_INT_COEFFS == 2)
>  {
>poly_uint64 res = 0;
> -  res.coeffs[0] = 1 << (precision & 0xff);
> +  res.coeffs[0] = HOST_WIDE_INT_1U << (precision & 0xff);

I'm not sure I understand this either before or after.  (precision &
0xff) gives a value in the range 0-255, which is way more than can be
accommodated in a left shift, even for a HWI.

>if (precision & 0x100)
> - res.coeffs[1] = 1 << (precision & 0xff);
> + res.coeffs[1] = HOST_WIDE_INT_1U << ((precision & 0x100) >> 16);

And this equally doesn't make sense >>16 will shift the masked value out
of the bottom of the result.

Also, if we set res.coeffs[1], don't we want refs.coefs[0] to be set to 0?

R.

>return res;
>  }
>else
> -return (unsigned HOST_WIDE_INT)1 << precision;
> +return HOST_WIDE_INT_1U << precision;
>  }
>  
>  /* Set the number of elements in VECTOR_TYPE NODE to SUBPARTS, which must
> Index: gcc/config/aarch64/aarch64.c
> ===
> --- gcc/config/aarch64/aarch64.c  (revision 270418)
> +++ gcc/config/aarch64/aarch64.c  (working copy)
> @@ -14924,7 +14924,10 @@ aarch64_simd_vector_alignment (const_tree type)
> be set for non-predicate vectors of booleans.  Modes are the most
> direct way we have of identifying real SVE predicate types.  */
>  return GET_MODE_CLASS (TYPE_MODE (type)) == MODE_VECTOR_BOOL ? 16 : 128;
> -  HOST_WIDE_INT align = tree_to_shwi (TYPE_SIZE (type));
> +  tree size = TYPE_SIZE (type);
> +  unsigned HOST_WIDE_INT align = 128;
> +  if (tree_fits_uhwi_p (size))
> +align = tree_to_uhwi (TYPE_SIZE (type));
>return MIN (align, 128);
>  }
>  
> 



Fix two ubsan failures (PR85164)

2019-04-18 Thread Richard Sandiford
Two fixes for UB when handling very large offsets.  The calculation in
force_int_to_mode would have been correct if signed integers used modulo
arithmetic, so just switch to unsigned types.  The calculation in
rtx_addr_can_trap_p_1 didn't handle overflow properly, so switch to
known_subrange_p instead (which is supposed to handle all cases).

Tested with bootstrap-ubsan on aarch64-linux-gnu and x86_64-linux-gnu.
OK to install?

Richard


2019-04-18  Richard Sandiford  

gcc/
PR middle-end/85164
* combine.c (force_int_to_mode): Cast the argument rather than
the result of known_alignment.
* rtlanal.c (rtx_addr_can_trap_p_1): Use known_subrange_p.

gcc/testsuite/
PR middle-end/85164
* gcc.dg/pr85164-1.c, gcc.dg/pr85164-2.c: New tests.

Index: gcc/combine.c
===
--- gcc/combine.c   2019-03-08 18:15:36.704740334 +
+++ gcc/combine.c   2019-04-18 10:11:01.984734102 +0100
@@ -8922,7 +8922,7 @@ force_int_to_mode (rtx x, scalar_int_mod
   /* If X is (minus C Y) where C's least set bit is larger than any bit
 in the mask, then we may replace with (neg Y).  */
   if (poly_int_rtx_p (XEXP (x, 0), &const_op0)
- && (unsigned HOST_WIDE_INT) known_alignment (const_op0) > mask)
+ && known_alignment (poly_uint64 (const_op0)) > mask)
{
  x = simplify_gen_unary (NEG, xmode, XEXP (x, 1), xmode);
  return force_to_mode (x, mode, mask, next_select);
Index: gcc/rtlanal.c
===
--- gcc/rtlanal.c   2019-03-08 18:14:26.721006369 +
+++ gcc/rtlanal.c   2019-04-18 10:11:01.984734102 +0100
@@ -521,7 +521,7 @@ rtx_addr_can_trap_p_1 (const_rtx x, poly
 
  return (!known_size_p (decl_size) || known_eq (decl_size, 0)
  ? maybe_ne (offset, 0)
- : maybe_gt (offset + size, decl_size));
+ : !known_subrange_p (offset, size, 0, decl_size));
 }
 
   return 0;
Index: gcc/testsuite/gcc.dg/pr85164-1.c
===
--- /dev/null   2019-03-08 11:40:14.606883727 +
+++ gcc/testsuite/gcc.dg/pr85164-1.c2019-04-18 10:11:01.984734102 +0100
@@ -0,0 +1,7 @@
+/* { dg-options "-O2 -w" } */
+a[];
+b;
+c() {
+  unsigned long d;
+  b = a[d - 1 >> 3];
+}
Index: gcc/testsuite/gcc.dg/pr85164-2.c
===
--- /dev/null   2019-03-08 11:40:14.606883727 +
+++ gcc/testsuite/gcc.dg/pr85164-2.c2019-04-18 10:11:01.984734102 +0100
@@ -0,0 +1,4 @@
+/* { dg-options "-O2 -w" } */
+int a;
+long b;
+void c() { b = -9223372036854775807L - 1 - a; }


Fix UB in int_const_binop

2019-04-18 Thread Richard Sandiford
When testing PR 85164, the baseline bootstrap-ubsan results had
a lot of failures from int_const_binop.  This is because with the
new overflow handling we can sometimes do:

  poly_res = res;

on an uninitialised res.

Tested with bootstrap-ubsan on aarch64-linux-gnu and x86_64-linux-gnu.
OK to install?  (This is a GCC 9 regression FWIW.)

Richard


2019-04-18  Richard Sandiford  

gcc/
* fold-const.c (int_const_binop): Return early on failure.

Index: gcc/fold-const.c
===
--- gcc/fold-const.c2019-04-04 08:34:52.001938080 +0100
+++ gcc/fold-const.c2019-04-18 10:11:12.336697917 +0100
@@ -1173,7 +1173,6 @@ poly_int_binop (poly_wide_int &res, enum
 int_const_binop (enum tree_code code, const_tree arg1, const_tree arg2,
 int overflowable)
 {
-  bool success = false;
   poly_wide_int poly_res;
   tree type = TREE_TYPE (arg1);
   signop sign = TYPE_SIGN (type);
@@ -1183,17 +1182,18 @@ int_const_binop (enum tree_code code, co
 {
   wide_int warg1 = wi::to_wide (arg1), res;
   wide_int warg2 = wi::to_wide (arg2, TYPE_PRECISION (type));
-  success = wide_int_binop (res, code, warg1, warg2, sign, &overflow);
+  if (!wide_int_binop (res, code, warg1, warg2, sign, &overflow))
+   return NULL_TREE;
   poly_res = res;
 }
-  else if (poly_int_tree_p (arg1) && poly_int_tree_p (arg2))
-success = poly_int_binop (poly_res, code, arg1, arg2, sign, &overflow);
-  if (success)
-return force_fit_type (type, poly_res, overflowable,
-  (((sign == SIGNED || overflowable == -1)
-&& overflow)
-   | TREE_OVERFLOW (arg1) | TREE_OVERFLOW (arg2)));
-  return NULL_TREE;
+  else if (!poly_int_tree_p (arg1)
+  || !poly_int_tree_p (arg2)
+  || !poly_int_binop (poly_res, code, arg1, arg2, sign, &overflow))
+return NULL_TREE;
+  return force_fit_type (type, poly_res, overflowable,
+(((sign == SIGNED || overflowable == -1)
+  && overflow)
+ | TREE_OVERFLOW (arg1) | TREE_OVERFLOW (arg2)));
 }
 
 /* Return true if binary operation OP distributes over addition in operand


[PATCH] Fix PR90131, wrong-debug

2019-04-18 Thread Richard Biener


This fixes another case similar to the fixed PR89892, mergephi
via remove_forwarder_block_with_phi caused out-of-date debug
binds to become live and thus needs similar treatment as
remove_forwarder_block.  Previously it didn't even bother
to move debug stmts because the destination always has
multiple predecessors.  Now we have to move and reset them.

Fixed by factoring out a worker from remove_forwarder_block and
using that from remove_forwarder_block_with_phi as well.

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

Richard.

2019-04-18  Richard Biener  

PR debug/90131
* tree-cfgcleanup.c (move_debug_stmts_from_forwarder): Split
out from ...
(remove_forwarder_block): ... here.
(remove_forwarder_block_with_phi): Also move debug stmts here.

* gcc.dg/guality/pr90131.c: New testcase.

Index: gcc/tree-cfgcleanup.c
===
--- gcc/tree-cfgcleanup.c   (revision 270437)
+++ gcc/tree-cfgcleanup.c   (working copy)
@@ -444,6 +444,45 @@ phi_alternatives_equal (basic_block dest
   return true;
 }
 
+/* Move debug stmts from the forwarder block SRC to DEST.  */
+
+static void
+move_debug_stmts_from_forwarder (basic_block src, basic_block dest)
+{
+  if (!MAY_HAVE_DEBUG_STMTS)
+return;
+
+  bool can_move_debug_stmts = single_pred_p (dest);
+  gimple_stmt_iterator gsi_to = gsi_after_labels (dest);
+  for (gimple_stmt_iterator gsi = gsi_after_labels (src); !gsi_end_p (gsi);)
+{
+  gimple *debug = gsi_stmt (gsi);
+  gcc_assert (is_gimple_debug (debug));
+  /* Move debug binds anyway, but not anything else like begin-stmt
+markers unless they are always valid at the destination.  */
+  if (can_move_debug_stmts
+ || gimple_debug_bind_p (debug))
+   {
+ gsi_move_before (&gsi, &gsi_to);
+ /* Reset debug-binds that are not always valid at the destination.
+Simply dropping them can cause earlier values to become live,
+generating wrong debug information.
+???  There are several things we could improve here.  For
+one we might be able to move stmts to the predecessor.
+For anther, if the debug stmt is immediately followed by a
+(debug) definition in the destination (on a post-dominated path?)
+we can elide it without any bad effects.  */
+ if (!can_move_debug_stmts)
+   {
+ gimple_debug_bind_reset_value (debug);
+ update_stmt (debug);
+   }
+   }
+  else
+   gsi_next (&gsi);
+}
+}
+
 /* Removes forwarder block BB.  Returns false if this failed.  */
 
 static bool
@@ -454,7 +493,6 @@ remove_forwarder_block (basic_block bb)
   gimple *stmt;
   edge_iterator ei;
   gimple_stmt_iterator gsi, gsi_to;
-  bool can_move_debug_stmts;
 
   /* We check for infinite loops already in tree_forwarder_block_p.
  However it may happen that the infinite loop is created
@@ -503,8 +541,6 @@ remove_forwarder_block (basic_block bb)
}
 }
 
-  can_move_debug_stmts = MAY_HAVE_DEBUG_STMTS && single_pred_p (dest);
-
   basic_block pred = NULL;
   if (single_pred_p (bb))
 pred = single_pred (bb);
@@ -566,40 +602,7 @@ remove_forwarder_block (basic_block bb)
 
   /* Move debug statements.  Reset them if the destination does not
  have a single predecessor.  */
-  if (!gsi_end_p (gsi))
-{
-  gsi_to = gsi_after_labels (dest);
-  do
-   {
- gimple *debug = gsi_stmt (gsi);
- gcc_assert (is_gimple_debug (debug));
- /* Move debug binds anyway, but not anything else
-like begin-stmt markers unless they are always
-valid at the destination.  */
- if (can_move_debug_stmts
- || gimple_debug_bind_p (debug))
-   {
- gsi_move_before (&gsi, &gsi_to);
- /* Reset debug-binds that are not always valid at the
-destination.  Simply dropping them can cause earlier
-values to become live, generating wrong debug information.
-???  There are several things we could improve here.  For
-one we might be able to move stmts to the predecessor.
-For anther, if the debug stmt is immediately followed
-by a (debug) definition in the destination (on a
-post-dominated path?) we can elide it without any bad
-effects.  */
- if (!can_move_debug_stmts)
-   {
- gimple_debug_bind_reset_value (debug);
- update_stmt (debug);
-   }
-   }
- else
-   gsi_next (&gsi);
-   }
-  while (!gsi_end_p (gsi));
-}
+  move_debug_stmts_from_forwarder (bb, dest);
 
   bitmap_set_bit (cfgcleanup_altered_bbs, dest->index);
 
@@ -1282,6 +1285,10 @@ remove_forwarder_block_with_phi (basic_b
   redirect_edge_var_map_cle

Re: Fix two ubsan failures (PR85164)

2019-04-18 Thread Richard Biener
On Thu, Apr 18, 2019 at 11:13 AM Richard Sandiford
 wrote:
>
> Two fixes for UB when handling very large offsets.  The calculation in
> force_int_to_mode would have been correct if signed integers used modulo
> arithmetic, so just switch to unsigned types.  The calculation in
> rtx_addr_can_trap_p_1 didn't handle overflow properly, so switch to
> known_subrange_p instead (which is supposed to handle all cases).
>
> Tested with bootstrap-ubsan on aarch64-linux-gnu and x86_64-linux-gnu.
> OK to install?

OK.

Richard.

> Richard
>
>
> 2019-04-18  Richard Sandiford  
>
> gcc/
> PR middle-end/85164
> * combine.c (force_int_to_mode): Cast the argument rather than
> the result of known_alignment.
> * rtlanal.c (rtx_addr_can_trap_p_1): Use known_subrange_p.
>
> gcc/testsuite/
> PR middle-end/85164
> * gcc.dg/pr85164-1.c, gcc.dg/pr85164-2.c: New tests.
>
> Index: gcc/combine.c
> ===
> --- gcc/combine.c   2019-03-08 18:15:36.704740334 +
> +++ gcc/combine.c   2019-04-18 10:11:01.984734102 +0100
> @@ -8922,7 +8922,7 @@ force_int_to_mode (rtx x, scalar_int_mod
>/* If X is (minus C Y) where C's least set bit is larger than any bit
>  in the mask, then we may replace with (neg Y).  */
>if (poly_int_rtx_p (XEXP (x, 0), &const_op0)
> - && (unsigned HOST_WIDE_INT) known_alignment (const_op0) > mask)
> + && known_alignment (poly_uint64 (const_op0)) > mask)
> {
>   x = simplify_gen_unary (NEG, xmode, XEXP (x, 1), xmode);
>   return force_to_mode (x, mode, mask, next_select);
> Index: gcc/rtlanal.c
> ===
> --- gcc/rtlanal.c   2019-03-08 18:14:26.721006369 +
> +++ gcc/rtlanal.c   2019-04-18 10:11:01.984734102 +0100
> @@ -521,7 +521,7 @@ rtx_addr_can_trap_p_1 (const_rtx x, poly
>
>   return (!known_size_p (decl_size) || known_eq (decl_size, 0)
>   ? maybe_ne (offset, 0)
> - : maybe_gt (offset + size, decl_size));
> + : !known_subrange_p (offset, size, 0, decl_size));
>  }
>
>return 0;
> Index: gcc/testsuite/gcc.dg/pr85164-1.c
> ===
> --- /dev/null   2019-03-08 11:40:14.606883727 +
> +++ gcc/testsuite/gcc.dg/pr85164-1.c2019-04-18 10:11:01.984734102 +0100
> @@ -0,0 +1,7 @@
> +/* { dg-options "-O2 -w" } */
> +a[];
> +b;
> +c() {
> +  unsigned long d;
> +  b = a[d - 1 >> 3];
> +}
> Index: gcc/testsuite/gcc.dg/pr85164-2.c
> ===
> --- /dev/null   2019-03-08 11:40:14.606883727 +
> +++ gcc/testsuite/gcc.dg/pr85164-2.c2019-04-18 10:11:01.984734102 +0100
> @@ -0,0 +1,4 @@
> +/* { dg-options "-O2 -w" } */
> +int a;
> +long b;
> +void c() { b = -9223372036854775807L - 1 - a; }


Re: Fix UB in int_const_binop

2019-04-18 Thread Richard Biener
On Thu, Apr 18, 2019 at 11:16 AM Richard Sandiford
 wrote:
>
> When testing PR 85164, the baseline bootstrap-ubsan results had
> a lot of failures from int_const_binop.  This is because with the
> new overflow handling we can sometimes do:
>
>   poly_res = res;
>
> on an uninitialised res.
>
> Tested with bootstrap-ubsan on aarch64-linux-gnu and x86_64-linux-gnu.
> OK to install?  (This is a GCC 9 regression FWIW.)

OK.

Richard.

> Richard
>
>
> 2019-04-18  Richard Sandiford  
>
> gcc/
> * fold-const.c (int_const_binop): Return early on failure.
>
> Index: gcc/fold-const.c
> ===
> --- gcc/fold-const.c2019-04-04 08:34:52.001938080 +0100
> +++ gcc/fold-const.c2019-04-18 10:11:12.336697917 +0100
> @@ -1173,7 +1173,6 @@ poly_int_binop (poly_wide_int &res, enum
>  int_const_binop (enum tree_code code, const_tree arg1, const_tree arg2,
>  int overflowable)
>  {
> -  bool success = false;
>poly_wide_int poly_res;
>tree type = TREE_TYPE (arg1);
>signop sign = TYPE_SIGN (type);
> @@ -1183,17 +1182,18 @@ int_const_binop (enum tree_code code, co
>  {
>wide_int warg1 = wi::to_wide (arg1), res;
>wide_int warg2 = wi::to_wide (arg2, TYPE_PRECISION (type));
> -  success = wide_int_binop (res, code, warg1, warg2, sign, &overflow);
> +  if (!wide_int_binop (res, code, warg1, warg2, sign, &overflow))
> +   return NULL_TREE;
>poly_res = res;
>  }
> -  else if (poly_int_tree_p (arg1) && poly_int_tree_p (arg2))
> -success = poly_int_binop (poly_res, code, arg1, arg2, sign, &overflow);
> -  if (success)
> -return force_fit_type (type, poly_res, overflowable,
> -  (((sign == SIGNED || overflowable == -1)
> -&& overflow)
> -   | TREE_OVERFLOW (arg1) | TREE_OVERFLOW (arg2)));
> -  return NULL_TREE;
> +  else if (!poly_int_tree_p (arg1)
> +  || !poly_int_tree_p (arg2)
> +  || !poly_int_binop (poly_res, code, arg1, arg2, sign, &overflow))
> +return NULL_TREE;
> +  return force_fit_type (type, poly_res, overflowable,
> +(((sign == SIGNED || overflowable == -1)
> +  && overflow)
> + | TREE_OVERFLOW (arg1) | TREE_OVERFLOW (arg2)));
>  }
>
>  /* Return true if binary operation OP distributes over addition in operand


[PATCH v2] Add error message for target_clones and AVX512 ISAs (PR target/89929).

2019-04-18 Thread Martin Liška
Hi.

I'm sending updated version of that patch. The patch rejects usage of AVX512 
ISAs (except AVX512F)
for target attribute for C++ multiversioning and for target_clone attribute.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin
>From 6a7119cbdb908a2a7bd8017c64e084b5f20b3052 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 17 Apr 2019 14:01:21 +0200
Subject: [PATCH] Add error message for target_clones and AVX512 ISAs (PR
 target/89929).

gcc/ChangeLog:

2019-04-18  Martin Liska  

	PR target/89929
	* config/i386/i386.c (get_builtin_code_for_version): Provide new
	error for AVX512 ISAs.

gcc/testsuite/ChangeLog:

2019-04-18  Martin Liska  

	PR target/89929
	* g++.target/i386/mv28.C: New test.
	* gcc.target/i386/mvc14.c: New test.
---
 gcc/config/i386/i386.c| 27 +++
 gcc/testsuite/g++.target/i386/mv28.C  | 26 ++
 gcc/testsuite/gcc.target/i386/mvc14.c | 16 
 3 files changed, 69 insertions(+)
 create mode 100644 gcc/testsuite/g++.target/i386/mv28.C
 create mode 100644 gcc/testsuite/gcc.target/i386/mvc14.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 498a35d8aea..ed9a9c2e3f7 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -31920,6 +31920,24 @@ get_builtin_code_for_version (tree decl, tree *predicate_list)
   {"avx512f", P_AVX512F}
 };
 
+  static const char * avx512_warning_isas[] = {
+  "avx5124fmaps",
+  "avx5124vnniw",
+  "avx512vpopcntdq",
+  "avx512vbmi2",
+  "avx512vnni",
+  "avx512bitalg",
+  "avx512vbmi",
+  "avx512ifma",
+  "avx512vl",
+  "avx512bw",
+  "avx512dq",
+  "avx512er",
+  "avx512pf",
+  "avx512cd",
+  "gfni",
+  "vpclmulqdq",
+  };
 
   static unsigned int NUM_FEATURES
 = sizeof (feature_list) / sizeof (struct _feature_list);
@@ -32140,6 +32158,15 @@ get_builtin_code_for_version (tree decl, tree *predicate_list)
 	}
   if (predicate_list && i == NUM_FEATURES)
 	{
+	  for (unsigned i = 0; i < ARRAY_SIZE (avx512_warning_isas); i++)
+	if (strcmp (token, avx512_warning_isas[i]) == 0)
+	  {
+		error_at (DECL_SOURCE_LOCATION (decl),
+			  "AVX512 ISA %qs is not supported in "
+			  "% attribute, use % syntax", token);
+		return 0;
+	  }
+
 	  error_at (DECL_SOURCE_LOCATION (decl),
 		"no dispatcher found for %s", token);
 	  return 0;
diff --git a/gcc/testsuite/g++.target/i386/mv28.C b/gcc/testsuite/g++.target/i386/mv28.C
new file mode 100644
index 000..a33efd9cf21
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/mv28.C
@@ -0,0 +1,26 @@
+/* { dg-do compile} */
+/* { dg-require-ifunc "" }  */
+
+void __attribute__ ((target("avx512vl"))) foo () {} /* { dg-error "AVX512 ISA '\[^\n\r\]*' is not supported in 'target' attribute, use 'arch=' syntax" } */
+void __attribute__ ((target("avx512bw"))) foo () {} /* { dg-error "AVX512 ISA '\[^\n\r\]*' is not supported in 'target' attribute, use 'arch=' syntax" } */
+void __attribute__ ((target("avx512dq"))) foo () {} /* { dg-error "AVX512 ISA '\[^\n\r\]*' is not supported in 'target' attribute, use 'arch=' syntax" } */
+void __attribute__ ((target("avx512cd"))) foo () {} /* { dg-error "AVX512 ISA '\[^\n\r\]*' is not supported in 'target' attribute, use 'arch=' syntax" } */
+void __attribute__ ((target("avx512er"))) foo () {} /* { dg-error "AVX512 ISA '\[^\n\r\]*' is not supported in 'target' attribute, use 'arch=' syntax" } */
+void __attribute__ ((target("avx512pf"))) foo () {} /* { dg-error "AVX512 ISA '\[^\n\r\]*' is not supported in 'target' attribute, use 'arch=' syntax" } */
+void __attribute__ ((target("avx512vbmi"))) foo () {} /* { dg-error "AVX512 ISA '\[^\n\r\]*' is not supported in 'target' attribute, use 'arch=' syntax" } */
+void __attribute__ ((target("avx512ifma"))) foo () {} /* { dg-error "AVX512 ISA '\[^\n\r\]*' is not supported in 'target' attribute, use 'arch=' syntax" } */
+void __attribute__ ((target("avx5124vnniw"))) foo () {} /* { dg-error "AVX512 ISA '\[^\n\r\]*' is not supported in 'target' attribute, use 'arch=' syntax" } */
+void __attribute__ ((target("avx5124fmaps"))) foo () {} /* { dg-error "AVX512 ISA '\[^\n\r\]*' is not supported in 'target' attribute, use 'arch=' syntax" } */
+void __attribute__ ((target("avx512vpopcntdq"))) foo () {} /* { dg-error "AVX512 ISA '\[^\n\r\]*' is not supported in 'target' attribute, use 'arch=' syntax" } */
+void __attribute__ ((target("avx512vbmi2"))) foo () {} /* { dg-error "AVX512 ISA '\[^\n\r\]*' is not supported in 'target' attribute, use 'arch=' syntax" } */
+void __attribute__ ((target("gfni"))) foo () {} /* { dg-error "AVX512 ISA '\[^\n\r\]*' is not supported in 'target' attribute, use 'arch=' syntax" } */
+void __attribute__ ((target("vpclmulqdq"))) foo () {} /* { dg-error "AVX512 ISA '\[^\n\r\]*' is not supported in 'target' attribute, use 'arch=' syntax" } */
+void __attribute__ ((target("avx512vnni"

[PATCH][stage1] Enhance target and target_clone error messages.

2019-04-18 Thread Martin Liška
Hi.

The patch distinguishes among target and target_clone attribute when
reporting for an error. I've also reworded the affected error messages
a bit.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed after stage1 opens?
Thanks,
Martin

---
 gcc/cgraphclones.c |  2 +-
 gcc/config/i386/i386-c.c   |  5 +-
 gcc/config/i386/i386-protos.h  |  4 +-
 gcc/config/i386/i386.c | 54 +-
 gcc/testsuite/gcc.target/i386/funcspec-4.c |  2 +-
 5 files changed, 39 insertions(+), 28 deletions(-)


diff --git a/gcc/cgraphclones.c b/gcc/cgraphclones.c
index 15f7e119d18..fd867ecac91 100644
--- a/gcc/cgraphclones.c
+++ b/gcc/cgraphclones.c
@@ -1056,7 +1056,7 @@ cgraph_node::create_version_clone_with_body
   location_t saved_loc = input_location;
   tree v = TREE_VALUE (target_attributes);
   input_location = DECL_SOURCE_LOCATION (new_decl);
-  bool r = targetm.target_option.valid_attribute_p (new_decl, NULL, v, 0);
+  bool r = targetm.target_option.valid_attribute_p (new_decl, NULL, v, 1);
   input_location = saved_loc;
   if (!r)
 	return NULL;
diff --git a/gcc/config/i386/i386-c.c b/gcc/config/i386/i386-c.c
index 5e7e46fcebe..50cac3b1a9f 100644
--- a/gcc/config/i386/i386-c.c
+++ b/gcc/config/i386/i386-c.c
@@ -586,8 +586,9 @@ ix86_pragma_target_parse (tree args, tree pop_target)
 }
   else
 {
-  cur_tree = ix86_valid_target_attribute_tree (args, &global_options,
-		   &global_options_set);
+  cur_tree = ix86_valid_target_attribute_tree (NULL_TREE, args,
+		   &global_options,
+		   &global_options_set, 0);
   if (!cur_tree || cur_tree == error_mark_node)
{
  cl_target_option_restore (&global_options,
diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index 83645e89a81..cfc4783205e 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -215,9 +215,9 @@ extern unsigned int ix86_minimum_alignment (tree, machine_mode,
 extern tree ix86_handle_shared_attribute (tree *, tree, tree, int, bool *);
 extern tree ix86_handle_selectany_attribute (tree *, tree, tree, int, bool *);
 extern int x86_field_alignment (tree, int);
-extern tree ix86_valid_target_attribute_tree (tree,
+extern tree ix86_valid_target_attribute_tree (tree, tree,
 	  struct gcc_options *,
-	  struct gcc_options *);
+	  struct gcc_options *, bool);
 extern unsigned int ix86_get_callcvt (const_tree);
 
 #endif
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index ed9a9c2e3f7..aa8adc7b2aa 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -847,10 +847,11 @@ static void ix86_function_specific_post_stream_in (struct cl_target_option *);
 static void ix86_function_specific_print (FILE *, int,
 	  struct cl_target_option *);
 static bool ix86_valid_target_attribute_p (tree, tree, tree, int);
-static bool ix86_valid_target_attribute_inner_p (tree, char *[],
+static bool ix86_valid_target_attribute_inner_p (tree, tree, char *[],
 		 struct gcc_options *,
 		 struct gcc_options *,
-		 struct gcc_options *);
+		 struct gcc_options *,
+		 bool);
 static bool ix86_can_inline_p (tree, tree);
 static void ix86_set_current_function (tree);
 static unsigned int ix86_minimum_incoming_stack_boundary (bool);
@@ -5149,10 +5150,11 @@ ix86_function_specific_print (FILE *file, int indent,
over the list.  */
 
 static bool
-ix86_valid_target_attribute_inner_p (tree args, char *p_strings[],
+ix86_valid_target_attribute_inner_p (tree fndecl, tree args, char *p_strings[],
  struct gcc_options *opts,
  struct gcc_options *opts_set,
- struct gcc_options *enum_opts_set)
+ struct gcc_options *enum_opts_set,
+ bool target_clone_attr)
 {
   char *next_optstr;
   bool ret = true;
@@ -5296,9 +5298,12 @@ ix86_valid_target_attribute_inner_p (tree args, char *p_strings[],
 IX86_ATTR_YES ("recip",
 		   OPT_mrecip,
 		   MASK_RECIP),
-
   };
 
+  location_t loc
+= fndecl == NULL ? UNKNOWN_LOCATION : DECL_SOURCE_LOCATION (fndecl);
+  const char *attr_name = target_clone_attr ? "target_clone" : "target";
+
   /* If this is a list, recurse to get the options.  */
   if (TREE_CODE (args) == TREE_LIST)
 {
@@ -5306,9 +5311,10 @@ ix86_valid_target_attribute_inner_p (tree args, char *p_strings[],
 
   for (; args; args = TREE_CHAIN (args))
 	if (TREE_VALUE (args)
-	&& !ix86_valid_target_attribute_inner_p (TREE_VALUE (args),
+	&& !ix86_valid_target_attribute_inner_p (fndecl, TREE_VALUE (args),
 		 p_strings, opts, opts_set,
-		 enum_opts_set))
+		 enum_opts_set,
+		 target_clone_attr))
 	  ret = false;
 
   return ret;
@@ -5316,7 +5322,7 @@ ix86_valid_target_attribute_inner_p (tree args, char *p_strings[],
 
   else if (TREE_CODE (args) != STRING_CST)
 {
-  error ("attrib

Re: [PATCH][stage1] Enhance target and target_clone error messages.

2019-04-18 Thread Martin Liška
On 4/18/19 1:09 PM, Martin Liška wrote:
> Hi.
> 
> The patch distinguishes among target and target_clone attribute when
> reporting for an error. I've also reworded the affected error messages
> a bit.
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Ready to be installed after stage1 opens?
> Thanks,
> Martin
> 
> ---
>  gcc/cgraphclones.c |  2 +-
>  gcc/config/i386/i386-c.c   |  5 +-
>  gcc/config/i386/i386-protos.h  |  4 +-
>  gcc/config/i386/i386.c | 54 +-
>  gcc/testsuite/gcc.target/i386/funcspec-4.c |  2 +-
>  5 files changed, 39 insertions(+), 28 deletions(-)
> 
> 

I forgot to attach ChangeLog.

Martin
>From 8330ad8d25463ea2f596a617f3e133d2050b9ac8 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 17 Apr 2019 13:59:14 +0200
Subject: [PATCH] Enhance target and target_clone error messages.

gcc/ChangeLog:

2019-04-18  Martin Liska  

	* cgraphclones.c: Call valid_attribute_p with 1 for
	target_clone.
	* config/i386/i386-c.c (ix86_pragma_target_parse): Use 0 as
	it's for target attribute.
	* config/i386/i386-protos.h (ix86_valid_target_attribute_tree):
	Add new boolean argument.
	* config/i386/i386.c (ix86_valid_target_attribute_inner_p):
	Likewise.
	(ix86_valid_target_attribute_tree): Pass target_clone_attr
	to ix86_valid_target_attribute_inner_p.
	(ix86_valid_target_attribute_p): Pass flags argument to
	ix86_valid_target_attribute_inner_p.
	(get_builtin_code_for_version): Use 0 as it's target attribute.

gcc/testsuite/ChangeLog:

2019-04-18  Martin Liska  

	* gcc.target/i386/funcspec-4.c: Update scanned pattern.
---
 gcc/cgraphclones.c |  2 +-
 gcc/config/i386/i386-c.c   |  5 +-
 gcc/config/i386/i386-protos.h  |  4 +-
 gcc/config/i386/i386.c | 56 +-
 gcc/testsuite/gcc.target/i386/funcspec-4.c |  2 +-
 5 files changed, 41 insertions(+), 28 deletions(-)

diff --git a/gcc/cgraphclones.c b/gcc/cgraphclones.c
index 15f7e119d18..fd867ecac91 100644
--- a/gcc/cgraphclones.c
+++ b/gcc/cgraphclones.c
@@ -1056,7 +1056,7 @@ cgraph_node::create_version_clone_with_body
   location_t saved_loc = input_location;
   tree v = TREE_VALUE (target_attributes);
   input_location = DECL_SOURCE_LOCATION (new_decl);
-  bool r = targetm.target_option.valid_attribute_p (new_decl, NULL, v, 0);
+  bool r = targetm.target_option.valid_attribute_p (new_decl, NULL, v, 1);
   input_location = saved_loc;
   if (!r)
 	return NULL;
diff --git a/gcc/config/i386/i386-c.c b/gcc/config/i386/i386-c.c
index 5e7e46fcebe..50cac3b1a9f 100644
--- a/gcc/config/i386/i386-c.c
+++ b/gcc/config/i386/i386-c.c
@@ -586,8 +586,9 @@ ix86_pragma_target_parse (tree args, tree pop_target)
 }
   else
 {
-  cur_tree = ix86_valid_target_attribute_tree (args, &global_options,
-		   &global_options_set);
+  cur_tree = ix86_valid_target_attribute_tree (NULL_TREE, args,
+		   &global_options,
+		   &global_options_set, 0);
   if (!cur_tree || cur_tree == error_mark_node)
{
  cl_target_option_restore (&global_options,
diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index 83645e89a81..cfc4783205e 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -215,9 +215,9 @@ extern unsigned int ix86_minimum_alignment (tree, machine_mode,
 extern tree ix86_handle_shared_attribute (tree *, tree, tree, int, bool *);
 extern tree ix86_handle_selectany_attribute (tree *, tree, tree, int, bool *);
 extern int x86_field_alignment (tree, int);
-extern tree ix86_valid_target_attribute_tree (tree,
+extern tree ix86_valid_target_attribute_tree (tree, tree,
 	  struct gcc_options *,
-	  struct gcc_options *);
+	  struct gcc_options *, bool);
 extern unsigned int ix86_get_callcvt (const_tree);
 
 #endif
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index ed9a9c2e3f7..eff22215e99 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -847,10 +847,11 @@ static void ix86_function_specific_post_stream_in (struct cl_target_option *);
 static void ix86_function_specific_print (FILE *, int,
 	  struct cl_target_option *);
 static bool ix86_valid_target_attribute_p (tree, tree, tree, int);
-static bool ix86_valid_target_attribute_inner_p (tree, char *[],
+static bool ix86_valid_target_attribute_inner_p (tree, tree, char *[],
 		 struct gcc_options *,
 		 struct gcc_options *,
-		 struct gcc_options *);
+		 struct gcc_options *,
+		 bool);
 static bool ix86_can_inline_p (tree, tree);
 static void ix86_set_current_function (tree);
 static unsigned int ix86_minimum_incoming_stack_boundary (bool);
@@ -5149,10 +5150,11 @@ ix86_function_specific_print (FILE *file, int indent,
over the list.  */
 
 static bool
-ix86_valid_target_attribute_inner_p (tree args, char *p_strings[],
+ix86

Re: [PATCH] avoid aarch64 ICE on large vectors (PR 89797)

2019-04-18 Thread Richard Earnshaw (lists)
On 18/04/2019 09:58, Richard Earnshaw (lists) wrote:
> On 18/04/2019 03:38, Martin Sebor wrote:
>> The fix for pr89797 committed in r270326 was limited to targets
>> with NUM_POLY_INT_COEFFS == 1 which I think is all but aarch64.
>> The tests for the fix have been failing with an ICE on aarch64
>> because it suffers from more or less the same problem but in
>> its own target-specific code.  Attached is the patch I posted
>> yesterday that fixes the ICE, successfully bootstrapped and
>> regtested on x86_64-linux.  I also ran the dg.exp=*attr* and
>> aarch64.exp tests with an aarch64-linux-elf cross-compiler.
>> There are no ICEs but there are tons of errors in the latter
>> tests because many (most?) either expect to be able to find
>> libc headers or link executables (I have not built libc for
>> aarch64).
>>
>> I'm around tomorrow but then traveling the next two weeks (with
>> no connectivity the first week) so I unfortunately won't be able
>> to fix whatever this change might break until the week of May 6.
>>
>> Jeff, if you have an aarch64 tester that could verify this patch
>> tomorrow that would help give us some confidence.  Otherwise,
>> another option to consider for the time being is to xfail
>> the tests on aarch64.
>>
>> Thanks
>> Martin
>>
>> gcc-89797.diff
>>
>> PR middle-end/89797 - ICE on a vector_size (1LU << 33) int variable
>>
>> gcc/ChangeLog:
>>
>>  PR middle-end/89797
>>  * tree.h (TYPE_VECTOR_SUBPARTS): Correct computation when
>>  NUM_POLY_INT_COEFFS == 2.  Use HOST_WIDE_INT_1U.
>>  * config/aarch64/aarch64.c (aarch64_simd_vector_alignment): Avoid
>>  assuming type size fits in SHWI.
>>
>> Index: gcc/tree.h
>> ===
>> --- gcc/tree.h   (revision 270418)
>> +++ gcc/tree.h   (working copy)
>> @@ -3735,13 +3735,13 @@ TYPE_VECTOR_SUBPARTS (const_tree node)
>>if (NUM_POLY_INT_COEFFS == 2)
>>  {
>>poly_uint64 res = 0;
>> -  res.coeffs[0] = 1 << (precision & 0xff);
>> +  res.coeffs[0] = HOST_WIDE_INT_1U << (precision & 0xff);
> 
> I'm not sure I understand this either before or after.  (precision &
> 0xff) gives a value in the range 0-255, which is way more than can be
> accommodated in a left shift, even for a HWI.
> 
>>if (precision & 0x100)
>> -res.coeffs[1] = 1 << (precision & 0xff);
>> +res.coeffs[1] = HOST_WIDE_INT_1U << ((precision & 0x100) >> 16);
> 
> And this equally doesn't make sense >>16 will shift the masked value out
> of the bottom of the result.
> 
> Also, if we set res.coeffs[1], don't we want refs.coefs[0] to be set to 0?

Maybe I've completely misread this, but shouldn't the code be:

  if (NUM_POLY_INT_COEFFS == 2)
{
  poly_uint64 res = 0;
  if (precision < HOST_BITS_PER_WIDE_INT)
{
  res.coeffs[0] = UNSIGNED_HOST_WIDE_INT_1U << precision;
  res.coeffs[1] = 0;
}
  else
{
  res.coeffs[0] = 0;
  res.coeffs[1] = UNSIGNED_HOST_WIDE_INT_1U << (precision -
HOST_BITS_PER_WIDE_INT);
}
  return res;
}

R.

> 
> R.
> 
>>return res;
>>  }
>>else
>> -return (unsigned HOST_WIDE_INT)1 << precision;
>> +return HOST_WIDE_INT_1U << precision;
>>  }
>>  
>>  /* Set the number of elements in VECTOR_TYPE NODE to SUBPARTS, which must
>> Index: gcc/config/aarch64/aarch64.c
>> ===
>> --- gcc/config/aarch64/aarch64.c (revision 270418)
>> +++ gcc/config/aarch64/aarch64.c (working copy)
>> @@ -14924,7 +14924,10 @@ aarch64_simd_vector_alignment (const_tree type)
>> be set for non-predicate vectors of booleans.  Modes are the most
>> direct way we have of identifying real SVE predicate types.  */
>>  return GET_MODE_CLASS (TYPE_MODE (type)) == MODE_VECTOR_BOOL ? 16 : 128;
>> -  HOST_WIDE_INT align = tree_to_shwi (TYPE_SIZE (type));
>> +  tree size = TYPE_SIZE (type);
>> +  unsigned HOST_WIDE_INT align = 128;
>> +  if (tree_fits_uhwi_p (size))
>> +align = tree_to_uhwi (TYPE_SIZE (type));
>>return MIN (align, 128);
>>  }
>>  
>>
> 



Re: [PATCH] avoid aarch64 ICE on large vectors (PR 89797)

2019-04-18 Thread Richard Sandiford
"Richard Earnshaw (lists)"  writes:
> On 18/04/2019 03:38, Martin Sebor wrote:
>> The fix for pr89797 committed in r270326 was limited to targets
>> with NUM_POLY_INT_COEFFS == 1 which I think is all but aarch64.
>> The tests for the fix have been failing with an ICE on aarch64
>> because it suffers from more or less the same problem but in
>> its own target-specific code.  Attached is the patch I posted
>> yesterday that fixes the ICE, successfully bootstrapped and
>> regtested on x86_64-linux.  I also ran the dg.exp=*attr* and
>> aarch64.exp tests with an aarch64-linux-elf cross-compiler.
>> There are no ICEs but there are tons of errors in the latter
>> tests because many (most?) either expect to be able to find
>> libc headers or link executables (I have not built libc for
>> aarch64).
>> 
>> I'm around tomorrow but then traveling the next two weeks (with
>> no connectivity the first week) so I unfortunately won't be able
>> to fix whatever this change might break until the week of May 6.
>> 
>> Jeff, if you have an aarch64 tester that could verify this patch
>> tomorrow that would help give us some confidence.  Otherwise,
>> another option to consider for the time being is to xfail
>> the tests on aarch64.
>> 
>> Thanks
>> Martin
>> 
>> gcc-89797.diff
>> 
>> PR middle-end/89797 - ICE on a vector_size (1LU << 33) int variable
>> 
>> gcc/ChangeLog:
>> 
>>  PR middle-end/89797
>>  * tree.h (TYPE_VECTOR_SUBPARTS): Correct computation when
>>  NUM_POLY_INT_COEFFS == 2.  Use HOST_WIDE_INT_1U.
>>  * config/aarch64/aarch64.c (aarch64_simd_vector_alignment): Avoid
>>  assuming type size fits in SHWI.
>> 
>> Index: gcc/tree.h
>> ===
>> --- gcc/tree.h   (revision 270418)
>> +++ gcc/tree.h   (working copy)
>> @@ -3735,13 +3735,13 @@ TYPE_VECTOR_SUBPARTS (const_tree node)
>>if (NUM_POLY_INT_COEFFS == 2)
>>  {
>>poly_uint64 res = 0;
>> -  res.coeffs[0] = 1 << (precision & 0xff);
>> +  res.coeffs[0] = HOST_WIDE_INT_1U << (precision & 0xff);
>
> I'm not sure I understand this either before or after.  (precision &
> 0xff) gives a value in the range 0-255, which is way more than can be
> accommodated in a left shift, even for a HWI.

For the record: the problem is that we have two coefficients that are
logically both in the range 1<<[0,63], needing 6 bits per coefficient
and 12 in total.  But precision is only 10 bits. :-)  Fortunately,
coeffs[1] is in practice either 0 or equal to coeffs[0], so we can encode
it in a single bit.

A 0x100/0xff split seemed easier to optimise than a 0x40/0x3f split.
E.g. 0xff can be loaded directly from memory as a byte.  There's not
going to be much in it either way though.

>>if (precision & 0x100)
>> -res.coeffs[1] = 1 << (precision & 0xff);
>> +res.coeffs[1] = HOST_WIDE_INT_1U << ((precision & 0x100) >> 16);
>
> And this equally doesn't make sense >>16 will shift the masked value out
> of the bottom of the result.

Yeah, we should keep the original shift amount and just do
s/1/HOST_WIDE_INT_1U/.

> Also, if we set res.coeffs[1], don't we want refs.coefs[0] to be set to 0?
>
> R.
>
>>return res;
>>  }
>>else
>> -return (unsigned HOST_WIDE_INT)1 << precision;
>> +return HOST_WIDE_INT_1U << precision;
>>  }
>>  
>>  /* Set the number of elements in VECTOR_TYPE NODE to SUBPARTS, which must
>> Index: gcc/config/aarch64/aarch64.c
>> ===
>> --- gcc/config/aarch64/aarch64.c (revision 270418)
>> +++ gcc/config/aarch64/aarch64.c (working copy)
>> @@ -14924,7 +14924,10 @@ aarch64_simd_vector_alignment (const_tree type)
>> be set for non-predicate vectors of booleans.  Modes are the most
>> direct way we have of identifying real SVE predicate types.  */
>>  return GET_MODE_CLASS (TYPE_MODE (type)) == MODE_VECTOR_BOOL ? 16 : 128;
>> -  HOST_WIDE_INT align = tree_to_shwi (TYPE_SIZE (type));
>> +  tree size = TYPE_SIZE (type);
>> +  unsigned HOST_WIDE_INT align = 128;
>> +  if (tree_fits_uhwi_p (size))
>> +align = tree_to_uhwi (TYPE_SIZE (type));
>>return MIN (align, 128);

I guess this is personal preference, sorry, but:

  return wi::umin (wi::to_wide (TYPE_SIZE (type)), 128).to_uhwi ();

seems more direct and obvious than:

  tree size = TYPE_SIZE (type);
  unsigned HOST_WIDE_INT align = 128;
  if (tree_fits_uhwi_p (size))
align = tree_to_uhwi (TYPE_SIZE (type));
  return MIN (align, 128);

OK with those changes, thanks.  I've bootrapped & regression-tested
it in that form on aarch64-linux-gnu.

Richard


Re: [PATCH] avoid aarch64 ICE on large vectors (PR 89797)

2019-04-18 Thread Christophe Lyon
On Thu, 18 Apr 2019 at 14:08, Richard Sandiford
 wrote:
>
> "Richard Earnshaw (lists)"  writes:
> > On 18/04/2019 03:38, Martin Sebor wrote:
> >> The fix for pr89797 committed in r270326 was limited to targets
> >> with NUM_POLY_INT_COEFFS == 1 which I think is all but aarch64.
> >> The tests for the fix have been failing with an ICE on aarch64
> >> because it suffers from more or less the same problem but in
> >> its own target-specific code.  Attached is the patch I posted
> >> yesterday that fixes the ICE, successfully bootstrapped and
> >> regtested on x86_64-linux.  I also ran the dg.exp=*attr* and
> >> aarch64.exp tests with an aarch64-linux-elf cross-compiler.
> >> There are no ICEs but there are tons of errors in the latter
> >> tests because many (most?) either expect to be able to find
> >> libc headers or link executables (I have not built libc for
> >> aarch64).
> >>
> >> I'm around tomorrow but then traveling the next two weeks (with
> >> no connectivity the first week) so I unfortunately won't be able
> >> to fix whatever this change might break until the week of May 6.
> >>
> >> Jeff, if you have an aarch64 tester that could verify this patch
> >> tomorrow that would help give us some confidence.  Otherwise,
> >> another option to consider for the time being is to xfail
> >> the tests on aarch64.
> >>
> >> Thanks
> >> Martin
> >>
> >> gcc-89797.diff
> >>
> >> PR middle-end/89797 - ICE on a vector_size (1LU << 33) int variable
> >>
> >> gcc/ChangeLog:
> >>
> >>  PR middle-end/89797
> >>  * tree.h (TYPE_VECTOR_SUBPARTS): Correct computation when
> >>  NUM_POLY_INT_COEFFS == 2.  Use HOST_WIDE_INT_1U.
> >>  * config/aarch64/aarch64.c (aarch64_simd_vector_alignment): Avoid
> >>  assuming type size fits in SHWI.
> >>
> >> Index: gcc/tree.h
> >> ===
> >> --- gcc/tree.h   (revision 270418)
> >> +++ gcc/tree.h   (working copy)
> >> @@ -3735,13 +3735,13 @@ TYPE_VECTOR_SUBPARTS (const_tree node)
> >>if (NUM_POLY_INT_COEFFS == 2)
> >>  {
> >>poly_uint64 res = 0;
> >> -  res.coeffs[0] = 1 << (precision & 0xff);
> >> +  res.coeffs[0] = HOST_WIDE_INT_1U << (precision & 0xff);
> >
> > I'm not sure I understand this either before or after.  (precision &
> > 0xff) gives a value in the range 0-255, which is way more than can be
> > accommodated in a left shift, even for a HWI.
>
> For the record: the problem is that we have two coefficients that are
> logically both in the range 1<<[0,63], needing 6 bits per coefficient
> and 12 in total.  But precision is only 10 bits. :-)  Fortunately,
> coeffs[1] is in practice either 0 or equal to coeffs[0], so we can encode
> it in a single bit.
>
> A 0x100/0xff split seemed easier to optimise than a 0x40/0x3f split.
> E.g. 0xff can be loaded directly from memory as a byte.  There's not
> going to be much in it either way though.
>
> >>if (precision & 0x100)
> >> -res.coeffs[1] = 1 << (precision & 0xff);
> >> +res.coeffs[1] = HOST_WIDE_INT_1U << ((precision & 0x100) >> 16);
> >
> > And this equally doesn't make sense >>16 will shift the masked value out
> > of the bottom of the result.
>
> Yeah, we should keep the original shift amount and just do
> s/1/HOST_WIDE_INT_1U/.
>
> > Also, if we set res.coeffs[1], don't we want refs.coefs[0] to be set to 0?
> >
> > R.
> >
> >>return res;
> >>  }
> >>else
> >> -return (unsigned HOST_WIDE_INT)1 << precision;
> >> +return HOST_WIDE_INT_1U << precision;
> >>  }
> >>
> >>  /* Set the number of elements in VECTOR_TYPE NODE to SUBPARTS, which must
> >> Index: gcc/config/aarch64/aarch64.c
> >> ===
> >> --- gcc/config/aarch64/aarch64.c (revision 270418)
> >> +++ gcc/config/aarch64/aarch64.c (working copy)
> >> @@ -14924,7 +14924,10 @@ aarch64_simd_vector_alignment (const_tree type)
> >> be set for non-predicate vectors of booleans.  Modes are the most
> >> direct way we have of identifying real SVE predicate types.  */
> >>  return GET_MODE_CLASS (TYPE_MODE (type)) == MODE_VECTOR_BOOL ? 16 : 
> >> 128;
> >> -  HOST_WIDE_INT align = tree_to_shwi (TYPE_SIZE (type));
> >> +  tree size = TYPE_SIZE (type);
> >> +  unsigned HOST_WIDE_INT align = 128;
> >> +  if (tree_fits_uhwi_p (size))
> >> +align = tree_to_uhwi (TYPE_SIZE (type));
> >>return MIN (align, 128);
>
> I guess this is personal preference, sorry, but:
>
>   return wi::umin (wi::to_wide (TYPE_SIZE (type)), 128).to_uhwi ();
>
> seems more direct and obvious than:
>
>   tree size = TYPE_SIZE (type);
>   unsigned HOST_WIDE_INT align = 128;
>   if (tree_fits_uhwi_p (size))
> align = tree_to_uhwi (TYPE_SIZE (type));
>   return MIN (align, 128);
>
> OK with those changes, thanks.  I've bootrapped & regression-tested
> it in that form on aarch64-linux-gnu.
>

I'm late in the party, and I confirm that Martin's original patch
introduce

[PATCH] gdbhooks.py: Fix UnicodeDecodeErrors when printing trees with corrupt codes

2019-04-18 Thread Eugene Sharygin
Hi,

This silences UnicodeDecodeError Python exceptions raised when trying to
print a tree with a corrupt (or uninitialized) code in Gdb:

Python Exception  'utf-8' codec can't decode 
byte 0xb8 in position 0: invalid start byte:

With this patch, TreePrinter would print such trees generically like
this (in the format it uses for null trees):

 

Eugene

2018-04-18  Eugene Sharygin  

* gdbhooks.py: Fix UnicodeDecodeErrors when printing trees with
  corrupt codes.
---
 gcc/gdbhooks.py | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/gcc/gdbhooks.py b/gcc/gdbhooks.py
index bbe7618e299..7b1a7be0002 100644
--- a/gcc/gdbhooks.py
+++ b/gcc/gdbhooks.py
@@ -229,7 +229,10 @@ class TreePrinter:
 val_code_name = val_tree_code_name[intptr(val_TREE_CODE)]
 #print(val_code_name.string())
 
-result = '<%s 0x%x' % (val_code_name.string(), intptr(self.gdbval))
+try:
+result = '<%s 0x%x' % (val_code_name.string(), intptr(self.gdbval))
+except:
+return '' % intptr(self.gdbval)
 if intptr(val_tclass) == tcc_declaration:
 tree_DECL_NAME = self.node.DECL_NAME()
 if tree_DECL_NAME.is_nonnull():
-- 
2.21.0



Re: [PATCH] S/390: Fix PR89952 incorrect CFI

2019-04-18 Thread Robin Dapp
Hi,

> +   Establish an ANTI dependency between r11 and r15 restores from FPRs
> +   to prevent the instructions scheduler from reordering them since
> +   this would break CFI.  No further handling in the sched_reorder
> +   hook is required since the r11 and r15 restore will never appear in
> +   the same ready list with that change.  */
[..]
> +  if (r11_restore == NULL || r15_restore == NULL)
> +return;
> +  add_dependence (r11_restore, r15_restore, REG_DEP_ANTI);
> +}

an anti dependency seems an odd choice because r15 would be restored
before r11 (according to the patch's changes in
s390_restore_gprs_from_fprs) and the CFI note for r11 reads
from/requires r15.  This would rather indicate a data dependency from
r15 to r11 but I understand that you don't want the scheduler to assume
there is a real dependency including effects on insn priority and latency.

I looked in haifa-sched.c and sched-ebb.c for methods to prevent
reordering two "special" instructions but did not find anything. There
is of course code to prevent scheduling stack-referencing insns across a
stack pointer restore but it doesn't seem to allow adding more
instructions.

I came across add_deps_for_risky_insns, though, which also uses anti
dependencies, apparently for a similar reason, so maybe that's the
normal and accepted way to do it.

Regards
 Robin



Re: [RFC] D support for S/390

2019-04-18 Thread Robin Dapp
Hi Rainer,

> I noticed you missed one piece of Iain's typeinfo.cc patch, btw.:
> 
> diff --git a/gcc/d/typeinfo.cc b/gcc/d/typeinfo.cc
> --- a/gcc/d/typeinfo.cc
> +++ b/gcc/d/typeinfo.cc
> @@ -886,7 +886,7 @@ public:
>   if (cd->isCOMinterface ())
> flags |= ClassFlags::isCOMclass;
>  
> - this->layout_field (build_integer_cst (flags));
> + this->layout_field (build_integer_cst (flags, d_uint_type));
>  
>   /* void *deallocator;
>  OffsetTypeInfo[] m_offTi;  (not implemented)

thanks for catching this.  I amended the patch and ran the D test suite
on S/390 and i386 one more time.

For S/390 the number of FAILs is unchanged

=== libphobos Summary ===

# of expected passes380
# of unexpected failures30

Some of the FAILs are still a little worrying, like tests for "-shared"
which I haven't looked at at all so far.  Still, it's better than the
>200 before.

On i386 I see no FAILs with the patch.

Regards
 Robin
diff --git a/gcc/d/dmd/constfold.c b/gcc/d/dmd/constfold.c
index ddd356bb966..cb58d4b4f5b 100644
--- a/gcc/d/dmd/constfold.c
+++ b/gcc/d/dmd/constfold.c
@@ -1752,14 +1752,17 @@ UnionExp Cat(Type *type, Expression *e1, Expression *e2)
 }
 else if (e1->op == TOKint64 && e2->op == TOKstring)
 {
-// Concatenate the strings
+// [w|d]?char ~ string --> string
+// We assume that we only ever prepend one char of the same type
+// (wchar,dchar) as the string's characters.
+
 StringExp *es2 = (StringExp *)e2;
 size_t len = 1 + es2->len;
 unsigned char sz = es2->sz;
 dinteger_t v = e1->toInteger();
 
 void *s = mem.xmalloc((len + 1) * sz);
-memcpy((char *)s, &v, sz);
+Port::valcpy((char *)s, v, sz);
 memcpy((char *)s + sz, es2->string, es2->len * sz);
 
 // Add terminating 0
diff --git a/gcc/d/typeinfo.cc b/gcc/d/typeinfo.cc
index dac66acdcd4..865fde2c863 100644
--- a/gcc/d/typeinfo.cc
+++ b/gcc/d/typeinfo.cc
@@ -830,7 +830,7 @@ public:
 	flags |= ClassFlags::noPointers;
 
 Lhaspointers:
-	this->layout_field (size_int (flags));
+	this->layout_field (build_integer_cst (flags, d_uint_type));
 
 	/* void *deallocator;  */
 	tree ddtor = (cd->aggDelete)
@@ -886,7 +886,7 @@ public:
 	if (cd->isCOMinterface ())
 	  flags |= ClassFlags::isCOMclass;
 
-	this->layout_field (size_int (flags));
+	this->layout_field (build_integer_cst (flags, d_uint_type));
 
 	/* void *deallocator;
 	   OffsetTypeInfo[] m_offTi;  (not implemented)
@@ -1019,7 +1019,7 @@ public:
 StructFlags::Type m_flags = 0;
 if (ti->hasPointers ())
   m_flags |= StructFlags::hasPointers;
-this->layout_field (size_int (m_flags));
+this->layout_field (build_integer_cst (m_flags, d_uint_type));
 
 /* void function(void*) xdtor;  */
 tree dtor = (sd->dtor) ? build_address (get_symbol_decl (sd->dtor))
@@ -1033,7 +1033,7 @@ public:
   this->layout_field (null_pointer_node);
 
 /* uint m_align;  */
-this->layout_field (size_int (ti->alignsize ()));
+this->layout_field (build_integer_cst (ti->alignsize (), d_uint_type));
 
 if (global.params.is64bit)
   {
@@ -1489,8 +1489,8 @@ create_typeinfo (Type *type, Module *mod)
   array_type_node, array_type_node,
   ptr_type_node, ptr_type_node,
   ptr_type_node, ptr_type_node,
-  size_type_node, ptr_type_node,
-  ptr_type_node, size_type_node,
+  d_uint_type, ptr_type_node,
+  ptr_type_node, d_uint_type,
   ptr_type_node, argtype, argtype, NULL);
 	}
 	  t->vtinfo = TypeInfoStructDeclaration::create (t);
diff --git a/gcc/testsuite/gdc.dg/runnable.d b/gcc/testsuite/gdc.dg/runnable.d
index e36a2585027..8d9a5868831 100644
--- a/gcc/testsuite/gdc.dg/runnable.d
+++ b/gcc/testsuite/gdc.dg/runnable.d
@@ -890,12 +890,17 @@ struct S186
 }
 }
 
+static if (size_t.sizeof == 8)
+ size_t checkval = 0x0202;
+static if (size_t.sizeof == 4)
+ size_t checkval = 0x0202;
+
 void check186(in S186 obj, byte fieldB)
 {
 assert(obj.fieldA == 2);
 assert(obj.fieldB == 0);
 assert(obj.fieldC == 0);
-assert(obj._complete == 2);
+assert(obj._complete == checkval);
 assert(fieldB == 0);
 }
 
@@ -907,7 +912,7 @@ void test186a(size_t val)
 assert(obj.fieldA == 2);
 assert(obj.fieldB == 0);
 assert(obj.fieldC == 0);
-assert(obj._complete == 2);
+assert(obj._complete == checkval);
 
 obj = S186(val);
 check186(obj, obj.fieldB);
@@ -915,12 +920,12 @@ void test186a(size_t val)
 assert(obj.fieldA == 2);
 assert(obj.fieldB == 0);
 assert(obj.fieldC == 0);
-assert(obj._complete == 2);
+assert(obj._complete == checkval);
 }
 
 void test186()
 {
-test186a(2);
+test186a(checkval);
 }
 
 /**/
diff --git a/gcc/testsuite/gdc.dg/simd.d b/gcc/testsuite/gdc.dg/simd.d
index 812b36649aa..7d0aa0168c0 100644
--- a/gcc/tes

[PATCH] PR translation/90118 Missing space between words

2019-04-18 Thread Christophe Lyon
Hi,

This patch adds the missing space before '%<' in
config/aarch64/aarch64.c and gcc/cp/call.c. It also updates the
check-internal-format-escaping.py checker to warn about such cases.

OK?

Christophe
diff --git a/contrib/check-internal-format-escaping.py 
b/contrib/check-internal-format-escaping.py
index aac4f9e..9c62586 100755
--- a/contrib/check-internal-format-escaping.py
+++ b/contrib/check-internal-format-escaping.py
@@ -58,6 +58,10 @@ for i, l in enumerate(lines):
 print('%s: %s' % (origin, text))
 if re.search("[^%]'", p):
 print('%s: %s' % (origin, text))
+# %< should not be preceded by a non-punctuation
+# %character.
+if re.search("[a-zA-Z0-9]%<", p):
+print('%s: %s' % (origin, text))
 j += 1
 
 origin = None
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 9be7548..b66071f 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -11483,7 +11483,7 @@ aarch64_override_options_internal (struct gcc_options 
*opts)
   if (aarch64_stack_protector_guard == SSP_GLOBAL
   && opts->x_aarch64_stack_protector_guard_offset_str)
 {
-  error ("incompatible options %<-mstack-protector-guard=global%> and"
+  error ("incompatible options %<-mstack-protector-guard=global%> and "
 "%<-mstack-protector-guard-offset=%s%>",
 aarch64_stack_protector_guard_offset_str);
 }
diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 9582345..8f3d019 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -3614,16 +3614,16 @@ print_z_candidate (location_t loc, const char *msgstr,
 {
   cloc = loc;
   if (candidate->num_convs == 3)
-   inform (cloc, "%s%<%D(%T, %T, %T)%> ", msg, fn,
+   inform (cloc, "%s %<%D(%T, %T, %T)%> ", msg, fn,
candidate->convs[0]->type,
candidate->convs[1]->type,
candidate->convs[2]->type);
   else if (candidate->num_convs == 2)
-   inform (cloc, "%s%<%D(%T, %T)%> ", msg, fn,
+   inform (cloc, "%s %<%D(%T, %T)%> ", msg, fn,
candidate->convs[0]->type,
candidate->convs[1]->type);
   else
-   inform (cloc, "%s%<%D(%T)%> ", msg, fn,
+   inform (cloc, "%s %<%D(%T)%> ", msg, fn,
candidate->convs[0]->type);
 }
   else if (TYPE_P (fn))
contrib/ChangeLog:

2019-04-18  Christophe Lyon  

PR translation/90118
* check-internal-format-escaping.py: Check that %< is not next to
a word.

gcc/ChangeLog:

2019-04-18  Christophe Lyon  

PR translation/90118
* config/aarch64/aarch64.c (aarch64_override_options_internal):
Add missing space before %<.

gcc/cp/ChangeLog:

2019-04-18  Christophe Lyon  

PR translation/90118
* call.c (print_z_candidate): Add missing space before %<.




Re: [PATCH] avoid aarch64 ICE on large vectors (PR 89797)

2019-04-18 Thread Martin Sebor

On 4/18/19 6:07 AM, Richard Sandiford wrote:

"Richard Earnshaw (lists)"  writes:

On 18/04/2019 03:38, Martin Sebor wrote:

The fix for pr89797 committed in r270326 was limited to targets
with NUM_POLY_INT_COEFFS == 1 which I think is all but aarch64.
The tests for the fix have been failing with an ICE on aarch64
because it suffers from more or less the same problem but in
its own target-specific code.  Attached is the patch I posted
yesterday that fixes the ICE, successfully bootstrapped and
regtested on x86_64-linux.  I also ran the dg.exp=*attr* and
aarch64.exp tests with an aarch64-linux-elf cross-compiler.
There are no ICEs but there are tons of errors in the latter
tests because many (most?) either expect to be able to find
libc headers or link executables (I have not built libc for
aarch64).

I'm around tomorrow but then traveling the next two weeks (with
no connectivity the first week) so I unfortunately won't be able
to fix whatever this change might break until the week of May 6.

Jeff, if you have an aarch64 tester that could verify this patch
tomorrow that would help give us some confidence.  Otherwise,
another option to consider for the time being is to xfail
the tests on aarch64.

Thanks
Martin

gcc-89797.diff

PR middle-end/89797 - ICE on a vector_size (1LU << 33) int variable

gcc/ChangeLog:

PR middle-end/89797
* tree.h (TYPE_VECTOR_SUBPARTS): Correct computation when
NUM_POLY_INT_COEFFS == 2.  Use HOST_WIDE_INT_1U.
* config/aarch64/aarch64.c (aarch64_simd_vector_alignment): Avoid
assuming type size fits in SHWI.

Index: gcc/tree.h
===
--- gcc/tree.h  (revision 270418)
+++ gcc/tree.h  (working copy)
@@ -3735,13 +3735,13 @@ TYPE_VECTOR_SUBPARTS (const_tree node)
if (NUM_POLY_INT_COEFFS == 2)
  {
poly_uint64 res = 0;
-  res.coeffs[0] = 1 << (precision & 0xff);
+  res.coeffs[0] = HOST_WIDE_INT_1U << (precision & 0xff);


I'm not sure I understand this either before or after.  (precision &
0xff) gives a value in the range 0-255, which is way more than can be
accommodated in a left shift, even for a HWI.


For the record: the problem is that we have two coefficients that are
logically both in the range 1<<[0,63], needing 6 bits per coefficient
and 12 in total.  But precision is only 10 bits. :-)  Fortunately,
coeffs[1] is in practice either 0 or equal to coeffs[0], so we can encode
it in a single bit.

A 0x100/0xff split seemed easier to optimise than a 0x40/0x3f split.
E.g. 0xff can be loaded directly from memory as a byte.  There's not
going to be much in it either way though.


if (precision & 0x100)
-   res.coeffs[1] = 1 << (precision & 0xff);
+   res.coeffs[1] = HOST_WIDE_INT_1U << ((precision & 0x100) >> 16);


And this equally doesn't make sense >>16 will shift the masked value out
of the bottom of the result.


Yeah, we should keep the original shift amount and just do
s/1/HOST_WIDE_INT_1U/.


Sorry, I confused things with the 16 bit shift thinko (I meant
to shift by 8 bits).  But I don't understand how poly_int works
at all so the shift was just a wild guess thinking the purpose
of the split was to allow for precisions in excess of (1 << 63).

But I'm afraid I still don't understand when (precision & 0x100)
could ever hold.  I put in an abort in that branch and ran
the test suite but didn't manage to trigger an ICE.  Can you
give me an example when it might trigger?  (I'd also like to
add a comment to the function to explain it.)

The test case I'm fixing is:

 attribute ((vector_size (1LLU << 62))) char v;

Greater values are rejected with:

  error: ‘vector_size’ attribute argument value ‘9223372036854775808’ 
exceeds 9223372036854775807



Also, if we set res.coeffs[1], don't we want refs.coefs[0] to be set to 0?

R.


return res;
  }
else
-return (unsigned HOST_WIDE_INT)1 << precision;
+return HOST_WIDE_INT_1U << precision;
  }
  
  /* Set the number of elements in VECTOR_TYPE NODE to SUBPARTS, which must

Index: gcc/config/aarch64/aarch64.c
===
--- gcc/config/aarch64/aarch64.c(revision 270418)
+++ gcc/config/aarch64/aarch64.c(working copy)
@@ -14924,7 +14924,10 @@ aarch64_simd_vector_alignment (const_tree type)
 be set for non-predicate vectors of booleans.  Modes are the most
 direct way we have of identifying real SVE predicate types.  */
  return GET_MODE_CLASS (TYPE_MODE (type)) == MODE_VECTOR_BOOL ? 16 : 128;
-  HOST_WIDE_INT align = tree_to_shwi (TYPE_SIZE (type));
+  tree size = TYPE_SIZE (type);
+  unsigned HOST_WIDE_INT align = 128;
+  if (tree_fits_uhwi_p (size))
+align = tree_to_uhwi (TYPE_SIZE (type));
return MIN (align, 128);


I guess this is personal preference, sorry, but:

   return wi::umin (wi::to_wide (TYPE_SIZE (type)), 128).to_uhwi ();

seems more direct and obvious than:

   tree size 

[PATCH][IRA] Fix PR87871: [9 Regression] testcases fail after r265398 on arm

2019-04-18 Thread Peter Bergner
PR87871 exposes a couple of problems.  One problem fixed here is that IRA
incorrectly adds a conflict in some cases where we have a simple register
copy between a pseudo reg and a hard reg.  This stopped us from assigning
the pseudo reg to that hard reg which we wanted to do in the testsuite test
case shown in the bugzilla.  The bug is due to an oversight in my r264897
commit that added support for not adding conflicts for simple register
copies.  That code correctly didn't add a conflict to CONFLICT_HARD_REGS
in make_object_dead(), but failed to do the same for TOTAL_CONFLICT_HARD_REGS.
The patch below fixes that oversight.

I have confirmed we now assign pseudo p116 to r0 in the ARM test case
as well as a similar assignment issue on POWER.

This passed bootstrap and regtesting with no regressions on powerpc64le-linux.
Ok for mainline?

Peter

PR rtl-optimization/87871
* ira-lives.c (make_object_dead): Don't add conflicts to
TOTAL_CONFLICT_HARD_REGS for register ignore_reg_for_conflicts.

Index: gcc/ira-lives.c
===
--- gcc/ira-lives.c (revision 270420)
+++ gcc/ira-lives.c (working copy)
@@ -163,7 +163,9 @@ static void
 make_object_dead (ira_object_t obj)
 {
   live_range_t lr;
+  int regno;
   int ignore_regno = -1;
+  int ignore_total_regno = -1;
   int end_regno = -1;
 
   sparseset_clear_bit (objects_live, OBJECT_CONFLICT_ID (obj));
@@ -174,16 +176,14 @@ make_object_dead (ira_object_t obj)
   && REGNO (ignore_reg_for_conflicts) < FIRST_PSEUDO_REGISTER)
 {
   end_regno = END_REGNO (ignore_reg_for_conflicts);
-  int src_regno = ignore_regno = REGNO (ignore_reg_for_conflicts);
+  ignore_regno = ignore_total_regno = REGNO (ignore_reg_for_conflicts);
 
-  while (src_regno < end_regno)
+  for (regno = ignore_regno; regno < end_regno; regno++)
{
- if (TEST_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), src_regno))
-   {
- ignore_regno = end_regno = -1;
- break;
-   }
- src_regno++;
+ if (TEST_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), regno))
+   ignore_regno = end_regno;
+ if (TEST_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), regno))
+   ignore_total_regno = end_regno;
}
 }
 
@@ -192,8 +192,10 @@ make_object_dead (ira_object_t obj)
 
   /* If IGNORE_REG_FOR_CONFLICTS did not already conflict with OBJ, make
  sure it still doesn't.  */
-  for (; ignore_regno < end_regno; ignore_regno++)
-CLEAR_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), ignore_regno);
+  for (regno = ignore_regno; regno < end_regno; regno++)
+CLEAR_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), regno);
+  for (regno = ignore_total_regno; regno < end_regno; regno++)
+CLEAR_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), regno);
 
   lr = OBJECT_LIVE_RANGES (obj);
   ira_assert (lr != NULL);



Re: [PATCH] avoid aarch64 ICE on large vectors (PR 89797)

2019-04-18 Thread Richard Sandiford
Martin Sebor  writes:
> On 4/18/19 6:07 AM, Richard Sandiford wrote:
>> "Richard Earnshaw (lists)"  writes:
>>> On 18/04/2019 03:38, Martin Sebor wrote:
 The fix for pr89797 committed in r270326 was limited to targets
 with NUM_POLY_INT_COEFFS == 1 which I think is all but aarch64.
 The tests for the fix have been failing with an ICE on aarch64
 because it suffers from more or less the same problem but in
 its own target-specific code.  Attached is the patch I posted
 yesterday that fixes the ICE, successfully bootstrapped and
 regtested on x86_64-linux.  I also ran the dg.exp=*attr* and
 aarch64.exp tests with an aarch64-linux-elf cross-compiler.
 There are no ICEs but there are tons of errors in the latter
 tests because many (most?) either expect to be able to find
 libc headers or link executables (I have not built libc for
 aarch64).

 I'm around tomorrow but then traveling the next two weeks (with
 no connectivity the first week) so I unfortunately won't be able
 to fix whatever this change might break until the week of May 6.

 Jeff, if you have an aarch64 tester that could verify this patch
 tomorrow that would help give us some confidence.  Otherwise,
 another option to consider for the time being is to xfail
 the tests on aarch64.

 Thanks
 Martin

 gcc-89797.diff

 PR middle-end/89797 - ICE on a vector_size (1LU << 33) int variable

 gcc/ChangeLog:

PR middle-end/89797
* tree.h (TYPE_VECTOR_SUBPARTS): Correct computation when
NUM_POLY_INT_COEFFS == 2.  Use HOST_WIDE_INT_1U.
* config/aarch64/aarch64.c (aarch64_simd_vector_alignment): Avoid
assuming type size fits in SHWI.

 Index: gcc/tree.h
 ===
 --- gcc/tree.h (revision 270418)
 +++ gcc/tree.h (working copy)
 @@ -3735,13 +3735,13 @@ TYPE_VECTOR_SUBPARTS (const_tree node)
 if (NUM_POLY_INT_COEFFS == 2)
   {
 poly_uint64 res = 0;
 -  res.coeffs[0] = 1 << (precision & 0xff);
 +  res.coeffs[0] = HOST_WIDE_INT_1U << (precision & 0xff);
>>>
>>> I'm not sure I understand this either before or after.  (precision &
>>> 0xff) gives a value in the range 0-255, which is way more than can be
>>> accommodated in a left shift, even for a HWI.
>> 
>> For the record: the problem is that we have two coefficients that are
>> logically both in the range 1<<[0,63], needing 6 bits per coefficient
>> and 12 in total.  But precision is only 10 bits. :-)  Fortunately,
>> coeffs[1] is in practice either 0 or equal to coeffs[0], so we can encode
>> it in a single bit.
>> 
>> A 0x100/0xff split seemed easier to optimise than a 0x40/0x3f split.
>> E.g. 0xff can be loaded directly from memory as a byte.  There's not
>> going to be much in it either way though.
>> 
 if (precision & 0x100)
 -  res.coeffs[1] = 1 << (precision & 0xff);
 +  res.coeffs[1] = HOST_WIDE_INT_1U << ((precision & 0x100) >> 16);
>>>
>>> And this equally doesn't make sense >>16 will shift the masked value out
>>> of the bottom of the result.
>> 
>> Yeah, we should keep the original shift amount and just do
>> s/1/HOST_WIDE_INT_1U/.
>
> Sorry, I confused things with the 16 bit shift thinko (I meant
> to shift by 8 bits).  But I don't understand how poly_int works
> at all so the shift was just a wild guess thinking the purpose
> of the split was to allow for precisions in excess of (1 << 63).
>
> But I'm afraid I still don't understand when (precision & 0x100)
> could ever hold.  I put in an abort in that branch and ran
> the test suite but didn't manage to trigger an ICE.  Can you
> give me an example when it might trigger?  (I'd also like to
> add a comment to the function to explain it.)

The function is decoding the precision installed by:

inline void
SET_TYPE_VECTOR_SUBPARTS (tree node, poly_uint64 subparts)
{
  STATIC_ASSERT (NUM_POLY_INT_COEFFS <= 2);
  unsigned HOST_WIDE_INT coeff0 = subparts.coeffs[0];
  int index = exact_log2 (coeff0);
  gcc_assert (index >= 0);
  if (NUM_POLY_INT_COEFFS == 2)
{
  unsigned HOST_WIDE_INT coeff1 = subparts.coeffs[1];
  gcc_assert (coeff1 == 0 || coeff1 == coeff0);
  VECTOR_TYPE_CHECK (node)->type_common.precision
= index + (coeff1 != 0 ? 0x100 : 0);
}
  else
VECTOR_TYPE_CHECK (node)->type_common.precision = index;
}

Code protected by NUM_POLY_INT_COEFFS == 2 will only trigger on
AArch64 targets, so an x86 test run won't cover it.  The assert you
added would blow up on pretty much all of aarch64-sve.exp in an AArch64
test run though; see Christophe's results for what happened with the
original patch.

I certainly don't mind adding more commentary, but please keep that
separate from the PR fix, which is simply about changing "1 <<" to
"HOST_WIDE_INT_1U <<".  I think we went down a bit of a rabbit hole
with this enc

Re: [PATCH] PR translation/90118 Missing space between words

2019-04-18 Thread Richard Sandiford
Christophe Lyon  writes:
> Hi,
>
> This patch adds the missing space before '%<' in
> config/aarch64/aarch64.c and gcc/cp/call.c. It also updates the
> check-internal-format-escaping.py checker to warn about such cases.
>
> OK?
>
> Christophe
>
> diff --git a/contrib/check-internal-format-escaping.py 
> b/contrib/check-internal-format-escaping.py
> index aac4f9e..9c62586 100755
> --- a/contrib/check-internal-format-escaping.py
> +++ b/contrib/check-internal-format-escaping.py
> @@ -58,6 +58,10 @@ for i, l in enumerate(lines):
>  print('%s: %s' % (origin, text))
>  if re.search("[^%]'", p):
>  print('%s: %s' % (origin, text))
> +# %< should not be preceded by a non-punctuation
> +# %character.
> +if re.search("[a-zA-Z0-9]%<", p):
> +print('%s: %s' % (origin, text))
>  j += 1
>  
>  origin = None
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 9be7548..b66071f 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -11483,7 +11483,7 @@ aarch64_override_options_internal (struct gcc_options 
> *opts)
>if (aarch64_stack_protector_guard == SSP_GLOBAL
>&& opts->x_aarch64_stack_protector_guard_offset_str)
>  {
> -  error ("incompatible options %<-mstack-protector-guard=global%> and"
> +  error ("incompatible options %<-mstack-protector-guard=global%> and "
>"%<-mstack-protector-guard-offset=%s%>",
>aarch64_stack_protector_guard_offset_str);
>  }
> diff --git a/gcc/cp/call.c b/gcc/cp/call.c
> index 9582345..8f3d019 100644
> --- a/gcc/cp/call.c
> +++ b/gcc/cp/call.c
> @@ -3614,16 +3614,16 @@ print_z_candidate (location_t loc, const char *msgstr,
>  {
>cloc = loc;
>if (candidate->num_convs == 3)
> - inform (cloc, "%s%<%D(%T, %T, %T)%> ", msg, fn,
> + inform (cloc, "%s %<%D(%T, %T, %T)%> ", msg, fn,
>   candidate->convs[0]->type,
>   candidate->convs[1]->type,
>   candidate->convs[2]->type);
>else if (candidate->num_convs == 2)
> - inform (cloc, "%s%<%D(%T, %T)%> ", msg, fn,
> + inform (cloc, "%s %<%D(%T, %T)%> ", msg, fn,
>   candidate->convs[0]->type,
>   candidate->convs[1]->type);
>else
> - inform (cloc, "%s%<%D(%T)%> ", msg, fn,
> + inform (cloc, "%s %<%D(%T)%> ", msg, fn,
>   candidate->convs[0]->type);
>  }
>else if (TYPE_P (fn))

I don't think this is right, since msg already has a space where necessary:

  const char *msg = (msgstr == NULL
 ? ""
 : ACONCAT ((msgstr, " ", NULL)));

Adding something like "(^| )[^% ]*" to the start of the regexp might
avoid that, at the risk of letting through real problems.

The aarch64.c change is definitely OK though, thanks for the catch.

Richard


[C++ PATCH] PR c++/87554 - ICE with extern template and reference member.

2019-04-18 Thread Jason Merrill
The removed code ended up setting DECL_INITIAL to the INIT_EXPR returned by
split_nonconstant_init, which makes no sense.  This code was added back in
1996, so any rationale is long lost.

Tested x86_64-pc-linux-gnu, applying to trunk.

* decl.c (cp_finish_decl): Don't set DECL_INITIAL of external vars.
---
 gcc/cp/decl.c |  3 +-
 .../g++.dg/cpp0x/extern_template-5.C  | 36 +++
 gcc/cp/ChangeLog  |  5 +++
 3 files changed, 42 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/extern_template-5.C

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 67d9244c450..420d6d896ec 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -7354,8 +7354,7 @@ cp_finish_decl (tree decl, tree init, bool 
init_const_expr_p,
   && ! (DECL_LANG_SPECIFIC (decl)
 && DECL_NOT_REALLY_EXTERN (decl)))
{
- if (init)
-   DECL_INITIAL (decl) = init;
+ /* check_initializer will have done any constant initialization.  */
}
   /* A variable definition.  */
   else if (DECL_FUNCTION_SCOPE_P (decl) && !TREE_STATIC (decl))
diff --git a/gcc/testsuite/g++.dg/cpp0x/extern_template-5.C 
b/gcc/testsuite/g++.dg/cpp0x/extern_template-5.C
new file mode 100644
index 000..5d412489fb2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/extern_template-5.C
@@ -0,0 +1,36 @@
+// PR c++/87554
+// { dg-options "-O" }
+
+template < class a > class b {
+  static void c(a);
+  static a &create() { c(instance); return mya; }
+
+  static a mya;
+
+public:
+  static a d() { create(); return a(); }
+  static a &instance;
+};
+template < class a > a &b< a >::instance = create();
+class e;
+class f {
+public:
+  void operator()(int g) { h(g); }
+  template < class a > void h(a i) { p(j, i); }
+  e *j;
+};
+class e : public f {
+public:
+  e(int);
+};
+struct k {
+  int l;
+};
+template < class m, class a > void p(m, a) { b< k >::d(); }
+extern template class b< k >;
+int n;
+int o;
+void test() {
+  e out(o);
+  out(n);
+}
diff --git a/gcc/cp/ChangeLog b/gcc/cp/ChangeLog
index bfaf355ebb7..7bb464f8aa7 100644
--- a/gcc/cp/ChangeLog
+++ b/gcc/cp/ChangeLog
@@ -1,3 +1,8 @@
+2019-04-18  Jason Merrill  
+
+   PR c++/87554 - ICE with extern template and reference member.
+   * decl.c (cp_finish_decl): Don't set DECL_INITIAL of external vars.
+
 2019-04-17  Jason Merrill  
 
PR c++/90047 - ICE with enable_if alias template.

base-commit: 1c51e7c89b58cb6a548790f988b080a84ea82914
-- 
2.20.1



New German PO file for 'gcc' (version 9.1-b20190414)

2019-04-18 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the German team of translators.  The file is available at:

https://translationproject.org/latest/gcc/de.po

(This file, 'gcc-9.1-b20190414.de.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




Fix inliner ICE with flattening

2019-04-18 Thread Jan Hubicka
Hi,
the testcase (which I fialed to annotate correctly for testsuite)
triggers situation where we forget to update overall summary after
flattenin and later ICE in verification that estimates match.

Bootstrapped/regtested x86_64-linux, comitted.

Index: ChangeLog
===
--- ChangeLog   (revision 270444)
+++ ChangeLog   (working copy)
@@ -1,3 +1,9 @@
+2019-04-18  Jan Hubicka  
+
+   PR ipa/85051
+   * ipa-inline.c (flatten_function): New parameter UPDATE.
+   (ipa_inline, early_inliner): Use it.
+
 2019-04-18  Richard Sandiford  
 
* fold-const.c (int_const_binop): Return early on failure.
Index: ipa-inline.c
===
--- ipa-inline.c(revision 270444)
+++ ipa-inline.c(working copy)
@@ -2134,7 +2134,7 @@ inline_small_functions (void)
at IPA inlining time.  */
 
 static void
-flatten_function (struct cgraph_node *node, bool early)
+flatten_function (struct cgraph_node *node, bool early, bool update)
 {
   struct cgraph_edge *e;
 
@@ -2164,7 +2164,7 @@ flatten_function (struct cgraph_node *no
 it in order to fully flatten the leaves.  */
   if (!e->inline_failed)
{
- flatten_function (callee, early);
+ flatten_function (callee, early, false);
  continue;
}
 
@@ -2204,14 +2204,15 @@ flatten_function (struct cgraph_node *no
   inline_call (e, true, NULL, NULL, false);
   if (e->callee != orig_callee)
orig_callee->aux = (void *) node;
-  flatten_function (e->callee, early);
+  flatten_function (e->callee, early, false);
   if (e->callee != orig_callee)
orig_callee->aux = NULL;
 }
 
   node->aux = NULL;
-  if (!node->global.inlined_to)
-ipa_update_overall_fn_summary (node);
+  if (update)
+ipa_update_overall_fn_summary (node->global.inlined_to
+  ? node->global.inlined_to : node);
 }
 
 /* Inline NODE to all callers.  Worker for cgraph_for_node_and_aliases.
@@ -2519,7 +2520,7 @@ ipa_inline (void)
 function.  */
   if (dump_file)
fprintf (dump_file, "Flattening %s\n", node->name ());
-  flatten_function (node, false);
+  flatten_function (node, false, true);
 }
 
   if (j < nnodes - 2)
@@ -2782,7 +2783,7 @@ early_inliner (function *fun)
   if (dump_enabled_p ())
dump_printf (MSG_OPTIMIZED_LOCATIONS,
 "Flattening %C\n", node);
-  flatten_function (node, true);
+  flatten_function (node, true, true);
   inlined = true;
 }
   else


Re: [C++ Patch/RFC] PR 89900 ("[9 Regression] ICE: Segmentation fault (in check_instantiated_arg)")

2019-04-18 Thread Jason Merrill
On Mon, Apr 15, 2019 at 1:08 PM Paolo Carlini  wrote:
>
> Hi,
>
> On 12/04/19 20:29, Jason Merrill wrote:
> > On 4/11/19 11:20 AM, Paolo Carlini wrote:
> >> Hi,
> >>
> >> over the last few days I spent some time on this regression, which at
> >> first seemed just a minor error-recovery issue, but then I noticed
> >> that very slightly tweeking the original testcase uncovered a pretty
> >> serious ICE on valid:
> >>
> >> template void
> >> fk (XE..., int/*SW*/);
> >>
> >> void
> >> w9 (void)
> >> {
> >>fk (0);
> >> }
> >>
> >> The regression has to do with the changes committed by Jason for
> >> c++/86932, in particular with the condition in coerce_template_parms:
> >>
> >> if (template_parameter_pack_p (TREE_VALUE (parm))
> >>&& (arg || !(complain & tf_partial))
> >>&& !(arg && ARGUMENT_PACK_P (arg)))
> >>
> >> which has the additional (arg || !complain & tf_partial)) false for
> >> the present testcase, thus the null arg is not changed into an empty
> >> pack, thus later  instantiate_template calls check_instantiated_args
> >> which finds it still null and crashes. Now, likely some additional
> >> analysis is in order, but for sure there is an important difference
> >> between the testcase which came with c++/86932 and the above:
> >> non-type vs type template parameter pack. It seems to me that the
> >> kind of problem fixed in c++/86932 cannot occur with type packs,
> >> because it boils down to a reference to a previous parm (full
> >> disclosure: the comments and logic in fixed_parameter_pack_p helped
> >> me a lot here). Thus I had the idea of simply restricting the scope
> >> of the new condition above by adding an || TREE_CODE (TREE_VALUE
> >> (parm)) == TYPE_DECL, which definitely leads to a clean testsuite and
> >> a proper behavior on the new testcases, AFAICS. I'm attaching what I
> >> tested on x86_64-linux.
> >
> > I think the important property here is that it's non-terminal, not
> > that it's a type pack.  We can't deduce anything for a non-terminal
> > pack, so we should go ahead and make an empty pack.
>
> I see.
>
> Then what about something bolder, like the below? Instead of fiddling
> with the details of coerce_template_parms - how it handles the explicit
> arguments - in fn_type_unification we deal with both parameter_pack ==
> true and false in the same way when targ == NULL_TREE, thus we set
> incomplete. Then, for the new testcases, since incomplete is true, there
> is no jump to the deduced label and type_unification_real takes care of
> making the empty pack - the same happens already when there are no
> explicit arguments. Tested x86_64-linux. I also checked quite a few
> other variants of the tests but nothing new, nothing interesting, showed
> up...

OK.

Jason


Re: [PATCH v2] Add error message for target_clones and AVX512 ISAs (PR target/89929).

2019-04-18 Thread H.J. Lu
On Thu, Apr 18, 2019 at 4:07 AM Martin Liška  wrote:
>
> Hi.
>
> I'm sending updated version of that patch. The patch rejects usage of AVX512 
> ISAs (except AVX512F)
> for target attribute for C++ multiversioning and for target_clone attribute.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?
> Thanks,
> Martin

Your patch doesn't handle cmov nor gfni properly, which aren't AVX512.
I prefer this patch.

-- 
H.J.
From 6089577be432bdd22fc89bf40e363c4af6876987 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Thu, 18 Apr 2019 09:49:58 -0700
Subject: [PATCH] x86: Update message for target_clones and unsupported ISAs

Before AVX512F, processors with the newer ISAs also support the older
ISAs, i.e., AVX2 processors also support AVX and SSE4, SSE4 processors
also support SSSE3, ...   After AVX512F, an AVX512XX processor may not
support AVX512YY.  It means AVX512XX features, except for AVX512F, can't
be used to decide priority in target_clones.

This patch updates error message for ISAs with P_ZERO priority.  It also
merges _feature_list into _isa_names_table and marks ISAs, which have
unknown priority, with P_ZERO so that we only need to update one place
to add a new ISA feature.

gcc/

2019-04-18  H.J. Lu  

	PR target/89929
	* config/i386/i386.c (feature_priority): Moved to file scope.
	(processor_features): Likewise.
	(processor_model): Likewise.
	(_arch_names_table): Likewise.
	(arch_names_table): Likewise.
	(_feature_list): Removed.
	(feature_list): Likewise.
	(_isa_names_table): Moved to file scope.  Add priority.
	(isa_names_table): Likewise.
	(get_builtin_code_for_version): Replace feature_list with
	isa_names_table.  Update error message for P_ZERO priority.

gcc/testsuite/

2019-04-18  Martin Liska  

	PR target/89929
	* g++.target/i386/mv28.C: New test.
	* gcc.target/i386/mvc14.c: New test.
---
 gcc/config/i386/i386.c| 490 --
 gcc/testsuite/g++.target/i386/mv28.C  |  26 ++
 gcc/testsuite/gcc.target/i386/mvc14.c |  16 +
 3 files changed, 274 insertions(+), 258 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/i386/mv28.C
 create mode 100644 gcc/testsuite/gcc.target/i386/mvc14.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 498a35d8aea..198af816f74 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -31832,6 +31832,229 @@ add_condition_to_bb (tree function_decl, tree version_decl,
   return bb3;
 }
 
+/* Priority of i386 features, greater value is higher priority.   This is
+   used to decide the order in which function dispatch must happen.  For
+   instance, a version specialized for SSE4.2 should be checked for dispatch
+   before a version for SSE3, as SSE4.2 implies SSE3.  */
+enum feature_priority
+{
+  P_ZERO = 0,
+  P_MMX,
+  P_SSE,
+  P_SSE2,
+  P_SSE3,
+  P_SSSE3,
+  P_PROC_SSSE3,
+  P_SSE4_A,
+  P_PROC_SSE4_A,
+  P_SSE4_1,
+  P_SSE4_2,
+  P_PROC_SSE4_2,
+  P_POPCNT,
+  P_AES,
+  P_PCLMUL,
+  P_AVX,
+  P_PROC_AVX,
+  P_BMI,
+  P_PROC_BMI,
+  P_FMA4,
+  P_XOP,
+  P_PROC_XOP,
+  P_FMA,
+  P_PROC_FMA,
+  P_BMI2,
+  P_AVX2,
+  P_PROC_AVX2,
+  P_AVX512F,
+  P_PROC_AVX512F
+};
+
+/* This is the order of bit-fields in __processor_features in cpuinfo.c */
+enum processor_features
+{
+  F_CMOV = 0,
+  F_MMX,
+  F_POPCNT,
+  F_SSE,
+  F_SSE2,
+  F_SSE3,
+  F_SSSE3,
+  F_SSE4_1,
+  F_SSE4_2,
+  F_AVX,
+  F_AVX2,
+  F_SSE4_A,
+  F_FMA4,
+  F_XOP,
+  F_FMA,
+  F_AVX512F,
+  F_BMI,
+  F_BMI2,
+  F_AES,
+  F_PCLMUL,
+  F_AVX512VL,
+  F_AVX512BW,
+  F_AVX512DQ,
+  F_AVX512CD,
+  F_AVX512ER,
+  F_AVX512PF,
+  F_AVX512VBMI,
+  F_AVX512IFMA,
+  F_AVX5124VNNIW,
+  F_AVX5124FMAPS,
+  F_AVX512VPOPCNTDQ,
+  F_AVX512VBMI2,
+  F_GFNI,
+  F_VPCLMULQDQ,
+  F_AVX512VNNI,
+  F_AVX512BITALG,
+  F_MAX
+};
+
+/* These are the values for vendor types and cpu types  and subtypes
+   in cpuinfo.c.  Cpu types and subtypes should be subtracted by
+   the corresponding start value.  */
+enum processor_model
+{
+  M_INTEL = 1,
+  M_AMD,
+  M_CPU_TYPE_START,
+  M_INTEL_BONNELL,
+  M_INTEL_CORE2,
+  M_INTEL_COREI7,
+  M_AMDFAM10H,
+  M_AMDFAM15H,
+  M_INTEL_SILVERMONT,
+  M_INTEL_KNL,
+  M_AMD_BTVER1,
+  M_AMD_BTVER2,
+  M_AMDFAM17H,
+  M_INTEL_KNM,
+  M_INTEL_GOLDMONT,
+  M_INTEL_GOLDMONT_PLUS,
+  M_INTEL_TREMONT,
+  M_CPU_SUBTYPE_START,
+  M_INTEL_COREI7_NEHALEM,
+  M_INTEL_COREI7_WESTMERE,
+  M_INTEL_COREI7_SANDYBRIDGE,
+  M_AMDFAM10H_BARCELONA,
+  M_AMDFAM10H_SHANGHAI,
+  M_AMDFAM10H_ISTANBUL,
+  M_AMDFAM15H_BDVER1,
+  M_AMDFAM15H_BDVER2,
+  M_AMDFAM15H_BDVER3,
+  M_AMDFAM15H_BDVER4,
+  M_AMDFAM17H_ZNVER1,
+  M_INTEL_COREI7_IVYBRIDGE,
+  M_INTEL_COREI7_HASWELL,
+  M_INTEL_COREI7_BROADWELL,
+  M_INTEL_COREI7_SKYLAKE,
+  M_INTEL_COREI7_SKYLAKE_AVX512,
+  M_INTEL_COREI7_CANNONLAKE,
+  M_INTEL_COREI7_ICELAKE_CLIENT,
+  M_INTEL_COREI7_ICELAKE_SERVER,
+  M_AMDFAM17H_ZNVER2,
+  M_INTEL_COREI7_CASCADELAKE
+};
+
+struct _arch_names_table
+{
+  const char *const name;
+  const enum processor_model model;

Re: [PATCH] avoid aarch64 ICE on large vectors (PR 89797)

2019-04-18 Thread Martin Sebor

On 4/18/19 9:59 AM, Richard Sandiford wrote:

Martin Sebor  writes:

On 4/18/19 6:07 AM, Richard Sandiford wrote:

"Richard Earnshaw (lists)"  writes:

On 18/04/2019 03:38, Martin Sebor wrote:

The fix for pr89797 committed in r270326 was limited to targets
with NUM_POLY_INT_COEFFS == 1 which I think is all but aarch64.
The tests for the fix have been failing with an ICE on aarch64
because it suffers from more or less the same problem but in
its own target-specific code.  Attached is the patch I posted
yesterday that fixes the ICE, successfully bootstrapped and
regtested on x86_64-linux.  I also ran the dg.exp=*attr* and
aarch64.exp tests with an aarch64-linux-elf cross-compiler.
There are no ICEs but there are tons of errors in the latter
tests because many (most?) either expect to be able to find
libc headers or link executables (I have not built libc for
aarch64).

I'm around tomorrow but then traveling the next two weeks (with
no connectivity the first week) so I unfortunately won't be able
to fix whatever this change might break until the week of May 6.

Jeff, if you have an aarch64 tester that could verify this patch
tomorrow that would help give us some confidence.  Otherwise,
another option to consider for the time being is to xfail
the tests on aarch64.

Thanks
Martin

gcc-89797.diff

PR middle-end/89797 - ICE on a vector_size (1LU << 33) int variable

gcc/ChangeLog:

PR middle-end/89797
* tree.h (TYPE_VECTOR_SUBPARTS): Correct computation when
NUM_POLY_INT_COEFFS == 2.  Use HOST_WIDE_INT_1U.
* config/aarch64/aarch64.c (aarch64_simd_vector_alignment): Avoid
assuming type size fits in SHWI.

Index: gcc/tree.h
===
--- gcc/tree.h  (revision 270418)
+++ gcc/tree.h  (working copy)
@@ -3735,13 +3735,13 @@ TYPE_VECTOR_SUBPARTS (const_tree node)
 if (NUM_POLY_INT_COEFFS == 2)
   {
 poly_uint64 res = 0;
-  res.coeffs[0] = 1 << (precision & 0xff);
+  res.coeffs[0] = HOST_WIDE_INT_1U << (precision & 0xff);


I'm not sure I understand this either before or after.  (precision &
0xff) gives a value in the range 0-255, which is way more than can be
accommodated in a left shift, even for a HWI.


For the record: the problem is that we have two coefficients that are
logically both in the range 1<<[0,63], needing 6 bits per coefficient
and 12 in total.  But precision is only 10 bits. :-)  Fortunately,
coeffs[1] is in practice either 0 or equal to coeffs[0], so we can encode
it in a single bit.

A 0x100/0xff split seemed easier to optimise than a 0x40/0x3f split.
E.g. 0xff can be loaded directly from memory as a byte.  There's not
going to be much in it either way though.


 if (precision & 0x100)
-   res.coeffs[1] = 1 << (precision & 0xff);
+   res.coeffs[1] = HOST_WIDE_INT_1U << ((precision & 0x100) >> 16);


And this equally doesn't make sense >>16 will shift the masked value out
of the bottom of the result.


Yeah, we should keep the original shift amount and just do
s/1/HOST_WIDE_INT_1U/.


Sorry, I confused things with the 16 bit shift thinko (I meant
to shift by 8 bits).  But I don't understand how poly_int works
at all so the shift was just a wild guess thinking the purpose
of the split was to allow for precisions in excess of (1 << 63).

But I'm afraid I still don't understand when (precision & 0x100)
could ever hold.  I put in an abort in that branch and ran
the test suite but didn't manage to trigger an ICE.  Can you
give me an example when it might trigger?  (I'd also like to
add a comment to the function to explain it.)


The function is decoding the precision installed by:

inline void
SET_TYPE_VECTOR_SUBPARTS (tree node, poly_uint64 subparts)
{
   STATIC_ASSERT (NUM_POLY_INT_COEFFS <= 2);
   unsigned HOST_WIDE_INT coeff0 = subparts.coeffs[0];
   int index = exact_log2 (coeff0);
   gcc_assert (index >= 0);
   if (NUM_POLY_INT_COEFFS == 2)
 {
   unsigned HOST_WIDE_INT coeff1 = subparts.coeffs[1];
   gcc_assert (coeff1 == 0 || coeff1 == coeff0);
   VECTOR_TYPE_CHECK (node)->type_common.precision
= index + (coeff1 != 0 ? 0x100 : 0);


Ugh.  It never occured to me that precision was being used like
this.  Even though the functions are one right after the other,
a brief comment would have helped avoid this misunderstanding.
Let me propose one separately.


 }
   else
 VECTOR_TYPE_CHECK (node)->type_common.precision = index;
}

Code protected by NUM_POLY_INT_COEFFS == 2 will only trigger on
AArch64 targets, so an x86 test run won't cover it.  The assert you
added would blow up on pretty much all of aarch64-sve.exp in an AArch64
test run though; see Christophe's results for what happened with the
original patch.


Ah!  aarch64-sve.exp explains it.  I only ran aarch64.exp.  There,
there's just one test that triggers an ICE (I missed it at first):
  aarch64/stack-check-prologue-16.c
But many of the aarch64-sve.exp do fail with the abort 

[PATCH v2] Use builtin sort instead of shell sort

2019-04-18 Thread Émeric Dupont
Some build environments and configuration options may lead to the make
variable PLUGIN_HEADERS being too long to be passed as parameters to the
shell `echo` command, leading to a "write error" message when making the
target install-plugin.

The following patch fixes this issue by using the [Make $(sort list)][1]
function instead to remove duplicates from the list of headers. There is
no functional change, the value assigned to the shell variable is the
same.

Tested in production on x86 and armv7 cross-compilation toolchains.
- The length of the headers variable goes from 8+ chars to 7500+

Tested with make bootstrap and make check on host-x86_64-pc-linux-gnu
- make bootstrap successful
- make check fails even before the patch is applied

WARNING: program timed out.
FAIL: libgomp.c/../libgomp.c-c++-common/cancel-parallel-1.c execution 
test
...
make[4]: *** [Makefile:479: check-DEJAGNU] Error 1

2019-04-15  Emeric Dupont  

* Makefile.in: Use builtin sort instead of shell sort

[1]: 
https://www.gnu.org/software/make/manual/html_node/Text-Functions.html#index-sorting-words

Signed-off-by: Emeric Dupont 
---
 gcc/Makefile.in | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index d186d71c91e..3196e774a26 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -3538,7 +3538,7 @@ install-plugin: installdirs lang.install-plugin 
s-header-vars install-gengtype
 # We keep the directory structure for files in config or c-family and .def
 # files. All other files are flattened to a single directory.
 $(mkinstalldirs) $(DESTDIR)$(plugin_includedir)
-headers=`echo $(PLUGIN_HEADERS) $$(cd $(srcdir); echo *.h *.def) | tr ' ' 
'\012' | sort -u`; \
+headers=$(sort $(PLUGIN_HEADERS) $$(cd $(srcdir); echo *.h *.def)); \
 srcdirstrip=`echo "$(srcdir)" | sed 's/[].[^$$\\*|]/&/g'`; \
 for file in $$headers; do \
   if [ -f $$file ] ; then \
--
2.21.0




TriaGnoSys GmbH, Registergericht: München HRB 141647, Vat.: DE 813396184 
Geschäftsführer: Núria Riera Díaz, Peter Lewalter



This email and any files transmitted with it are confidential & proprietary to 
Zodiac Inflight Innovations. This information is intended solely for the use of 
the individual or entity to which it is addressed. Access or transmittal of the 
information contained in this e-mail, in full or in part, to any other 
organization or persons is not authorized.


Re: Adding noexcept-specification on tuple constructors (LWG 2899)

2019-04-18 Thread Jonathan Wakely

On 16/04/19 17:59 +0100, Nina Dinka Ranns wrote:

On Tue, 16 Apr 2019 at 15:18, Jonathan Wakely  wrote:


On 16/04/19 14:08 +0100, Nina Dinka Ranns wrote:
>Tested on Linux-PPC64
>Adding noexcept-specification on tuple constructors (LWG 2899)

Thanks, Nina!

This looks great, although as I think Ville has explained we won't
commit it until the next stage 1, after the GCC 9 release.

ack



The changes look good, I just have some mostly-stylistic comments,
which are inline below ...


>2019-04-13 Nina Dinka Ranns 
>
>Adding noexcept-specification on tuple constructors (LWG 2899)
>* libstdc++-v3/include/std/tuple:
>(tuple()): Add noexcept-specification.
>(tuple(const _Elements&...)): Likewise
>(tuple(_UElements&&...)): Likewise
>(tuple(const tuple<_UElements...>&)): Likewise
>(tuple(tuple<_UElements...>&&)): Likewise
>(tuple(const _T1&, const _T2&)): Likewise
>(tuple(_U1&&, _U2&&)): Likewise
>(tuple(const tuple<_U1, _U2>&): Likewise
>(tuple(tuple<_U1, _U2>&&): Likewise
>(tuple(const pair<_U1, _U2>&): Likewise
>(tuple(pair<_U1, _U2>&&): Likewise
>
>

There should be no blank lines in the changelog entry here. A single
change should be recorded as a single block in the changelog, with no
blank lines within it.

ack. Do you need me to do anything about this or is it for future
reference only ?


For future reference. Whoever commits the patch can correct the
changelog.



>* libstdc++-v3/testsuite/20_util/tuple/cons/noexcept_specs.cc: New
>* libstdc++-v3/testsuite/20_util/tuple/cons/noexcept_specs2.cc: New
>* libstdc++-v3/testsuite/20_util/tuple/cons/noexcept_specs3.cc: New
>* libstdc++-v3/testsuite/20_util/tuple/cons/noexcept_specs4.cc: New
>* libstdc++-v3/testsuite/20_util/tuple/cons/noexcept_specs5.cc: New
>* libstdc++-v3/testsuite/20_util/tuple/cons/noexcept_specs6.cc: New

This is a lot of new test files for a small-ish QoI feature. Could
they be combined into one file?  Generally we do want one test file
per feature, but I think all of these are arguably testing one feature
(just on different constructors). The downside of lots of smaller
files is that we have to compile+assemble+link+run each one, which
adds several fork()s to launch a new process for each step. On some
platforms that can be quite slow.

I can do that, but there may be an issue. See below.




>@@ -624,6 +634,7 @@
>   && (sizeof...(_Elements) >= 1),
> bool>::type=true>
> constexpr tuple(_UElements&&... __elements)
>+
noexcept(__and_...>::value)

Can this be __nothrow_constructible<_UElements>() ?

It should have been that in the first place. Apologies. Fixed.




> : _Inherited(std::forward<_UElements>(__elements)...) { }
>
>   template@@ -635,6 +646,7 @@
>   && (sizeof...(_Elements) >= 1),
> bool>::type=false>
> explicit constexpr tuple(_UElements&&... __elements)
>+noexcept(__nothrow_constructible<_UElements&&...>())

The && here is redundant, though harmless.

is_constructible is exactly equivalent to is_constructible
because U means construction from an rvalue of type U and so does U&&.

It's fine to leave the && there though.

I'm happy to go either way. The only reason I used && form is because
it mimics the wording in the LWG resolution.


I suspect if STL had reviewed the wording in the resolution he'd have
asked for the && to be removed :-)



>@@ -966,6 +995,7 @@
> && !is_same<__remove_cvref_t<_U1>, allocator_arg_t>::value,
>   bool>::type = true>
> constexpr tuple(_U1&& __a1, _U2&& __a2)
>+noexcept(__nothrow_constructible<_U1&&,_U2&&>())

There should be a space after the comma here, and all the later
additions in the file.

ack. Fixed




>Index: libstdc++-v3/testsuite/20_util/tuple/cons/noexcept_specs.cc
>===
>--- libstdc++-v3/testsuite/20_util/tuple/cons/noexcept_specs.cc
(nonexistent)
>+++ libstdc++-v3/testsuite/20_util/tuple/cons/noexcept_specs.cc
(working copy)
>@@ -0,0 +1,191 @@
>+// { dg-options { -std=gnu++2a } }
>+// { dg-do run { target c++2a } }

This new file doesn't use std::is_nothrow_convertible so could just
use: { dg-do run { target c++11 } } and no dg-options.

For the other new tests that do use is_nothrow_convertible, I'm
already planning to add std::is_nothrow_convertible for our internal
use in C++11, so they could use that.

Alternatively, the test files themselves could define:

template
  struct is_nothrow_convertible
  : std::integral_constant && is_nothrow_constructible>
  { };

and then use that. That way we can test the exception specs are
correct in C++11 mode, the default C++14 mode, and C++17 mode.
Otherwise we're adding code that affects all those modes but only
testing it works correctly for the experimental C++2a mode.


There is a reason

Re: [PATCH] Improve implementation of parallel equal()

2019-04-18 Thread Jonathan Wakely

On 16/04/19 12:39 -0700, Thomas Rodgers wrote:


* include/pstl/algorithm_impl.h
(__internal::__brick_equal): use "4 iterator" version of
std::equal().
(__internal::__brick_equal): use simd for random access
iterators on unsequenced execution policies.
(__internal::__pattern_equal): add "4 iterator" version
(__internal::__pattern_equal): dispatch to simd __brick_equal
for vector-only execution policies.
(__internal::__pattern_equal): disptach to __parallel_or for


s/disptach/dispatch/ in the changelog




--- a/libstdc++-v3/include/pstl/glue_algorithm_impl.h
+++ b/libstdc++-v3/include/pstl/glue_algorithm_impl.h
@@ -757,7 +757,7 @@ 
__pstl::__internal::__enable_if_execution_policy<_ExecutionPolicy, bool>
equal(_ExecutionPolicy&& __exec, _ForwardIterator1 __first1, _ForwardIterator1 
__last1, _ForwardIterator2 __first2,
  _ForwardIterator2 __last2)
{
-return equal(std::forward<_ExecutionPolicy>(__exec), __first1, __last1, 
__first2,
+return equal(std::forward<_ExecutionPolicy>(__exec), __first1, __last1, 
__first2, __last2,


N.B. I don't think this should be an unqualified call, but that can be
fixed in a later patch.

OK for trunk, thanks.



[PATCH] tree-call-cdce: If !HONOR_NANS do not make code with NaNs (PR88055)

2019-04-18 Thread Segher Boessenkool
If we don't HONOR_NANS we should not try to use any unordered
comparison results.  Best case those will just be optimized away;
realistically, they ICE.  For example, the rs6000 backend has some
code that specifically checks we never do this.

This patch fixes it.  Bootstrapped and tested on powerpc64-linux
{-m32,-m64}.  Is this okay for trunk?


Segher


2019-04-18  Segher Boessenkool  

PR tree-optimization/88055
* tree-call-cdce.c (comparison_code_if_no_nans): New function.
(gen_one_condition): Use it if !HONOR_NANS.

---
 gcc/tree-call-cdce.c | 37 +
 1 file changed, 37 insertions(+)

diff --git a/gcc/tree-call-cdce.c b/gcc/tree-call-cdce.c
index 2b1e934..2e482b3 100644
--- a/gcc/tree-call-cdce.c
+++ b/gcc/tree-call-cdce.c
@@ -362,6 +362,40 @@ can_guard_call_p (gimple *call)
  || find_fallthru_edge (gimple_bb (call)->succs));
 }
 
+/* For a comparison code return the comparison code we should use if we don't
+   HONOR_NANS.  */
+
+static enum tree_code
+comparison_code_if_no_nans (tree_code code)
+{
+  switch (code)
+{
+case UNLT_EXPR:
+  return LT_EXPR;
+case UNGT_EXPR:
+  return GT_EXPR;
+case UNLE_EXPR:
+  return LE_EXPR;
+case UNGE_EXPR:
+  return GE_EXPR;
+case UNEQ_EXPR:
+  return EQ_EXPR;
+case LTGT_EXPR:
+  return NE_EXPR;
+
+case LT_EXPR:
+case GT_EXPR:
+case LE_EXPR:
+case GE_EXPR:
+case EQ_EXPR:
+case NE_EXPR:
+  return code;
+
+default:
+  gcc_unreachable ();
+}
+}
+
 /* A helper function to generate gimple statements for one bound
comparison, so that the built-in function is called whenever
TCODE  is *false*.  TEMP_NAME1/TEMP_NAME2 are names
@@ -378,6 +412,9 @@ gen_one_condition (tree arg, int lbub,
   vec conds,
unsigned *nconds)
 {
+  if (!HONOR_NANS (arg))
+tcode = comparison_code_if_no_nans (tcode);
+
   tree lbub_real_cst, lbub_cst, float_type;
   tree temp, tempn, tempc, tempcn;
   gassign *stmt1;
-- 
1.8.3.1



Re: [PATCH] Improve implementation of parallel equal()

2019-04-18 Thread Jonathan Wakely

On 18/04/19 21:43 +0100, Jonathan Wakely wrote:

--- a/libstdc++-v3/include/pstl/glue_algorithm_impl.h
+++ b/libstdc++-v3/include/pstl/glue_algorithm_impl.h
@@ -757,7 +757,7 @@ 
__pstl::__internal::__enable_if_execution_policy<_ExecutionPolicy, bool>
equal(_ExecutionPolicy&& __exec, _ForwardIterator1 __first1, _ForwardIterator1 
__last1, _ForwardIterator2 __first2,
 _ForwardIterator2 __last2)
{
-return equal(std::forward<_ExecutionPolicy>(__exec), __first1, __last1, 
__first2,
+return equal(std::forward<_ExecutionPolicy>(__exec), __first1, __last1, 
__first2, __last2,


N.B. I don't think this should be an unqualified call, but that can be
fixed in a later patch.


Also, should lines 677 and 699 in this file use std::forward for the
__exec parameter, not pass it as an lvalue?

I didn't look for other cases where it's not forwarded.



Re: [PATCH][IRA] Fix PR87871: [9 Regression] testcases fail after r265398 on arm

2019-04-18 Thread Vladimir Makarov



On 4/18/19 11:24 AM, Peter Bergner wrote:

PR87871 exposes a couple of problems.  One problem fixed here is that IRA
incorrectly adds a conflict in some cases where we have a simple register
copy between a pseudo reg and a hard reg.  This stopped us from assigning
the pseudo reg to that hard reg which we wanted to do in the testsuite test
case shown in the bugzilla.  The bug is due to an oversight in my r264897
commit that added support for not adding conflicts for simple register
copies.  That code correctly didn't add a conflict to CONFLICT_HARD_REGS
in make_object_dead(), but failed to do the same for TOTAL_CONFLICT_HARD_REGS.
The patch below fixes that oversight.

I have confirmed we now assign pseudo p116 to r0 in the ARM test case
as well as a similar assignment issue on POWER.

This passed bootstrap and regtesting with no regressions on powerpc64le-linux.
Ok for mainline?



Ok.  Thank you for fixing this, Peter.  Although I don't expect any 
problems with the patch.  Still please prepare to revert the patch if 
something goes wrong.




Peter

PR rtl-optimization/87871
* ira-lives.c (make_object_dead): Don't add conflicts to
TOTAL_CONFLICT_HARD_REGS for register ignore_reg_for_conflicts.

Index: gcc/ira-lives.c
===
--- gcc/ira-lives.c (revision 270420)
+++ gcc/ira-lives.c (working copy)
@@ -163,7 +163,9 @@ static void
  make_object_dead (ira_object_t obj)
  {
live_range_t lr;
+  int regno;
int ignore_regno = -1;
+  int ignore_total_regno = -1;
int end_regno = -1;
  
sparseset_clear_bit (objects_live, OBJECT_CONFLICT_ID (obj));

@@ -174,16 +176,14 @@ make_object_dead (ira_object_t obj)
&& REGNO (ignore_reg_for_conflicts) < FIRST_PSEUDO_REGISTER)
  {
end_regno = END_REGNO (ignore_reg_for_conflicts);
-  int src_regno = ignore_regno = REGNO (ignore_reg_for_conflicts);
+  ignore_regno = ignore_total_regno = REGNO (ignore_reg_for_conflicts);
  
-  while (src_regno < end_regno)

+  for (regno = ignore_regno; regno < end_regno; regno++)
{
- if (TEST_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), src_regno))
-   {
- ignore_regno = end_regno = -1;
- break;
-   }
- src_regno++;
+ if (TEST_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), regno))
+   ignore_regno = end_regno;
+ if (TEST_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), regno))
+   ignore_total_regno = end_regno;
}
  }
  
@@ -192,8 +192,10 @@ make_object_dead (ira_object_t obj)
  
/* If IGNORE_REG_FOR_CONFLICTS did not already conflict with OBJ, make

   sure it still doesn't.  */
-  for (; ignore_regno < end_regno; ignore_regno++)
-CLEAR_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), ignore_regno);
+  for (regno = ignore_regno; regno < end_regno; regno++)
+CLEAR_HARD_REG_BIT (OBJECT_CONFLICT_HARD_REGS (obj), regno);
+  for (regno = ignore_total_regno; regno < end_regno; regno++)
+CLEAR_HARD_REG_BIT (OBJECT_TOTAL_CONFLICT_HARD_REGS (obj), regno);
  
lr = OBJECT_LIVE_RANGES (obj);

ira_assert (lr != NULL);



Re: [PATCH][IRA] Fix PR87871: [9 Regression] testcases fail after r265398 on arm

2019-04-18 Thread Peter Bergner
On 4/18/19 4:39 PM, Vladimir Makarov wrote:
> On 4/18/19 11:24 AM, Peter Bergner wrote:
>> I have confirmed we now assign pseudo p116 to r0 in the ARM test case
>> as well as a similar assignment issue on POWER.
>>
>> This passed bootstrap and regtesting with no regressions on 
>> powerpc64le-linux.
>> Ok for mainline?
> 
> 
> Ok.  Thank you for fixing this, Peter.

Ok, committed.  Thanks!



> Although I don't expect any problems with the patch.  Still please prepare
> to revert the patch if something goes wrong.

Will do.


Peter



[PATCH] Delegate PSTL configuration to pstl/pstl_config.h

2019-04-18 Thread Thomas Rodgers

 * include/bits/c++config: Remove explicit PSTL configuration
 macros and use definitions from .

>From 198662c6e2ee6b1a6b363c2a515c05ef1ca949bd Mon Sep 17 00:00:00 2001
From: Thomas Rodgers 
Date: Thu, 18 Apr 2019 16:55:40 -0700
Subject: [PATCH] Delegate PSTL configuration to pstl/pstl_config.h

	 * include/bits/c++config: Remove explicit PSTL configuration
	 macros and use definitions from .
---
 libstdc++-v3/include/bits/c++config | 82 +
 1 file changed, 1 insertion(+), 81 deletions(-)

diff --git a/libstdc++-v3/include/bits/c++config b/libstdc++-v3/include/bits/c++config
index ef8ba96737b..5016f4853de 100644
--- a/libstdc++-v3/include/bits/c++config
+++ b/libstdc++-v3/include/bits/c++config
@@ -682,90 +682,10 @@ namespace std
 #  define __PSTL_USE_PAR_POLICIES 1
 # endif
 
-# if __PSTL_USE_PAR_POLICIES
-#  if !defined(__PSTL_PAR_BACKEND_TBB)
-#   define __PSTL_PAR_BACKEND_TBB 1
-#  endif
-# else
-#  undef __PSTL_PAR_BACKEND_TBB
-# endif
-
 # define __PSTL_ASSERT(_Condition) __glibcxx_assert(_Condition)
 # define __PSTL_ASSERT_MSG(_Condition, _Message) __glibcxx_assert(_Condition)
 
-
-# define __PSTL_PRAGMA(x) _Pragma (#x)
-
-# define __PSTL_STRING_AUX(x) #x
-# define __PSTL_STRING(x) __PSTL_STRING_AUX(x)
-# define __PSTL_STRING_CONCAT(x, y) x#y
-
-# define __PSTL_GCC_VERSION (__GNUC__ * 1 + __GNUC_MINOR__ * 100 + \
-			 __GNUC_PATCHLEVEL__)
-
-// Enable SIMD for compilers that support OpenMP 4.0
-# if (__PSTL_GCC_VERSION >= 40900)
-#  define __PSTL_PRAGMA_SIMD __PSTL_PRAGMA(omp simd)
-#define __PSTL_PRAGMA_DECLARE_SIMD __PSTL_PRAGMA(omp declare simd)
-#  define __PSTL_PRAGMA_SIMD_REDUCTION(PRM) __PSTL_PRAGMA(omp simd reduction(PRM))
-# else //no simd
-#  define __PSTL_PRAGMA_SIMD
-#  define __PSTL_PRAGMA_SIMD_REDUCTION(PRM)
-# endif //Enable SIMD
-
-#define __PSTL_PRAGMA_SIMD_SCAN(PRM)
-#define __PSTL_PRAGMA_SIMD_INCLUSIVE_SCAN(PRM)
-#define __PSTL_PRAGMA_SIMD_EXCLUSIVE_SCAN(PRM)
-
-#define __PSTL_PRAGMA_FORCEINLINE
-
-// Should be defined to 1 for environments with a vendor implementation
-// of C++17 execution policies
-// #define __PSTL_CPP17_EXECUTION_POLICIES_PRESENT (_MSC_VER >= 1912)
-// TODO define libstdc++ policies
-
-# define __PSTL_CPP14_2RANGE_MISMATCH_EQUAL_PRESENT \
-  (__cplusplus >= 201300L || __cpp_lib_robust_nonmodifying_seq_ops == 201304)
-# define __PSTL_CPP14_MAKE_REVERSE_ITERATOR_PRESENT \
-  (__cplusplus >= 201402L || __cpp_lib_make_reverse_iterator == 201402)
-# define __PSTL_CPP14_INTEGER_SEQUENCE_PRESENT (__cplusplus >= 201402L)
-# define __PSTL_CPP14_VARIABLE_TEMPLATES_PRESENT (__cplusplus >= 201402L)
-
-# if __PSTL_MONOTONIC_PRESENT
-#  define __PSTL_PRAGMA_SIMD_ORDERED_MONOTONIC(PRM) \
-  __PSTL_PRAGMA(omp ordered simd monotonic(PRM))
-#  define __PSTL_PRAGMA_SIMD_ORDERED_MONOTONIC_2ARGS(PRM1, PRM2) \
-  __PSTL_PRAGMA(omp ordered simd monotonic(PRM1, PRM2))
-# else
-#  define __PSTL_PRAGMA_SIMD_ORDERED_MONOTONIC(PRM)
-#  define __PSTL_PRAGMA_SIMD_ORDERED_MONOTONIC_2ARGS(PRM1, PRM2)
-# endif
-
-// Declaration of reduction functor, where
-// NAME - the name of the functor
-// OP - type of the callable object with the reduction operation
-// omp_in - refers to the local partial result
-// omp_out - refers to the final value of the combiner operator
-// omp_priv - refers to the private copy of the initial value
-// omp_orig - refers to the original variable to be reduced
-#define __PSTL_PRAGMA_DECLARE_REDUCTION(NAME, OP)  \
-__PSTL_PRAGMA(omp declare reduction(NAME : OP : omp_out(omp_in)) initializer(omp_priv = omp_orig))
-
-# define __PSTL_PRAGMA_VECTOR_UNALIGNED
-# define __PSTL_USE_NONTEMPORAL_STORES_IF_ALLOWED
-# define __PSTL_PRAGMA_LOCATION
-
-# define __PSTL_PRAGMA_MESSAGE_IMPL(x) \
-  __PSTL_PRAGMA(message(__PSTL_STRING_CONCAT(__PSTL_PRAGMA_LOCATION, x)))
-# define __PSTL_PRAGMA_MESSAGE_POLICIES(x) __PSTL_PRAGMA_MESSAGE_IMPL(x)
-
-//Too many warnings in output, switched off
-# define __PSTL_PRAGMA_MESSAGE(x)
-
-# if defined(__GLIBCXX__)
-#  define __PSTL_CPP11_STD_ROTATE_BROKEN \
-  (__PSTL_GCC_VERSION < 50100) //GCC 5.1 release
-# endif
+#include 
 
 #endif
 // End of prewritten config; the settings discovered at configure time follow.
-- 
2.20.1



Re: [PATCH][stage1] Enhance target and target_clone error messages.

2019-04-18 Thread Martin Sebor

On 4/18/19 5:09 AM, Martin Liška wrote:

Hi.

The patch distinguishes among target and target_clone attribute when
reporting for an error. I've also reworded the affected error messages
a bit.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed after stage1 opens?
Thanks,
Martin

---
  gcc/cgraphclones.c |  2 +-
  gcc/config/i386/i386-c.c   |  5 +-
  gcc/config/i386/i386-protos.h  |  4 +-
  gcc/config/i386/i386.c | 54 +-
  gcc/testsuite/gcc.target/i386/funcspec-4.c |  2 +-
  5 files changed, 39 insertions(+), 28 deletions(-)


Just a suggestion to also consider making the phrasing in the messages
consistent by adding "is" below

@@ -5316,7 +5322,7 @@ ix86_valid_target_attribute_inner_p (tree args, 
char *p_strings[],


   else if (TREE_CODE (args) != STRING_CST)
 {
-  error ("attribute % argument not a string");
+  error_at (loc, "attribute %qs argument not a string", attr_name);
   return false;

i.e., "is not a string" and changing the below

@@ -5382,7 +5386,8 @@ ix86_valid_target_attribute_inner_p (tree args, 
char *p_strings[],

   /* Process the option.  */
   if (opt == N_OPTS)
{
- error ("attribute(target(\"%s\")) is unknown", orig_p);
+ error_at (loc, "attribute value %qs is unknown in %qs attribute",
+   orig_p, attr_name);

to

  error_at (loc, "attribute %qs argument %qs is unknown",
attr_name, orig_p);

Martin


Re: [PATCH][C++] Improve compile-time by ordering expensive checks last

2019-04-18 Thread Jason Merrill
OK.

On Wed, Apr 17, 2019 at 6:44 AM Richard Biener  wrote:
>
> On Tue, 16 Apr 2019, Richard Biener wrote:
>
> >
> > Two cases from a -fsynax-only tramp3d callgrind profile.
>
> Amended by two others.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
>
> OK?
>
> Thanks,
> Richard.
>
> 2019-04-17  Richard Biener  
>
> cp/
> * call.c (null_ptr_cst_p): Order checks according to expensiveness.
> (conversion_null_warnings): Likewise.
> * typeck.c (same_type_ignoring_top_level_qualifiers_p): Return
> early if type1 == type2.
>
> Index: gcc/cp/call.c
> ===
> --- gcc/cp/call.c   (revision 270407)
> +++ gcc/cp/call.c   (working copy)
> @@ -541,11 +541,11 @@ null_ptr_cst_p (tree t)
>STRIP_ANY_LOCATION_WRAPPER (t);
>
>/* Core issue 903 says only literal 0 is a null pointer constant.  */
> -  if (TREE_CODE (type) == INTEGER_TYPE
> - && !char_type_p (type)
> - && TREE_CODE (t) == INTEGER_CST
> +  if (TREE_CODE (t) == INTEGER_CST
> + && !TREE_OVERFLOW (t)
> + && TREE_CODE (type) == INTEGER_TYPE
>   && integer_zerop (t)
> - && !TREE_OVERFLOW (t))
> + && !char_type_p (type))
> return true;
>  }
>else if (CP_INTEGRAL_TYPE_P (type))
> @@ -6844,8 +6844,9 @@ static void
>  conversion_null_warnings (tree totype, tree expr, tree fn, int argnum)
>  {
>/* Issue warnings about peculiar, but valid, uses of NULL.  */
> -  if (null_node_p (expr) && TREE_CODE (totype) != BOOLEAN_TYPE
> -  && ARITHMETIC_TYPE_P (totype))
> +  if (TREE_CODE (totype) != BOOLEAN_TYPE
> +  && ARITHMETIC_TYPE_P (totype)
> +  && null_node_p (expr))
>  {
>location_t loc = get_location_for_expr_unwinding_for_system_header 
> (expr);
>if (fn)
> @@ -6882,8 +6883,8 @@ conversion_null_warnings (tree totype, t
>  }
>/* Handle zero as null pointer warnings for cases other
>   than EQ_EXPR and NE_EXPR */
> -  else if (null_ptr_cst_p (expr) &&
> -  (TYPE_PTR_OR_PTRMEM_P (totype) || NULLPTR_TYPE_P (totype)))
> +  else if ((TYPE_PTR_OR_PTRMEM_P (totype) || NULLPTR_TYPE_P (totype))
> +  && null_ptr_cst_p (expr))
>  {
>location_t loc = get_location_for_expr_unwinding_for_system_header 
> (expr);
>maybe_warn_zero_as_null_pointer_constant (expr, loc);
> Index: gcc/cp/typeck.c
> ===
> --- gcc/cp/typeck.c (revision 270407)
> +++ gcc/cp/typeck.c (working copy)
> @@ -1508,6 +1508,8 @@ same_type_ignoring_top_level_qualifiers_
>  {
>if (type1 == error_mark_node || type2 == error_mark_node)
>  return false;
> +  if (type1 == type2)
> +return true;
>
>type1 = cp_build_qualified_type (type1, TYPE_UNQUALIFIED);
>type2 = cp_build_qualified_type (type2, TYPE_UNQUALIFIED);


Re: [C/C++ PATCH] Further typedef duplicate decl fixes (PR c++/90108)

2019-04-18 Thread Jason Merrill
OK.

On Wed, Apr 17, 2019 at 11:25 PM Jakub Jelinek  wrote:
>
> Hi!
>
> As reported, the newly added testcase ICEs with --param ggc-min-heapsize=0.
> The problem is that while the remove type is not referenced by anything
> else, it is a distinct type created to hold the attributes, there is another
> type with TYPE_NAME equal to the newdecl we want to tree.  That one is
> created in common_handle_aligned_attribute:
> {
>   if ((flags & (int) ATTR_FLAG_TYPE_IN_PLACE))
> /* OK, modify the type in place.  */;
>   /* If we have a TYPE_DECL, then copy the type, so that we
>  don't accidentally modify a builtin type.  See pushdecl.  */
>   else if (decl && TREE_TYPE (decl) != error_mark_node
>&& DECL_ORIGINAL_TYPE (decl) == NULL_TREE)
> {
>   tree tt = TREE_TYPE (decl);
>   *type = build_variant_type_copy (*type);
>   DECL_ORIGINAL_TYPE (decl) = tt;
>   TYPE_NAME (*type) = decl;
>   TREE_USED (*type) = TREE_USED (decl);
>   TREE_TYPE (decl) = *type;
> }
>   else
> *type = build_variant_type_copy (*type);
> where we create a variant type and set the TYPE_NAME too.  I've tried to
> remove that else if ... and just do *type = build_variant_type_copy (*type);
> but that regressed some DWARF DW_AT_alignment tests.
>
> So, the following patch instead removes the remove type from the variants
> list if it is not a main variant (as before), otherwise tries to find in
> TYPE_MAIN_VARIANT (DECL_ORIGINAL_TYPE (newdecl)) variant list a type
> with TYPE_NAME equal to newdecl and remove that one.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2019-04-18  Jakub Jelinek  
>
> PR c++/90108
> * c-decl.c (merge_decls): If remove is main variant and
> DECL_ORIGINAL_TYPE is some other type, remove a DECL_ORIGINAL_TYPE
> variant that has newdecl as TYPE_NAME if any.
>
> * decl.c (duplicate_decls): If remove is main variant and
> DECL_ORIGINAL_TYPE is some other type, remove a DECL_ORIGINAL_TYPE
> variant that has newdecl as TYPE_NAME if any.
>
> * c-c++-common/pr90108.c: New test.
>
> --- gcc/c/c-decl.c.jj   2019-04-17 21:21:39.936133112 +0200
> +++ gcc/c/c-decl.c  2019-04-17 23:25:08.098936888 +0200
> @@ -2513,7 +2513,24 @@ merge_decls (tree newdecl, tree olddecl,
> {
>   tree remove = TREE_TYPE (newdecl);
>   if (TYPE_MAIN_VARIANT (remove) == remove)
> -   gcc_assert (TYPE_NEXT_VARIANT (remove) == NULL_TREE);
> +   {
> + gcc_assert (TYPE_NEXT_VARIANT (remove) == NULL_TREE);
> + /* If remove is the main variant, no need to remove that
> +from the list.  One of the DECL_ORIGINAL_TYPE
> +variants, e.g. created for aligned attribute, might still
> +refer to the newdecl TYPE_DECL though, so remove that one
> +in that case.  */
> + if (DECL_ORIGINAL_TYPE (newdecl)
> + && DECL_ORIGINAL_TYPE (newdecl) != remove)
> +   for (tree t = TYPE_MAIN_VARIANT (DECL_ORIGINAL_TYPE 
> (newdecl));
> +t; t = TYPE_MAIN_VARIANT (t))
> + if (TYPE_NAME (TYPE_NEXT_VARIANT (t)) == newdecl)
> +   {
> + TYPE_NEXT_VARIANT (t)
> +   = TYPE_NEXT_VARIANT (TYPE_NEXT_VARIANT (t));
> + break;
> +   }
> +   }
>   else
> for (tree t = TYPE_MAIN_VARIANT (remove); ;
>  t = TYPE_NEXT_VARIANT (t))
> --- gcc/cp/decl.c.jj2019-04-17 21:21:39.753136091 +0200
> +++ gcc/cp/decl.c   2019-04-17 23:27:13.995875527 +0200
> @@ -2133,7 +2133,24 @@ next_arg:;
> {
>   tree remove = TREE_TYPE (newdecl);
>   if (TYPE_MAIN_VARIANT (remove) == remove)
> -   gcc_assert (TYPE_NEXT_VARIANT (remove) == NULL_TREE);
> +   {
> + gcc_assert (TYPE_NEXT_VARIANT (remove) == NULL_TREE);
> + /* If remove is the main variant, no need to remove that
> +from the list.  One of the DECL_ORIGINAL_TYPE
> +variants, e.g. created for aligned attribute, might still
> +refer to the newdecl TYPE_DECL though, so remove that one
> +in that case.  */
> + if (tree orig = DECL_ORIGINAL_TYPE (newdecl))
> +   if (orig != remove)
> + for (tree t = TYPE_MAIN_VARIANT (orig); t;
> +  t = TYPE_MAIN_VARIANT (t))
> +   if (TYPE_NAME (TYPE_NEXT_VARIANT (t)) == newdecl)
> + {
> +   TYPE_NEXT_VARIANT (t)
> + = TYPE_NEXT_VARIANT (TYPE_NEXT_VARIANT (t));
> +   break;
> + }
> +   

Re: [C/C++ PATCH] Fix promoted switch condition out of range diagnostics (PR c/89888, take 2)

2019-04-18 Thread Jason Merrill
On Tue, Apr 9, 2019 at 2:39 PM Jakub Jelinek  wrote:
> On Tue, Apr 09, 2019 at 09:06:33AM +0200, Jakub Jelinek wrote:
> > Alternatively, I believe we could remove from the patch the in-place
> > replacement of CASE_LABEL_EXPRs with LABEL_EXPRs if we want to remove,
> > just splay_tree_remove those, because by my reading of
> > preprocess_case_label_vec_for_gimple we also ignore the fully out of range
> > CASE_LABEL_EXPRs there, shall I tweak the patch and rebootstrap?
>
> Here is a version of the patch that doesn't overwrite those CASE_LABEL_EXPRs
> as gimplifier handles those, just removes them from the splay tree.
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2019-04-09  Jakub Jelinek  
>
> PR c/89888
> * c-common.h (c_add_case_label): Remove orig_type and outside_range_p
> arguments.
> (c_do_switch_warnings): Remove outside_range_p argument.
> * c-common.c (check_case_bounds): Removed.
> (c_add_case_label): Remove orig_type and outside_range_p arguments.
> Don't call check_case_bounds.  Fold low_value as well as high_value.
> * c-warn.c (c_do_switch_warnings): Remove outside_range_p argument.
> Check for case labels outside of range of original type here and
> adjust them.
> c/
> * c-typeck.c (struct c_switch): Remove outside_range_p member.
> (c_start_case): Don't clear it.
> (do_case): Adjust c_add_case_label caller.
> (c_finish_case): Adjust c_do_switch_warnings caller.
> cp/
> * decl.c (struct cp_switch): Remove outside_range_p member.
> (push_switch): Don't clear it.
> (pop_switch): Adjust c_do_switch_warnings caller.
> (finish_case_label): Adjust c_add_case_label caller.

> +  node = splay_tree_predecessor (cases, (splay_tree_key) min_value);
...
> + if (CASE_HIGH ((tree) node->value)
> + && tree_int_cst_compare (CASE_HIGH ((tree) node->value),
> +  min_value) >= 0)
...
> + node = splay_tree_predecessor (cases,
> +(splay_tree_key) min_value);

> +  node = splay_tree_lookup (cases, (splay_tree_key) max_value);
> +  if (node == NULL)
> +   node = splay_tree_predecessor (cases, (splay_tree_key) max_value);
> +  /* Handle a single node that might partially overlap the range.  */
> +  if (node
> + && node->key
> + && CASE_HIGH ((tree) node->value)
> + && tree_int_cst_compare (CASE_HIGH ((tree) node->value),
> +  max_value) > 0)
...
> +  while ((node = splay_tree_successor (cases,
> +  (splay_tree_key) max_value))

Hmm, why are the code sections for dealing with the lower bound and
upper bound asymmetrical?  That is, why not start the upper bound
consideration with splay_tree_successor?

Jason