[C++ PATCH] Fix replace_placeholders (PR c++/83556)

2017-12-23 Thread Jakub Jelinek
Hi!

Recently I've changed replace_placeholders to walk trees without duplicates
to avoid compile time explosion with lots of nested SAVE_EXPRs.
The problem as the following testcase shows is that it also prevents
replacement of PLACEHOLDER_EXPRs we want to replace, if the same
PLACEHOLDER_EXPR appears multiple time in the expression.

The following patch changes the walk_tree not to use pset itself, but rather
stops walking children if we've seen some tree already.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/7.3?

2017-12-23  Jakub Jelinek  

PR c++/83556
* tree.c (replace_placeholders_r): Pass NULL as last argument to
cp_walk_tree instead of d->pset.  If non-TREE_CONSTANT and
non-PLACEHOLDER_EXPR tree has been seen already, set *walk_subtrees
to false and return.
(replace_placeholders): Pass NULL instead of &pset as last argument
to cp_walk_tree.

* g++.dg/cpp0x/pr83556.C: New test.

--- gcc/cp/tree.c.jj2017-12-15 16:10:37.0 +0100
+++ gcc/cp/tree.c   2017-12-22 23:24:16.720428548 +0100
@@ -3106,6 +3106,11 @@ replace_placeholders_r (tree* t, int* wa
   {
constructor_elt *ce;
vec *v = CONSTRUCTOR_ELTS (*t);
+   if (d->pset->add (*t))
+ {
+   *walk_subtrees = false;
+   return NULL_TREE;
+ }
for (unsigned i = 0; vec_safe_iterate (v, i, &ce); ++i)
  {
tree *valp = &ce->value;
@@ -3125,7 +3130,7 @@ replace_placeholders_r (tree* t, int* wa
  valp = &TARGET_EXPR_INITIAL (*valp);
  }
d->obj = subob;
-   cp_walk_tree (valp, replace_placeholders_r, data_, d->pset);
+   cp_walk_tree (valp, replace_placeholders_r, data_, NULL);
d->obj = obj;
  }
*walk_subtrees = false;
@@ -3133,6 +3138,8 @@ replace_placeholders_r (tree* t, int* wa
   }
 
 default:
+  if (d->pset->add (*t))
+   *walk_subtrees = false;
   break;
 }
 
@@ -3161,7 +3168,7 @@ replace_placeholders (tree exp, tree obj
   replace_placeholders_t data = { obj, false, &pset };
   if (TREE_CODE (exp) == TARGET_EXPR)
 tp = &TARGET_EXPR_INITIAL (exp);
-  cp_walk_tree (tp, replace_placeholders_r, &data, &pset);
+  cp_walk_tree (tp, replace_placeholders_r, &data, NULL);
   if (seen_p)
 *seen_p = data.seen;
   return exp;
--- gcc/testsuite/g++.dg/cpp0x/pr83556.C.jj 2017-12-22 23:30:10.771126002 
+0100
+++ gcc/testsuite/g++.dg/cpp0x/pr83556.C2017-12-22 23:29:21.0 
+0100
@@ -0,0 +1,28 @@
+// PR c++/83556
+// { dg-do run { target c++11 } }
+
+int
+foo ()
+{
+  return 1;
+}
+
+struct A
+{
+  int a = foo ();
+  int b = 1;
+  int c = a ? 1 * b : 2 * b;
+};
+
+struct B
+{
+  A d {};
+};
+
+int
+main ()
+{
+  B e {};
+  if (e.d.c != 1)
+__builtin_abort ();
+}

Jakub


[PATCH] Fix COND_EXPR folding with CASE_LABEL_EXPRs (PR c++/83553)

2017-12-23 Thread Jakub Jelinek
Hi!

The problem here is that for loops that have constant 0/false condition
the C++ FE wants to correctly emit if (0) { body; incr-expr; }
but doesn't just build3 (COND_EXPR, ...), but fold_build3, and COND_EXPR
folding with constant condition optimizes away the unused branch completely
if it doesn't contain any labels.  In this case it doesn't contain any
labels, but contains CASE_LABEL_EXPR that we should also take into account
(but can ignore it when during the walk we are inside a SWITCH_BODY).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/7.3?

2017-12-23  Jakub Jelinek  

PR c++/83553
* fold-const.c (struct contains_label_data): New type.
(contains_label_1): Return non-NULL even for CASE_LABEL_EXPR, unless
inside of a SWITCH_BODY seen during the walk.
(contains_label_p): Use walk_tree instead of
walk_tree_without_duplicates, prepare data for contains_label_1 and
provide own pset.

* c-c++-common/torture/pr83553.c: New test.

--- gcc/fold-const.c.jj 2017-12-21 09:43:17.0 +0100
+++ gcc/fold-const.c2017-12-23 00:22:54.447669504 +0100
@@ -11218,22 +11218,48 @@ fold_binary_loc (location_t loc, enum tr
 } /* switch (code) */
 }
 
+/* Used by contains_label_[p1].  */
+
+struct contains_label_data
+{
+  hash_set *pset;
+  bool inside_switch_p;
+};
+
 /* Callback for walk_tree, looking for LABEL_EXPR.  Return *TP if it is
-   a LABEL_EXPR; otherwise return NULL_TREE.  Do not check the subtrees
-   of GOTO_EXPR.  */
+   a LABEL_EXPR or CASE_LABEL_EXPR not inside of another SWITCH_EXPR; otherwise
+   return NULL_TREE.  Do not check the subtrees of GOTO_EXPR.  */
 
 static tree
-contains_label_1 (tree *tp, int *walk_subtrees, void *data ATTRIBUTE_UNUSED)
+contains_label_1 (tree *tp, int *walk_subtrees, void *data)
 {
+  contains_label_data *d = (contains_label_data *) data;
   switch (TREE_CODE (*tp))
 {
 case LABEL_EXPR:
   return *tp;
 
+case CASE_LABEL_EXPR:
+  if (!d->inside_switch_p)
+   return *tp;
+  return NULL_TREE;
+
+case SWITCH_EXPR:
+  if (!d->inside_switch_p)
+   {
+ if (walk_tree (&SWITCH_COND (*tp), contains_label_1, data, d->pset))
+   return *tp;
+ d->inside_switch_p = true;
+ if (walk_tree (&SWITCH_BODY (*tp), contains_label_1, data, d->pset))
+   return *tp;
+ d->inside_switch_p = false;
+ *walk_subtrees = 0;
+   }
+  return NULL_TREE;
+
 case GOTO_EXPR:
   *walk_subtrees = 0;
-
-  /* fall through */
+  return NULL_TREE;
 
 default:
   return NULL_TREE;
@@ -11246,8 +11272,9 @@ contains_label_1 (tree *tp, int *walk_su
 static bool
 contains_label_p (tree st)
 {
-  return
-   (walk_tree_without_duplicates (&st, contains_label_1 , NULL) != NULL_TREE);
+  hash_set pset;
+  contains_label_data data = { &pset, false };
+  return walk_tree (&st, contains_label_1, &data, &pset) != NULL_TREE;
 }
 
 /* Fold a ternary expression of code CODE and type TYPE with operands
--- gcc/testsuite/c-c++-common/torture/pr83553.c.jj 2017-12-23 
00:30:20.548180122 +0100
+++ gcc/testsuite/c-c++-common/torture/pr83553.c2017-12-23 
00:29:11.0 +0100
@@ -0,0 +1,29 @@
+/* PR c++/83553 */
+/* { dg-do run } */
+
+int a[3];
+
+int
+foo (int n)
+{
+  switch (n)
+{
+case 0:
+  for (n = 7, a[0]++; 0; a[2] = a[1] + 1)
+   {
+case 2:
+ a[1] = a[0] + 1;
+   }
+}
+  return n;
+}
+
+int
+main ()
+{
+  if (foo (0) != 7 || a[0] != 1 || a[1] || a[2])
+__builtin_abort ();
+  if (foo (2) != 2 || a[0] != 1 || a[1] != 2 || a[2] != 3)
+__builtin_abort ();
+  return 0;
+}

Jakub


Re: [PATCH] Fix COND_EXPR folding with CASE_LABEL_EXPRs (PR c++/83553)

2017-12-23 Thread Richard Biener
On December 23, 2017 9:33:03 AM GMT+01:00, Jakub Jelinek  
wrote:
>Hi!
>
>The problem here is that for loops that have constant 0/false condition
>the C++ FE wants to correctly emit if (0) { body; incr-expr; }
>but doesn't just build3 (COND_EXPR, ...), but fold_build3, and
>COND_EXPR
>folding with constant condition optimizes away the unused branch
>completely
>if it doesn't contain any labels.  In this case it doesn't contain any
>labels, but contains CASE_LABEL_EXPR that we should also take into
>account
>(but can ignore it when during the walk we are inside a SWITCH_BODY).
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for
>trunk/7.3?

OK. 

Richard. 

>2017-12-23  Jakub Jelinek  
>
>   PR c++/83553
>   * fold-const.c (struct contains_label_data): New type.
>   (contains_label_1): Return non-NULL even for CASE_LABEL_EXPR, unless
>   inside of a SWITCH_BODY seen during the walk.
>   (contains_label_p): Use walk_tree instead of
>   walk_tree_without_duplicates, prepare data for contains_label_1 and
>   provide own pset.
>
>   * c-c++-common/torture/pr83553.c: New test.
>
>--- gcc/fold-const.c.jj2017-12-21 09:43:17.0 +0100
>+++ gcc/fold-const.c   2017-12-23 00:22:54.447669504 +0100
>@@ -11218,22 +11218,48 @@ fold_binary_loc (location_t loc, enum tr
> } /* switch (code) */
> }
> 
>+/* Used by contains_label_[p1].  */
>+
>+struct contains_label_data
>+{
>+  hash_set *pset;
>+  bool inside_switch_p;
>+};
>+
>/* Callback for walk_tree, looking for LABEL_EXPR.  Return *TP if it is
>-   a LABEL_EXPR; otherwise return NULL_TREE.  Do not check the
>subtrees
>-   of GOTO_EXPR.  */
>+   a LABEL_EXPR or CASE_LABEL_EXPR not inside of another SWITCH_EXPR;
>otherwise
>+   return NULL_TREE.  Do not check the subtrees of GOTO_EXPR.  */
> 
> static tree
>-contains_label_1 (tree *tp, int *walk_subtrees, void *data
>ATTRIBUTE_UNUSED)
>+contains_label_1 (tree *tp, int *walk_subtrees, void *data)
> {
>+  contains_label_data *d = (contains_label_data *) data;
>   switch (TREE_CODE (*tp))
> {
> case LABEL_EXPR:
>   return *tp;
> 
>+case CASE_LABEL_EXPR:
>+  if (!d->inside_switch_p)
>+  return *tp;
>+  return NULL_TREE;
>+
>+case SWITCH_EXPR:
>+  if (!d->inside_switch_p)
>+  {
>+if (walk_tree (&SWITCH_COND (*tp), contains_label_1, data,
>d->pset))
>+  return *tp;
>+d->inside_switch_p = true;
>+if (walk_tree (&SWITCH_BODY (*tp), contains_label_1, data,
>d->pset))
>+  return *tp;
>+d->inside_switch_p = false;
>+*walk_subtrees = 0;
>+  }
>+  return NULL_TREE;
>+
> case GOTO_EXPR:
>   *walk_subtrees = 0;
>-
>-  /* fall through */
>+  return NULL_TREE;
> 
> default:
>   return NULL_TREE;
>@@ -11246,8 +11272,9 @@ contains_label_1 (tree *tp, int *walk_su
> static bool
> contains_label_p (tree st)
> {
>-  return
>-   (walk_tree_without_duplicates (&st, contains_label_1 , NULL) !=
>NULL_TREE);
>+  hash_set pset;
>+  contains_label_data data = { &pset, false };
>+  return walk_tree (&st, contains_label_1, &data, &pset) != NULL_TREE;
> }
> 
> /* Fold a ternary expression of code CODE and type TYPE with operands
>--- gcc/testsuite/c-c++-common/torture/pr83553.c.jj2017-12-23
>00:30:20.548180122 +0100
>+++ gcc/testsuite/c-c++-common/torture/pr83553.c   2017-12-23
>00:29:11.0 +0100
>@@ -0,0 +1,29 @@
>+/* PR c++/83553 */
>+/* { dg-do run } */
>+
>+int a[3];
>+
>+int
>+foo (int n)
>+{
>+  switch (n)
>+{
>+case 0:
>+  for (n = 7, a[0]++; 0; a[2] = a[1] + 1)
>+  {
>+case 2:
>+a[1] = a[0] + 1;
>+  }
>+}
>+  return n;
>+}
>+
>+int
>+main ()
>+{
>+  if (foo (0) != 7 || a[0] != 1 || a[1] || a[2])
>+__builtin_abort ();
>+  if (foo (2) != 2 || a[0] != 1 || a[1] != 2 || a[2] != 3)
>+__builtin_abort ();
>+  return 0;
>+}
>
>   Jakub



Re: [045/nnn] poly_int: REG_ARGS_SIZE

2017-12-23 Thread Richard Sandiford
Andreas Schwab  writes:
> This breaks gcc.dg/tls/opt-3.c, gcc.dg/tls/pr47715-3.c and
> gcc.dg/tls/struct-1.c on m68k:
>
> /daten/aranym/gcc/gcc-20171222/gcc/testsuite/gcc.dg/tls/opt-3.c:11:3:
> internal compiler error: in add_args_size_note, at rtlanal.c:2379
> 0xae7aa9 add_args_size_note(rtx_insn*, poly_int<1u, long>)
> ../../gcc/rtlanal.c:2379
> 0x7ea4ca fixup_args_size_notes(rtx_insn*, rtx_insn*, poly_int<1u, long>)
> ../../gcc/expr.c:4105
> 0x7f6a02 emit_single_push_insn
> ../../gcc/expr.c:4225
> 0x7fa412 emit_push_insn(rtx_def*, machine_mode, tree_node*, rtx_def*,
> unsigned int, int, rtx_def*, poly_int<1u, long>, rtx_def*, rtx_def*,
> int, rtx_def*, bool)
> ../../gcc/expr.c:4561
> 0x6b8976 store_one_arg
> ../../gcc/calls.c:5694
> 0x6c15b1 expand_call(tree_node*, rtx_def*, int)
> ../../gcc/calls.c:4030
> 0x7f0485 expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
> expand_modifier, rtx_def**, bool)
> ../../gcc/expr.c:10927
> 0x6d6c97 expand_expr
> ../../gcc/expr.h:276
> 0x6d6c97 expand_call_stmt
> ../../gcc/cfgexpand.c:2690
> 0x6d6c97 expand_gimple_stmt_1
> ../../gcc/cfgexpand.c:3624
> 0x6d6c97 expand_gimple_stmt
> ../../gcc/cfgexpand.c:3790
> 0x6d8058 expand_gimple_tailcall
> ../../gcc/cfgexpand.c:3836
> 0x6d8058 expand_gimple_basic_block
> ../../gcc/cfgexpand.c:5774
> 0x6dd62e execute
> ../../gcc/cfgexpand.c:6403

Bah.  Looks like I need to update my scripts to use --enable-tls,
since I'd ended up with emultls for the m68k targets.

I think the assert is catching a pre-existing bug here.  If we pushed
a value that needs a call to something like __tls_get_addr, we ended
up with two different REG_ARGS_SIZE notes on the same instruction.

It seems to be OK for emit_single_push_insn to push something that
needs a call to __tls_get_addr:

  /* We have to allow non-call_pop patterns for the case
 of emit_single_push_insn of a TLS address.  */
  if (GET_CODE (pat) != PARALLEL)
return 0;

so I think the problem is in the way this is handled rather than the fact
that it occurs at all.

If we're pushing a value X that needs a call C to calculate, we'll
add REG_ARGS_SIZE notes to the pushes and pops for C as part of the
call sequence.  Then emit_single_push_insn calls fixup_args_size_notes
on the whole push sequence (the calculation of X, including C,
and the push of X itself).  This is where the double notes came from.
But emit_single_push_insn_1 adjusted stack_pointer_delta *before* the
push, so the notes added for C were relative to the situation after
the future push of X rather than before it.

Presumably this didn't matter in practice because the note added
second tended to trump the note added first.  But code is allowed to
walk REG_NOTES without having to disregard secondary notes.

This patch seems to fix it for me, but I'm not sure how best to test it.

Thanks,
Richard


2017-12-23  Richard Sandiford  

gcc/
* expr.c (fixup_args_size_notes): Check that any existing
REG_ARGS_SIZE notes are correct, and don't try to re-add them.
(emit_single_push_insn_1): Move stack_pointer_delta adjustment to...
(emit_single_push_insn): ...here.

Index: gcc/expr.c
===
--- gcc/expr.c  2017-12-23 09:29:20.226338285 +
+++ gcc/expr.c  2017-12-23 09:29:45.783339673 +
@@ -4089,6 +4089,14 @@ fixup_args_size_notes (rtx_insn *prev, r
   if (!NONDEBUG_INSN_P (insn))
continue;
 
+  /* We might have existing REG_ARGS_SIZE notes, e.g. when pushing
+a call argument containing a TLS address that itself requires
+a call to __tls_get_addr.  The handling of stack_pointer_delta
+in emit_single_push_insn is supposed to ensure that any such
+notes are already correct.  */
+  rtx note = find_reg_note (insn, REG_ARGS_SIZE, NULL_RTX);
+  gcc_assert (!note || known_eq (args_size, get_args_size (note)));
+
   poly_int64 this_delta = find_args_size_adjust (insn);
   if (known_eq (this_delta, 0))
{
@@ -4102,7 +4110,8 @@ fixup_args_size_notes (rtx_insn *prev, r
   if (known_eq (this_delta, HOST_WIDE_INT_MIN))
saw_unknown = true;
 
-  add_args_size_note (insn, args_size);
+  if (!note)
+   add_args_size_note (insn, args_size);
   if (STACK_GROWS_DOWNWARD)
this_delta = -poly_uint64 (this_delta);
 
@@ -4126,7 +4135,6 @@ emit_single_push_insn_1 (machine_mode mo
   rtx dest;
   enum insn_code icode;
 
-  stack_pointer_delta += PUSH_ROUNDING (GET_MODE_SIZE (mode));
   /* If there is push pattern, use it.  Otherwise try old way of throwing
  MEM representing push operation to move expander.  */
   icode = optab_handler (push_optab, mode);
@@ -4213,6 +4221,14 @@ emit_single_push_insn (machine_mode mode
 
   emit_single_push_insn_1 (mode, x, type);
 
+  /* Adjust stack_pointer_delta to describe the situati

Re: [PATCH] Change PRED_LOOP_EXIT from 85 to 92

2017-12-23 Thread Segher Boessenkool
On Fri, Dec 22, 2017 at 04:53:47PM -0600, David Esparza wrote:
> With a value of 85 GCC has a CPU performance degradation of 11%,
> reverting PRED_LOOP_EXIT to 92 this degradation disappear.
> Those values where measured by running c-ray ray-tracer that is a
> floating point benchmark that runs out of L1 cache.

Why is this single benchmark more important than everything else?

https://patchwork.ozlabs.org/patch/637073/


Segher


Re: [PATCH] Fix Bug 83237 - Values returned by std::poisson_distribution are not distributed correctly

2017-12-23 Thread Michele Pezzutti

I got confirmation from Luc.
He also added it to the errata file---the entries regarding p. 511, page 
6 of http://luc.devroye.org/errors.pdf


On 12/14/2017 11:11 AM, mp...@tiscali.it wrote:
If Luc's explicit green light will not arrive before it is decision 
time, Paolo's point 2- below is doable.


Il 13.12.2017 12:51 Jonathan Wakely ha scritto:


On 12/12/17 21:37 +0100, Paolo Carlini wrote:

Hi, On 12/12/2017 19:42, Michele Pezzutti wrote:
Hi. Yes, I looked at the text before submitting the patch. I 
contacted Devroye and he confirmed that another reader had also 
pointed out this bug but not the solution. I sent him my proposed 
patch, he will look into it (no idea when though).

Nice.
I would state that "comparison function for x = 1 is e^(1/78)" 
(which becomes 1/78 as the algorithm uses log-probabilities). I 
think the change is needed because otherwise, for that particular 
bin, the rejection probability is lower than it should be, 
resulting in a higher number of samples.
Ok. Ideally I would be much less nervous about committing the patch 
if we either 1- Had Luc's explicit green light; 2- Were able to 
*rigorously deduce* within the framework of the book why the change 
is needed. That said, the patch makes sense to me and so far holds 
up well in all my tests (I'm currently running a full make check). I 
would say, let's wait a week or so and then make the final decision. 
Jon, do you agree? Ideas about further testing? (eg, some code you 
are aware of stressing Poisson?)

No, I have nothing useful to add here, but I CC'd Ed on the PR as I'd
like his input.






[PATCH] PR rtl-optimization/83565: Fix 32-bit rotate on ia64

2017-12-23 Thread James Clarke
From: James Clarke 

On ia64, 32-bit rotates are implemented by copying the lower 32 bits of
a register into the upper half, then performing a right shift. However,
depending on the bit pattern in question, this can leave the upper 32
bits as non-zero, despite being only a 32-bit unsigned result. Therefore
add an extra zero_extract to mask these out.

gcc/
PR rtl-optimization/83565
* gcc/config/ia64/ia64.md ("*rotrsi3_internal"): Mask out higher 32
bits from the shift result.
("*rotlsi3_internal"): Likewise
---
 gcc/config/ia64/ia64.md |   10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/gcc/config/ia64/ia64.md b/gcc/config/ia64/ia64.md
index b7cd52b..8198b54 100644
--- a/gcc/config/ia64/ia64.md
+++ b/gcc/config/ia64/ia64.md
@@ -3329,7 +3329,10 @@
(ior:DI (zero_extend:DI (match_dup 1))
(ashift:DI (zero_extend:DI (match_dup 1)) (const_int 32
(set (match_dup 3)
-   (lshiftrt:DI (match_dup 3) (match_dup 2)))]
+   (lshiftrt:DI (match_dup 3) (match_dup 2)))
+   (set (match_dup 3)
+   (zero_extract:DI (match_dup 3)
+   (const_int 32) (const_int 0)))]
   "operands[3] = gen_rtx_REG (DImode, REGNO (operands[0]));")
 
 (define_expand "rotlsi3"
@@ -3358,7 +3361,10 @@
(ior:DI (zero_extend:DI (match_dup 1))
(ashift:DI (zero_extend:DI (match_dup 1)) (const_int 32
(set (match_dup 3)
-   (lshiftrt:DI (match_dup 3) (match_dup 2)))]
+   (lshiftrt:DI (match_dup 3) (match_dup 2)))
+   (set (match_dup 3)
+   (zero_extract:DI (match_dup 3)
+   (const_int 32) (const_int 0)))]
 {
   operands[3] = gen_rtx_REG (DImode, REGNO (operands[0]));
   operands[2] = GEN_INT (32 - INTVAL (operands[2]));
-- 
1.7.10.4



[PATCH v2] PR rtl-optimization/83565: Fix 32-bit rotate on ia64

2017-12-23 Thread James Clarke
On ia64, 32-bit rotates are implemented by copying the lower 32 bits of
a register into the upper half, then performing a right shift. However,
depending on the bit pattern in question, this can leave the upper 32
bits as non-zero, despite being only a 32-bit unsigned result. Therefore
add an extra zero_extract to mask these out.

gcc/
PR rtl-optimization/83565
* gcc/config/ia64/ia64.md ("*rotrsi3_internal"): Mask out higher 32
bits from the shift result.
("*rotlsi3_internal"): Likewise
---

[Resent because git send-email messed about with the headers, adding a
second From: with a different email address, presumably since I ran git
send-email from a directory with .git/config containing a user.email
setting...]

 gcc/config/ia64/ia64.md |   10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/gcc/config/ia64/ia64.md b/gcc/config/ia64/ia64.md
index b7cd52b..8198b54 100644
--- a/gcc/config/ia64/ia64.md
+++ b/gcc/config/ia64/ia64.md
@@ -3329,7 +3329,10 @@
(ior:DI (zero_extend:DI (match_dup 1))
(ashift:DI (zero_extend:DI (match_dup 1)) (const_int 32
(set (match_dup 3)
-   (lshiftrt:DI (match_dup 3) (match_dup 2)))]
+   (lshiftrt:DI (match_dup 3) (match_dup 2)))
+   (set (match_dup 3)
+   (zero_extract:DI (match_dup 3)
+   (const_int 32) (const_int 0)))]
   "operands[3] = gen_rtx_REG (DImode, REGNO (operands[0]));")

 (define_expand "rotlsi3"
@@ -3358,7 +3361,10 @@
(ior:DI (zero_extend:DI (match_dup 1))
(ashift:DI (zero_extend:DI (match_dup 1)) (const_int 32
(set (match_dup 3)
-   (lshiftrt:DI (match_dup 3) (match_dup 2)))]
+   (lshiftrt:DI (match_dup 3) (match_dup 2)))
+   (set (match_dup 3)
+   (zero_extract:DI (match_dup 3)
+   (const_int 32) (const_int 0)))]
 {
   operands[3] = gen_rtx_REG (DImode, REGNO (operands[0]));
   operands[2] = GEN_INT (32 - INTVAL (operands[2]));
--
1.7.10.4



Re: [PATCH v2] PR rtl-optimization/83565: Fix 32-bit rotate on ia64

2017-12-23 Thread Jim Wilson

On 12/23/2017 04:36 PM, James Clarke wrote:

PR rtl-optimization/83565
* gcc/config/ia64/ia64.md ("*rotrsi3_internal"): Mask out higher 32
bits from the shift result.
("*rotlsi3_internal"): Likewise


This doesn't look right to me.  On ia64, the upper 32-bits of a 32-bit 
value in a 64-bit register are garbage bits.  So there should be no need 
to clear them here after the operation.


Note for instance that lshrsi3 clears the upper 32-bits before shifting 
right, because they are garbage bits.  And note for instance that 
ashlsi3 just shifts left, and doesn't care that it is putting garbage in 
the upper 32-bits.


I think either nonzero bits is broken in the SUBREG case, or ia64 
perhaps should not be setting WORD_REGISTER_OPERATIONS.


Jim



Re: Fix Debug DR2354

2017-12-23 Thread Jonathan Wakely
On 19 December 2017 at 21:33, François Dumont wrote:
> Hi
>
>   I plan to apply attached patch tomorrow to fix those _GLIBCXX_DEBUG
> failures:
>
> FAIL: 23_containers/map/modifiers/insert/dr2354.cc (test for excess errors)
> FAIL: 23_containers/multimap/modifiers/insert/dr2354.cc (test for excess
> errors)
> FAIL: 23_containers/unordered_map/insert/dr2354.cc (test for excess errors)
> FAIL: 23_containers/unordered_multimap/insert/dr2354.cc (test for excess
> errors)
>
>   I am also adding tests checking additional DR 2354 insert overloads. I
> discovered I needed to add those only because I looked at the ChangeLog
> entry.
>
>   Tested under Linux x86_64 Debug mode.

Thanks, this also needs to be fixed on gcc-7-branch.