Re: [097/nnn] poly_int: alter_reg

2017-11-28 Thread Jeff Law
On 10/23/2017 11:39 AM, Richard Sandiford wrote:
> This patch makes alter_reg cope with polynomial mode sizes.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * reload1.c (spill_stack_slot_width): Change element type
>   from unsigned int to poly_uint64_pod.
>   (alter_reg): Treat mode sizes as polynomial.
OK.
Jeff


Re: [096/nnn] poly_int: reloading complex subregs

2017-11-28 Thread Jeff Law
On 10/23/2017 11:39 AM, Richard Sandiford wrote:
> This patch splits out a condition that is common to both push_reload
> and reload_inner_reg_of_subreg.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * reload.c (complex_word_subreg_p): New function.
>   (reload_inner_reg_of_subreg, push_reload): Use it.
OK.
jeff



Re: [095/nnn] poly_int: process_alt_operands

2017-11-28 Thread Jeff Law
On 10/23/2017 11:38 AM, Richard Sandiford wrote:
> This patch makes process_alt_operands check that the mode sizes
> are ordered, so that match_reload can validly treat them as subregs
> of one another.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * lra-constraints.c (process_alt_operands): Reject matched
>   operands whose sizes aren't ordered.
>   (match_reload): Refer to this check here.
OK.
jeff



Re: [094/nnn] poly_int: expand_ifn_atomic_compare_exchange_into_call

2017-11-28 Thread Jeff Law
On 10/23/2017 11:37 AM, Richard Sandiford wrote:
> This patch makes the mode size assumptions in
> expand_ifn_atomic_compare_exchange_into_call a bit more
> explicit, so that a later patch can add a to_constant () call.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * builtins.c (expand_ifn_atomic_compare_exchange_into_call): Assert
>   that the mode size is in the set {1, 2, 4, 8, 16}.
OK.
jeff


[PATCH] Fix PR83158

2017-11-28 Thread Richard Biener

Recent changes caused VRP to create [-2147483646, +INF] from
merging ~[0, 0] and [-2147483646, +INF].  This causes i386 specific
folding of lznct to no longer trigger.  The following extends the
existing special-casing of ~[0, 0] from pointer-like types to
pointer types and integer types > int (covering integral arguments).

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2017-11-28  Richard Biener  

PR tree-optimization/83158
* tree-vrp.c (intersect_ranges): Prefer ~[0, 0] in a few more
cases.

Index: gcc/tree-vrp.c
===
--- gcc/tree-vrp.c  (revision 255173)
+++ gcc/tree-vrp.c  (working copy)
@@ -6021,11 +6021,14 @@ intersect_ranges (enum value_range_type
   && vrp_val_is_max (vr1max))
;
  /* Choose the anti-range if it is ~[0,0], that range is special
-enough to special case when vr1's range is relatively wide.  */
+enough to special case when vr1's range is relatively wide.
+At least for types bigger than int - this covers pointers
+and arguments to functions like ctz.  */
  else if (*vr0min == *vr0max
   && integer_zerop (*vr0min)
-  && (TYPE_PRECISION (TREE_TYPE (*vr0min))
-  == TYPE_PRECISION (ptr_type_node))
+  && ((TYPE_PRECISION (TREE_TYPE (*vr0min))
+   >= TYPE_PRECISION (integer_type_node))
+  || POINTER_TYPE_P (TREE_TYPE (*vr0min)))
   && TREE_CODE (vr1max) == INTEGER_CST
   && TREE_CODE (vr1min) == INTEGER_CST
   && (wi::clz (wi::to_wide (vr1max) - wi::to_wide (vr1min))


Re: [093/nnn] poly_int: adjust_mems

2017-11-28 Thread Jeff Law
On 10/23/2017 11:37 AM, Richard Sandiford wrote:
> This patch makes the var-tracking.c handling of autoinc addresses
> cope with polynomial mode sizes.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * var-tracking.c (adjust_mems): Treat mode sizes as polynomial.
>   Use plus_constant instead of gen_rtx_PLUS.
OK.
jeff


Re: [091/nnn] poly_int: emit_single_push_insn_1

2017-11-28 Thread Jeff Law
On 10/23/2017 11:36 AM, Richard Sandiford wrote:
> This patch makes emit_single_push_insn_1 cope with polynomial mode sizes.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * expr.c (emit_single_push_insn_1): Treat mode sizes as polynomial.
>   Use plus_constant instead of gen_rtx_PLUS.
OK.
jeff



Re: [090/nnn] poly_int: set_inc_state

2017-11-28 Thread Jeff Law
On 10/23/2017 11:36 AM, Richard Sandiford wrote:
> This trivial patch makes auto-inc-dec.c:set_inc_state take a poly_int64.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * auto-inc-dec.c (set_inc_state): Take the mode size as a poly_int64
>   rather than an int.
OK.
jeff


Re: [089/nnn] poly_int: expand_expr_real_1

2017-11-28 Thread Jeff Law
On 10/23/2017 11:35 AM, Richard Sandiford wrote:
> This patch makes the VIEW_CONVERT_EXPR handling in expand_expr_real_1
> cope with polynomial type and mode sizes.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * expr.c (expand_expr_real_1): Use tree_to_poly_uint64
>   instead of int_size_in_bytes when handling VIEW_CONVERT_EXPRs
>   via stack temporaries.  Treat the mode size as polynomial too.
OK.
jeff


Re: [088/nnn] poly_int: expand_expr_real_2

2017-11-28 Thread Jeff Law
On 10/23/2017 11:35 AM, Richard Sandiford wrote:
> This patch makes expand_expr_real_2 cope with polynomial mode sizes
> when handling conversions involving a union type.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * expr.c (expand_expr_real_2): When handling conversions involving
>   unions, apply tree_to_poly_uint64 to the TYPE_SIZE rather than
>   multiplying int_size_in_bytes by BITS_PER_UNIT.  Treat GET_MODE_BISIZE
>   as a poly_uint64 too.
OK.
jeff


Re: [PATCH, Makefile.in] refine selftest recipes to restore mingw bootstrap

2017-11-28 Thread Olivier Hainque

> On Nov 28, 2017, at 07:20 , Jeff Law  wrote:

>> Given that, I prefer Olivier's patch - it seems simpler to me.
> OK.  I can live with it.

Thanks for for your reviews Jeff and David. Will commit.

Windows hosts are full of surprises. It shouldn't be hard to
revisit the approach if need be.

With Kind Regards,

Olivier



[C PATCH] Handle C SWITCH_EXPR in block_may_fallthru (PR sanitizer/81275)

2017-11-28 Thread Jakub Jelinek
Hi!

This is the C version of the switch block_may_fallthru handling.
Unlike C++ SWITCH_STMT, break; is represented in SWITCH_EXPR by a goto
to a label emitted after the SWITCH_EXPR, so either block_may_fallthru
finds such label (but then doesn't find the SWITCH_EXPR), or it
finds SWITCH_EXPR, in which case if the body doesn't fall through (e.g.
ends with a return stmt), then it may fall through only if it doesn't
cover all the cases.

This patch adds a bit that signals that, and computes whether all cases
are covered (either if default: is present, or by walking the splay tree).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-11-28  Jakub Jelinek  

PR sanitizer/81275
* tree.c (block_may_fallthru): Return false if SWITCH_ALL_CASES_P
is set on SWITCH_EXPR and !block_may_fallthru (SWITCH_BODY ()).
c/
* c-typeck.c (c_finish_case): Set SWITCH_ALL_CASES_P if
c_switch_covers_all_cases_p returns true.
c-family/
* c-common.c (c_switch_covers_all_cases_p_1,
c_switch_covers_all_cases_p): New functions.
* c-common.h (c_switch_covers_all_cases_p): Declare.
testsuite/
* c-c++-common/tsan/pr81275.c: New test.

--- gcc/tree.c.jj   2017-11-27 14:36:09.0 +0100
+++ gcc/tree.c  2017-11-27 15:11:15.715131528 +0100
@@ -12339,6 +12339,12 @@ block_may_fallthru (const_tree block)
   return false;
 
 case SWITCH_EXPR:
+  /* If there is a default: label or case labels cover all possible
+SWITCH_COND values, then the SWITCH_EXPR will transfer control
+to some case label in all cases and all we care is whether the
+SWITCH_BODY falls through.  */
+  if (SWITCH_ALL_CASES_P (stmt))
+   return block_may_fallthru (SWITCH_BODY (stmt));
   return true;
 
 case COND_EXPR:
--- gcc/tree.h.jj   2017-11-27 14:34:38.0 +0100
+++ gcc/tree.h  2017-11-27 15:08:23.510250289 +0100
@@ -1175,6 +1175,10 @@ extern void protected_set_expr_location
 /* SWITCH_EXPR accessors. These give access to the condition and body.  */
 #define SWITCH_COND(NODE)   TREE_OPERAND (SWITCH_EXPR_CHECK (NODE), 0)
 #define SWITCH_BODY(NODE)   TREE_OPERAND (SWITCH_EXPR_CHECK (NODE), 1)
+/* True if there all case labels for all possible values of SWITCH_COND, either
+   because there is a default: case label or because the case label ranges 
cover
+   all values.  */
+#define SWITCH_ALL_CASES_P(NODE) (SWITCH_EXPR_CHECK (NODE)->base.private_flag)
 
 /* CASE_LABEL_EXPR accessors. These give access to the high and low values
of a case label, respectively.  */
--- gcc/c/c-typeck.c.jj 2017-11-27 14:27:53.0 +0100
+++ gcc/c/c-typeck.c2017-11-27 16:00:03.180982468 +0100
@@ -10407,6 +10407,8 @@ c_finish_case (tree body, tree type)
type ? type : TREE_TYPE (cs->switch_expr),
SWITCH_COND (cs->switch_expr),
cs->bool_cond_p, cs->outside_range_p);
+  if (c_switch_covers_all_cases_p (cs->cases, TREE_TYPE (cs->switch_expr)))
+SWITCH_ALL_CASES_P (cs->switch_expr) = 1;
 
   /* Pop the stack.  */
   c_switch_stack = cs->next;
--- gcc/c-family/c-common.h.jj  2017-11-20 19:55:39.0 +0100
+++ gcc/c-family/c-common.h 2017-11-27 16:08:30.740827358 +0100
@@ -975,6 +975,7 @@ extern int case_compare (splay_tree_key,
 
 extern tree c_add_case_label (location_t, splay_tree, tree, tree, tree, tree,
  bool *);
+extern bool c_switch_covers_all_cases_p (splay_tree, tree);
 
 extern tree build_function_call (location_t, tree, tree);
 
--- gcc/c-family/c-common.c.jj  2017-11-21 14:56:50.0 +0100
+++ gcc/c-family/c-common.c 2017-11-27 16:09:52.555839861 +0100
@@ -4904,6 +4904,64 @@ c_add_case_label (location_t loc, splay_
   return error_mark_node;
 }
 
+/* Subroutine of c_switch_covers_all_cases_p, called via
+   splay_tree_foreach.  Return 1 if it doesn't cover all the cases.
+   ARGS[0] is initially NULL and after the first iteration is the
+   so far highest case label.  ARGS[1] is the minimum of SWITCH_COND's
+   type.  */
+
+static int
+c_switch_covers_all_cases_p_1 (splay_tree_node node, void *data)
+{
+  tree label = (tree) node->value;
+  tree *args = (tree *) data;
+
+  /* If there is a default case, we shouldn't have called this.  */
+  gcc_assert (CASE_LOW (label));
+
+  if (args[0] == NULL_TREE)
+{
+  if (wi::to_widest (args[1]) < wi::to_widest (CASE_LOW (label)))
+   return 1;
+}
+  else if (wi::add (wi::to_widest (args[0]), 1)
+  != wi::to_widest (CASE_LOW (label)))
+return 1;
+  if (CASE_HIGH (label))
+args[0] = CASE_HIGH (label);
+  else
+args[0] = CASE_LOW (label);
+  return 0;
+}
+
+/* Return true if switch with CASES and switch condition with type
+   covers all possible values in the case labels.  */
+
+bool
+c_switch_covers_all_cases_p (splay_tree cases, tree type)
+{
+  /* If there is default:, this is always the case.  */
+  splay_tree_n

[C++ PATCH] Avoid -Wreturn-type warnings if a switch has default label, no breaks inside of it, but is followed by a break (PR sanitizer/81275, take 2)

2017-11-28 Thread Jakub Jelinek
On Mon, Nov 27, 2017 at 02:01:05PM +0100, Jakub Jelinek wrote:
> You are right that I can remove the || SWITCH_STMT_BODY (stmt) == NULL_TREE,
> part, because then there wouldn't be any case labels in it either.

...

Here is an updated patch, on top of the C patch I've just posted:
http://gcc.gnu.org/ml/gcc-patches/2017-11/msg02372.html
(though that dependency could be easily removed if needed by dropping the
c_switch_covers_all_cases_p call and SWITCH_ALL_CASES_P setting from
SWITCH_STMT_ALL_CASES_P).
Note, looking for default is still needed, because in templates we do not
build the cases splay tree and therefore would never set
SWITCH_STMT_ALL_CASES_P.  Computing the cases splay tree is probably too
expensive, but default tracking is cheap.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-11-28  Jakub Jelinek  

PR sanitizer/81275
* cp-tree.h (SWITCH_STMT_ALL_CASES_P): Define.
(SWITCH_STMT_NO_BREAK_P): Define.
(note_break_stmt, note_iteration_stmt_body_start,
note_iteration_stmt_body_end): Declare.
* decl.c (struct cp_switch): Add has_default_p, break_stmt_seen_p
and in_loop_body_p fields. 
(push_switch): Clear them.
(pop_switch): Set SWITCH_STMT_CANNOT_FALLTHRU_P if has_default_p
and !break_stmt_seen_p.  Assert in_loop_body_p is false.
(note_break_stmt, note_iteration_stmt_body_start,
note_iteration_stmt_body_end): New functions.
(finish_case_label): Set has_default_p when both low and high
are NULL_TREE.
* parser.c (cp_parser_iteration_statement): Use
note_iteration_stmt_body_start and note_iteration_stmt_body_end
around parsing iteration body.
* pt.c (tsubst_expr): Likewise.
* cp-objcp-common.c (cxx_block_may_fallthru): Return false for
SWITCH_STMT which contains no BREAK_STMTs, contains a default:
CASE_LABEL_EXPR and where SWITCH_STMT_BODY isn't empty and
can't fallthru.
* semantics.c (finish_break_stmt): Call note_break_stmt.
* cp-gimplify.c (genericize_switch_stmt): Copy SWITCH_STMT_ALL_CASES_P
bit to SWITCH_ALL_CASES_P.  Assert that if SWITCH_STMT_NO_BREAK_P then
the break label is not TREE_USED.

* g++.dg/warn/pr81275-1.C: New test.
* g++.dg/warn/pr81275-2.C: New test.
* g++.dg/warn/pr81275-3.C: New test.
* c-c++-common/tsan/pr81275.c: Skip for C++ and -O2.

--- gcc/cp/cp-tree.h.jj 2017-11-27 09:30:51.742077295 +0100
+++ gcc/cp/cp-tree.h2017-11-27 16:29:44.806456638 +0100
@@ -364,6 +364,7 @@ extern GTY(()) tree cp_global_trees[CPTI
   IF_STMT_CONSTEXPR_P (IF_STMT)
   TEMPLATE_TYPE_PARM_FOR_CLASS (TEMPLATE_TYPE_PARM)
   DECL_NAMESPACE_INLINE_P (in NAMESPACE_DECL)
+  SWITCH_STMT_ALL_CASES_P (in SWITCH_STMT)
1: IDENTIFIER_KIND_BIT_1 (in IDENTIFIER_NODE)
   TI_PENDING_TEMPLATE_FLAG.
   TEMPLATE_PARMS_FOR_INLINE.
@@ -395,6 +396,7 @@ extern GTY(()) tree cp_global_trees[CPTI
   AGGR_INIT_ZERO_FIRST (in AGGR_INIT_EXPR)
   CONSTRUCTOR_MUTABLE_POISON (in CONSTRUCTOR)
   OVL_HIDDEN_P (in OVERLOAD)
+  SWITCH_STMT_NO_BREAK_P (in SWITCH_STMT)
3: (TREE_REFERENCE_EXPR) (in NON_LVALUE_EXPR) (commented-out).
   ICS_BAD_FLAG (in _CONV)
   FN_TRY_BLOCK_P (in TRY_BLOCK)
@@ -4840,6 +4842,14 @@ more_aggr_init_expr_args_p (const aggr_i
 #define SWITCH_STMT_BODY(NODE) TREE_OPERAND (SWITCH_STMT_CHECK (NODE), 1)
 #define SWITCH_STMT_TYPE(NODE) TREE_OPERAND (SWITCH_STMT_CHECK (NODE), 2)
 #define SWITCH_STMT_SCOPE(NODE)TREE_OPERAND (SWITCH_STMT_CHECK (NODE), 
3)
+/* True if there all case labels for all possible values of switch cond, either
+   because there is a default: case label or because the case label ranges 
cover
+   all values.  */
+#define SWITCH_STMT_ALL_CASES_P(NODE) \
+  TREE_LANG_FLAG_0 (SWITCH_STMT_CHECK (NODE))
+/* True if the body of a switch stmt contains no BREAK_STMTs.  */
+#define SWITCH_STMT_NO_BREAK_P(NODE) \
+  TREE_LANG_FLAG_2 (SWITCH_STMT_CHECK (NODE))
 
 /* STMT_EXPR accessor.  */
 #define STMT_EXPR_STMT(NODE)   TREE_OPERAND (STMT_EXPR_CHECK (NODE), 0)
@@ -6102,6 +6112,9 @@ enum cp_tree_node_structure_enum cp_tree
 extern void finish_scope   (void);
 extern void push_switch(tree);
 extern void pop_switch (void);
+extern void note_break_stmt(void);
+extern bool note_iteration_stmt_body_start (void);
+extern void note_iteration_stmt_body_end   (bool);
 extern tree make_lambda_name   (void);
 extern int decls_match (tree, tree);
 extern bool maybe_version_functions(tree, tree);
--- gcc/cp/decl.c.jj2017-11-27 09:30:51.744077271 +0100
+++ gcc/cp/decl.c   2017-11-27 16:32:01.308812566 +0100
@@ -3427,6 +3427,13 @@ struct cp_switch
   /* Remember whether there was a case value that is outside the
  range of th

[PATCH, committed] Add myself to MAINTAINRS

2017-11-28 Thread Koval, Julia
2017-11-28  Julia Koval  

    * MAINTAINERS (write after approval): Add myself.



Re: [PATCH] Fix PR80776

2017-11-28 Thread Richard Biener
On Mon, 27 Nov 2017, Jeff Law wrote:

> On 11/27/2017 06:39 AM, Richard Biener wrote:
> > 
> > The following avoids -Wformat-overflow false positives by teaching
> > EVRP the trick about __builtin_unreachable () "other" edges and
> > attaching range info to SSA names.  EVRP does a better job in keeping
> > ranges for every SSA name from conditional info (VRP "optimizes" its
> > costly ASSERT_EXPR insertion process).
> > 
> > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> > 
> > This will also fix the testcase from PR83072 but it doesn't really
> > fix all cases I want to fix with a fix for it.  OTOH it might be
> > this is enough for stage3.
> > 
> > Richard.
> > 
> > 2017-11-27  Richard Biener  
> > 
> > PR tree-optimization/80776
> > * gimple-ssa-evrp-analyze.h (evrp_range_analyzer::set_ssa_range_info):
> > Declare.
> > * gimple-ssa-evrp-analyze.c (evrp_range_analyzer::set_ssa_range_info):
> > New function.
> > (evrp_range_analyzer::record_ranges_from_incoming_edges):
> > If the incoming edge is an effective fallthru because the other
> > edge only reaches a __builtin_unreachable () then record ranges
> > derived from the controlling condition in SSA info.
> > (evrp_range_analyzer::record_ranges_from_phis): Use set_ssa_range_info.
> > (evrp_range_analyzer::record_ranges_from_stmt): Likewise.
> > 
> > * gcc.dg/pr80776-1.c: New testcase.
> > * gcc.dg/pr80776-2.c: Likewise.
> So the thing to make sure of here is that the range information we
> reflect back into the SSA_NAME actually applies everywhere the SSA_NAME
> can appear.  ie, it's globally valid.
> 
> This means we can't reflect anything we derive from conditionals or
> things like a *p making the range non-null back to the SSA_NAME.
> 
> I'd be concerned about the change to record_ranges_from_incoming_edge.

It's basically a copy of what VRP does when removing range assertions.
I've added the correctness check I missed and also the trick with
setting nonzero bits.

This causes us to no longer handle the gcc.dg/pr80776-1.c case:

   :
  i_4 = somerandom ();
  if (i_4 < 0)
goto ; [INV]
  else
goto ; [INV]

   :
  __builtin_unreachable ();

   :
  i.0_1 = (unsigned int) i_4;
  if (i.0_1 > 99)
goto ; [INV]
  else
goto ; [INV]

   :
  __builtin_unreachable ();

   :
  _7 = __builtin___sprintf_chk (&number, 1, 7, "%d", i_4);

when trying to update the SSA range info for i_4 from the
if (i.0_1 > 99) we see the use in the dominating condition
and thus conclude we cannot update the SSA range info like we want.

ifcombine also doesn't merge the tests because I think it gets
confused by the __builtin_unreachable ().  ifcombine also runs after
VRP1 which gets rid of the __builtin_unreachable ()s.

I think for GCC 9 we may want to experiment with moving ifcombine
before VRP1 and handling if-chains with __builtin_unreachable ()s.

Anyway, re-testing below.

Richard.

2017-11-28  Richard Biener  

PR tree-optimization/80776
* gimple-ssa-evrp-analyze.h (evrp_range_analyzer::set_ssa_range_info):
Declare.
* gimple-ssa-evrp-analyze.c (evrp_range_analyzer::set_ssa_range_info):
New function.
(evrp_range_analyzer::record_ranges_from_incoming_edges):
If the incoming edge is an effective fallthru because the other
edge only reaches a __builtin_unreachable () then record ranges
derived from the controlling condition in SSA info.
(evrp_range_analyzer::record_ranges_from_phis): Use set_ssa_range_info.
(evrp_range_analyzer::record_ranges_from_stmt): Likewise.

* gcc.dg/pr80776-1.c: New testcase.
* gcc.dg/pr80776-2.c: Likewise.

Index: gcc/gimple-ssa-evrp-analyze.c
===
--- gcc/gimple-ssa-evrp-analyze.c   (revision 255172)
+++ gcc/gimple-ssa-evrp-analyze.c   (working copy)
@@ -91,6 +91,62 @@ evrp_range_analyzer::try_find_new_range
   return NULL;
 }
 
+/* For LHS record VR in the SSA info.  */
+void
+evrp_range_analyzer::set_ssa_range_info (tree lhs, value_range *vr)
+{
+  /* Set the SSA with the value range.  */
+  if (INTEGRAL_TYPE_P (TREE_TYPE (lhs)))
+{
+  if ((vr->type == VR_RANGE
+  || vr->type == VR_ANTI_RANGE)
+ && (TREE_CODE (vr->min) == INTEGER_CST)
+ && (TREE_CODE (vr->max) == INTEGER_CST))
+   set_range_info (lhs, vr->type,
+   wi::to_wide (vr->min),
+   wi::to_wide (vr->max));
+}
+  else if (POINTER_TYPE_P (TREE_TYPE (lhs))
+  && ((vr->type == VR_RANGE
+   && range_includes_zero_p (vr->min,
+ vr->max) == 0)
+  || (vr->type == VR_ANTI_RANGE
+  && range_includes_zero_p (vr->min,
+vr->max) == 1)))
+set_ptr_nonnull (lhs);
+}
+
+/* Return true if all uses of NAME are dominated by STMT or feed S

Re: [PATCH] Fix hot/cold partitioning with -gstabs{,+} (PR debug/81307)

2017-11-28 Thread Richard Biener
On Mon, 27 Nov 2017, Jim Wilson wrote:

> On 11/27/2017 12:21 AM, Richard Biener wrote:
> > Let's formally deprecate any non-DWARF debugging format support
> > in GCC 8 changes.html?
> 
> I think we need to be careful about that.  I've been looking at the stabs
> support after I finished removing the sdb/coff debug info support.  We have 3
> cpu targets that only support stabs (avr, pdp11, vax).  We have two OS targets
> that only support stabs, though in both cases the cpus do support dwarf and
> hence they could switch.  We have two embedded targets that support the
> --with-stabs configure option.  We have a few targets that might default to
> stabs if used with old obsolete assembler versions that probably don't work
> anymore anyways.  And we have AIX which continues to use XCOFF_DEBUG which is
> a variant of stabs, and shares some of the same code.  We also have vms which
> apparently uses a mixture of vms and dwarf2 debugging.  Does that count as a
> non-DWARF debugging format?
> 
> I think a good first step would be to emit a compile-time warning if gcc was
> configured to emit stabs by default.  E.g. maybe something like
> #if PREFERRED_DEBUGGING_TYPE == DBX_DEBUG
> #warning "dbx/stabs debug info support is being deprecated"
> #endif
> in a popular header file.  That gives maintainers a chance to fix their ports.
> I was planning to propose this after gcc-8, but we could maybe add it to
> gcc-8.  Then maybe in the next version this changes to a #error, so you can't
> build a gcc that emits stabs by default, but users can still get stabs with
> -gstabs.  We can then emit a run-time warning when a user uses -gstabs, and
> then later a run-time error when -gstabs is used.  That gives users a chance
> to migrate before it breaks.  Then we can remove -gstabs, careful to avoid
> breaking xcoff support, until IBM agrees in the futility of continuing to
> maintain it and lets us remove xcoff too.

I think we want to decomission -gstabs for all targets that do not
default to it (or variants of it) and adjust documentation to
reflect that -gstabs is unmaintained and deprecated.  We also want to
deprecate --with-stabs if there is an alternative, preferable emit
a warning for this configure option for GCC 8 and remove support for GCC 9.

I think embedded folks stick to STABS because of its smaller size, not
because it is anywhere useful.

Richard.


Re: [fortran] Add support for #pragma GCC unroll v3

2017-11-28 Thread Eric Botcazou
> The patch looks ok to me.

Thanks.

> For documentation, the gfortran manual has 2 sections:
> 
> 6.1 Extensions implemented in GNU Fortran
> 7.2 GNU Fortran Compiler Directives
> 
> 6.1 describes extension covering legacy code and vendor extensions.
> 7.2 describes other !$GCC directives.  Currently, the section is
> mainly calling conventions (CDECL, STDCALL, etc) and library
> macroc (DLLEXPORT).  These should probably be in 7.2.1 and the
> UNROLL directive in 7.2.2.
> 
> I can help with the documentation (although it might take a weekend
> or two to get done), but need to know sematics.  Does the directive
> apply to only the immediately following loop?  Does it apply to all
> loops that follow the directive?

The former.  Here's the documentation for the C & C++ compilers:

`#pragma GCC unroll N'
 You can use this pragma to control how many times a loop should be
 unrolled.  It must be placed immediately before a `for', `while'
 or `do' loop or a `#pragma GCC ivdep', and applies only to the
 loop that follows.  N is an integer constant expression specifying
 the unrolling factor.  The values of 0 and 1 block any unrolling
 of the loop.

> What is the interaction of the directive with -funroll-loops and --param
> max-unroll-times=4?

It's independent and always prevails, i.e. it doesn't need -funroll-loops to 
be effective, #pragma GCC 0 will block unrolling despite -funroll-loops and 
#pragma GCC N wins over --param max-unroll-times=M.

-- 
Eric Botcazou


Re: [PATCH v2 1/4] [SPARC] Errata workaround for GRLIB-TN-0012

2017-11-28 Thread Eric Botcazou
> 2017-11-17  Daniel Cederman  
> 
>   * config/sparc/sparc.c (fpop_insn_p): New function.
>   (sparc_do_work_around_errata): Insert NOP instructions to
>   prevent sequences that could trigger the TN-0012 errata for
>   GR712RC.
>   (pass_work_around_errata::gate): Also test sparc_fix_gr712rc.
>   * config/sparc/sparc.md (fix_gr712rc): New attribute.
>   (in_branch_annul_delay): Prevent floating-point instructions
>   in delay slot of annulled integer branch.

Sorry, I should have been more explicit in my first reply, because:

> @@ -590,6 +594,26 @@
>  (const_string "true")
>   ] (const_string "false")))
> 
> +(define_attr "in_integer_branch_annul_delay" "false,true"
> +  (cond [(eq_attr "type"
> "uncond_branch,branch,cbcond,uncond_cbcond,call,sibcall,call_no_delay_slot,
> multi") +(const_string "false")
> +  (and (eq_attr "fix_gr712rc" "true")
> +   (eq_attr "type" "fp,fpcmp,fpmove,fpcmove,fpmul,
> +fpdivs,fpsqrts,fpdivd,fpsqrtd"))
> +(const_string "false")
> +  (and (eq_attr "fix_b2bst" "true") (eq_attr "type" "store,fpstore"))
> +(const_string "false")
> +  (and (eq_attr "fix_ut699" "true") (eq_attr "type" "load,sload"))
> +(const_string "false")
> +  (and (eq_attr "fix_ut699" "true")
> +   (and (eq_attr "type" "fpload,fp,fpmove,fpmul,fpdivs,fpsqrts")
> +(ior (eq_attr "fptype" "single")
> + (eq_attr "fptype_ut699" "single"
> +(const_string "false")
> +  (eq_attr "length" "1")
> +(const_string "true")
> + ] (const_string "false")))
> +
>  (define_delay (eq_attr "type" "call")
>[(eq_attr "in_call_delay" "true") (nil) (nil)])

is barely maintainable.  So let's go back to the original version and...

> @@ -602,6 +626,10 @@
>  (define_delay (eq_attr "type" "branch")
>[(eq_attr "in_branch_delay" "true") (nil) (eq_attr "in_branch_delay"
> "true")])
> 
> +(define_delay (and (eq_attr "type" "branch") (eq_attr "branch_type" "icc"))
> +  [(eq_attr "in_branch_delay" "true") (nil)
> +  (eq_attr "in_integer_branch_annul_delay" "true")])
> +
>  (define_delay (eq_attr "type" "uncond_branch")
>[(eq_attr "in_branch_delay" "true") (nil) (nil)])

...add (and (.) (not (eq_attr "branch_type" "icc")) to the first define_delay.

-- 
Eric Botcazou


Re: [PATCH v2 2/4] [SPARC] Errata workaround for GRLIB-TN-0011

2017-11-28 Thread Eric Botcazou
> 2017-11-17  Daniel Cederman  
> 
>   * config/sparc/sync.md (swapsi): 16-byte align if sparc_fix_gr712rc.
>   (atomic_compare_and_swap_leon3_1): Likewise.
>   (ldstub): Likewise.

OK for mainline and 7 branch, thanks.

-- 
Eric Botcazou


Re: [PATCH v2 3/4] [SPARC] Errata workaround for GRLIB-TN-0010

2017-11-28 Thread Eric Botcazou
> 2017-11-17  Daniel Cederman  
> 
>   * config/sparc/sparc.c (atomic_insn_p): New function.
>   (sparc_do_work_around_errata): Insert NOP instructions to
>   prevent sequences that could trigger the TN-0010 errata for
>   UT700.
>   * config/sparc/sync.md (atomic_compare_and_swap_leon3_1): Make
>   instruction referable in atomic_insns_p.

OK for mainline and 7 branch, thanks.

-- 
Eric Botcazou


Re: [PATCH v2 4/4] [SPARC] Errata workaround for GRLIB-TN-0013

2017-11-28 Thread Eric Botcazou
> 2017-11-17  Daniel Cederman  
> 
>   * config/sparc/sparc.c (fpop_reg_depend_p): New function.
>   (div_sqrt_insn_p): New function.
>   (sparc_do_work_around_errata): Insert NOP instructions to
>   prevent sequences that could trigger the TN-0013 errata for
>   certain LEON3 processors.
>   (pass_work_around_errata::gate): Also test sparc_fix_tn0013.
>   (sparc_option_override): Set sparc_fix_tn0013 appropriately.
>   * config/sparc/sparc.md (fix_tn0013): New attribute.
>   (in_branch_delay): Prevent div and sqrt in delay slot if fix_tn0013.
>   * config/sparc/sparc.opt (sparc_fix_tn0013: New variable.

OK for mainline and 7 branch modulo:

> +   /* Check if this is a problematic sequence.  */
> +   if (i > 1
> +   && fp_found >= 2
> +   && div_sqrt_insn_p (after))
> + {
> +   /* If is this is the short version of the problematic
> +  sequence we add two NOPs in a row to also prevent
> +  the long version.  */
> +   if (i == 2)
> + emit_insn_before (gen_nop (), next);
> +   insert_nop = true;
> +   break;
> + }

Superfluous "is".

-- 
Eric Botcazou


RE: [PATCH, Makefile.in] refine selftest recipes to restore mingw bootstrap

2017-11-28 Thread Tamar Christina
Sending reply to list.

> -Original Message-
> From: Tamar Christina
> Sent: Tuesday, November 28, 2017 10:02
> To: 'David Malcolm' ; Olivier Hainque
> ; Jeff Law 
> Cc: GCC Patches 
> Subject: RE: [PATCH, Makefile.in] refine selftest recipes to restore mingw
> bootstrap
> 
> > -Original Message-
> > From: David Malcolm [mailto:dmalc...@redhat.com]
> > Sent: Monday, November 27, 2017 22:16
> > To: Olivier Hainque ; Jeff Law 
> > Cc: GCC Patches ; Tamar Christina
> > 
> > Subject: Re: [PATCH, Makefile.in] refine selftest recipes to restore
> > mingw bootstrap
> >
> > On Mon, 2017-11-27 at 09:40 +0100, Olivier Hainque wrote:
> > > (typo in David's email address in the previous message, resending.
> > > sorry for the duplicates)
> > >
> > > Hi Jeff,
> > >
> > > (Thanks for your feedback)
> > >
> > > > On Nov 27, 2017, at 04:55 , Jeff Law  wrote:
> > > >
> > > > >   * Makefile.in (SELFTEST_FLAGS): Use nul instead of /dev/null
> > > > >   on mingw build hosts.
> > > >
> > > > Would it make more sense to set the output file to HOST_BIT_BUCKET
> > > > when -fselftest is active?
> > >
> > >
> > > Hmm, possibly. Interesting suggestion :)
> > >
> > > David, who introduced the framework, would know for sure - cc'ed.
> > >
> > > David, thoughts on this ?
> >
> > [CCing Tamar]
> >
> > The selftests currently assume both:
> > (a) reading a source file from /dev/null, and
> > (b) writing the generated asm to /dev/null
> >
> > I note that r241895 added "-o /dev/null" for the asm output to "Fix
> > the Windows native x86-64 build.":
> >
> > +2016-11-07  Tamar Christina  
> > +
> > +   PR driver/78196
> > +   * Makefile.in (SELFTEST_FLAGS): Added -o /dev/null.
> > +
> 
 Yeah I choose this fix because /dev/null is converted to "nul" by the
 Cygwin/msys2 runtime. So when invoked On the shell it gets converted
 before GCC sees it.
 
 Nothing against using nul directly, though I am surprised that there's an
 environment that provides Configure, automake and coreutils which doesn't
 convert /dev/null to nul.
 
Thanks,
Tamar
> 
> 
> > HOST_BIT_BUCKET seems to be a header thing, rather than a Makefile
> > thing; presumably there would need to be logic to handle both input
> > and output for the HOST_BIT_BUCKET approach.
> >
> > Given that, I prefer Olivier's patch - it seems simpler to me.
> >
> > Dave


Re: [PATCH v2 4/4] [SPARC] Errata workaround for GRLIB-TN-0013

2017-11-28 Thread Eric Botcazou
> > +(and (eq_attr "fix_lost_divsqrt" "true")
> > + (eq_attr "type" "fpdivs,fpsqrts,fpdivd,fpsqrtd"))
> > +  (const_string "false")
> 
> These lines should also be added to the in_integer_branch_annul_delay
> attribute.

Let's not though and make the modification I suggested instead.

-- 
Eric Botcazou


Re: [PATCH 1/2] [SPARC] Prevent -mfix-ut699 from generating b2bst errata sequences

2017-11-28 Thread Eric Botcazou
> 2017-11-27  Martin Aberg  
> 
>   * config/sparc/sparc.md (divdf3_fix): Add NOP and adjust length
> to prevent b2bst errata sequence.
> (sqrtdf2_fix): Likewise.

OK for mainline and 7 branch, thanks.

-- 
Eric Botcazou


Re: [PATCH 2/2] [SPARC] Recognize the load when accessing the GOT

2017-11-28 Thread Eric Botcazou
> 2017-11-27  Daniel Cederman  
> 
>   * config/sparc/sparc.c (sparc_do_work_around_errata): Treat the
> movsi_pic_gotdata_op instruction as a load for the UT699 errata
> workaround.

OK for mainline, 7 and 6 branches, thanks.

-- 
Eric Botcazou


[committed] Add testcase for PR rtl-optimization/81020

2017-11-28 Thread Jakub Jelinek
Hi!

This testcase was fixed by Segher's r254875:
https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00340.html
on the trunk, so I've committed it to trunk as obvious.
On release branches it still needs fixing.

2017-11-28  Jakub Jelinek  

PR rtl-optimization/81020
* gcc.dg/pr81020.c: New test.

--- gcc/testsuite/gcc.dg/pr81020.c.jj   2017-11-27 18:49:06.659907687 +0100
+++ gcc/testsuite/gcc.dg/pr81020.c  2017-11-27 18:49:00.431982443 +0100
@@ -0,0 +1,23 @@
+/* PR rtl-optimization/81020 */
+/* { dg-do run } */
+/* { dg-options "-O -fno-tree-bit-ccp -fno-tree-coalesce-vars -fno-tree-vrp" } 
*/
+
+unsigned v = 4;
+
+unsigned long long __attribute__((noipa))
+foo (unsigned x)
+{
+  unsigned a = v;
+  a &= 1;
+  x |= 0 < a;
+  a >>= 31;
+  return x + a;
+}
+
+int
+main ()
+{
+  if (foo (2) != 2)
+__builtin_abort ();
+  return 0;
+}

Jakub


Re: [PATCH, Makefile.in] refine selftest recipes to restore mingw bootstrap

2017-11-28 Thread Olivier Hainque
Hello Tamar,

> On Nov 28, 2017, at 11:05 , Tamar Christina  wrote:
> 
> Yeah I choose this fix because /dev/null is converted to "nul" by the
> Cygwin/msys2 runtime. So when invoked On the shell it gets converted
> before GCC sees it.
> 
> Nothing against using nul directly, though I am surprised that there's an
> environment that provides Configure, automake and coreutils which doesn't
> convert /dev/null to nul.

My understanding is that it's not only a property of
the environment in which the tool gets executed, but also of
the tool itself.

For gcc in particular, ISTM it depends on the target for
which you configured.

Our toolchains are configured for x86_64-pc-mingw32

Maybe yours are configured for cygwin, not mingw (are they ?),
and then are linked with cygwin libraries which perform various
transformations.

Olivier




RE: [PATCH, Makefile.in] refine selftest recipes to restore mingw bootstrap

2017-11-28 Thread Tamar Christina
Hi Olivier,

> -Original Message-
> From: Olivier Hainque [mailto:hain...@adacore.com]
> Sent: Tuesday, November 28, 2017 10:40
> To: Tamar Christina 
> Cc: Olivier Hainque ; David Malcolm
> ; Jeff Law ; GCC Patches  patc...@gcc.gnu.org>; nd 
> Subject: Re: [PATCH, Makefile.in] refine selftest recipes to restore mingw
> bootstrap
> 
> Hello Tamar,
> 
> > On Nov 28, 2017, at 11:05 , Tamar Christina 
> wrote:
> >
> > Yeah I choose this fix because /dev/null is converted to "nul" by the
> > Cygwin/msys2 runtime. So when invoked On the shell it gets converted
> > before GCC sees it.
> >
> > Nothing against using nul directly, though I am surprised that there's
> > an environment that provides Configure, automake and coreutils which
> > doesn't convert /dev/null to nul.
> 
> My understanding is that it's not only a property of the environment in which
> the tool gets executed, but also of the tool itself.
> 
> For gcc in particular, ISTM it depends on the target for which you configured.

This is true, but In this case, the tool should have never seen "/dev/null". 
Since the invocation
Of the command that uses SELFTEST_FLAGS should have converted it. Unless you're 
using
Something like mingw32-make, which would explain the difference.

> 
> Our toolchains are configured for x86_64-pc-mingw32
> 
> Maybe yours are configured for cygwin, not mingw (are they ?), and then are
> linked with cygwin libraries which perform various transformations.

No, mine are configured for x86_64-w64-mingw32 and are just native gcc builds.
It's just build using cygwin derived tools to not encounter certain differences 
like this one.

I think using nul is fine if it makes your use-case work, it shouldn't break 
the mingw64 builds.

Thanks,
Tamar

> 
> Olivier
> 



Re: [PATCH] Fix ms-sysv.exp testsuite FAILs (PR c/83117)

2017-11-28 Thread Jakub Jelinek
On Mon, Nov 27, 2017 at 05:02:32PM -0600, Daniel Santos wrote:
> > --- gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/gen.cc.jj   2017-05-22 
> > 10:49:45.0 +0200
> > +++ gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/gen.cc  2017-11-27 
> > 11:57:14.889570915 +0100
> > @@ -392,7 +392,7 @@ static void make_do_tests_decl (const ve
> > continue;
> >  
> >   comma.reset ();
> > - out << "static __attribute__ ((ms_abi)) long (*const do_test_"
> > + out << "static __attribute__ ((ms_abi)) long (*do_test_"
> >   << (unaligned ? "u" : "")
> >   << (varargs ? "v" : "") << i << ") (";
> 
> I don't have a problem with removing const, it's only there for
> const-correctness and caution.  I just posted to the PR a bit ago and
> I'm curious if there is a better approach when using assembly stubs that
> are meant to be called in varying ways.  CV would work also, although
> there's no real need to refetch the address before each use.
> 
> If you don't have a better way to do this then please use this patch.

I've verified the resulting *.optimized dump as well as assembly is
practically identical without/with the patch, only differences are in
SSA_NAME versions, in assembly the .LC and .LCFI constants are
different but otherwise it is the same - the functions are emitted in
different orders by cgraph and committed the patch.

Using assembly stubs that are meant to be called in varying ways should
just be avoided in portable programs, you could e.g. in the generator
instead of all those:
extern __attribute__ ((ms_abi)) long do_test_aligned ();
extern __attribute__ ((ms_abi)) long do_test_unaligned ();
static __attribute__ ((ms_abi)) long (*do_test_1) (long a) = 
(void*)do_test_aligned;
static __attribute__ ((ms_abi)) long (*do_test_v1) (long a, ...) = 
(void*)do_test_aligned;
static __attribute__ ((ms_abi)) long (*do_test_u1) (long a) = 
(void*)do_test_unaligned;
static __attribute__ ((ms_abi)) long (*do_test_uv1) (long a, ...) = 
(void*)do_test_unaligned;
emit:
extern __attribute__ ((ms_abi)) long do_test_1 (long a);
asm (".text; do_test_1: jmp do_test_aligned; .previous");
extern __attribute__ ((ms_abi)) long do_test_v1 (long a, ...);
asm (".text; do_test_v1: jmp do_test_aligned; .previous");
extern __attribute__ ((ms_abi)) long do_test_1 (long a);
asm (".text; do_test_u1: jmp do_test_unaligned; .previous");
extern __attribute__ ((ms_abi)) long do_test_1 (long a, ...);
asm (".text; do_test_uv1: jmp do_test_unaligned; .previous");
or something similar.

Jakub


Re: [PATCH] [pr#83069] Keep profile_count for bb under real_bb_freq_max

2017-11-28 Thread Siddhesh Poyarekar
On Friday 24 November 2017 05:36 PM, Siddhesh Poyarekar wrote:
> freq_max < 1, i.e. highest frequency among bbs in the function being
> higher than real_bb_freq_max means that the bb ends up with a profile
> count larger than real_bb_freq_max and then can go all the way up to
> and beyond profile_count::max_count.
> 
> Bootstrapped on aarch64, testsuite in progress.

Tests came out clean (no new regressions) on aarch64 and x86_64.  Ping?

Siddhesh

> 
>   * gcc/predict.c (estimate_bb_frequencies): Don't reset freq_max.
>   * gcc/testsuite/gcc.dg/pr83069.c: New test case.
> 
> ---
>  gcc/predict.c  |  2 --
>  gcc/testsuite/gcc.dg/pr83069.c | 15 +++
>  2 files changed, 15 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/pr83069.c
> 
> diff --git a/gcc/predict.c b/gcc/predict.c
> index 0f34956..ff9b5a9 100644
> --- a/gcc/predict.c
> +++ b/gcc/predict.c
> @@ -3613,8 +3613,6 @@ estimate_bb_frequencies (bool force)
> freq_max = BLOCK_INFO (bb)->frequency;
>  
>freq_max = real_bb_freq_max / freq_max;
> -  if (freq_max < 16)
> - freq_max = 16;
>profile_count ipa_count = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count.ipa ();
>cfun->cfg->count_max = profile_count::uninitialized ();
>FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR_FOR_FN (cfun), NULL, next_bb)
> diff --git a/gcc/testsuite/gcc.dg/pr83069.c b/gcc/testsuite/gcc.dg/pr83069.c
> new file mode 100644
> index 000..d43d78d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr83069.c
> @@ -0,0 +1,15 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O1" } */
> +
> +void
> +foo (unsigned long *res, unsigned long in)
> +{
> +  for (unsigned long a = 0; a < 98; a++)
> +for (unsigned long b = 0; b < 98; b++)
> +  for (unsigned long c = 0; c < 98; c++)
> + for (unsigned long d = 0; d < 98; d++)
> +   for (unsigned long e = 0; e < 98; e++)
> + for (unsigned long f = 0; f < 98; f++)
> +   for (unsigned long g = 0; g < 98; g++)
> + *res += a * in;
> +}
> 


Re: [PR 82808] Use result types for arithmetic jump functions

2017-11-28 Thread Richard Biener
On Tue, Nov 28, 2017 at 12:35 AM, Martin Jambor  wrote:
> Hi,
>
> sorry for getting so late to this.  However...
>
> On Tue, Nov 14 2017, Richard Biener wrote:
>> On Tue, Nov 14, 2017 at 10:31 AM, Prathamesh Kulkarni
>>  wrote:
>>> On 3 November 2017 at 15:38, Richard Biener  
>>> wrote:
 On Fri, Nov 3, 2017 at 6:15 AM, Prathamesh Kulkarni
  wrote:
> Hi Martin,
> As mentioned in PR, the issue here for propagating value of 'm' from
> f_c1 to foo() is that the jump function operation is FLOAT_EXPR, and
> the type of input param 'm' is int, so fold_unary() doesn't do the
> conversion to real_type. The attached patch fixes that by calling
> fold_convert if operation is FLOAT_EXPR / FIX_TRUNC_EXPR /
> CONVERT_EXPR and converts it to the type of corresponding parameter in
> callee.
>
> There are still two issues:
> a) Using NOP_EXPR for early_exit in ipa_get_jf_pass_through_result.
> I suppose we need to change to some other code to indicate that there
> is no operation ?
> b) Patch does not passing param_type from all callers.
> I suppose we could fix these incrementally ?
>
> Bootstrap+tested on x86_64-unknown-linux-gnu.
> OK for trunk ?

 This doesn't look like a well designed fix.  Both fold_unary and 
 fold_binary
 calls get a possibly bogus type and you single out only a few ops.

 Either _fully_ list a set of operations that are know to have matching
 input/output types or always require param_type to be non-NULL.

 For a) simply remove the special-casing and merge it with CONVERT_EXPR
 handling (however it will end up looking).

 Please don't use fold_convert, using fold_unary is fine.
>>> Hi Richard,
>>> Sorry for the delay. In the attached version, parm_type is made non
>>> NULL in ipa_get_jf_pass_through_result().
>>>
>>> ipa_get_jf_pass_through_result() is called from two
>>> places - propagate_vals_across_pass_through() and ipa_value_from_jfunc().
>>> However it appears ipa_value_from_jfunc() is called from multiple
>>> functions and it's
>>> hard to detect parm_type in the individual callers. So I have passed
>>> correct parm_type from propagate_vals_across_pass_through(), and kept
>>> the old behavior for ipa_value_from_jfunc().
>>>
>>> Would it be OK to fix that incrementally ?
>>
>> I don't think it is good to do that.  If we can't get the correct type
>> then we have
>
> ...I agree with Richi that it is better to fix all the potential type
> issues like in the patch below.  Because we now have the types almost
> everywhere - notable exceptions are K&R C programs and functions with
> variable number of parameters - we might just as well use them, and so I
> went over other uses of ipa_get_jf_pass_through_result and made them
> look for the appropriate type too.
>
> However, because there are cases where we just do not have the type at
> hand, we have to deal with it by only going forward if we know the
> operation at hand does not change type.  I have added a special
> predicate for this purpose to tree.c but I am opened to suggestions for
> better place or name or how/whether to integrate them to gimple
> verifier.
>
> Bootstrapped and tested on x86_64-linux.  If the new tree.c predicate is
> deemed OK, I'd like to propose to commit this to trunk (and then
> backport it to the gcc 7 branch).
>
> What do you think?

Ok with nits below...

> Martin
>
>
> 2017-11-27  Prathamesh Kulkarni  
> Martin Jambor  
>
> PR ipa/82808
> * tree.h (tree_operation_preserves_op1_type_p): Declare
> * tree.c (tree_operation_preserves_op1_type_p): New function.
> * ipa-prop.h (ipa_get_type): Allow i to be out of bounds.
> (ipa_value_from_jfunc): Adjust declaration.
> * ipa-cp.c (ipa_get_jf_pass_through_result): New parameter RES_TYPE.
> Use it as result type for arithmetics, unless it is NULL in which case
> be more conservative.
> (ipa_value_from_jfunc): New parameter PARM_TYPE, pass it to
> ipa_get_jf_pass_through_result.
> (propagate_vals_across_pass_through): Likewise.
> (propagate_scalar_across_jump_function): New parameter PARM_TYPE, pass
> is to propagate_vals_across_pass_through.
> (propagate_constants_across_call): Pass PARM_TYPE to
> propagate_scalar_across_jump_function.
> (find_more_scalar_values_for_callers_subset): Pass parameter type to
> ipa_value_from_jfunc.
> (cgraph_edge_brings_all_scalars_for_node): Likewise.
> * ipa-fnsummary.c (evaluate_properties_for_edge): Renamed parms_info
> to caller_parms_info, pass parameter type to ipa_value_from_jfunc.
> * ipa-prop.c (try_make_edge_direct_simple_call): New parameter
> target_type, pass it to ipa_value_from_jfunc.
> (update_indirect_edges_after_inlining): Pass parameter type to
> try_make_edge_direct_simple_call.
>
> testsuite/

Re: [RFA][PATCH] Use SCEV conditionally within vr-values and evrp range analysis - V2

2017-11-28 Thread Richard Biener
On Mon, Nov 27, 2017 at 5:43 PM, Jeff Law  wrote:
> On 11/23/2017 05:49 AM, Richard Biener wrote:
>> On Thu, Nov 23, 2017 at 1:16 AM, Jeff Law  wrote:
>>>
>>> Clients of the evrp range analysis may not have initialized the SCEV
>>> infrastructure, and in fact my not want to (DOM for example).
>>>
>>> Yet inside both vr-values.c and gimple-ssa-evrp-analyze.c we have calls
>>> into SCEV (that will fault/abort if SCEV is not properly initialized).
>>>
>>> This patch allows clients of vr-values.c and gimple-ssa-evrp-analyze.c
>>> to indicate if they want SCEV analysis.
>>>
>>> Bootstrapped and regression tested by itself as well as with the DOM
>>> patches to use EVRP analysis  (which test the "don't want SCEV path).
>>>
>>> OK for the trunk?
>>
>> There's also scev_initialized_p () which you could conveniently use.
> Yea, that worked fine and is (of course) much simpler.
>
> Bootstrapped and regression tested in isolation as well as on top of my
> ongoing work to remove jump threading from tree-vrp.c.
>
> OK for the trunk now?

Ok.

Richard.

> Jeff
>
>
> * gimple-ssa-evrp-analyze.c
> (evrp_range_analyzer::record_ranges_from_phis): Only use SCEV to
> refine ranges if scev_initialized_p returns true.
> * vr-values.c (vr_values::extract_range_from_phi_node): Likewise.
>
> diff --git a/gcc/gimple-ssa-evrp-analyze.c b/gcc/gimple-ssa-evrp-analyze.c
> index 68a2cdc..38fb0db 100644
> --- a/gcc/gimple-ssa-evrp-analyze.c
> +++ b/gcc/gimple-ssa-evrp-analyze.c
> @@ -176,7 +176,8 @@ evrp_range_analyzer::record_ranges_from_phis (basic_block 
> bb)
>  to use VARYING for them.  But we can still resort to
>  SCEV for loop header PHIs.  */
>   struct loop *l;
> - if (interesting
> + if (scev_initialized_p ()
> + && interesting
>   && (l = loop_containing_stmt (phi))
>   && l->header == gimple_bb (phi))
>   vr_values.adjust_range_with_scev (&vr_result, l, phi, lhs);
> diff --git a/gcc/vr-values.c b/gcc/vr-values.c
> index 2d11861..e617556 100644
> --- a/gcc/vr-values.c
> +++ b/gcc/vr-values.c
> @@ -2935,7 +2935,8 @@ scev_check:
>   scev_check can be reached from two paths, one is a fall through from 
> above
>   "varying" label, the other is direct goto from code block which tries to
>   avoid infinite simulation.  */
> -  if ((l = loop_containing_stmt (phi))
> +  if (scev_initialized_p ()
> +  && (l = loop_containing_stmt (phi))
>&& l->header == gimple_bb (phi))
>  adjust_range_with_scev (vr_result, l, phi, lhs);
>
>


Re: [RFC][PATCH] Extend DCE to remove unnecessary new/delete-pairs

2017-11-28 Thread Richard Biener
On Mon, Nov 27, 2017 at 5:58 PM, Jeff Law  wrote:
> On 11/27/2017 02:22 AM, Dominik Inführ wrote:
>> Thanks for all the reviews! I’ve revised the patch, the operator_delete_flag 
>> is now stored in tree_decl_with_vis (there already seem to be some 
>> FUNCTION_DECL-flags in there). I’ve also added the option -fallocation-dce 
>> to disable this optimization. It bootstraps and no regressions on aarch64 
>> and x86_64.
>>
>> The problem with this patch is what Marc noticed: it omits too many 
>> allocations. The C++ standard seems to only allow to omit "replaceable 
>> global allocation functions (18.6.1.1, 18.6.1.2)”. So e.g. no class-specific 
>> or user-defined allocations. I am not sure what’s the best way to implement 
>> this. Just checking the function declarations might not be enough and seems 
>> more like a hack. The better way seems to introduce a __builtin_operator_new 
>> like Marc mentioned. In which way would you implement this? Could you please 
>> give me some pointers here to look at?
> Just a nit.  Make sure to mention BZ 23383 in your ChangeLog entry.
> Like this:
>
> c++/23383
> * tree-core.h (blah blah): What changed.
>
>
> Jakub and Richi probably have a better understanding of the builtin
> mechanisms than I do.  I'll leave it for them to comment on how best to
> proceed there.

I don't see why a builtin is necessary, can the FE not see which one
is the global allocation function?
Anyways, there are no FE specific builtins, traditionally FEs have
used special keywords - you might
want to search for RID_BUILTIN_LAUNDER for example.

Richard.

> jeff


Re: [C++ PATCH] Avoid -Wreturn-type warnings if a switch has default label, no breaks inside of it, but is followed by a break (PR sanitizer/81275, take 2)

2017-11-28 Thread Nathan Sidwell

On 11/28/2017 03:53 AM, Jakub Jelinek wrote:

On Mon, Nov 27, 2017 at 02:01:05PM +0100, Jakub Jelinek wrote:

You are right that I can remove the || SWITCH_STMT_BODY (stmt) == NULL_TREE,
part, because then there wouldn't be any case labels in it either.


...

Here is an updated patch, on top of the C patch I've just posted:
http://gcc.gnu.org/ml/gcc-patches/2017-11/msg02372.html
(though that dependency could be easily removed if needed by dropping the
c_switch_covers_all_cases_p call and SWITCH_ALL_CASES_P setting from
SWITCH_STMT_ALL_CASES_P).
Note, looking for default is still needed, because in templates we do not
build the cases splay tree and therefore would never set
SWITCH_STMT_ALL_CASES_P.  Computing the cases splay tree is probably too
expensive, but default tracking is cheap.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-11-28  Jakub Jelinek  

PR sanitizer/81275
* cp-tree.h (SWITCH_STMT_ALL_CASES_P): Define.
(SWITCH_STMT_NO_BREAK_P): Define.
(note_break_stmt, note_iteration_stmt_body_start,
note_iteration_stmt_body_end): Declare.
* decl.c (struct cp_switch): Add has_default_p, break_stmt_seen_p
and in_loop_body_p fields.
(push_switch): Clear them.
(pop_switch): Set SWITCH_STMT_CANNOT_FALLTHRU_P if has_default_p
and !break_stmt_seen_p.  Assert in_loop_body_p is false.
(note_break_stmt, note_iteration_stmt_body_start,
note_iteration_stmt_body_end): New functions.
(finish_case_label): Set has_default_p when both low and high
are NULL_TREE.
* parser.c (cp_parser_iteration_statement): Use
note_iteration_stmt_body_start and note_iteration_stmt_body_end
around parsing iteration body.
* pt.c (tsubst_expr): Likewise.
* cp-objcp-common.c (cxx_block_may_fallthru): Return false for
SWITCH_STMT which contains no BREAK_STMTs, contains a default:
CASE_LABEL_EXPR and where SWITCH_STMT_BODY isn't empty and
can't fallthru.
* semantics.c (finish_break_stmt): Call note_break_stmt.
* cp-gimplify.c (genericize_switch_stmt): Copy SWITCH_STMT_ALL_CASES_P
bit to SWITCH_ALL_CASES_P.  Assert that if SWITCH_STMT_NO_BREAK_P then
the break label is not TREE_USED.


Ok.  one nit.


  #define SWITCH_STMT_BODY(NODE)TREE_OPERAND (SWITCH_STMT_CHECK (NODE), 
1)
  #define SWITCH_STMT_TYPE(NODE)TREE_OPERAND (SWITCH_STMT_CHECK (NODE), 
2)
  #define SWITCH_STMT_SCOPE(NODE)   TREE_OPERAND (SWITCH_STMT_CHECK (NODE), 
3)
+/* True if there all case labels for all possible values of switch cond, either

s/all/are/ (first one, no trailing 'g' modifier :)


+   because there is a default: case label or because the case label ranges 
cover
+   all values.  */


nathan

--
Nathan Sidwell


Re: [PATCH 1/7]: SVE: Add CLOBBER_HIGH expression

2017-11-28 Thread Richard Biener
On Mon, Nov 27, 2017 at 6:29 PM, Jeff Law  wrote:
> On 11/23/2017 04:11 AM, Alan Hayward wrote:
>>
>>> On 22 Nov 2017, at 17:33, Jeff Law  wrote:
>>>
>>> On 11/22/2017 04:31 AM, Alan Hayward wrote:

> On 21 Nov 2017, at 03:13, Jeff Law  wrote:
>>
>>>
>>> You might also look at TARGET_HARD_REGNO_CALL_PART_CLOBBERED.  I'd
>>> totally forgotten about it.  And in fact it seems to come pretty close
>>> to what you need…
>>
>> Yes, some of the code is similar to the way
>> TARGET_HARD_REGNO_CALL_PART_CLOBBERED works. Both that code and the
>> CLOBBER expr code served as a starting point for writing the patch. The 
>> main difference
>> here, is that _PART_CLOBBERED is around all calls and is not tied to a 
>> specific Instruction,
>> it’s part of the calling abi. Whereas clobber_high is explicitly tied to 
>> an expression (tls_desc).
>> It meant there wasn’t really any opportunity to resume any existing code.
> Understood.  Though your first patch mentions that you're trying to
> describe partial preservation "around TLS calls". Presumably those are
> represented as normal insns, not call_insn.
>
> That brings me back to Richi's idea of exposing a set of the low subreg
> to itself using whatever mode is wide enough to cover the neon part of
> the register.
>
> That should tell the generic parts of the compiler that you're just
> clobbering the upper part and at least in theory you can implement in
> the aarch64 backend and the rest of the compiler should "just work"
> because that's the existing semantics of a subreg store.
>
> The only worry would be if a pass tried to get overly smart and
> considered that kind of set a nop -- but I think I'd argue that's simply
> wrong given the semantics of a partial store.
>

 So, the instead of using clobber_high(reg X), to use set(reg X, reg X).
 It’s something we considered, and then dismissed.

 The problem then is you are now using SET semantics on those registers, 
 and it
 would make the register live around the function, which might not be the 
 case.
 Whereas clobber semantics will just make the register dead - which is 
 exactly
 what we want (but only conditionally).
>>> ?!?  A set of the subreg is the *exact* semantics you want.  It says the
>>> low part is preserved while the upper part is clobbered across the TLS
>>> insns.
>>>
>>> jeff
>>
>> Consider where the TLS call is inside a loop. The compiler would normally 
>> want
>> to hoist that out of the loop. By adding a set(x,x) into the parallel of the 
>> tls_desc we
>> are now making x live across the loop, x is dependant on the value from the 
>> previous
>> iteration, and the tls_desc can no longer be hoisted.
> Hmm.  I think I see the problem you're trying to point out.  Let me
> restate it and see if you agree.
>
> The low subreg set does clearly indicate the upper part of the SVE
> register is clobbered.  The problem is from a liveness standpoint the
> compiler is considering the low part live, even though it's a self-set.
>
> In fact, if that is the case, then a single TLS call (independent of a
> loop) would make the low part of the register globally live.  This
> should be testable.  Include one of these low part self sets on the
> existing TLS calls and compile a little test function and let's look at
> the liveness data.
>
>
> Now it could be the case that various local analysis could sub-optimally
> handle things.  You mention LICM.  I know our original LICM did have a
> problem in that if it saw a use of a hard reg in a loop without seeing a
> set of that hard reg it considered the register varying within the loop.
>  I have no idea if we carried that forward when the loop code was
> rewritten (when I looked at this it was circa 1992).
>
>
>>
>> Or consider a stream of code containing two tls_desc calls (ok, the compiler 
>> might
>> optimise one of the tls calls away, but this approach should be reusable for 
>> other exprs).
>> Between the two set(x,x)’s x is considered live so the register allocator 
>> can’t use that
>> register.
>> Given that we are applying this to all the neon registers, the register 
>> allocator now throws
>> an ICE because it can’t find any free hard neon registers to use.
> Given your statements it sounds like the liveness infrastructure is
> making those neon regs globally live when it sees the low part subreg
> self-set.  Let's confirm that one way or the other and see where it
> takes us.

Indeed in (set (subreg:neon reg1) (subreg:neon reg1)) it appears that
the lowpart of reg1
is used and thus it is live but liveness analysis can (and should)
simply ignore such sets.

> Jeff


Re: [PATCH][2/2] gimple-fold.c part for PR83141

2017-11-28 Thread Richard Biener
On Mon, 27 Nov 2017, Richard Biener wrote:

> 
> The following is the truly minimal fix for the middle-end issue
> with SRA and memcpy folding interaction.  I've tried more variants
> that "make sense" but as they all end up folding slightly more
> memcpy calls than before we run into optimization testcase
> regressions in places that look for __builtin_memcpy and do
> not deal with aggregate copies as MEM[..] = MEM[..];
> 
> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
> 
> GCC 9 is the time we can try dealing with such fallout, it just
> doesn't feel like the correct time to do now...

Installed as follows, bootstrapped and tested on x86_64-unknown-linux-gnu.

Richard.

2017-11-28  Richard Biener  

PR middle-end/83141
* gimple-fold.c (gimple_fold_builtin_memory_op): For aggregate
copies generated from memcpy use a character array as reference
type.

Index: gcc/gimple-fold.c
===
--- gcc/gimple-fold.c   (revision 255196)
+++ gcc/gimple-fold.c   (working copy)
@@ -1039,8 +1039,24 @@ gimple_fold_builtin_memory_op (gimple_st
  gimple_set_vuse (new_stmt, gimple_vuse (stmt));
  gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT);
}
+ new_stmt = gimple_build_assign (destvar, srcvar);
+ goto set_vop_and_replace;
}
-  new_stmt = gimple_build_assign (destvar, srcvar);
+
+  /* We get an aggregate copy.  Use an unsigned char[] type to
+perform the copying to preserve padding and to avoid any issues
+with TREE_ADDRESSABLE types or float modes behavior on copying.  */
+  desttype = build_array_type_nelts (unsigned_char_type_node,
+tree_to_uhwi (len));
+  srctype = desttype;
+  if (src_align > TYPE_ALIGN (srctype))
+   srctype = build_aligned_type (srctype, src_align);
+  if (dest_align > TYPE_ALIGN (desttype))
+   desttype = build_aligned_type (desttype, dest_align);
+  new_stmt
+   = gimple_build_assign (fold_build2 (MEM_REF, desttype, dest, off0),
+  fold_build2 (MEM_REF, srctype, src, off0));
+set_vop_and_replace:
   gimple_set_vuse (new_stmt, gimple_vuse (stmt));
   gimple_set_vdef (new_stmt, gimple_vdef (stmt));
   if (gimple_vdef (new_stmt)


Re: [PATCH, Makefile.in] refine selftest recipes to restore mingw bootstrap

2017-11-28 Thread Olivier Hainque

> On Nov 28, 2017, at 11:56 , Tamar Christina  wrote:
>> For gcc in particular, ISTM it depends on the target for which you 
>> configured.
> 
> This is true, but In this case, the tool should have never seen "/dev/null". 
> Since the invocation
> Of the command that uses SELFTEST_FLAGS should have converted it. Unless 
> you're using
> Something like mingw32-make, which would explain the difference.

Oh, right. We are using

  GNU Make 3.82.90
  Built for i686-w64-mingw32

>> Maybe yours are configured for cygwin, not mingw (are they ?), and then are
>> linked with cygwin libraries which perform various transformations.
> 
> No, mine are configured for x86_64-w64-mingw32 and are just native gcc builds.
> It's just build using cygwin derived tools to not encounter certain 
> differences like this one.
> 
> I think using nul is fine if it makes your use-case work, it shouldn't break 
> the mingw64 builds.

OK. It does help our use case for sure.

Thanks for your feedback!

Olivier




Re: [PATCH v2 1/4] [SPARC] Errata workaround for GRLIB-TN-0012

2017-11-28 Thread Daniel Cederman

...add (and (.) (not (eq_attr "branch_type" "icc")) to the first define_delay.


Ah, OK, that makes more sense. I will submit an updated version. Thanks 
for getting back so quickly.


Daniel C



[PATCH v3 1/4] [SPARC] Errata workaround for GRLIB-TN-0012

2017-11-28 Thread Daniel Cederman
This patch provides a workaround for the errata described in GRLIB-TN-0012.

If the workaround is enabled it will:

* Prevent any floating-point operation from being placed in the
  delay slot of an annulled integer branch.

* Place a NOP at the branch target of an integer branch if it is
  a floating-point operation or a floating-point branch.

It is applicable to GR712RC.

gcc/ChangeLog:

2017-11-17  Daniel Cederman  

* config/sparc/sparc.c (fpop_insn_p): New function.
(sparc_do_work_around_errata): Insert NOP instructions to
prevent sequences that could trigger the TN-0012 errata for
GR712RC.
(pass_work_around_errata::gate): Also test sparc_fix_gr712rc.
* config/sparc/sparc.md (fix_gr712rc): New attribute.
(in_branch_annul_delay): Prevent floating-point instructions
in delay slot of annulled integer branch.
---
 gcc/config/sparc/sparc.c  | 57 +++
 gcc/config/sparc/sparc.md | 21 -
 2 files changed, 73 insertions(+), 5 deletions(-)

diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
index 83ca1dc..c656b78 100644
--- a/gcc/config/sparc/sparc.c
+++ b/gcc/config/sparc/sparc.c
@@ -914,6 +914,31 @@ mem_ref (rtx x)
   return NULL_RTX;
 }
 
+/* True if INSN is a floating-point instruction.  */
+
+static bool
+fpop_insn_p (rtx_insn *insn)
+{
+  if (GET_CODE (PATTERN (insn)) != SET)
+return false;
+
+  switch (get_attr_type (insn))
+{
+case TYPE_FPMOVE:
+case TYPE_FPCMOVE:
+case TYPE_FP:
+case TYPE_FPCMP:
+case TYPE_FPMUL:
+case TYPE_FPDIVS:
+case TYPE_FPSQRTS:
+case TYPE_FPDIVD:
+case TYPE_FPSQRTD:
+  return true;
+default:
+  return false;
+}
+}
+
 /* We use a machine specific pass to enable workarounds for errata.
 
We need to have the (essentially) final form of the insn stream in order
@@ -939,11 +964,34 @@ sparc_do_work_around_errata (void)
 {
   bool insert_nop = false;
   rtx set;
+  rtx_insn *jump;
+  rtx_sequence *seq;
 
   /* Look into the instruction in a delay slot.  */
-  if (NONJUMP_INSN_P (insn))
-   if (rtx_sequence *seq = dyn_cast  (PATTERN (insn)))
- insn = seq->insn (1);
+  if (NONJUMP_INSN_P (insn)
+ && (seq = dyn_cast  (PATTERN (insn
+ {
+   jump = seq->insn (0);
+   insn = seq->insn (1);
+ }
+  else if (JUMP_P (insn))
+   jump = insn;
+  else
+   jump = NULL;
+
+  /* Place a NOP at the branch target of an integer branch if it is
+a floating-point operation or a floating-point branch.  */
+  if (sparc_fix_gr712rc
+ && jump != NULL_RTX
+ && get_attr_branch_type (jump) == BRANCH_TYPE_ICC)
+   {
+ rtx_insn *target = next_active_insn (JUMP_LABEL_AS_INSN (jump));
+ if (target
+ && (fpop_insn_p (target)
+ || ((JUMP_P (target)
+  && get_attr_branch_type (target) == BRANCH_TYPE_FCC
+   emit_insn_before (gen_nop (), target);
+   }
 
   /* Look for either of these two sequences:
 
@@ -1272,7 +1320,8 @@ public:
   /* opt_pass methods: */
   virtual bool gate (function *)
 {
-  return sparc_fix_at697f || sparc_fix_ut699 || sparc_fix_b2bst;
+  return sparc_fix_at697f || sparc_fix_ut699 || sparc_fix_b2bst
+ || sparc_fix_gr712rc;
 }
 
   virtual unsigned int execute (function *)
diff --git a/gcc/config/sparc/sparc.md b/gcc/config/sparc/sparc.md
index 5e1f0b7..2a83bab 100644
--- a/gcc/config/sparc/sparc.md
+++ b/gcc/config/sparc/sparc.md
@@ -430,6 +430,10 @@
(symbol_ref "(sparc_fix_b2bst != 0
 ? FIX_B2BST_TRUE : FIX_B2BST_FALSE)"))
 
+(define_attr "fix_gr712rc" "false,true"
+   (symbol_ref "(sparc_fix_gr712rc != 0
+? FIX_GR712RC_TRUE : FIX_GR712RC_FALSE)"))
+
 ;; Length (in # of insns).
 ;; Beware that setting a length greater or equal to 3 for conditional branches
 ;; has a side-effect (see output_cbranch and output_v9branch).
@@ -590,6 +594,15 @@
   (const_string "true")
] (const_string "false")))
 
+(define_attr "in_integer_branch_annul_delay" "false,true"
+  (cond [(and (eq_attr "fix_gr712rc" "true")
+ (eq_attr "type" "fp,fpcmp,fpmove,fpcmove,fpmul,
+  fpdivs,fpsqrts,fpdivd,fpsqrtd"))
+  (const_string "false")
+(eq_attr "in_branch_delay" "true")
+  (const_string "true")
+   ] (const_string "false")))
+
 (define_delay (eq_attr "type" "call")
   [(eq_attr "in_call_delay" "true") (nil) (nil)])
 
@@ -599,9 +612,15 @@
 (define_delay (eq_attr "type" "return")
   [(eq_attr "in_return_delay" "true") (nil) (nil)])
 
-(define_delay (eq_attr "type" "branch")
+(define_delay (and (eq_attr "type" "branch")
+ (not (eq_attr "branch_type" "icc")))
   [(eq_attr "in_branch_delay" "true") (nil) (eq_attr "in_branch_delay" 
"true")])
 
+(define_delay (an

Re: [RFC][PATCH] Extend DCE to remove unnecessary new/delete-pairs

2017-11-28 Thread Jakub Jelinek
On Tue, Nov 28, 2017 at 12:52:12PM +0100, Richard Biener wrote:
> On Mon, Nov 27, 2017 at 5:58 PM, Jeff Law  wrote:
> > On 11/27/2017 02:22 AM, Dominik Inführ wrote:
> >> Thanks for all the reviews! I’ve revised the patch, the 
> >> operator_delete_flag is now stored in tree_decl_with_vis (there already 
> >> seem to be some FUNCTION_DECL-flags in there). I’ve also added the option 
> >> -fallocation-dce to disable this optimization. It bootstraps and no 
> >> regressions on aarch64 and x86_64.
> >>
> >> The problem with this patch is what Marc noticed: it omits too many 
> >> allocations. The C++ standard seems to only allow to omit "replaceable 
> >> global allocation functions (18.6.1.1, 18.6.1.2)”. So e.g. no 
> >> class-specific or user-defined allocations. I am not sure what’s the best 
> >> way to implement this. Just checking the function declarations might not 
> >> be enough and seems more like a hack. The better way seems to introduce a 
> >> __builtin_operator_new like Marc mentioned. In which way would you 
> >> implement this? Could you please give me some pointers here to look at?
> > Just a nit.  Make sure to mention BZ 23383 in your ChangeLog entry.
> > Like this:
> >
> > c++/23383
> > * tree-core.h (blah blah): What changed.
> >
> >
> > Jakub and Richi probably have a better understanding of the builtin
> > mechanisms than I do.  I'll leave it for them to comment on how best to
> > proceed there.
> 
> I don't see why a builtin is necessary, can the FE not see which one
> is the global allocation function?

It should, and finding it by namespace and name is IMHO not a hack,
that is how the C++ standard specifies those.

> Anyways, there are no FE specific builtins, traditionally FEs have
> used special keywords - you might
> want to search for RID_BUILTIN_LAUNDER for example.

Those are there just for parsing them, there is nothing special needed
to parse the standard new/delete operators.

Jakub


Re: [PATCH] Implement std::to_address for C++2a

2017-11-28 Thread Jonathan Wakely

On 25/11/17 10:31 -0500, Glen Fernandes wrote:

(Just a minor update to the last patch to use is_function_v instead
of is_function::value)

Implement std::to_address for C++2a


Thanks, Glen, I've committed this to trunk, with one small change to
fix the copyright dates in the new test, to be just 2017.

Because my new hobby is finding uses for if-constexpr, I think we
could have used the detection idiom to do it in a single overload, but
I don't see any reason to prefer that over your implementation:

 template
   using __ptr_traits_to_address
 = decltype(pointer_traits<_Ptr>::to_address(std::declval<_Ptr>()));

 template
   constexpr auto
   __to_address(const _Ptr& __ptr) noexcept
   {
 struct __nonesuch;
 using type = __detected_or_t<__nonesuch, __ptr_traits_to_address, _Ptr>;
 if constexpr (is_same_v)
return std::__to_address(__ptr.operator->());
 else
return std::pointer_traits<_Ptr>::to_address(__ptr);
   }

However, more importantly, both this form and yours fails for the
following test case, in two ways:

#include 

struct P {
 using F = void();

 F* operator->() const noexcept { return nullptr; }
};

int main()
{
 P p;
 std::to_address(p);
}

Firstly, instantiating pointer_traits fails a static assertion
(outside the immediate context, so not SFINAE-able):

In file included from 
/home/jwakely/gcc/8/include/c++/8.0.0/bits/stl_iterator.h:66:0,
from 
/home/jwakely/gcc/8/include/c++/8.0.0/bits/stl_algobase.h:67,
from /home/jwakely/gcc/8/include/c++/8.0.0/memory:62,
from toaddr.cc:1:
/home/jwakely/gcc/8/include/c++/8.0.0/bits/ptr_traits.h: In instantiation of ‘struct 
std::pointer_traits’:
/home/jwakely/gcc/8/include/c++/8.0.0/type_traits:2364:62:   recursively required by substitution of 
‘template class _Op, class ... _Args> struct 
std::__detector<_Default, std::__void_t<_Op<_Args ...> >, _Op, _Args ...> [with _Default = 
std::__to_address(const _Ptr&) [with _Ptr = P]::__nonesuch; _Op = std::__ptr_traits_to_address; _Args = {P}]’
/home/jwakely/gcc/8/include/c++/8.0.0/type_traits:2364:62:   required by substitution of 
‘template class _Op, class ... _Args> using 
__detected_or_t = typename std::__detected_or<_Default, _Op, _Args ...>::type [with _Default = 
std::__to_address(const _Ptr&) [with _Ptr = P]::__nonesuch; _Op = std::__ptr_traits_to_address; 
_Args = {P}]’
/home/jwakely/gcc/8/include/c++/8.0.0/bits/ptr_traits.h:169:78:   required from 
‘constexpr auto std::__to_address(const _Ptr&) [with _Ptr = P]’
/home/jwakely/gcc/8/include/c++/8.0.0/bits/ptr_traits.h:200:31:   required from 
‘constexpr auto std::to_address(const _Ptr&) [with _Ptr = P]’
toaddr.cc:12:20:   required from here
/home/jwakely/gcc/8/include/c++/8.0.0/bits/ptr_traits.h:114:7: error: static 
assertion failed: pointer type defines element_type or is like SomePointer
  static_assert(!is_same::value,
  ^

I'm not sure if this is a bug in our std::pointer_traits, or if the
standard requires the specialization of std::pointer_traits to be
ill-formed (see [pointer.traits.types] p1). We have a problem if it
does require it, and either need to relax the requirements on
pointer_traits, or we need to alter the wording for to_address so that
it doesn't try to use pointer_traits when the specialization would be
ill-formed.

Secondly, if I remove that static_assert from  then
the test compiles, which is wrong, because it calls std::to_address on
a function pointer type. That should be ill-formed. The problem is
that the static_assert(!is_function_v<_Ptr>) is in std::to_address and
the implementation actually uses std::__to_address. So I think we want
the !is_function_v<_Ptr> check to be moved to the __to_address(_Ptr*)
overload.



[PATCH GCC]Rename and make remove_dead_inserted_code a simple dce interface

2017-11-28 Thread Bin Cheng
Hi,
This patch renames remove_dead_inserted_code to simple_dce_from_worklist, moves 
it to tree-ssa-dce.c
and makes it a simple public DCE interface.  Bootstrap and test along with loop 
interchange.  It's required
for interchange pass.  Is it OK?
BTW, I will push this along with interchange to branch: 
gcc.gnu.org/svn/gcc/branches/gimple-linterchange.

Thanks,
bin
2017-11-27  Bin Cheng  

* tree-ssa-dce.c (simple_dce_from_worklist): Move and rename from
tree-ssa-pre.c::remove_dead_inserted_code.
* tree-ssa-dce.h: New file.
* tree-ssa-pre.c (tree-ssa-dce.h): Include new header file.
(remove_dead_inserted_code): Move and rename to function
tree-ssa-dce.c::simple_dce_from_worklist.
(pass_pre::execute): Update use.diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c
index a5f0edf..227e55d 100644
--- a/gcc/tree-ssa-dce.c
+++ b/gcc/tree-ssa-dce.c
@@ -1723,3 +1723,56 @@ make_pass_cd_dce (gcc::context *ctxt)
 {
   return new pass_cd_dce (ctxt);
 }
+
+
+/* A cheap DCE interface starting from a seed set of possibly dead stmts.  */
+
+void
+simple_dce_from_worklist (bitmap seeds)
+{
+  /* ???  Re-use seeds as worklist not only as initial set.  This may end up
+ removing more code as well.  If we keep seeds unchanged we could restrict
+ new worklist elements to members of seed.  */
+  bitmap worklist = seeds;
+  while (! bitmap_empty_p (worklist))
+{
+  /* Pop item.  */
+  unsigned i = bitmap_first_set_bit (worklist);
+  bitmap_clear_bit (worklist, i);
+
+  tree def = ssa_name (i);
+  /* Removed by somebody else or still in use.  */
+  if (! def || ! has_zero_uses (def))
+   continue;
+
+  gimple *t = SSA_NAME_DEF_STMT (def);
+  if (gimple_has_side_effects (t))
+   continue;
+
+  /* Add uses to the worklist.  */
+  ssa_op_iter iter;
+  use_operand_p use_p;
+  FOR_EACH_PHI_OR_STMT_USE (use_p, t, iter, SSA_OP_USE)
+   {
+ tree use = USE_FROM_PTR (use_p);
+ if (TREE_CODE (use) == SSA_NAME
+ && ! SSA_NAME_IS_DEFAULT_DEF (use))
+   bitmap_set_bit (worklist, SSA_NAME_VERSION (use));
+   }
+
+  /* Remove stmt.  */
+  if (dump_file && (dump_flags & TDF_DETAILS))
+   {
+ fprintf (dump_file, "Removing dead stmt:");
+ print_gimple_stmt (dump_file, t, 0);
+   }
+  gimple_stmt_iterator gsi = gsi_for_stmt (t);
+  if (gimple_code (t) == GIMPLE_PHI)
+   remove_phi_node (&gsi, true);
+  else
+   {
+ gsi_remove (&gsi, true);
+ release_defs (t);
+   }
+}
+}
diff --git a/gcc/tree-ssa-dce.h b/gcc/tree-ssa-dce.h
new file mode 100644
index 000..2adb086
--- /dev/null
+++ b/gcc/tree-ssa-dce.h
@@ -0,0 +1,22 @@
+/* Copyright (C) 2017 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by the
+Free Software Foundation; either version 3, or (at your option) any
+later version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT
+ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#ifndef TREE_SSA_DCE_H
+#define TREE_SSA_DCE_H
+extern void simple_dce_from_worklist (bitmap);
+#endif
diff --git a/gcc/tree-ssa-pre.c b/gcc/tree-ssa-pre.c
index 281f100..c19d486 100644
--- a/gcc/tree-ssa-pre.c
+++ b/gcc/tree-ssa-pre.c
@@ -49,6 +49,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "dbgcnt.h"
 #include "domwalk.h"
 #include "tree-ssa-propagate.h"
+#include "tree-ssa-dce.h"
 #include "tree-cfgcleanup.h"
 #include "alias.h"
 
@@ -4038,64 +4039,6 @@ compute_avail (void)
   free (worklist);
 }
 
-/* Cheap DCE of a known set of possibly dead stmts.
-
-   Because we don't follow exactly the standard PRE algorithm, and decide not
-   to insert PHI nodes sometimes, and because value numbering of casts isn't
-   perfect, we sometimes end up inserting dead code.   This simple DCE-like
-   pass removes any insertions we made that weren't actually used.  */
-
-static void
-remove_dead_inserted_code (void)
-{
-  /* ???  Re-use inserted_exprs as worklist not only as initial set.
- This may end up removing non-inserted code as well.  If we
- keep inserted_exprs unchanged we could restrict new worklist
- elements to members of inserted_exprs.  */
-  bitmap worklist = inserted_exprs;
-  while (! bitmap_empty_p (worklist))
-{
-  /* Pop item.  */
-  unsigned i = bitmap_first_set_bit (worklist);
-  bitmap_clear_bit (worklist, i);
-
-  tree def = ssa_name (i);
-  /* Removed by somebody else or still in use.  */
-  if (! def || ! has_zero_uses (def))
-   co

Re: [PATCH] Fix PR80776

2017-11-28 Thread Jeff Law
On 11/28/2017 02:14 AM, Richard Biener wrote:
> On Mon, 27 Nov 2017, Jeff Law wrote:
> 
>> On 11/27/2017 06:39 AM, Richard Biener wrote:
>>>
>>> The following avoids -Wformat-overflow false positives by teaching
>>> EVRP the trick about __builtin_unreachable () "other" edges and
>>> attaching range info to SSA names.  EVRP does a better job in keeping
>>> ranges for every SSA name from conditional info (VRP "optimizes" its
>>> costly ASSERT_EXPR insertion process).
>>>
>>> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
>>>
>>> This will also fix the testcase from PR83072 but it doesn't really
>>> fix all cases I want to fix with a fix for it.  OTOH it might be
>>> this is enough for stage3.
>>>
>>> Richard.
>>>
>>> 2017-11-27  Richard Biener  
>>>
>>> PR tree-optimization/80776
>>> * gimple-ssa-evrp-analyze.h (evrp_range_analyzer::set_ssa_range_info):
>>> Declare.
>>> * gimple-ssa-evrp-analyze.c (evrp_range_analyzer::set_ssa_range_info):
>>> New function.
>>> (evrp_range_analyzer::record_ranges_from_incoming_edges):
>>> If the incoming edge is an effective fallthru because the other
>>> edge only reaches a __builtin_unreachable () then record ranges
>>> derived from the controlling condition in SSA info.
>>> (evrp_range_analyzer::record_ranges_from_phis): Use set_ssa_range_info.
>>> (evrp_range_analyzer::record_ranges_from_stmt): Likewise.
>>>
>>> * gcc.dg/pr80776-1.c: New testcase.
>>> * gcc.dg/pr80776-2.c: Likewise.
>> So the thing to make sure of here is that the range information we
>> reflect back into the SSA_NAME actually applies everywhere the SSA_NAME
>> can appear.  ie, it's globally valid.
>>
>> This means we can't reflect anything we derive from conditionals or
>> things like a *p making the range non-null back to the SSA_NAME.
>>
>> I'd be concerned about the change to record_ranges_from_incoming_edge.
> 
> It's basically a copy of what VRP does when removing range assertions.
> I've added the correctness check I missed and also the trick with
> setting nonzero bits.
> 
> This causes us to no longer handle the gcc.dg/pr80776-1.c case:
> 
>:
>   i_4 = somerandom ();
>   if (i_4 < 0)
> goto ; [INV]
>   else
> goto ; [INV]
> 
>:
>   __builtin_unreachable ();
> 
>:
>   i.0_1 = (unsigned int) i_4;
>   if (i.0_1 > 99)
> goto ; [INV]
>   else
> goto ; [INV]
> 
>:
>   __builtin_unreachable ();
> 
>:
>   _7 = __builtin___sprintf_chk (&number, 1, 7, "%d", i_4);
> 
> when trying to update the SSA range info for i_4 from the
> if (i.0_1 > 99) we see the use in the dominating condition
> and thus conclude we cannot update the SSA range info like we want.
So the unreachables add a twist.  If we assume the unreachables are in
fact unreachable, then ISTM we can use the conditionals to derive
refined ranges on the other edge.  The implications of the unreachable
didn't really sink in until just now.

> 
> ifcombine also doesn't merge the tests because I think it gets
> confused by the __builtin_unreachable ().  ifcombine also runs after
> VRP1 which gets rid of the __builtin_unreachable ()s.
Yea.  As you know I was just poking at a case yesterday where ifcombine
worked sub-optimally and its placement was inconvenient...  But given
the particular case I was looking at it made more sense to simplify the
IL prior to ifcombine.

> 
> I think for GCC 9 we may want to experiment with moving ifcombine
> before VRP1 and handling if-chains with __builtin_unreachable ()s.
Worth investigation.

Jeff


Re: [PATCH 1/7]: SVE: Add CLOBBER_HIGH expression

2017-11-28 Thread Jeff Law
On 11/28/2017 04:55 AM, Richard Biener wrote:

>>> Or consider a stream of code containing two tls_desc calls (ok, the 
>>> compiler might
>>> optimise one of the tls calls away, but this approach should be reusable 
>>> for other exprs).
>>> Between the two set(x,x)’s x is considered live so the register allocator 
>>> can’t use that
>>> register.
>>> Given that we are applying this to all the neon registers, the register 
>>> allocator now throws
>>> an ICE because it can’t find any free hard neon registers to use.
>> Given your statements it sounds like the liveness infrastructure is
>> making those neon regs globally live when it sees the low part subreg
>> self-set.  Let's confirm that one way or the other and see where it
>> takes us.
> 
> Indeed in (set (subreg:neon reg1) (subreg:neon reg1)) it appears that
> the lowpart of reg1
> is used and thus it is live but liveness analysis can (and should)
> simply ignore such sets.
My suggestion was going to be to peek a bit at the life analysis code if
indeed my suspicion was true.

Jeff


[PATCH] Fix PR80846, change vectorizer reduction epilogue (on x86)

2017-11-28 Thread Richard Biener

The following adds a new target hook, targetm.vectorize.split_reduction,
which allows the target to specify a preferred mode to perform the
final reducion on using either vector shifts or scalar extractions.
Up to that mode the vector reduction result is reduced by combining
lowparts and highparts recursively.  This avoids lane-crossing operations
when doing AVX256 on Zen and Bulldozer and also speeds up things on
Haswell (I verified ~20% speedup on Broadwell).

Thus the patch implements the target hook on x86 to _always_ prefer
SSE modes for the final reduction.

For the testcase in the bugzilla

int sumint(const int arr[]) {
arr = __builtin_assume_aligned(arr, 64);
int sum=0;
for (int i=0 ; i<1024 ; i++)
  sum+=arr[i];
return sum;
}

this changes -O3 -mavx512f code from

sumint:
.LFB0:
.cfi_startproc
vpxord  %zmm0, %zmm0, %zmm0
leaq4096(%rdi), %rax
.p2align 4,,10
.p2align 3
.L2:
vpaddd  (%rdi), %zmm0, %zmm0
addq$64, %rdi
cmpq%rdi, %rax
jne .L2
vpxord  %zmm1, %zmm1, %zmm1
vshufi32x4  $78, %zmm1, %zmm0, %zmm2
vpaddd  %zmm2, %zmm0, %zmm0
vmovdqa64   .LC0(%rip), %zmm2
vpermi2d%zmm1, %zmm0, %zmm2
vpaddd  %zmm2, %zmm0, %zmm0
vmovdqa64   .LC1(%rip), %zmm2
vpermi2d%zmm1, %zmm0, %zmm2
vpaddd  %zmm2, %zmm0, %zmm0
vmovdqa64   .LC2(%rip), %zmm2
vpermi2d%zmm1, %zmm0, %zmm2
vpaddd  %zmm2, %zmm0, %zmm0
vmovd   %xmm0, %eax

to

sumint:
.LFB0:
.cfi_startproc
vpxord  %zmm0, %zmm0, %zmm0
leaq4096(%rdi), %rax
.p2align 4,,10
.p2align 3
.L2:
vpaddd  (%rdi), %zmm0, %zmm0
addq$64, %rdi
cmpq%rdi, %rax
jne .L2
vextracti64x4   $0x1, %zmm0, %ymm1
vpaddd  %ymm0, %ymm1, %ymm1
vmovdqa %xmm1, %xmm0
vextracti128$1, %ymm1, %xmm1
vpaddd  %xmm1, %xmm0, %xmm0
vpsrldq $8, %xmm0, %xmm1
vpaddd  %xmm1, %xmm0, %xmm0
vpsrldq $4, %xmm0, %xmm1
vpaddd  %xmm1, %xmm0, %xmm0
vmovd   %xmm0, %eax

and for -O3 -mavx2 from

sumint:
.LFB0:
.cfi_startproc
vpxor   %xmm0, %xmm0, %xmm0
leaq4096(%rdi), %rax
.p2align 4,,10
.p2align 3
.L2:
vpaddd  (%rdi), %ymm0, %ymm0
addq$32, %rdi
cmpq%rdi, %rax
jne .L2
vpxor   %xmm1, %xmm1, %xmm1
vperm2i128  $33, %ymm1, %ymm0, %ymm2
vpaddd  %ymm2, %ymm0, %ymm0
vperm2i128  $33, %ymm1, %ymm0, %ymm2
vpalignr$8, %ymm0, %ymm2, %ymm2
vpaddd  %ymm2, %ymm0, %ymm0
vperm2i128  $33, %ymm1, %ymm0, %ymm1
vpalignr$4, %ymm0, %ymm1, %ymm1
vpaddd  %ymm1, %ymm0, %ymm0
vmovd   %xmm0, %eax

to

sumint:
.LFB0:
.cfi_startproc
vpxor   %xmm0, %xmm0, %xmm0
leaq4096(%rdi), %rax
.p2align 4,,10
.p2align 3
.L2:
vpaddd  (%rdi), %ymm0, %ymm0
addq$32, %rdi
cmpq%rdi, %rax
jne .L2
vmovdqa %xmm0, %xmm1
vextracti128$1, %ymm0, %xmm0
vpaddd  %xmm0, %xmm1, %xmm0
vpsrldq $8, %xmm0, %xmm1
vpaddd  %xmm1, %xmm0, %xmm0
vpsrldq $4, %xmm0, %xmm1
vpaddd  %xmm1, %xmm0, %xmm0
vmovd   %xmm0, %eax
vzeroupper
ret

which besides being faster is also smaller (less prefixes).

SPEC 2k6 results on Haswell (thus AVX2) are neutral.  As it merely
effects reduction vectorization epilogues I didn't expect big effects
but for loops that do not run much (more likely with AVX512).

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Ok for trunk?

The PR mentions some more tricks to optimize the sequence but
those look like backend only optimizations.

Thanks,
Richard.

2017-11-28  Richard Biener  

PR tree-optimization/80846
* target.def (split_reduction): New target hook.
* targhooks.c (default_split_reduction): New function.
* targhooks.h (default_split_reduction): Declare.
* tree-vect-loop.c (vect_create_epilog_for_reduction): If the
target requests first reduce vectors by combining low and high
parts.
* tree-vect-stmts.c (vect_gen_perm_mask_any): Adjust.
(get_vectype_for_scalar_type_and_size): Export.
* tree-vectorizer.h (get_vectype_for_scalar_type_and_size): Declare.

* doc/tm.texi.in (TARGET_VECTORIZE_SPLIT_REDUCTION): Document.
* doc/tm.texi: Regenerate.

i386/
* config/i386/i386.c (ix86_split_reduction): Implement
TARGET_VECTORIZE_SPLIT_REDUCTION.

* gcc.target/i386/pr80846-1.c: New testcase.
* gcc.target/i386/pr80846-2.c: Likewise.

Index: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c   

Re: [PATCH] Fix hot/cold partitioning with -gstabs{,+} (PR debug/81307)

2017-11-28 Thread Jeff Law
On 11/27/2017 06:07 PM, Jim Wilson wrote:
> On 11/27/2017 01:54 PM, Jim Wilson wrote:
>> On 11/27/2017 12:21 AM, Richard Biener wrote:
>>> Let's formally deprecate any non-DWARF debugging format support
>>> in GCC 8 changes.html?
>>
>>   We have two OS targets that only support stabs, though in both cases
>> the cpus do support dwarf and hence they could switch.
> 
> After prodding my memory a bit, this is m68k-openbsd and *-lynxos.
I'm pretty sure the OpenBSD guys have moved to dwarf2 by default at some
point.  Certainly appears that way looking at my OpenBSD 6.2 VM.

In the case of LynxOS, I haven't used it in about 25 years.  I have no
clue if they've moved away from stabs yet.  However, I don't think we
should let that stand in the way of moving towards deprecation.

My biggest worry WRT deprecation of stabs is AIX.

jeff


[arm-embedded] [PATCH, GCC/LTO, ping] Fix PR69866: LTO with def for weak alias in regular object file

2017-11-28 Thread Thomas Preudhomme

Hi,

We have decided to apply the forwarded patch to the embedded-7-branch to fix an 
ICE when doing partial LTO with weak symbols.


ChangeLog entry is as follows:

2017-11-28  Thomas Preud'homme  

Backport from mainline
2017-06-15  Jan Hubicka  
Thomas Preud'homme  

PR lto/69866
* lto-symtab.c (lto_symtab_merge_symbols): Drop useless definitions
that resolved externally.

Backport from mainline
2017-06-15  Thomas Preud'homme  

PR lto/69866
* gcc.dg/lto/pr69866_0.c: New test.
* gcc.dg/lto/pr69866_1.c: Likewise.


Best regards,

Thomas
--- Begin Message ---
Hi,
I am testing the following. Let me know if it works for you.

Honza

Index: lto/lto-symtab.c
===
--- lto/lto-symtab.c(revision 249213)
+++ lto/lto-symtab.c(working copy)
@@ -952,6 +952,42 @@
  if (tgt)
node->resolve_alias (tgt, true);
}
+ /* If the symbol was preempted outside IR, see if we want to get rid
+of the definition.  */
+ if (node->analyzed
+ && !DECL_EXTERNAL (node->decl)
+ && (node->resolution == LDPR_PREEMPTED_REG
+ || node->resolution == LDPR_RESOLVED_IR
+ || node->resolution == LDPR_RESOLVED_EXEC
+ || node->resolution == LDPR_RESOLVED_DYN))
+   {
+ DECL_EXTERNAL (node->decl) = 1;
+ /* If alias to local symbol was preempted by external definition,
+we know it is not pointing to the local symbol.  Remove it.  */
+ if (node->alias
+ && !node->weakref
+ && !node->transparent_alias
+ && node->get_alias_target ()->binds_to_current_def_p ())
+   {
+ node->alias = false;
+ node->remove_all_references ();
+ node->definition = false;
+ node->analyzed = false;
+ node->cpp_implicit_alias = false;
+   }
+ else if (!node->alias
+  && node->definition
+  && node->get_availability () <= AVAIL_INTERPOSABLE)
+   {
+ if ((cnode = dyn_cast  (node)) != NULL)
+   cnode->reset ();
+ else
+   {
+ node->analyzed = node->definition = false;
+ node->remove_all_references ();
+   }
+   }
+   }
 
  if (!(cnode = dyn_cast  (node))
  || !cnode->clone_of
--- End Message ---


Re: C++ PATCH to primary_template_instantiation_p

2017-11-28 Thread Maxim Kuvyrkov

> On Nov 28, 2017, at 12:29 AM, Jason Merrill  wrote:
> 
> All the uses of primary_template_instantiation_p actually want to
> query whether the entity in question is a specialization of the
> template, not whether it's an instantiation or explicit
> specialization.
> 
> Tested x86_64-pc-linux-gnu, applying to trunk.
> 

Hi Jason,

I get the following failure with the new test on x86_64-linux-gnu and 
aarch64-linux-gnu:

> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp0x/fntmpdefarg2a.C
> @@ -0,0 +1,27 @@
> +// PR c++/46831
> +// { dg-do compile { target c++11 } }
> +// { dg-options "" }
> +
> +struct B { };
> +struct D : B { };
> +struct A {
> +  template operator D&(); // { dg-message "template 
> conversion" }
> +  operator long();
> +};
> +
> +template <> A::operator D&();

"Template conversion" warning is triggered on this line, rather than above.

> +
> +void f(long);
> +void f(B&);
> +
> +struct A2 {
> +  template operator B&();
> +};
> +
> +void f2(const B&);
> +
> +int main() {
> +  f(A());
> +  f2(A2());
> +  f2(A());   // { dg-error "" }
> +}
> 

Would you please take a look?

===
spawn -ignore SIGHUP 
/home/tcwg-buildslave/workspace/tcwg-buildfarm/tcwg-x86_64-build/_build/builds/x86_64-unknown-linux-gnu/x86_64-unknown-linux-gnu/gcc.git~master-stage2/gcc/testsuite/g++5/../../xg++
 
-B/home/tcwg-buildslave/workspace/tcwg-buildfarm/tcwg-x86_64-build/_build/builds/x86_64-unknown-linux-gnu/x86_64-unknown-linux-gnu/gcc.git~master-stage2/gcc/testsuite/g++5/../../
 
/home/tcwg-buildslave/workspace/tcwg-buildfarm/tcwg-x86_64-build/snapshots/gcc.git~master/gcc/testsuite/g++.dg/cpp0x/fntmpdefarg2a.C
 -fno-diagnostics-show-caret -fdiagnostics-color=never -nostdinc++ 
-I/home/tcwg-buildslave/workspace/tcwg-buildfarm/tcwg-x86_64-build/_build/builds/x86_64-unknown-linux-gnu/x86_64-unknown-linux-gnu/gcc.git~master-stage2/x86_64-unknown-linux-gnu/libstdc++-v3/include/x86_64-unknown-linux-gnu
 
-I/home/tcwg-buildslave/workspace/tcwg-buildfarm/tcwg-x86_64-build/_build/builds/x86_64-unknown-linux-gnu/x86_64-unknown-linux-gnu/gcc.git~master-stage2/x86_64-unknown-linux-gnu/libstdc++-v3/include
 
-I/home/tcwg-buildslave/workspace/tcwg-buildfarm/tcwg-x86_64-build/snapshots/gcc.git~master/libstdc++-v3/libsupc++
 
-I/home/tcwg-buildslave/workspace/tcwg-buildfarm/tcwg-x86_64-build/snapshots/gcc.git~master/libstdc++-v3/include/backward
 
-I/home/tcwg-buildslave/workspace/tcwg-buildfarm/tcwg-x86_64-build/snapshots/gcc.git~master/libstdc++-v3/testsuite/util
 -fmessage-length=0 -std=gnu++11 -S -o fntmpdefarg2a.s
/home/tcwg-buildslave/workspace/tcwg-buildfarm/tcwg-x86_64-build/snapshots/gcc.git~master/gcc/testsuite/g++.dg/cpp0x/fntmpdefarg2a.C:
 In function 'int main()':
/home/tcwg-buildslave/workspace/tcwg-buildfarm/tcwg-x86_64-build/snapshots/gcc.git~master/gcc/testsuite/g++.dg/cpp0x/fntmpdefarg2a.C:26:6:
 error: invalid user-defined conversion from 'A' to 'const B&' [-fpermissive]
/home/tcwg-buildslave/workspace/tcwg-buildfarm/tcwg-x86_64-build/snapshots/gcc.git~master/gcc/testsuite/g++.dg/cpp0x/fntmpdefarg2a.C:12:13:
 note: candidate is: 'A::operator D&() [with T = void]' 
/home/tcwg-buildslave/workspace/tcwg-buildfarm/tcwg-x86_64-build/snapshots/gcc.git~master/gcc/testsuite/g++.dg/cpp0x/fntmpdefarg2a.C:12:13:
 note:   conversion from return type 'D&' of template conversion function 
specialization to 'const B&' is not an exact match
/home/tcwg-buildslave/workspace/tcwg-buildfarm/tcwg-x86_64-build/snapshots/gcc.git~master/gcc/testsuite/g++.dg/cpp0x/fntmpdefarg2a.C:21:6:
 note:   initializing argument 1 of 'void f2(const B&)'
compiler exited with status 1
FAIL: g++.dg/cpp0x/fntmpdefarg2a.C  -std=gnu++11  (test for warnings, line 8)
PASS: g++.dg/cpp0x/fntmpdefarg2a.C  -std=gnu++11  (test for errors, line 26)
PASS: g++.dg/cpp0x/fntmpdefarg2a.C  -std=gnu++11 (test for excess errors)
===

Regards,

--
Maxim Kuvyrkov
www.linaro.org





Re: [committed] Add testcase for PR rtl-optimization/81020

2017-11-28 Thread Christophe Lyon
Hi,

On 28 November 2017 at 11:24, Jakub Jelinek  wrote:
> Hi!
>
> This testcase was fixed by Segher's r254875:
> https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00340.html
> on the trunk, so I've committed it to trunk as obvious.
> On release branches it still needs fixing.
>
> 2017-11-28  Jakub Jelinek  
>
> PR rtl-optimization/81020
> * gcc.dg/pr81020.c: New test.
>

I'm seeing:
FAIL: gcc.dg/pr81020.c execution test
on aarch64

Is it expected?


> --- gcc/testsuite/gcc.dg/pr81020.c.jj   2017-11-27 18:49:06.659907687 +0100
> +++ gcc/testsuite/gcc.dg/pr81020.c  2017-11-27 18:49:00.431982443 +0100
> @@ -0,0 +1,23 @@
> +/* PR rtl-optimization/81020 */
> +/* { dg-do run } */
> +/* { dg-options "-O -fno-tree-bit-ccp -fno-tree-coalesce-vars -fno-tree-vrp" 
> } */
> +
> +unsigned v = 4;
> +
> +unsigned long long __attribute__((noipa))
> +foo (unsigned x)
> +{
> +  unsigned a = v;
> +  a &= 1;
> +  x |= 0 < a;
> +  a >>= 31;
> +  return x + a;
> +}
> +
> +int
> +main ()
> +{
> +  if (foo (2) != 2)
> +__builtin_abort ();
> +  return 0;
> +}
>
> Jakub


Re: [PATCH GCC][V2]A simple implementation of loop interchange

2017-11-28 Thread David Malcolm
On Tue, 2017-11-28 at 15:26 +, Bin Cheng wrote:
> Hi,
> This is updated patch with review comments resolved.  Some
> explanation embedded below.
> 
> On Mon, Nov 20, 2017 at 2:46 PM, Richard Biener  il.com> wrote:
> > On Thu, Nov 16, 2017 at 4:18 PM, Bin.Cheng 
> > wrote:
> > > On Tue, Oct 24, 2017 at 3:30 PM, Michael Matz 
> > > wrote:
> > > > Hello,
> > > > 
> > > > On Fri, 22 Sep 2017, Bin.Cheng wrote:
> > > > 
> > > > > This is updated patch for loop interchange with review
> > > > > suggestions
> > > > > resolved.  Changes are:
> > > > >   1) It does more light weight checks like rectangle loop
> > > > > nest check
> > > > > earlier than before.
> > > > >   2) It checks profitability of interchange before data
> > > > > dependence computation.
> > > > >   3) It calls find_data_references_in_loop only once for a
> > > > > loop nest now.
> > > > >   4) Data dependence is open-computed so that we can skip
> > > > > instantly at
> > > > > unknown dependence.
> > > > >   5) It improves code generation in mapping induction
> > > > > variables for
> > > > > loop nest, as well as
> > > > >  adding a simple dead code elimination pass.
> > > > >   6) It changes magic constants into parameters.
> > > > 
> > > > So I have a couple comments/questions.  Something stylistic:
> > > 
> > > Hi Michael,
> > > Thanks for reviewing.
> > > 
> > > > 
> > > > > +class loop_cand
> > > > > +{
> > > > > +public:
> > > > > ...
> > > > > +  friend class tree_loop_interchange;
> > > > > +private:
> > > > 
> > > > Just make this all public (and hence a struct, not class).
> > > > No need for friends in file local classes.
> > > 
> > > Done.
> > > 
> > > > 
> > > > > +single_use_in_loop (tree var, struct loop *loop)
> > > > > ...
> > > > > +  FOR_EACH_IMM_USE_FAST (use_p, iterator, var)
> > > > > +{
> > > > > +  stmt = USE_STMT (use_p);
> > > > > ...
> > > > > +  basic_block bb = gimple_bb (stmt);
> > > > > +  gcc_assert (bb != NULL);
> > > > 
> > > > This pattern reoccurs often in your patch: you check for a bb
> > > > associated
> > > > for a USE_STMT.  Uses of SSA names always occur in basic
> > > > blocks, no need
> > > > for checking.
> > > 
> > > Done.
> > > 
> > > > 
> > > > Then, something about your handling of simple reductions:
> > > > 
> > > > > +void
> > > > > +loop_cand::classify_simple_reduction (reduction_p re)
> > > > > +{
> > > > > ...
> > > > > +  /* Require memory references in producer and consumer are
> > > > > the same so
> > > > > + that we can undo reduction during interchange.  */
> > > > > +  if (re->init_ref && !operand_equal_p (re->init_ref, re-
> > > > > >fini_ref, 0))
> > > > > +return;
> > > > 
> > > > Where is it checked that the undoing transformation is legal
> > > > also
> > > > from a data dep point of view?  Think code like this:
> > > > 
> > > >sum = X[i];
> > > >for (j ...)
> > > >  sum += X[j];
> > > >X[i] = sum;
> > > > 
> > > > Moving the store into the inner loop isn't always correct and I
> > > > don't seem
> > > > to find where the above situation is rejected.
> > > 
> > > Yeah.  for the old patch, it's possible to have such loop wrongly
> > > interchanged;
> > > in practice, it's hard to create an example.  The pass will give
> > > up
> > > when computing
> > > data dep between references in inner/outer loops.  In this
> > > updated
> > > patch, it's fixed
> > > by giving up if there is any dependence between references of
> > > inner/outer loops.
> > > 
> > > > 
> > > > Maybe I'm confused because I also don't see where you even can
> > > > get into
> > > > the above situation (though I do see testcases about
> > > > this).  The thing is,
> > > > for an 2d loop nest to contain something like the above
> > > > reduction it can't
> > > > be perfect:
> > > > 
> > > >for (j) {
> > > >  int sum = X[j];  // 1
> > > >  for (i)
> > > >sum += Y[j][i];
> > > >  X[j] = sum;  // 2
> > > >}
> > > > 
> > > > But you do check for perfectness in
> > > > proper_loop_form_for_interchange and
> > > > prepare_perfect_loop_nest, so either you can't get into the
> > > > situation or
> > > > the checking can't be complete, or you define the above to be
> > > > perfect
> > > > nevertheless (probably because the load and store are in outer
> > > > loop
> > > > header/exit blocks?).  The latter would mean that you accept
> > > > also other
> > > > code in header/footer of loops from a pure CFG perspective, so
> > > > where is it
> > > > checked that that other code (which aren't simple reductions)
> > > > isn't
> > > > harmful to the transformation?
> > > 
> > > Yes, I used the name perfect loop nest, but the pass can handle
> > > special form
> > > imperfect loop nest for the simple reduction.  I added comments
> > > describing
> > > this before function prepare_perfect_loop_nest.
> > > 
> > > > 
> > > > Then, the data dependence part of the new pass:
> > > > 
> > > > > +bool
> > > > > +tree_loop_interchange::valid_data_depe

Re: [PATCH GCC][V2]A simple implementation of loop interchange

2017-11-28 Thread Bin.Cheng
On Tue, Nov 28, 2017 at 4:00 PM, David Malcolm  wrote:
> On Tue, 2017-11-28 at 15:26 +, Bin Cheng wrote:
>> Hi,
>> This is updated patch with review comments resolved.  Some
>> explanation embedded below.
>>
>> On Mon, Nov 20, 2017 at 2:46 PM, Richard Biener > il.com> wrote:
>> > On Thu, Nov 16, 2017 at 4:18 PM, Bin.Cheng 
>> > wrote:
>> > > On Tue, Oct 24, 2017 at 3:30 PM, Michael Matz 
>> > > wrote:
>> > > > Hello,
>> > > >
>> > > > On Fri, 22 Sep 2017, Bin.Cheng wrote:
>> > > >
>> > > > > This is updated patch for loop interchange with review
>> > > > > suggestions
>> > > > > resolved.  Changes are:
>> > > > >   1) It does more light weight checks like rectangle loop
>> > > > > nest check
>> > > > > earlier than before.
>> > > > >   2) It checks profitability of interchange before data
>> > > > > dependence computation.
>> > > > >   3) It calls find_data_references_in_loop only once for a
>> > > > > loop nest now.
>> > > > >   4) Data dependence is open-computed so that we can skip
>> > > > > instantly at
>> > > > > unknown dependence.
>> > > > >   5) It improves code generation in mapping induction
>> > > > > variables for
>> > > > > loop nest, as well as
>> > > > >  adding a simple dead code elimination pass.
>> > > > >   6) It changes magic constants into parameters.
>> > > >
>> > > > So I have a couple comments/questions.  Something stylistic:
>> > >
>> > > Hi Michael,
>> > > Thanks for reviewing.
>> > >
>> > > >
>> > > > > +class loop_cand
>> > > > > +{
>> > > > > +public:
>> > > > > ...
>> > > > > +  friend class tree_loop_interchange;
>> > > > > +private:
>> > > >
>> > > > Just make this all public (and hence a struct, not class).
>> > > > No need for friends in file local classes.
>> > >
>> > > Done.
>> > >
>> > > >
>> > > > > +single_use_in_loop (tree var, struct loop *loop)
>> > > > > ...
>> > > > > +  FOR_EACH_IMM_USE_FAST (use_p, iterator, var)
>> > > > > +{
>> > > > > +  stmt = USE_STMT (use_p);
>> > > > > ...
>> > > > > +  basic_block bb = gimple_bb (stmt);
>> > > > > +  gcc_assert (bb != NULL);
>> > > >
>> > > > This pattern reoccurs often in your patch: you check for a bb
>> > > > associated
>> > > > for a USE_STMT.  Uses of SSA names always occur in basic
>> > > > blocks, no need
>> > > > for checking.
>> > >
>> > > Done.
>> > >
>> > > >
>> > > > Then, something about your handling of simple reductions:
>> > > >
>> > > > > +void
>> > > > > +loop_cand::classify_simple_reduction (reduction_p re)
>> > > > > +{
>> > > > > ...
>> > > > > +  /* Require memory references in producer and consumer are
>> > > > > the same so
>> > > > > + that we can undo reduction during interchange.  */
>> > > > > +  if (re->init_ref && !operand_equal_p (re->init_ref, re-
>> > > > > >fini_ref, 0))
>> > > > > +return;
>> > > >
>> > > > Where is it checked that the undoing transformation is legal
>> > > > also
>> > > > from a data dep point of view?  Think code like this:
>> > > >
>> > > >sum = X[i];
>> > > >for (j ...)
>> > > >  sum += X[j];
>> > > >X[i] = sum;
>> > > >
>> > > > Moving the store into the inner loop isn't always correct and I
>> > > > don't seem
>> > > > to find where the above situation is rejected.
>> > >
>> > > Yeah.  for the old patch, it's possible to have such loop wrongly
>> > > interchanged;
>> > > in practice, it's hard to create an example.  The pass will give
>> > > up
>> > > when computing
>> > > data dep between references in inner/outer loops.  In this
>> > > updated
>> > > patch, it's fixed
>> > > by giving up if there is any dependence between references of
>> > > inner/outer loops.
>> > >
>> > > >
>> > > > Maybe I'm confused because I also don't see where you even can
>> > > > get into
>> > > > the above situation (though I do see testcases about
>> > > > this).  The thing is,
>> > > > for an 2d loop nest to contain something like the above
>> > > > reduction it can't
>> > > > be perfect:
>> > > >
>> > > >for (j) {
>> > > >  int sum = X[j];  // 1
>> > > >  for (i)
>> > > >sum += Y[j][i];
>> > > >  X[j] = sum;  // 2
>> > > >}
>> > > >
>> > > > But you do check for perfectness in
>> > > > proper_loop_form_for_interchange and
>> > > > prepare_perfect_loop_nest, so either you can't get into the
>> > > > situation or
>> > > > the checking can't be complete, or you define the above to be
>> > > > perfect
>> > > > nevertheless (probably because the load and store are in outer
>> > > > loop
>> > > > header/exit blocks?).  The latter would mean that you accept
>> > > > also other
>> > > > code in header/footer of loops from a pure CFG perspective, so
>> > > > where is it
>> > > > checked that that other code (which aren't simple reductions)
>> > > > isn't
>> > > > harmful to the transformation?
>> > >
>> > > Yes, I used the name perfect loop nest, but the pass can handle
>> > > special form
>> > > imperfect loop nest for the simple reduction.  I added comments
>> > > describing
>> > > this before functi

Re: [092/nnn] poly_int: PUSH_ROUNDING

2017-11-28 Thread Jeff Law
On 10/23/2017 11:37 AM, Richard Sandiford wrote:
> PUSH_ROUNDING is difficult to convert to a hook since there is still
> a lot of conditional code based on it.  It isn't clear that a direct
> conversion with checks for null hooks is the right thing to do.
> 
> Rather than untangle that, this patch converts all implementations
> that do something to out-of-line functions that have the same
> interface as a hook would have.  This should at least help towards
> any future hook conversion.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * config/cr16/cr16-protos.h (cr16_push_rounding): Declare.
>   * config/cr16/cr16.h (PUSH_ROUNDING): Move implementation to...
>   * config/cr16/cr16.c (cr16_push_rounding): ...this new function.
>   * config/h8300/h8300-protos.h (h8300_push_rounding): Declare.
>   * config/h8300/h8300.h (PUSH_ROUNDING): Move implementation to...
>   * config/h8300/h8300.c (h8300_push_rounding): ...this new function.
>   * config/i386/i386-protos.h (ix86_push_rounding): Declare.
>   * config/i386/i386.h (PUSH_ROUNDING): Move implementation to...
>   * config/i386/i386.c (ix86_push_rounding): ...this new function.
>   * config/m32c/m32c-protos.h (m32c_push_rounding): Take and return
>   a poly_int64.
>   * config/m32c/m32c.c (m32c_push_rounding): Likewise.
>   * config/m68k/m68k-protos.h (m68k_push_rounding): Declare.
>   * config/m68k/m68k.h (PUSH_ROUNDING): Move implementation to...
>   * config/m68k/m68k.c (m68k_push_rounding): ...this new function.
>   * config/pdp11/pdp11-protos.h (pdp11_push_rounding): Declare.
>   * config/pdp11/pdp11.h (PUSH_ROUNDING): Move implementation to...
>   * config/pdp11/pdp11.c (pdp11_push_rounding): ...this new function.
>   * config/stormy16/stormy16-protos.h (xstormy16_push_rounding): Declare.
>   * config/stormy16/stormy16.h (PUSH_ROUNDING): Move implementation to...
>   * config/stormy16/stormy16.c (xstormy16_push_rounding): ...this new
>   function.
>   * expr.c (emit_move_resolve_push): Treat the input and result
>   of PUSH_ROUNDING as a poly_int64.
>   (emit_move_complex_push, emit_single_push_insn_1): Likewise.
>   (emit_push_insn): Likewise.
>   * lra-eliminations.c (mark_not_eliminable): Likewise.
>   * recog.c (push_operand): Likewise.
>   * reload1.c (elimination_effects): Likewise.
>   * rtlanal.c (nonzero_bits1): Likewise.
>   * calls.c (store_one_arg): Likewise.  Require the padding to be
>   known at compile time.
OK.

I so wish PUSH_ROUNDING wasn't needed and that folks could at least keep
their processors consistent (I'm looking at the coldfire designers :(.
For a tale of woe, see BZ68467.

Jeff


Re: [087/nnn] poly_int: subreg_get_info

2017-11-28 Thread Jeff Law
On 10/23/2017 11:35 AM, Richard Sandiford wrote:
> This patch makes subreg_get_info handle polynomial sizes.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * rtlanal.c (subreg_get_info): Handle polynomial mode sizes.
OK.
Jeff


Re: [RFA][PATCH] Stack clash protection 07/08 -- V4 (aarch64 bits)

2017-11-28 Thread Rich Felker
On Mon, Nov 27, 2017 at 03:48:41PM +, Szabolcs Nagy wrote:
> On 28/10/17 05:08, Jeff Law wrote:
> > On 10/13/2017 02:26 PM, Wilco Dijkstra wrote:
> >> For larger frames the first oddity is that there are now 2 separate params
> >> controlling how probes are generated:
> >>
> >> stack-clash-protection-guard-size (default 12, but set to 16 on AArch64)
> >> stack-clash-protection-probe-interval (default 12)
> >>
> >> I don't see how this makes sense. These values are closely related, so if
> >> one is different from the other, probing becomes ineffective/incorrect. 
> >> For example we generate code that trivially bypasses the guard despite
> >> all the probing:
> > My hope would be that we simply don't ever use the params.  They were
> > done as much for *you* to experiment with as anything.  I'd happy just
> > delete them as there's essentially no guard rails to ensure their values
> > are sane.
> 
> so is there a consensus now that 64k guard size is what
> gcc stack probing will assume?

I would very much prefer the compiler never assume more than 1 guard
page is present. I understand the performance argument, but I think
it's mostly in artificial scenarios where it makes a difference:

If a function has 4k and actually uses it, you're almost certainly
looking at >4k cycles, and a few cycles to probe the guard page are
irrelevant (dominated by the actual work).

If a function has >4k of local data and code paths that can easily be
determined don't use it (e.g. the big data is in a conditional block),
the compiler should shrink-wrap them anyway in order to avoid
pathological blowing away of the stack like what LLVM does all over
the place (hoisting stack allocations up as far as it can). The probe
can then likewise be moved with the shrinkwrapping.

The only remaining case is functions which have >4k of local data
that's mostly unused, but no easy way to determine that it's not used,
or where which small part is actually used is data-dependant. This is
the case where the probe is mildly costly, but it seems very unlikely
to be a common case in real-world usage.

> if so i can propose a patch to glibc to actually have
> that much guard by default in threads.. (i think it
> makes sense on all 64bit targets to have bigger guard
> and a consensus here would help making that change)

I don't object to making musl libc default to >64k (I'd prefer 68k to
avoid preserving alignment mod a large power of 2) guard size on
64-bit archs (on 32-bit, including ILP32, though, it's prohibitive
because it exhausts virtual address space; this may affect aarch64's
ILP32 model). It doesn't have any significant cost and it's useful
hardening. But I think it would be unfortunate if smaller guard sizes,
which applications can request/set, were left unsafe due to the
compiler making a tradeoff for performance that doesn't actually get
you any measureable real-world performance benefits.

Rich


Re: [085/nnn] poly_int: expand_vector_ubsan_overflow

2017-11-28 Thread Jeff Law
On 10/23/2017 11:34 AM, Richard Sandiford wrote:
> This patch makes expand_vector_ubsan_overflow cope with a polynomial
> number of elements.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * internal-fn.c (expand_vector_ubsan_overflow): Handle polynomial
>   numbers of elements.
OK
jeff


Re: [084/nnn] poly_int: folding BIT_FIELD_REFs on vectors

2017-11-28 Thread Jeff Law
On 10/23/2017 11:33 AM, Richard Sandiford wrote:
> This patch makes the:
> 
>   (BIT_FIELD_REF CONSTRUCTOR@0 @1 @2)
> 
> folder cope with polynomial numbers of elements.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * match.pd: Cope with polynomial numbers of vector elements.
ARgh.  It took me a moment of wondering why it didn't look like C code.
It's match.pd :-)

OK.

jeff



Re: [083/nnn] poly_int: fold_indirect_ref_1

2017-11-28 Thread Jeff Law
On 10/23/2017 11:33 AM, Richard Sandiford wrote:
> This patch makes fold_indirect_ref_1 handle polynomial offsets in
> a POINTER_PLUS_EXPR.  The specific reason for doing this now is
> to handle:
> 
> (tree_to_uhwi (part_width) / BITS_PER_UNIT
>  * TYPE_VECTOR_SUBPARTS (op00type));
> 
> when TYPE_VECTOR_SUBPARTS becomes a poly_int.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * fold-const.c (fold_indirect_ref_1): Handle polynomial offsets
>   in a POINTER_PLUS_EXPR.
OK.
jeff


Re: [082/nnn] poly_int: omp-simd-clone.c

2017-11-28 Thread Jeff Law
On 10/23/2017 11:33 AM, Richard Sandiford wrote:
> This patch adds a wrapper around TYPE_VECTOR_SUBPARTS for omp-simd-clone.c.
> Supporting SIMD clones for variable-length vectors is post GCC8 work.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * omp-simd-clone.c (simd_clone_subparts): New function.
>   (simd_clone_init_simd_arrays): Use it instead of TYPE_VECTOR_SUBPARTS.
>   (ipa_simd_modify_function_body): Likewise.
OK.
jeff


Re: [078/nnn] poly_int: two-operation SLP

2017-11-28 Thread Jeff Law
On 10/23/2017 11:31 AM, Richard Sandiford wrote:
> This patch makes two-operation SLP handle but reject variable-length
> vectors.  Adding support for this is a post-GCC8 thing.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * tree-vect-slp.c (vect_build_slp_tree_1): Handle polynomial
>   numbers of units.
>   (vect_schedule_slp_instance): Likewise.
OK.
jeff


Re: [077/nnn] poly_int: vect_get_constant_vectors

2017-11-28 Thread Jeff Law
On 10/23/2017 11:31 AM, Richard Sandiford wrote:
> For now, vect_get_constant_vectors can only cope with constant-length
> vectors, although a patch after the main SVE submission relaxes this.
> This patch adds an appropriate guard for variable-length vectors.
> The TYPE_VECTOR_SUBPARTS use in vect_get_constant_vectors will then
> have a to_constant call when TYPE_VECTOR_SUBPARTS becomes a poly_int.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * tree-vect-slp.c (vect_get_and_check_slp_defs): Reject
>   constant and extern definitions for variable-length vectors.
>   (vect_get_constant_vectors): Note that the number of units
>   is known to be constant.
OK.
jeff

ps.  Sorry about the strange ordering of acks.  I'm trying to work
through the simple stuff and come back to the larger patches.  The only
way to eat an elephant is a bite at a time...


Re: [076/nnn] poly_int: vectorizable_conversion

2017-11-28 Thread Jeff Law
On 10/23/2017 11:30 AM, Richard Sandiford wrote:
> This patch makes vectorizable_conversion cope with variable-length
> vectors.  We already require the number of elements in one vector
> to be a multiple of the number of elements in the other vector,
> so the patch uses that to choose between widening and narrowing.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * tree-vect-stmts.c (vectorizable_conversion): Treat the number
>   of units as polynomial.  Choose between WIDE and NARROW based
>   on multiple_p.
If I'm reding this right, if nunits_in < nunits_out, but the latter is
not a multiple of the former, we'll choose WIDEN, which is the opposite
of what we'd do before this patch.  Was that intentional?


jeff


Re: [075/nnn] poly_int: vectorizable_simd_clone_call

2017-11-28 Thread Jeff Law
On 10/23/2017 11:30 AM, Richard Sandiford wrote:
> This patch makes vectorizable_simd_clone_call cope with variable-length
> vectors.  For now we don't support SIMD clones for variable-length
> vectors; this will be post GCC 8 material.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * tree-vect-stmts.c (simd_clone_subparts): New function.
>   (vectorizable_simd_clone_call): Use it instead of TYPE_VECTOR_SUBPARTS.
OK.
jeff


Re: [074/nnn] poly_int: vectorizable_call

2017-11-28 Thread Jeff Law
On 10/23/2017 11:30 AM, Richard Sandiford wrote:
> This patch makes vectorizable_call handle variable-length vectors.
> The only substantial change is to use build_index_vector for
> IFN_GOMP_SIMD_LANE; this makes no functional difference for
> fixed-length vectors.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * tree-vect-stmts.c (vectorizable_call): Treat the number of
>   vectors as polynomial.  Use build_index_vector for
>   IFN_GOMP_SIMD_LANE.
OK.
jeff


Re: [072/nnn] poly_int: vectorizable_live_operation

2017-11-28 Thread Jeff Law
On 10/23/2017 11:29 AM, Richard Sandiford wrote:
> This patch makes vectorizable_live_operation cope with variable-length
> vectors.  For now we just handle cases in which we can tell at compile
> time which vector contains the final result.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * tree-vect-loop.c (vectorizable_live_operation): Treat the number
>   of units as polynomial.  Punt if we can't tell at compile time
>   which vector contains the final result.
OK.
jeff


Re: [069/nnn] poly_int: vector_alignment_reachable_p

2017-11-28 Thread Jeff Law
On 10/23/2017 11:28 AM, Richard Sandiford wrote:
> This patch makes vector_alignment_reachable_p cope with variable-length
> vectors.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * tree-vect-data-refs.c (vector_alignment_reachable_p): Treat the
>   number of units as polynomial.

OK
jeff


Re: [067/nnn] poly_int: get_mask_mode

2017-11-28 Thread Jeff Law
On 10/23/2017 11:27 AM, Richard Sandiford wrote:
> This patch makes TARGET_GET_MASK_MODE take polynomial nunits and
> vector_size arguments.  The gcc_assert in default_get_mask_mode
> is now handled by the exact_div call in vector_element_size.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * target.def (get_mask_mode): Take the number of units and length
>   as poly_uint64s rather than unsigned ints.
>   * targhooks.h (default_get_mask_mode): Update accordingly.
>   * targhooks.c (default_get_mask_mode): Likewise.
>   * config/i386/i386.c (ix86_get_mask_mode): Likewise.
>   * doc/tm.texi: Regenerate.
>
OK.
jeff


Re: [061/nnn] poly_int: compute_data_ref_alignment

2017-11-28 Thread Jeff Law
On 10/23/2017 11:25 AM, Richard Sandiford wrote:
> This patch makes vect_compute_data_ref_alignment treat DR_INIT as a
> poly_int and handles cases in which the calculated misalignment might
> not be constant.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * tree-vect-data-refs.c (vect_compute_data_ref_alignment):
>   Treat drb->init as a poly_int.  Fail if its misalignment wrt
>   vector_alignment isn't known.
OK.
jeff


Re: [058/nnn] poly_int: get_binfo_at_offset

2017-11-28 Thread Jeff Law
On 10/23/2017 11:24 AM, Richard Sandiford wrote:
> This patch changes the offset parameter to get_binfo_at_offset
> from HOST_WIDE_INT to poly_int64.  This function probably doesn't
> need to handle polynomial offsets in practice, but it's easy
> to do and avoids forcing the caller to check first.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * tree.h (get_binfo_at_offset): Take the offset as a poly_int64
>   rather than a HOST_WIDE_INT.
>   * tree.c (get_binfo_at_offset): Likewise.
>
OK.
jeff


Re: [057/nnn] poly_int: build_ref_for_offset

2017-11-28 Thread Jeff Law
On 10/23/2017 11:23 AM, Richard Sandiford wrote:
> This patch changes the offset parameter to build_ref_for_offset
> from HOST_WIDE_INT to poly_int64.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * ipa-prop.h (build_ref_for_offset): Take the offset as a poly_int64
>   rather than a HOST_WIDE_INT.
>   * tree-sra.c (build_ref_for_offset): Likewise.
> 
OK
jeff



Re: [055/nnn] poly_int: find_bswap_or_nop_load

2017-11-28 Thread Jeff Law
On 10/23/2017 11:23 AM, Richard Sandiford wrote:
> This patch handles polynomial offsets in find_bswap_or_nop_load,
> which could be useful for constant-sized data at a variable offset.
> It is needed for a later patch to compile.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * tree-ssa-math-opts.c (find_bswap_or_nop_load): Track polynomial
>   offsets for MEM_REFs.
OK.
jeff


Re: [054/nnn] poly_int: adjust_ptr_info_misalignment

2017-11-28 Thread Jeff Law
On 10/23/2017 11:22 AM, Richard Sandiford wrote:
> This patch makes adjust_ptr_info_misalignment take the adjustment
> as a poly_uint64 rather than an unsigned int.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * tree-ssanames.h (adjust_ptr_info_misalignment): Take the increment
>   as a poly_uint64 rather than an unsigned int.
>   * tree-ssanames.c (adjust_ptr_info_misalignment): Likewise.
> 
OK.
jeff


Re: [053/nnn] poly_int: decode_addr_const

2017-11-28 Thread Jeff Law
On 10/23/2017 11:22 AM, Richard Sandiford wrote:
> This patch makes the varasm-local addr_const track polynomial offsets.
> I'm not sure how useful this is, but it was easier to convert than not.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * varasm.c (addr_const::offset): Change from HOST_WIDE_INT
>   to poly_int64.
>   (decode_addr_const): Update accordingly.
> 
OK
jeff


Re: [050/nnn] poly_int: reload<->ira interface

2017-11-28 Thread Jeff Law
On 10/23/2017 11:21 AM, Richard Sandiford wrote:
> This patch uses poly_int64 for:
> 
> - ira_reuse_stack_slot
> - ira_mark_new_stack_slot
> - ira_spilled_reg_stack_slot::width
> 
> all of which are part of the IRA/reload interface.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * ira-int.h (ira_spilled_reg_stack_slot::width): Change from
>   an unsigned int to a poly_uint64.
>   * ira.h (ira_reuse_stack_slot, ira_mark_new_stack_slot): Take the
>   sizes as poly_uint64s rather than unsigned ints.
>   * ira-color.c (ira_reuse_stack_slot, ira_mark_new_stack_slot):
>   Likewise.
OK
Jeff





Re: [049/nnn] poly_int: emit_inc

2017-11-28 Thread Jeff Law
On 10/23/2017 11:20 AM, Richard Sandiford wrote:
> This patch changes the LRA emit_inc routine so that it takes
> a poly_int64 rather than an int.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * lra-constraints.c (emit_inc): Change inc_amount from an int
>   to a poly_int64.
OK.
jeff


Re: [036/nnn] poly_int: get_object_alignment_2

2017-11-28 Thread Jeff Law
On 10/23/2017 11:14 AM, Richard Sandiford wrote:
> This patch makes get_object_alignment_2 track polynomial offsets
> and sizes.  The real work is done by get_inner_reference, but we
> then need to handle the alignment correctly.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * builtins.c (get_object_alignment_2): Track polynomial offsets
>   and sizes.  Update the alignment handling.
OK.
jeff


Re: [PATCH] Implement std::to_address for C++2a

2017-11-28 Thread Glen Fernandes
On Tue, Nov 28, 2017 at 9:24 AM, Jonathan Wakely wrote:
> Thanks, Glen, I've committed this to trunk, with one small change to
> fix the copyright dates in the new test, to be just 2017.

Thanks!

> Because my new hobby is finding uses for if-constexpr, I think we
> could have used the detection idiom to do it in a single overload, but
> I don't see any reason to prefer that over your implementation:

I was thinking about using if-constexpr with std::is_detected_v but
wondered if it wouldn't be appropriate to use the latter until it
transitions from TS to IS. (But now that you've pointed it out, I
guess an implementation detail like __detected_or_t can live on
forever, even if the detection idiom facilities do not get adopted).

> However, more importantly, both this form and yours fails for the
> following test case, in two ways:
>
> struct P {
>  using F = void();
>
>  F* operator->() const noexcept { return nullptr; }
> };
>
> I'm not sure if this is a bug in our std::pointer_traits, or if the
> standard requires the specialization of std::pointer_traits to be
> ill-formed (see [pointer.traits.types] p1). We have a problem if it
> does require it, and either need to relax the requirements on
> pointer_traits, or we need to alter the wording for to_address so that
> it doesn't try to use pointer_traits when the specialization would be
> ill-formed.

Could both be avoided? That is: I don't know if we need to relax it,
or make to_address tolerate it, if the intent is to require the user
to make P a valid pointer-like type  such that pointer_traits is
not ill-formed (by 1. providing an element_type member or 2.
specializing pointer_traits, since P is not a template template). Current implementations of __to_address or
__to_raw_pointer that are in use by our library facilities already
have this requirement implicitly (those that use typename
pointer_traits::element_type* as the return type, instead of C++14
auto), so users working with non-raw pointers would already be doing 1
or 2.

> Secondly, if I remove that static_assert from  then
> the test compiles, which is wrong, because it calls std::to_address on
> a function pointer type. That should be ill-formed. The problem is
> that the static_assert(!is_function_v<_Ptr>) is in std::to_address and
> the implementation actually uses std::__to_address. So I think we want
> the !is_function_v<_Ptr> check to be moved to the __to_address(_Ptr*)
> overload.

Ah, yes. I'll move the static_assert into that overload (enabled in
C++2a or higher mode, since it uses is_function_v).

Glen


Re: [C PATCH] Handle C SWITCH_EXPR in block_may_fallthru (PR sanitizer/81275)

2017-11-28 Thread Jeff Law
On 11/28/2017 01:49 AM, Jakub Jelinek wrote:
> Hi!
> 
> This is the C version of the switch block_may_fallthru handling.
> Unlike C++ SWITCH_STMT, break; is represented in SWITCH_EXPR by a goto
> to a label emitted after the SWITCH_EXPR, so either block_may_fallthru
> finds such label (but then doesn't find the SWITCH_EXPR), or it
> finds SWITCH_EXPR, in which case if the body doesn't fall through (e.g.
> ends with a return stmt), then it may fall through only if it doesn't
> cover all the cases.
> 
> This patch adds a bit that signals that, and computes whether all cases
> are covered (either if default: is present, or by walking the splay tree).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2017-11-28  Jakub Jelinek  
> 
>   PR sanitizer/81275
>   * tree.c (block_may_fallthru): Return false if SWITCH_ALL_CASES_P
>   is set on SWITCH_EXPR and !block_may_fallthru (SWITCH_BODY ()).
> c/
>   * c-typeck.c (c_finish_case): Set SWITCH_ALL_CASES_P if
>   c_switch_covers_all_cases_p returns true.
> c-family/
>   * c-common.c (c_switch_covers_all_cases_p_1,
>   c_switch_covers_all_cases_p): New functions.
>   * c-common.h (c_switch_covers_all_cases_p): Declare.
> testsuite/
>   * c-c++-common/tsan/pr81275.c: New test.
OK.
jeff


Re: [033/nnn] poly_int: pointer_may_wrap_p

2017-11-28 Thread Jeff Law
On 10/23/2017 11:13 AM, Richard Sandiford wrote:
> This patch changes the bitpos argument to pointer_may_wrap_p from
> HOST_WIDE_INT to poly_int64.  A later patch makes the callers track
> polynomial offsets.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * fold-const.c (pointer_may_wrap_p): Take the offset as a
>   HOST_WIDE_INT rather than a poly_int64.
OK.
jeff


Re: [032/nnn] poly_int: symbolic_number

2017-11-28 Thread Jeff Law
On 10/23/2017 11:12 AM, Richard Sandiford wrote:
> This patch changes symbol_number::bytepos from a HOST_WIDE_INT
> to a poly_int64.  perform_symbolic_merge can cope with symbolic
> offsets as long as the difference between the two offsets is
> constant.  (This could happen for a constant-sized field that
> occurs at a variable offset, for example.)
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * tree-ssa-math-opts.c (symbolic_number::bytepos): Change from
>   HOST_WIDE_INT to poly_int64.
>   (perform_symbolic_merge): Update accordingly.
OK.
jeff


Re: [028/nnn] poly_int: ipa_parm_adjustment

2017-11-28 Thread Jeff Law
On 10/23/2017 11:11 AM, Richard Sandiford wrote:
> This patch changes the type of ipa_parm_adjustment::offset from
> HOST_WIDE_INT to poly_int64 and updates uses accordingly.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * ipa-prop.h (ipa_parm_adjustment::offset): Change from
>   HOST_WIDE_INT to poly_int64_pod.
>   * ipa-prop.c (ipa_modify_call_arguments): Track polynomail
>   parameter offsets.
OK.
jeff


Re: [026/nnn] poly_int: operand_subword

2017-11-28 Thread Jeff Law
On 10/23/2017 11:10 AM, Richard Sandiford wrote:
> This patch makes operand_subword and operand_subword_force take
> polynomial offsets.  This is a fairly old-school interface and
> these days should only be used when splitting multiword operations
> into word operations.  It still doesn't hurt to support polynomial
> offsets and it helps make callers easier to write.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * rtl.h (operand_subword, operand_subword_force): Take the offset
>   as a poly_uint64 an unsigned int.
>   * emit-rtl.c (operand_subword, operand_subword_force): Likewise.
OK.
jeff


Re: [034/nnn] poly_int: get_inner_reference_aff

2017-11-28 Thread Jeff Law
On 10/23/2017 11:13 AM, Richard Sandiford wrote:
> This patch makes get_inner_reference_aff return the size as a
> poly_widest_int rather than a widest_int.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * tree-affine.h (get_inner_reference_aff): Return the size as a
>   poly_widest_int.
>   * tree-affine.c (get_inner_reference_aff): Likewise.
>   * tree-data-ref.c (dr_may_alias_p): Update accordingly.
>   * tree-ssa-loop-im.c (mem_refs_may_alias_p): Likewise.
> 
OK.
jeff


Re: [039/nnn] poly_int: pass_store_merging::execute

2017-11-28 Thread Jeff Law
On 10/23/2017 11:17 AM, Richard Sandiford wrote:
> This patch makes pass_store_merging::execute track polynomial sizes
> and offsets.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * gimple-ssa-store-merging.c (pass_store_merging::execute): Track
>   polynomial sizes and offsets.
OK.  THough I wouldn't be surprised if this needs revamping after
Jakub's work in this space.

It wasn't clear why you moved some of the code which computes invalid vs
where we test invalid, but I don't see any problem with that movement of
code.

jeff



Re: [046/nnn] poly_int: instantiate_virtual_regs

2017-11-28 Thread Jeff Law
On 10/23/2017 11:19 AM, Richard Sandiford wrote:
> This patch makes the instantiate virtual regs pass track offsets
> as poly_ints.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * function.c (in_arg_offset, var_offset, dynamic_offset)
>   (out_arg_offset, cfa_offset): Change from int to poly_int64.
>   (instantiate_new_reg): Return the new offset as a poly_int64_pod
>   rather than a HOST_WIDE_INT.
>   (instantiate_virtual_regs_in_rtx): Track polynomial offsets.
>   (instantiate_virtual_regs_in_insn): Likewise.
OK.
jeff


Re: [092/nnn] poly_int: PUSH_ROUNDING

2017-11-28 Thread Richard Sandiford
Jeff Law  writes:
> On 10/23/2017 11:37 AM, Richard Sandiford wrote:
>> PUSH_ROUNDING is difficult to convert to a hook since there is still
>> a lot of conditional code based on it.  It isn't clear that a direct
>> conversion with checks for null hooks is the right thing to do.
>> 
>> Rather than untangle that, this patch converts all implementations
>> that do something to out-of-line functions that have the same
>> interface as a hook would have.  This should at least help towards
>> any future hook conversion.
>> 
>> 
>> 2017-10-23  Richard Sandiford  
>>  Alan Hayward  
>>  David Sherwood  
>> 
>> gcc/
>>  * config/cr16/cr16-protos.h (cr16_push_rounding): Declare.
>>  * config/cr16/cr16.h (PUSH_ROUNDING): Move implementation to...
>>  * config/cr16/cr16.c (cr16_push_rounding): ...this new function.
>>  * config/h8300/h8300-protos.h (h8300_push_rounding): Declare.
>>  * config/h8300/h8300.h (PUSH_ROUNDING): Move implementation to...
>>  * config/h8300/h8300.c (h8300_push_rounding): ...this new function.
>>  * config/i386/i386-protos.h (ix86_push_rounding): Declare.
>>  * config/i386/i386.h (PUSH_ROUNDING): Move implementation to...
>>  * config/i386/i386.c (ix86_push_rounding): ...this new function.
>>  * config/m32c/m32c-protos.h (m32c_push_rounding): Take and return
>>  a poly_int64.
>>  * config/m32c/m32c.c (m32c_push_rounding): Likewise.
>>  * config/m68k/m68k-protos.h (m68k_push_rounding): Declare.
>>  * config/m68k/m68k.h (PUSH_ROUNDING): Move implementation to...
>>  * config/m68k/m68k.c (m68k_push_rounding): ...this new function.
>>  * config/pdp11/pdp11-protos.h (pdp11_push_rounding): Declare.
>>  * config/pdp11/pdp11.h (PUSH_ROUNDING): Move implementation to...
>>  * config/pdp11/pdp11.c (pdp11_push_rounding): ...this new function.
>>  * config/stormy16/stormy16-protos.h (xstormy16_push_rounding): Declare.
>>  * config/stormy16/stormy16.h (PUSH_ROUNDING): Move implementation to...
>>  * config/stormy16/stormy16.c (xstormy16_push_rounding): ...this new
>>  function.
>>  * expr.c (emit_move_resolve_push): Treat the input and result
>>  of PUSH_ROUNDING as a poly_int64.
>>  (emit_move_complex_push, emit_single_push_insn_1): Likewise.
>>  (emit_push_insn): Likewise.
>>  * lra-eliminations.c (mark_not_eliminable): Likewise.
>>  * recog.c (push_operand): Likewise.
>>  * reload1.c (elimination_effects): Likewise.
>>  * rtlanal.c (nonzero_bits1): Likewise.
>>  * calls.c (store_one_arg): Likewise.  Require the padding to be
>>  known at compile time.
> OK.
>
> I so wish PUSH_ROUNDING wasn't needed and that folks could at least keep
> their processors consistent (I'm looking at the coldfire designers :(.
> For a tale of woe, see BZ68467.

Ouch.  Is this also fallout from having different code for libcalls
and normal calls?  That always seemed like an accident waiting to
happen, but I don't remember seeing cases where it caused actual ABI
breakage before.

Thanks as ever for the reviews :-)

Richard


Re: [PATCH][i386,AVX] Enable VBMI2 support [4/7]

2017-11-28 Thread Kirill Yukhin
Hello Julia,
On 24 Oct 09:08, Koval, Julia wrote:
> Hi,
> This patch enables VPSHLD instruction. The doc for isaset and instruction: 
> https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf
> 
> Ok for trunk?
You patch is OK for trunk. I've checked it in.

--
Thanks, K


Re: [PATCH][i386,AVX] Enable VBMI2 support [5/7]

2017-11-28 Thread Kirill Yukhin
Hello Julia,
On 24 Oct 10:05, Koval, Julia wrote:
> Attached the patch
> 
> > -Original Message-
> > From: Koval, Julia
> > Sent: Tuesday, October 24, 2017 12:01 PM
> > To: GCC Patches 
> > Cc: Kirill Yukhin 
> > Subject: [PATCH][i386,AVX] Enable VBMI2 support [5/7]
> > 
> > Hi,
> > This patch enables VPSHRD instruction. The doc for isaset and instruction:
> > https://software.intel.com/sites/default/files/managed/c5/15/architecture-
> > instruction-set-extensions-programming-reference.pdf
> > 
> > Ok for trunk?
Your patch is OK for trunk. I've checked it in.

--
Thanks, K


PUSH_ROUNDING

2017-11-28 Thread Jeff Law
On 11/28/2017 11:00 AM, Richard Sandiford wrote:
> Jeff Law  writes:

>>
>> I so wish PUSH_ROUNDING wasn't needed and that folks could at least keep
>> their processors consistent (I'm looking at the coldfire designers :(.
>> For a tale of woe, see BZ68467.
> 
> Ouch.  Is this also fallout from having different code for libcalls
> and normal calls?  That always seemed like an accident waiting to
> happen, but I don't remember seeing cases where it caused actual ABI
> breakage before.
Yup.  Essentially the caller uses a libcall interface where promotions
are not occurring, but there's no way to describe that at the source
level to the implementation of the libcall and the implementation thus
expects the usual argument promotions.  At least that's how it looked
when I started poking a bit.  At that point, I had to stop as I couldn't
justify the time to dig further for an m68k issue...


> 
> Thanks as ever for the reviews :-)
You're welcome.  Still lots to do, but at least some progress whittling
it down.

jeff


Re: [076/nnn] poly_int: vectorizable_conversion

2017-11-28 Thread Richard Sandiford
Jeff Law  writes:
> On 10/23/2017 11:30 AM, Richard Sandiford wrote:
>> This patch makes vectorizable_conversion cope with variable-length
>> vectors.  We already require the number of elements in one vector
>> to be a multiple of the number of elements in the other vector,
>> so the patch uses that to choose between widening and narrowing.
>> 
>> 
>> 2017-10-23  Richard Sandiford  
>>  Alan Hayward  
>>  David Sherwood  
>> 
>> gcc/
>>  * tree-vect-stmts.c (vectorizable_conversion): Treat the number
>>  of units as polynomial.  Choose between WIDE and NARROW based
>>  on multiple_p.
> If I'm reding this right, if nunits_in < nunits_out, but the latter is
> not a multiple of the former, we'll choose WIDEN, which is the opposite
> of what we'd do before this patch.  Was that intentional?

That case isn't possible, so we'd assert:

  if (must_eq (nunits_out, nunits_in))
modifier = NONE;
  else if (multiple_p (nunits_out, nunits_in))
modifier = NARROW;
  else
{
  gcc_checking_assert (multiple_p (nunits_in, nunits_out));
  modifier = WIDEN;
}

We already implicitly rely on this, since we either widen one full
vector to N full vectors or narrow N full vectors to one vector.

Structurally this is enforced by all vectors having the same number of
bytes (current_vector_size) and the number of vector elements being a
power of 2 (or in the case of poly_int, a power of 2 times a runtime
variant, but that's good enough, since the runtime invariant is the same
in both cases).

Thanks,
Richard


Re: C++ PATCH to primary_template_instantiation_p

2017-11-28 Thread Jason Merrill
Fixed, thanks.

On Tue, Nov 28, 2017 at 10:49 AM, Maxim Kuvyrkov
 wrote:
>
>> On Nov 28, 2017, at 12:29 AM, Jason Merrill  wrote:
>>
>> All the uses of primary_template_instantiation_p actually want to
>> query whether the entity in question is a specialization of the
>> template, not whether it's an instantiation or explicit
>> specialization.
>>
>> Tested x86_64-pc-linux-gnu, applying to trunk.
>> 
>
> Hi Jason,
>
> I get the following failure with the new test on x86_64-linux-gnu and 
> aarch64-linux-gnu:
>
>> --- /dev/null
>> +++ b/gcc/testsuite/g++.dg/cpp0x/fntmpdefarg2a.C
>> @@ -0,0 +1,27 @@
>> +// PR c++/46831
>> +// { dg-do compile { target c++11 } }
>> +// { dg-options "" }
>> +
>> +struct B { };
>> +struct D : B { };
>> +struct A {
>> +  template operator D&(); // { dg-message "template 
>> conversion" }
>> +  operator long();
>> +};
>> +
>> +template <> A::operator D&();
>
> "Template conversion" warning is triggered on this line, rather than above.
>
>> +
>> +void f(long);
>> +void f(B&);
>> +
>> +struct A2 {
>> +  template operator B&();
>> +};
>> +
>> +void f2(const B&);
>> +
>> +int main() {
>> +  f(A());
>> +  f2(A2());
>> +  f2(A());   // { dg-error "" }
>> +}
>>
>
> Would you please take a look?
>
> ===
> spawn -ignore SIGHUP 
> /home/tcwg-buildslave/workspace/tcwg-buildfarm/tcwg-x86_64-build/_build/builds/x86_64-unknown-linux-gnu/x86_64-unknown-linux-gnu/gcc.git~master-stage2/gcc/testsuite/g++5/../../xg++
>  
> -B/home/tcwg-buildslave/workspace/tcwg-buildfarm/tcwg-x86_64-build/_build/builds/x86_64-unknown-linux-gnu/x86_64-unknown-linux-gnu/gcc.git~master-stage2/gcc/testsuite/g++5/../../
>  
> /home/tcwg-buildslave/workspace/tcwg-buildfarm/tcwg-x86_64-build/snapshots/gcc.git~master/gcc/testsuite/g++.dg/cpp0x/fntmpdefarg2a.C
>  -fno-diagnostics-show-caret -fdiagnostics-color=never -nostdinc++ 
> -I/home/tcwg-buildslave/workspace/tcwg-buildfarm/tcwg-x86_64-build/_build/builds/x86_64-unknown-linux-gnu/x86_64-unknown-linux-gnu/gcc.git~master-stage2/x86_64-unknown-linux-gnu/libstdc++-v3/include/x86_64-unknown-linux-gnu
>  
> -I/home/tcwg-buildslave/workspace/tcwg-buildfarm/tcwg-x86_64-build/_build/builds/x86_64-unknown-linux-gnu/x86_64-unknown-linux-gnu/gcc.git~master-stage2/x86_64-unknown-linux-gnu/libstdc++-v3/include
>  
> -I/home/tcwg-buildslave/workspace/tcwg-buildfarm/tcwg-x86_64-build/snapshots/gcc.git~master/libstdc++-v3/libsupc++
>  
> -I/home/tcwg-buildslave/workspace/tcwg-buildfarm/tcwg-x86_64-build/snapshots/gcc.git~master/libstdc++-v3/include/backward
>  
> -I/home/tcwg-buildslave/workspace/tcwg-buildfarm/tcwg-x86_64-build/snapshots/gcc.git~master/libstdc++-v3/testsuite/util
>  -fmessage-length=0 -std=gnu++11 -S -o fntmpdefarg2a.s
> /home/tcwg-buildslave/workspace/tcwg-buildfarm/tcwg-x86_64-build/snapshots/gcc.git~master/gcc/testsuite/g++.dg/cpp0x/fntmpdefarg2a.C:
>  In function 'int main()':
> /home/tcwg-buildslave/workspace/tcwg-buildfarm/tcwg-x86_64-build/snapshots/gcc.git~master/gcc/testsuite/g++.dg/cpp0x/fntmpdefarg2a.C:26:6:
>  error: invalid user-defined conversion from 'A' to 'const B&' [-fpermissive]
> /home/tcwg-buildslave/workspace/tcwg-buildfarm/tcwg-x86_64-build/snapshots/gcc.git~master/gcc/testsuite/g++.dg/cpp0x/fntmpdefarg2a.C:12:13:
>  note: candidate is: 'A::operator D&() [with T = void]' 
> /home/tcwg-buildslave/workspace/tcwg-buildfarm/tcwg-x86_64-build/snapshots/gcc.git~master/gcc/testsuite/g++.dg/cpp0x/fntmpdefarg2a.C:12:13:
>  note:   conversion from return type 'D&' of template conversion function 
> specialization to 'const B&' is not an exact match
> /home/tcwg-buildslave/workspace/tcwg-buildfarm/tcwg-x86_64-build/snapshots/gcc.git~master/gcc/testsuite/g++.dg/cpp0x/fntmpdefarg2a.C:21:6:
>  note:   initializing argument 1 of 'void f2(const B&)'
> compiler exited with status 1
> FAIL: g++.dg/cpp0x/fntmpdefarg2a.C  -std=gnu++11  (test for warnings, line 8)
> PASS: g++.dg/cpp0x/fntmpdefarg2a.C  -std=gnu++11  (test for errors, line 26)
> PASS: g++.dg/cpp0x/fntmpdefarg2a.C  -std=gnu++11 (test for excess errors)
> ===
>
> Regards,
>
> --
> Maxim Kuvyrkov
> www.linaro.org
>
>
>


Re: [PATCH][AArch64] Fix ICE due to store_pair_lanes

2017-11-28 Thread James Greenhalgh
On Mon, Nov 27, 2017 at 03:20:29PM +, Wilco Dijkstra wrote:
> The recently added store_pair_lanes causes ICEs in output_operand.
> This is due to aarch64_classify_address treating it like a 128-bit STR
> rather than a STP. The valid immediate offsets don't fully overlap,
> causing it to return false.  Eg. offset 264 is a valid 8-byte STP offset
> but not a valid 16-byte STR offset since it isn't a multiple of 16.
> 
> The original instruction isn't passed in the printing code, so the context
> is unclear.  The solution is to add a new operand formatting specifier
> which is used for LDP/STP instructions like this.  This, like the Uml
> constraint that applies to store_pair_lanes, uses PARALLEL when calling
> aarch64_classify_address so that it knows it is an STP.
> Also add the 'z' specifier for future use by load/store pair instructions.
> 
> Passes regress, OK for commit? 

OK. But...

> +  if (aarch64_classify_address (&addr, x, mode, op, true))

This interface is not nice, resulting in...

> +/* Print address 'x' of a LDP/STP with mode 'mode'.  */
> +static void
> +aarch64_print_ldpstp_address (FILE *f, machine_mode mode, rtx x)
> +{
> +  aarch64_print_address_internal (f, mode, x, PARALLEL);
> +}
> +
> +/* Print address 'x' of a memory access with mode 'mode'.  */
> +static void
> +aarch64_print_operand_address (FILE *f, machine_mode mode, rtx x)
> +{
> +  aarch64_print_address_internal (f, mode, x, MEM);
> +}

These, which I *really* dislike.

Ideas on how to clean up this interface would be appreciated.

Thanks,
James



[PATCH] complex type canonicalization

2017-11-28 Thread Nathan Sidwell
This patch fixes PR 83187, a C++ ICE with alias sets.  As Jakub observed 
in the bug, we ended up with a type whose main variant's canonical 
type's main variant was not itself.  That should be an invariant.


In build_complex_type, we create a probe type:

  tree probe = make_node (COMPLEX_TYPE);
  TREE_TYPE (probe) = TYPE_MAIN_VARIANT (component_type);

notice setting TREE_TYPE to main_variant of the component type.  We then 
insert this into the canonical type hash (or find a match).


I think my type hashing clean up exposed an underlying problem, when I 
changed:

 hstate.add_object (TYPE_HASH (component_type));
to
 hashval_t hash = type_hash_canon_hash (t);

because that's abstractly better.  However, build_complex_type then 
proceeds to do two strange things.


1) regardless of whether it added a new type to the hash table, it 
checks for t == CANONICAL (t) and if equal goes to recreate the 
canonical type.


2) but it checks the canonicalness of component_type, not 
TYPE_MAIN_VARIANT (component_type), which is what the newly created 
complex's TREE_TYPE was set to.


#2 is where I think things are getting confused.  Anyway, this patch 
simply moves all of the post-insertion setting after a check to see 
whether the insertion inserted the new type, (rather than find an 
existing variant).  It also consistently uses TREE_TYPE (new_type), when 
looking at the component type in that context.


booted on x86_64-linux-gnu.  no C or C++ regressions.  ok?

nathan
--
Nathan Sidwell
2017-11-28  Nathan Sidwell  

	PR c++/83817
	* tree.c (build_complex_type): Fix canonicalization.  Only fill in
	type if it is new.

	PR c++/83187
	* g++.dg/opt/pr83187.C: New.

Index: testsuite/g++.dg/opt/pr83187.C
===
--- testsuite/g++.dg/opt/pr83187.C	(revision 0)
+++ testsuite/g++.dg/opt/pr83187.C	(working copy)
@@ -0,0 +1,32 @@
+// { dg-do compile { target c++11 } }
+// { dg-additional-options "-O1 -Wno-pedantic" }
+// PR c++/83187 ICE in get_alias_set due to canonical type confusion.
+
+extern "C" {
+  double cos (double);
+  double sin (double);
+}
+
+template  class COMPLEX;
+
+template <>
+struct COMPLEX
+{
+  COMPLEX(double r, double i);
+
+  __complex__ mem;
+};
+
+COMPLEX::COMPLEX (double r, double i)
+  : mem {r, i} {}
+
+typedef double dbl_t;
+
+dbl_t var;
+
+void foo (COMPLEX *ptr)
+{
+  const dbl_t unused = var;
+
+  *ptr = COMPLEX (cos (var), sin (var));
+}
Index: tree.c
===
--- tree.c	(revision 255202)
+++ tree.c	(working copy)
@@ -8077,65 +8077,66 @@ build_offset_type (tree basetype, tree t
 tree
 build_complex_type (tree component_type, bool named)
 {
-  tree t;
-
   gcc_assert (INTEGRAL_TYPE_P (component_type)
 	  || SCALAR_FLOAT_TYPE_P (component_type)
 	  || FIXED_POINT_TYPE_P (component_type));
 
   /* Make a node of the sort we want.  */
-  t = make_node (COMPLEX_TYPE);
+  tree probe = make_node (COMPLEX_TYPE);
 
-  TREE_TYPE (t) = TYPE_MAIN_VARIANT (component_type);
+  TREE_TYPE (probe) = TYPE_MAIN_VARIANT (component_type);
 
   /* If we already have such a type, use the old one.  */
-  hashval_t hash = type_hash_canon_hash (t);
-  t = type_hash_canon (hash, t);
-
-  if (!COMPLETE_TYPE_P (t))
-layout_type (t);
+  hashval_t hash = type_hash_canon_hash (probe);
+  tree t = type_hash_canon (hash, probe);
 
-  if (TYPE_CANONICAL (t) == t)
+  if (t == probe)
 {
-  if (TYPE_STRUCTURAL_EQUALITY_P (component_type))
+  /* We created a new type.  The hash insertion will have laid
+	 out the type.  We need to check the canonicalization and
+	 maybe set the name.  */
+  gcc_checking_assert (COMPLETE_TYPE_P (t)
+			   && !TYPE_NAME (t)
+			   && TYPE_CANONICAL (t) == t);
+
+  if (TYPE_STRUCTURAL_EQUALITY_P (TREE_TYPE (t)))
 	SET_TYPE_STRUCTURAL_EQUALITY (t);
-  else if (TYPE_CANONICAL (component_type) != component_type)
+  else if (TYPE_CANONICAL (TREE_TYPE (t)) != TREE_TYPE (t))
 	TYPE_CANONICAL (t)
-	  = build_complex_type (TYPE_CANONICAL (component_type), named);
-}
+	  = build_complex_type (TYPE_CANONICAL (TREE_TYPE (t)), named);
 
-  /* We need to create a name, since complex is a fundamental type.  */
-  if (!TYPE_NAME (t) && named)
-{
-  const char *name;
-  if (component_type == char_type_node)
-	name = "complex char";
-  else if (component_type == signed_char_type_node)
-	name = "complex signed char";
-  else if (component_type == unsigned_char_type_node)
-	name = "complex unsigned char";
-  else if (component_type == short_integer_type_node)
-	name = "complex short int";
-  else if (component_type == short_unsigned_type_node)
-	name = "complex short unsigned int";
-  else if (component_type == integer_type_node)
-	name = "complex int";
-  else if (component_type == unsigned_type_node)
-	name = "complex unsigned int";
-  else if (component_type == long_integer_type_node)
-	name = "complex long int";

Re: [PATCH] Fix hot/cold partitioning with -gstabs{,+} (PR debug/81307)

2017-11-28 Thread Mike Stump
On Nov 27, 2017, at 5:07 PM, Jim Wilson  wrote:
> There is also darwin9 support that apparently no one really cares about 
> anymore.

I'm fine with removing stabs support from the compiler.

Re: [PR 82808] Use result types for arithmetic jump functions

2017-11-28 Thread Martin Jambor
Hi,

On Tue, Nov 28 2017, Richard Biener wrote:
> On Tue, Nov 28, 2017 at 12:35 AM, Martin Jambor  wrote:

...

>> index 7efd644fb27..2a25c657f8b 100644
>> --- a/gcc/tree.c
>> +++ b/gcc/tree.c
>> @@ -13893,6 +13893,52 @@ arg_size_in_bytes (const_tree type)
>>return TYPE_EMPTY_P (type) ? size_zero_node : size_in_bytes (type);
>>  }
>>
>> +/* Return true if unary or binary operation specified with CODE has to have 
>> the
>> +   same result type as its first operand.  */
>> +
>> +bool
>> +tree_operation_preserves_op1_type_p (tree_code code)
>
> expr_type_first_operand_type_p ()

Done.

>
>> +{
>> +  gcc_checking_assert (TREE_CODE_CLASS (code) == tcc_unary
>> +  || TREE_CODE_CLASS (code) == tcc_binary);
>
> Drop this assert, there are at least tcc_expression codes we might want
> to handle in the future, the default: return false should be a good fallback.
> Adjust the function comment accordingly.

Done, this is what I have just committed.  I will prepare a conservative
fix for gcc 7 with only the expr_type_first_operand_type_p part.

Thanks a lot,

Martin

[PR 82808] Use proper result types for arithmetic jump functions

2017-11-28  Prathamesh Kulkarni  
Martin Jambor  

PR ipa/82808
* tree.h (expr_type_first_operand_type_p): Declare
* tree.c (expr_type_first_operand_type_p): New function.
* ipa-prop.h (ipa_get_type): Allow i to be out of bounds.
(ipa_value_from_jfunc): Adjust declaration.
* ipa-cp.c (ipa_get_jf_pass_through_result): New parameter RES_TYPE.
Use it as result type for arithmetics, unless it is NULL in which case
be more conservative.
(ipa_value_from_jfunc): New parameter PARM_TYPE, pass it to
ipa_get_jf_pass_through_result.
(propagate_vals_across_pass_through): Likewise.
(propagate_scalar_across_jump_function): New parameter PARM_TYPE, pass
is to propagate_vals_across_pass_through.
(propagate_constants_across_call): Pass PARM_TYPE to
propagate_scalar_across_jump_function.
(find_more_scalar_values_for_callers_subset): Pass parameter type to
ipa_value_from_jfunc.
(cgraph_edge_brings_all_scalars_for_node): Likewise.
* ipa-fnsummary.c (evaluate_properties_for_edge): Renamed parms_info
to caller_parms_info, pass parameter type to ipa_value_from_jfunc.
* ipa-prop.c (try_make_edge_direct_simple_call): New parameter
target_type, pass it to ipa_value_from_jfunc.
(update_indirect_edges_after_inlining): Pass parameter type to
try_make_edge_direct_simple_call.

testsuite/
* gcc.dg/ipa/pr82808.c: New test.
---
 gcc/ipa-cp.c   | 67 +++---
 gcc/ipa-fnsummary.c| 14 
 gcc/ipa-prop.c | 21 
 gcc/ipa-prop.h |  5 +--
 gcc/testsuite/gcc.dg/ipa/pr82808.c | 27 +++
 gcc/tree.c | 44 +
 gcc/tree.h |  1 +
 7 files changed, 138 insertions(+), 41 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/ipa/pr82808.c

diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 05228d0582d..aa9e300d378 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -1220,33 +1220,38 @@ initialize_node_lattices (struct cgraph_node *node)
 }
 
 /* Return the result of a (possibly arithmetic) pass through jump function
-   JFUNC on the constant value INPUT.  Return NULL_TREE if that cannot be
+   JFUNC on the constant value INPUT.  RES_TYPE is the type of the parameter
+   to which the result is passed.  Return NULL_TREE if that cannot be
determined or be considered an interprocedural invariant.  */
 
 static tree
-ipa_get_jf_pass_through_result (struct ipa_jump_func *jfunc, tree input)
+ipa_get_jf_pass_through_result (struct ipa_jump_func *jfunc, tree input,
+   tree res_type)
 {
-  tree restype, res;
+  tree res;
 
   if (ipa_get_jf_pass_through_operation (jfunc) == NOP_EXPR)
 return input;
   if (!is_gimple_ip_invariant (input))
 return NULL_TREE;
 
-  if (TREE_CODE_CLASS (ipa_get_jf_pass_through_operation (jfunc))
-  == tcc_unary)
-res = fold_unary (ipa_get_jf_pass_through_operation (jfunc),
- TREE_TYPE (input), input);
-  else
+  tree_code opcode = ipa_get_jf_pass_through_operation (jfunc);
+  if (!res_type)
 {
-  if (TREE_CODE_CLASS (ipa_get_jf_pass_through_operation (jfunc))
- == tcc_comparison)
-   restype = boolean_type_node;
+  if (TREE_CODE_CLASS (opcode) == tcc_comparison)
+   res_type = boolean_type_node;
+  else if (expr_type_first_operand_type_p (opcode))
+   res_type = TREE_TYPE (input);
   else
-   restype = TREE_TYPE (input);
-  res = fold_binary (ipa_get_jf_pass_through_operation (jfunc), restype,
-input, ipa_get_jf_pass_through_operand (jfunc));
+   return N

Re: [PATCH] Fix ms-sysv.exp testsuite FAILs (PR c/83117)

2017-11-28 Thread Daniel Santos


On 11/28/2017 05:22 AM, Jakub Jelinek wrote:
> On Mon, Nov 27, 2017 at 05:02:32PM -0600, Daniel Santos wrote:
>>> --- gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/gen.cc.jj   2017-05-22 
>>> 10:49:45.0 +0200
>>> +++ gcc/testsuite/gcc.target/x86_64/abi/ms-sysv/gen.cc  2017-11-27 
>>> 11:57:14.889570915 +0100
>>> @@ -392,7 +392,7 @@ static void make_do_tests_decl (const ve
>>> continue;
>>>  
>>>   comma.reset ();
>>> - out << "static __attribute__ ((ms_abi)) long (*const do_test_"
>>> + out << "static __attribute__ ((ms_abi)) long (*do_test_"
>>>   << (unaligned ? "u" : "")
>>>   << (varargs ? "v" : "") << i << ") (";
>> I don't have a problem with removing const, it's only there for
>> const-correctness and caution.  I just posted to the PR a bit ago and
>> I'm curious if there is a better approach when using assembly stubs that
>> are meant to be called in varying ways.  CV would work also, although
>> there's no real need to refetch the address before each use.
>>
>> If you don't have a better way to do this then please use this patch.
> I've verified the resulting *.optimized dump as well as assembly is
> practically identical without/with the patch, only differences are in
> SSA_NAME versions, in assembly the .LC and .LCFI constants are
> different but otherwise it is the same - the functions are emitted in
> different orders by cgraph and committed the patch.
>
> Using assembly stubs that are meant to be called in varying ways should
> just be avoided in portable programs, you could e.g. in the generator
> instead of all those:
> extern __attribute__ ((ms_abi)) long do_test_aligned ();
> extern __attribute__ ((ms_abi)) long do_test_unaligned ();
> static __attribute__ ((ms_abi)) long (*do_test_1) (long a) = 
> (void*)do_test_aligned;
> static __attribute__ ((ms_abi)) long (*do_test_v1) (long a, ...) = 
> (void*)do_test_aligned;
> static __attribute__ ((ms_abi)) long (*do_test_u1) (long a) = 
> (void*)do_test_unaligned;
> static __attribute__ ((ms_abi)) long (*do_test_uv1) (long a, ...) = 
> (void*)do_test_unaligned;
> emit:
> extern __attribute__ ((ms_abi)) long do_test_1 (long a);
> asm (".text; do_test_1: jmp do_test_aligned; .previous");
> extern __attribute__ ((ms_abi)) long do_test_v1 (long a, ...);
> asm (".text; do_test_v1: jmp do_test_aligned; .previous");
> extern __attribute__ ((ms_abi)) long do_test_1 (long a);
> asm (".text; do_test_u1: jmp do_test_unaligned; .previous");
> extern __attribute__ ((ms_abi)) long do_test_1 (long a, ...);
> asm (".text; do_test_uv1: jmp do_test_unaligned; .previous");
> or something similar.
>
>   Jakub

Ah hah! That would indeed work. Thanks for the tip.  I have some
improvements to make to this set of tests, mostly tests triggered by
GCC_TEST_RUN_EXPENSIVE, but perhaps I can make this modification as
well.  Come to think of it, attribute naked might work too.

Thanks,
Daniel


[committed] Reject fix-it hints for various awkward boundary cases (PR c/82050)

2017-11-28 Thread David Malcolm
PR c/82050 reports a failed assertion deep within diagnostic_show_locus's
code for printing fix-it hints.

The root cause is a fix-it hint suggesting a textual replacement,
where the affected column numbers straddle the LINE_MAP_MAX_COLUMN_NUMBER
boundary, so that the start of the range has a column number, but the
end of the range doesn't.

The fix is to verify that the column numbers are sane when adding fix-it
hints to a rich_location, rejecting fix-it hints where they are not.

Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.
Committed to trunk as r255214.

libcpp/ChangeLog:
PR c/82050
* include/line-map.h (LINE_MAP_MAX_COLUMN_NUMBER): Move here.
* line-map.c (LINE_MAP_MAX_COLUMN_NUMBER): ...from here.
(rich_location::maybe_add_fixit): Reject fix-it hints in which
the start column exceeds the next column.
---
 libcpp/include/line-map.h |  5 +
 libcpp/line-map.c | 13 -
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index 8b7e5dc..1151484 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -280,6 +280,11 @@ enum lc_reason
worked example in libcpp/location-example.txt.  */
 typedef unsigned int source_location;
 
+/* Do not track column numbers higher than this one.  As a result, the
+   range of column_bits is [12, 18] (or 0 if column numbers are
+   disabled).  */
+const unsigned int LINE_MAP_MAX_COLUMN_NUMBER = (1U << 12);
+
 /* Do not pack ranges if locations get higher than this.
If you change this, update:
  gcc.dg/plugin/location-overflow-test-*.c.  */
diff --git a/libcpp/line-map.c b/libcpp/line-map.c
index 0e5804b..ac621e9 100644
--- a/libcpp/line-map.c
+++ b/libcpp/line-map.c
@@ -26,11 +26,6 @@ along with this program; see the file COPYING3.  If not see
 #include "internal.h"
 #include "hashtab.h"
 
-/* Do not track column numbers higher than this one.  As a result, the
-   range of column_bits is [12, 18] (or 0 if column numbers are
-   disabled).  */
-const unsigned int LINE_MAP_MAX_COLUMN_NUMBER = (1U << 12);
-
 /* Highest possible source location encoded within an ordinary or
macro map.  */
 const source_location LINE_MAP_MAX_SOURCE_LOCATION = 0x7000;
@@ -2352,6 +2347,14 @@ rich_location::maybe_add_fixit (source_location start,
   stop_supporting_fixits ();
   return;
 }
+  /* The columns must be in the correct order.  This can fail if the
+ endpoints straddle the boundary for which the linemap can represent
+ columns (PR c/82050).  */
+  if (exploc_start.column > exploc_next_loc.column)
+{
+  stop_supporting_fixits ();
+  return;
+}
 
   const char *newline = strchr (new_content, '\n');
   if (newline)
-- 
1.8.5.3



[PATCH] Expensive selftests: torture testing for fix-it boundary conditions (PR c/82050)

2017-11-28 Thread David Malcolm
This patch adds selftest coverage for the fix for PR c/82050 (r255214).

The selftest iterates over various "interesting" column and line-width
values to try to shake out bugs in the fix-it printing routines, a kind
of "torture" selftest.

Unfortunately this selftest is noticably slower than the other selftests;
adding it to diagnostic-show-locus.c led to:
  -fself-test: 40218 pass(es) in 0.172000 seconds
slowing down to:
  -fself-test: 97315 pass(es) in 6.109000 seconds
for an unoptimized build (e.g. when hacking with --disable-bootstrap).

Given that this affects the compile-edit-test cycle of the "gcc"
subdirectory, this felt like an unacceptable amount of overhead to add.

I attempted to optimize the test by reducing the amount of coverage, but
the test seems useful, and there seems to be a valid role for "torture"
selftests.

Hence this patch adds a:
  gcc.dg/plugin/expensive_selftests_plugin.c
with the responsibility for running "expensive" selftests, and adds the
expensive test there.  The patch moves a small amount of code from
selftest::run_tests into a helper class so that the plugin can print
a useful summary line (to reassure us that the tests are actually being
run).

With that, the compile-edit-test cycle of the "gcc" subdir is unaffected;
the plugin takes:
  expensive_selftests_plugin: 26641 pass(es) in 3.127000 seconds
which seems reasonable within the much longer time taken by "make check"
(I optimized some of the overhead away, hence the reduction from 6 seconds
above down to 3 seconds).

Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.

OK for trunk?

gcc/ChangeLog:
PR c/82050
* selftest-run-tests.c (selftest::run_tests): Move start/finish code
to...
* selftest.c (selftest::test_runner::test_runner): New ctor.
(selftest::test_runner::~test_runner): New dtor.
* selftest.h (class selftest::test_runner): New class.

gcc/testsuite/ChangeLog:
PR c/82050
* gcc.dg/plugin/expensive-selftests-1.c: New file.
* gcc.dg/plugin/expensive_selftests_plugin.c: New file.
* gcc.dg/plugin/plugin.exp (plugin_test_list): Add the above.
---
 gcc/selftest-run-tests.c   |  11 +-
 gcc/selftest.c |  22 +++
 gcc/selftest.h |  14 ++
 .../gcc.dg/plugin/expensive-selftests-1.c  |   3 +
 .../gcc.dg/plugin/expensive_selftests_plugin.c | 175 +
 gcc/testsuite/gcc.dg/plugin/plugin.exp |   2 +
 6 files changed, 218 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/plugin/expensive-selftests-1.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/expensive_selftests_plugin.c

diff --git a/gcc/selftest-run-tests.c b/gcc/selftest-run-tests.c
index 6030d3b..f539c66 100644
--- a/gcc/selftest-run-tests.c
+++ b/gcc/selftest-run-tests.c
@@ -46,7 +46,7 @@ selftest::run_tests ()
  option-handling.  */
   path_to_selftest_files = flag_self_test;
 
-  long start_time = get_run_time ();
+  test_runner r ("-fself-test");
 
   /* Run all the tests, in hand-coded order of (approximate) dependencies:
  run the tests for lowest-level code first.  */
@@ -106,14 +106,7 @@ selftest::run_tests ()
  failed to be finalized can be detected by valgrind.  */
   forcibly_ggc_collect ();
 
-  /* Finished running tests.  */
-  long finish_time = get_run_time ();
-  long elapsed_time = finish_time - start_time;
-
-  fprintf (stderr,
-  "-fself-test: %i pass(es) in %ld.%06ld seconds\n",
-  num_passes,
-  elapsed_time / 100, elapsed_time % 100);
+  /* Finished running tests; the test_runner dtor will print a summary.  */
 }
 
 #endif /* #if CHECKING_P */
diff --git a/gcc/selftest.c b/gcc/selftest.c
index b41b9f5..ca84bfa 100644
--- a/gcc/selftest.c
+++ b/gcc/selftest.c
@@ -213,6 +213,28 @@ locate_file (const char *name)
   return concat (path_to_selftest_files, "/", name, NULL);
 }
 
+/* selftest::test_runner's ctor.  */
+
+test_runner::test_runner (const char *name)
+: m_name (name),
+  m_start_time (get_run_time ())
+{
+}
+
+/* selftest::test_runner's dtor.  Print a summary line to stderr.  */
+
+test_runner::~test_runner ()
+{
+  /* Finished running tests.  */
+  long finish_time = get_run_time ();
+  long elapsed_time = finish_time - m_start_time;
+
+  fprintf (stderr,
+  "%s: %i pass(es) in %ld.%06ld seconds\n",
+  m_name, num_passes,
+  elapsed_time / 100, elapsed_time % 100);
+}
+
 /* Selftests for libiberty.  */
 
 /* Verify that xstrndup generates EXPECTED when called on SRC and N.  */
diff --git a/gcc/selftest.h b/gcc/selftest.h
index cdad939..f282055 100644
--- a/gcc/selftest.h
+++ b/gcc/selftest.h
@@ -168,6 +168,20 @@ extern char *locate_file (const char *path);
 
 extern const char *path_to_selftest_files;
 
+/* selftest::test_runner is an implementation detail of selftest::run_tests,
+   exposed here to allow plugins to run th

[PATCH, gcc-7] Add riscv musl support.

2017-11-28 Thread Jim Wilson
This adds MUSL support to the riscv port in gcc-7, as we had a request for it.
Tested with a glibc toolchain build to verify it doesn't break anything, and a
musl gcc build to verify that the dynamic linker names are right for each -mabi
option value.  Committed.

gcc/
Backport from mainline
2017-11-07  Michael Clark  

* config/riscv/linux.h (MUSL_ABI_SUFFIX): New define.
(MUSL_DYNAMIC_LINKER): Likewise.
---
 gcc/config/riscv/linux.h | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/gcc/config/riscv/linux.h b/gcc/config/riscv/linux.h
index ecf424d..6c7e3c4 100644
--- a/gcc/config/riscv/linux.h
+++ b/gcc/config/riscv/linux.h
@@ -24,6 +24,17 @@ along with GCC; see the file COPYING3.  If not see
 
 #define GLIBC_DYNAMIC_LINKER "/lib/ld-linux-riscv" XLEN_SPEC "-" ABI_SPEC 
".so.1"
 
+#define MUSL_ABI_SUFFIX \
+  "%{mabi=ilp32:-sf}" \
+  "%{mabi=ilp32f:-sp}" \
+  "%{mabi=ilp32d:}" \
+  "%{mabi=lp64:-sf}" \
+  "%{mabi=lp64f:-sp}" \
+  "%{mabi=lp64d:}" \
+
+#undef MUSL_DYNAMIC_LINKER
+#define MUSL_DYNAMIC_LINKER "/lib/ld-musl-riscv" XLEN_SPEC MUSL_ABI_SUFFIX 
".so.1"
+
 /* Because RISC-V only has word-sized atomics, it requries libatomic where
others do not.  So link libatomic by default, as needed.  */
 #undef LIB_SPEC
-- 
2.7.4



Re: [024/nnn] poly_int: ira subreg liveness tracking

2017-11-28 Thread Jeff Law
On 10/23/2017 11:09 AM, Richard Sandiford wrote:
> Normmaly the IRA-reload interface tries to track the liveness of
> individual bytes of an allocno if the allocno is sometimes written
> to as a SUBREG.  This isn't possible for variable-sized allocnos,
> but it doesn't matter because targets with variable-sized registers
> should use LRA instead.
> 
> This patch adds a get_subreg_tracking_sizes function for deciding
> whether it is possible to model a partial read or write.  Later
> patches make it return false if anything is variable.
> 
> 
> 2017-10-23  Richard Sandiford  
>   Alan Hayward  
>   David Sherwood  
> 
> gcc/
>   * ira.c (get_subreg_tracking_sizes): New function.
>   (init_live_subregs): Take an integer size rather than a register.
>   (build_insn_chain): Use get_subreg_tracking_sizes.  Update calls
>   to init_live_subregs.
OK.

Note this is starting to get close to the discussion around CLOBBER_HIGH
vs using a self set with a low subreg that we're having with Alan on
another thread in that liveness tracking of subregs of SVE regs could
potentially use some improvements.

When I quickly looked at the subreg handling in the df infrstructure my
first thought was that it might need some updating for SVE.  I can't
immediately call bits for poly_int/SVE in the patches to-date.  Have you
dug in there at all for the poly_int/SVE work?

Jeff



Re: [PATCH] Implement std::to_address for C++2a

2017-11-28 Thread Jonathan Wakely

On 28/11/17 12:30 -0500, Glen Fernandes wrote:

On Tue, Nov 28, 2017 at 9:24 AM, Jonathan Wakely wrote:

Thanks, Glen, I've committed this to trunk, with one small change to
fix the copyright dates in the new test, to be just 2017.


Thanks!


Because my new hobby is finding uses for if-constexpr, I think we
could have used the detection idiom to do it in a single overload, but
I don't see any reason to prefer that over your implementation:


I was thinking about using if-constexpr with std::is_detected_v but
wondered if it wouldn't be appropriate to use the latter until it
transitions from TS to IS. (But now that you've pointed it out, I
guess an implementation detail like __detected_or_t can live on
forever, even if the detection idiom facilities do not get adopted).


However, more importantly, both this form and yours fails for the
following test case, in two ways:

struct P {
 using F = void();

 F* operator->() const noexcept { return nullptr; }
};

I'm not sure if this is a bug in our std::pointer_traits, or if the
standard requires the specialization of std::pointer_traits to be
ill-formed (see [pointer.traits.types] p1). We have a problem if it
does require it, and either need to relax the requirements on
pointer_traits, or we need to alter the wording for to_address so that
it doesn't try to use pointer_traits when the specialization would be
ill-formed.


Could both be avoided? That is: I don't know if we need to relax it,
or make to_address tolerate it, if the intent is to require the user
to make P a valid pointer-like type  such that pointer_traits is
not ill-formed (by 1. providing an element_type member or 2.
specializing pointer_traits, since P is not a template template). Current implementations of __to_address or
__to_raw_pointer that are in use by our library facilities already
have this requirement implicitly (those that use typename
pointer_traits::element_type* as the return type, instead of C++14
auto), so users working with non-raw pointers would already be doing 1
or 2.


OK, that seems reasonable. In that case I think adding a note to the
standard might be useful, to clarify that for the first overload, even
if pointer_traits::to_address(p) is not well-formed, the
specialization pointer_traits must be well-formed (because the
check for to_address(p) can trigger errors outside the immediate
context).

So my test should be changed to have element_type (or specialize
pointer_traits), like so:

#include 

struct P {
 using element_type = void();

 element_type* operator->() const noexcept { return nullptr; }
};

int main()
{
 P p;
 std::to_address(p);
}

That compiles, and the bug is that is should fail the static assertion.


Secondly, if I remove that static_assert from  then
the test compiles, which is wrong, because it calls std::to_address on
a function pointer type. That should be ill-formed. The problem is
that the static_assert(!is_function_v<_Ptr>) is in std::to_address and
the implementation actually uses std::__to_address. So I think we want
the !is_function_v<_Ptr> check to be moved to the __to_address(_Ptr*)
overload.


Ah, yes. I'll move the static_assert into that overload (enabled in
C++2a or higher mode, since it uses is_function_v).


Great, thanks.

(Using is_function<_Tp>::value as in your original patch would allow
the assertion to be done for all modes that define __to_address)



  1   2   >