Re: [PATCH, PR70700] Fix ICE in dump_pred_graph

2016-05-02 Thread Richard Biener
On Mon, 2 May 2016, Tom de Vries wrote:

> Hi,
> 
> this patch fixes PR70700, an ICE in tree-ssa-structalias.c:dump_pred_graph for
> the test-case contained in the patch.
> 
> In the constraint graph, a node representing a variable varinfo_t var is
> represented as the corresponding var->id, ranging from 1 to FIRST_REF_NODE -
> 1.
> 
> A node representing a DEREF of a varinfo_t var is represented as the
> corresponding var->id + FIRST_REF_NODE, ranging from FIRST_REF_NODE + 1 to
> LAST_REF_NODE.
> 
> So, for a DEREF node, we need to substract FIRST_REF_NODE to find the
> corresponding variable. This logic is missing in a print statement in
> dump_pred_graph (which is triggered with TDF_GRAPH), which causes the ICE.
> 
> This patch fixes the ICE by substracting FIRST_REF_NODE from the node number
> of a DEREF node to find the varinfo, and prints it as a DEREF node (by adding
> an '*' prefix).
> 
> Bootstrapped and reg-tested on x86_64. Extracted graphs from ealias dump and
> verified that valid pdfs were produced.
> 
> OK for trunk?

Ok.

Richard.

> Thanks,
> - Tom
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Fix PR rtl-optimization/70886

2016-05-02 Thread Eric Botcazou
This is the bootstrap comparison failure on IA-64 after -frename-registers was 
enabled at -O2, but the issue has nothing to do with the pass.  It comes from 
the speculation support in the scheduler, namely from estimate_dep_weak:

/* Estimate the weakness of dependence between MEM1 and MEM2.  */
dw_t
estimate_dep_weak (rtx mem1, rtx mem2)
{
  rtx r1, r2;

  if (mem1 == mem2)
/* MEMs are the same - don't speculate.  */
return MIN_DEP_WEAK;

  r1 = XEXP (mem1, 0);
  r2 = XEXP (mem2, 0);

  if (r1 == r2
  || (REG_P (r1) && REG_P (r2)
  && REGNO (r1) == REGNO (r2)))
/* Again, MEMs are the same.  */
return MIN_DEP_WEAK;

The pointer comparison is not stable for VALUEs when cselib is used (this is 
the business of canonical cselib values).  I tried rtx_equal_for_cselib_p here 
but this doesn't work because there are dangling VALUEs during scheduling 
(VALUEs whose associated cselib value has been reclaimed but still referenced 
as addresses of MEMs on lists).  Hence the attached patch, which canonicalizes 
the cselib values manually and fixes the comparison failure.

Bootstrapped/regtested on IA-64/Linux.  The patch also contains an unrelated 
micro-optimization for rtx_equal_for_cselib_p.  Thoughts?


2016-05-02  Eric Botcazou  

PR rtl-optimization/70886
* sched-deps.c (estimate_dep_weak): Canonicalize cselib values.

* cselib.h (rtx_equal_for_cselib_1): Declare.
(rtx_equal_for_cselib_p: New inline function.
* cselib.c (rtx_equal_for_cselib_p): Delete.
(rtx_equal_for_cselib_1): Make public.

-- 
Eric BotcazouIndex: cselib.c
===
--- cselib.c	(revision 235678)
+++ cselib.c	(working copy)
@@ -49,7 +49,6 @@ static void unchain_one_value (cselib_va
 static void unchain_one_elt_list (struct elt_list **);
 static void unchain_one_elt_loc_list (struct elt_loc_list **);
 static void remove_useless_values (void);
-static int rtx_equal_for_cselib_1 (rtx, rtx, machine_mode);
 static unsigned int cselib_hash_rtx (rtx, int, machine_mode);
 static cselib_val *new_cselib_val (unsigned int, machine_mode, rtx);
 static void add_mem_for_addr (cselib_val *, cselib_val *, rtx);
@@ -788,15 +787,6 @@ cselib_reg_set_mode (const_rtx x)
   return GET_MODE (REG_VALUES (REGNO (x))->elt->val_rtx);
 }
 
-/* Return nonzero if we can prove that X and Y contain the same value, taking
-   our gathered information into account.  */
-
-int
-rtx_equal_for_cselib_p (rtx x, rtx y)
-{
-  return rtx_equal_for_cselib_1 (x, y, VOIDmode);
-}
-
 /* If x is a PLUS or an autoinc operation, expand the operation,
storing the offset, if any, in *OFF.  */
 
@@ -843,7 +833,7 @@ autoinc_split (rtx x, rtx *off, machine_
addressing modes.  If X and Y are not (known to be) part of
addresses, MEMMODE should be VOIDmode.  */
 
-static int
+int
 rtx_equal_for_cselib_1 (rtx x, rtx y, machine_mode memmode)
 {
   enum rtx_code code;
Index: cselib.h
===
--- cselib.h	(revision 235678)
+++ cselib.h	(working copy)
@@ -82,7 +82,7 @@ extern void cselib_finish (void);
 extern void cselib_process_insn (rtx_insn *);
 extern bool fp_setter_insn (rtx_insn *);
 extern machine_mode cselib_reg_set_mode (const_rtx);
-extern int rtx_equal_for_cselib_p (rtx, rtx);
+extern int rtx_equal_for_cselib_1 (rtx, rtx, machine_mode);
 extern int references_value_p (const_rtx, int);
 extern rtx cselib_expand_value_rtx (rtx, bitmap, int);
 typedef rtx (*cselib_expand_callback)(rtx, bitmap, int, void *);
@@ -125,4 +125,16 @@ canonical_cselib_val (cselib_val *val)
   return canon;
 }
 
+/* Return nonzero if we can prove that X and Y contain the same value, taking
+   our gathered information into account.  */
+
+static inline int
+rtx_equal_for_cselib_p (rtx x, rtx y)
+{
+  if (x == y)
+return 1;
+
+  return rtx_equal_for_cselib_1 (x, y, VOIDmode);
+}
+
 #endif /* GCC_CSELIB_H */
Index: sched-deps.c
===
--- sched-deps.c	(revision 235678)
+++ sched-deps.c	(working copy)
@@ -4182,22 +4182,29 @@ finish_deps_global (void)
 dw_t
 estimate_dep_weak (rtx mem1, rtx mem2)
 {
-  rtx r1, r2;
-
   if (mem1 == mem2)
 /* MEMs are the same - don't speculate.  */
 return MIN_DEP_WEAK;
 
-  r1 = XEXP (mem1, 0);
-  r2 = XEXP (mem2, 0);
+  rtx r1 = XEXP (mem1, 0);
+  rtx r2 = XEXP (mem2, 0);
+
+  if (sched_deps_info->use_cselib)
+{
+  /* We cannot call rtx_equal_for_cselib_p because the VALUEs might be
+	 dangling at this point, since we never preserve them.  Instead we
+	 canonicalize manually to get stable VALUEs out of hashing.  */
+  if (GET_CODE (r1) == VALUE && CSELIB_VAL_PTR (r1))
+	r1 = canonical_cselib_val (CSELIB_VAL_PTR (r1))->val_rtx;
+  if (GET_CODE (r2) == VALUE && CSELIB_VAL_PTR (r2))
+	r2 = canonical_cselib_val (CSELIB_VAL_PTR (r2))->val_rtx;
+}
 
   if (r1 == r2
-  || (REG_P (r1) && REG_P (r

Re: [Openacc] Adjust automatic loop partitioning

2016-05-02 Thread Jakub Jelinek
On Fri, Apr 29, 2016 at 10:00:43AM -0400, Nathan Sidwell wrote:
> Jakub,
> currently automatic loop partitioning assigns from the innermost loop
> outwards -- that was the simplest thing to implement.  A better algorithm is
> to assign the outermost loop to the outermost available axis, and then
> assign from the innermost loop outwards.   That way we (generally) get gang
> partitioning on the outermost loop.  Just inside that we'll get
> non-partitioned loops if the nest is too deep, and the two innermost nested
> loops will get worker and vector partitioning.
> 
> This patch has been on the gomp4 branch for a while.  ok for trunk?
> 
> nathan

> 2016-04-29  Nathan Sidwell  
> 
>   gcc/
>   * omp-low.c (struct oacc_loop): Add 'inner' field.
>   (new_oacc_loop_raw): Initialize it to zero.
>   (oacc_loop_fixed_partitions): Initialize it.
>   (oacc_loop_auto_partitions): Partition outermost loop to outermost
>   available partitioning.
> 
>   gcc/testsuite/
>   * c-c++-common/goacc/loop-auto-1.c: Adjust expected warnings.
> 
>   libgomp/
>   * testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: Adjust
>   expected partitioning.

Ok.

Jakub


[Ada] Minor cleanup 1/2

2016-05-02 Thread Eric Botcazou
This is a small refactoring of the handling of ranges of values.

Tested on x86_64-suse-linux, applied on the mainline.


2016-05-02  Eric Botcazou  

* gcc-interface/trans.c (Range_to_gnu): New static function.
(Raise_Error_to_gnu) : Call it to translate the range.
(gnat_to_gnu) : Likewise.

-- 
Eric BotcazouIndex: gcc-interface/trans.c
===
--- gcc-interface/trans.c	(revision 235678)
+++ gcc-interface/trans.c	(working copy)
@@ -5439,6 +5439,38 @@ build_noreturn_cond (tree cond)
   return build1 (NOP_EXPR, boolean_type_node, t);
 }
 
+/* Subroutine of gnat_to_gnu to translate GNAT_RANGE, a node representing a
+   range of values, into GNU_LOW and GNU_HIGH bounds.  */
+
+static void
+Range_to_gnu (Node_Id gnat_range, tree *gnu_low, tree *gnu_high)
+{
+  /* GNAT_RANGE is either an N_Range or an identifier denoting a subtype.  */
+  switch (Nkind (gnat_range))
+{
+case N_Range:
+  *gnu_low = gnat_to_gnu (Low_Bound (gnat_range));
+  *gnu_high = gnat_to_gnu (High_Bound (gnat_range));
+  break;
+
+case N_Expanded_Name:
+case N_Identifier:
+  {
+	tree gnu_range_type = get_unpadded_type (Entity (gnat_range));
+	tree gnu_range_base_type = get_base_type (gnu_range_type);
+
+	*gnu_low
+	  = convert (gnu_range_base_type, TYPE_MIN_VALUE (gnu_range_type));
+	*gnu_high
+	  = convert (gnu_range_base_type, TYPE_MAX_VALUE (gnu_range_type));
+  }
+  break;
+
+default:
+  gcc_unreachable ();
+}
+}
+
 /* Subroutine of gnat_to_gnu to translate GNAT_NODE, an N_Raise_xxx_Error,
to a GCC tree and return it.  GNU_RESULT_TYPE_P is a pointer to where
we should place the result type.  */
@@ -5469,7 +5501,7 @@ Raise_Error_to_gnu (Node_Id gnat_node, t
 case CE_Invalid_Data:
   if (Present (gnat_cond) && Nkind (gnat_cond) == N_Op_Not)
 	{
-	  Node_Id gnat_range, gnat_index, gnat_type;
+	  Node_Id gnat_index, gnat_type;
 	  tree gnu_type, gnu_index, gnu_low_bound, gnu_high_bound, disp;
 	  bool neg_p;
 	  struct loop_info_d *loop;
@@ -5477,10 +5509,8 @@ Raise_Error_to_gnu (Node_Id gnat_node, t
 	  switch (Nkind (Right_Opnd (gnat_cond)))
 	{
 	case N_In:
-	  gnat_range = Right_Opnd (Right_Opnd (gnat_cond));
-	  gcc_assert (Nkind (gnat_range) == N_Range);
-	  gnu_low_bound = gnat_to_gnu (Low_Bound (gnat_range));
-	  gnu_high_bound = gnat_to_gnu (High_Bound (gnat_range));
+	  Range_to_gnu (Right_Opnd (Right_Opnd (gnat_cond)),
+			&gnu_low_bound, &gnu_high_bound);
 	  break;
 
 	case N_Op_Ge:
@@ -6458,30 +6488,9 @@ gnat_to_gnu (Node_Id gnat_node)
 case N_Not_In:
   {
 	tree gnu_obj = gnat_to_gnu (Left_Opnd (gnat_node));
-	Node_Id gnat_range = Right_Opnd (gnat_node);
 	tree gnu_low, gnu_high;
 
-	/* GNAT_RANGE is either an N_Range node or an identifier denoting a
-	   subtype.  */
-	if (Nkind (gnat_range) == N_Range)
-	  {
-	gnu_low = gnat_to_gnu (Low_Bound (gnat_range));
-	gnu_high = gnat_to_gnu (High_Bound (gnat_range));
-	  }
-	else if (Nkind (gnat_range) == N_Identifier
-		 || Nkind (gnat_range) == N_Expanded_Name)
-	  {
-	tree gnu_range_type = get_unpadded_type (Entity (gnat_range));
-	tree gnu_range_base_type = get_base_type (gnu_range_type);
-
-	gnu_low
-	  = convert (gnu_range_base_type, TYPE_MIN_VALUE (gnu_range_type));
-	gnu_high
-	  = convert (gnu_range_base_type, TYPE_MAX_VALUE (gnu_range_type));
-	  }
-	else
-	  gcc_unreachable ();
-
+	Range_to_gnu (Right_Opnd (gnat_node), &gnu_low, &gnu_high);
 	gnu_result_type = get_unpadded_type (Etype (gnat_node));
 
 	tree gnu_op_type = maybe_character_type (TREE_TYPE (gnu_obj));


[Ada] Minor cleanup 2/2

2016-05-02 Thread Eric Botcazou
This consistently passes NULL_TREE as operand #2 of COMPONENT_REF and #3 of 
ARRAY_REF/ARRAY_RANGE_REF in gigi.  There is no functional change since we 
never build them with these operands in the first place.

Tested on x86_64-suse-linux, applied on the mainline.


2016-05-02  Eric Botcazou  

* gcc-interface/decl.c (elaborate_reference_1): Do not bother about
operand #2 for COMPONENT_REF.
* gcc-interface/utils2.c (gnat_save_expr): Likewise.
(gnat_protect_expr): Likewise.
(gnat_stabilize_reference_1): Likewise.
(gnat_rewrite_reference): Don't bother about operand #3 for ARRAY_REF
(get_inner_constant_reference): Likewise.
(gnat_invariant_expr): Likewise.
* gcc-interface/trans.c (fold_constant_decl_in_expr): Likewise.

-- 
Eric BotcazouIndex: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 235678)
+++ gcc-interface/decl.c	(working copy)
@@ -6656,7 +6656,7 @@ elaborate_reference_1 (tree ref, void *d
   && TYPE_IS_FAT_POINTER_P (TREE_TYPE (TREE_OPERAND (ref, 0
 return build3 (COMPONENT_REF, TREE_TYPE (ref),
 		   elaborate_reference_1 (TREE_OPERAND (ref, 0), data),
-		   TREE_OPERAND (ref, 1), TREE_OPERAND (ref, 2));
+		   TREE_OPERAND (ref, 1), NULL_TREE);
 
   sprintf (suffix, "EXP%d", ++er->n);
   return
Index: gcc-interface/trans.c
===
--- gcc-interface/trans.c	(revision 235699)
+++ gcc-interface/trans.c	(working copy)
@@ -955,14 +955,21 @@ fold_constant_decl_in_expr (tree exp)
 
   return DECL_INITIAL (exp);
 
-case BIT_FIELD_REF:
 case COMPONENT_REF:
   op0 = fold_constant_decl_in_expr (TREE_OPERAND (exp, 0));
   if (op0 == TREE_OPERAND (exp, 0))
 	return exp;
 
-  return fold_build3 (code, TREE_TYPE (exp), op0, TREE_OPERAND (exp, 1),
-			  TREE_OPERAND (exp, 2));
+  return fold_build3 (COMPONENT_REF, TREE_TYPE (exp), op0,
+			  TREE_OPERAND (exp, 1), NULL_TREE);
+
+case BIT_FIELD_REF:
+  op0 = fold_constant_decl_in_expr (TREE_OPERAND (exp, 0));
+  if (op0 == TREE_OPERAND (exp, 0))
+	return exp;
+
+  return fold_build3 (BIT_FIELD_REF, TREE_TYPE (exp), op0,
+			  TREE_OPERAND (exp, 1), TREE_OPERAND (exp, 2));
 
 case ARRAY_REF:
 case ARRAY_RANGE_REF:
@@ -974,7 +981,7 @@ fold_constant_decl_in_expr (tree exp)
 	return exp;
 
   return fold (build4 (code, TREE_TYPE (exp), op0, TREE_OPERAND (exp, 1),
-			   TREE_OPERAND (exp, 2), TREE_OPERAND (exp, 3)));
+			   TREE_OPERAND (exp, 2), NULL_TREE));
 
 case REALPART_EXPR:
 case IMAGPART_EXPR:
Index: gcc-interface/utils2.c
===
--- gcc-interface/utils2.c	(revision 235678)
+++ gcc-interface/utils2.c	(working copy)
@@ -2510,7 +2510,7 @@ gnat_save_expr (tree exp)
   if (code == COMPONENT_REF
   && TYPE_IS_FAT_POINTER_P (TREE_TYPE (TREE_OPERAND (exp, 0
 return build3 (code, type, gnat_save_expr (TREE_OPERAND (exp, 0)),
-		   TREE_OPERAND (exp, 1), TREE_OPERAND (exp, 2));
+		   TREE_OPERAND (exp, 1), NULL_TREE);
 
   return save_expr (exp);
 }
@@ -2562,7 +2562,7 @@ gnat_protect_expr (tree exp)
   if (code == COMPONENT_REF
   && TYPE_IS_FAT_POINTER_P (TREE_TYPE (TREE_OPERAND (exp, 0
 return build3 (code, type, gnat_protect_expr (TREE_OPERAND (exp, 0)),
-		   TREE_OPERAND (exp, 1), TREE_OPERAND (exp, 2));
+		   TREE_OPERAND (exp, 1), NULL_TREE);
 
   /* If this is a fat pointer or a scalar, just make a SAVE_EXPR.  Likewise
  for a CALL_EXPR as large objects are returned via invisible reference
@@ -2610,7 +2610,7 @@ gnat_stabilize_reference_1 (tree e, void
 	result
 	  = build3 (code, type,
 		gnat_stabilize_reference_1 (TREE_OPERAND (e, 0), data),
-		TREE_OPERAND (e, 1), TREE_OPERAND (e, 2));
+		TREE_OPERAND (e, 1), NULL_TREE);
   /* If the expression has side-effects, then encase it in a SAVE_EXPR
 	 so that it will only be evaluated once.  */
   /* The tcc_reference and tcc_comparison classes could be handled as
@@ -2718,7 +2718,7 @@ gnat_rewrite_reference (tree ref, rewrit
 		  gnat_rewrite_reference (TREE_OPERAND (ref, 0), func, data,
 	  init),
 		  func (TREE_OPERAND (ref, 1), data),
-		  TREE_OPERAND (ref, 2), TREE_OPERAND (ref, 3));
+		  TREE_OPERAND (ref, 2), NULL_TREE);
   break;
 
 case COMPOUND_EXPR:
@@ -2796,9 +2796,6 @@ get_inner_constant_reference (tree exp)
 	  break;
 
 	case COMPONENT_REF:
-	  if (TREE_OPERAND (exp, 2))
-	return NULL_TREE;
-
 	  if (!TREE_CONSTANT (DECL_FIELD_OFFSET (TREE_OPERAND (exp, 1
 	return NULL_TREE;
 	  break;
@@ -2806,7 +2803,7 @@ get_inner_constant_reference (tree exp)
 	case ARRAY_REF:
 	case ARRAY_RANGE_REF:
 	  {
-	if (TREE_OPERAND (exp, 2) || TREE_OPERAND (exp, 3))
+	if (TREE_OPERAND (exp, 2))
 	  return NULL_TREE;
 
 	tree array_type = TREE_TYPE (TREE_OPERAND (exp, 0));
@@ -2934,16 +2931,1

Re: [PATCH, i386, PR target/70799, 1/2] Support constants in STV pass (DImode)

2016-05-02 Thread Uros Bizjak
On Fri, Apr 29, 2016 at 5:48 PM, H.J. Lu  wrote:
> On Fri, Apr 29, 2016 at 8:42 AM, Ilya Enkovich  wrote:
>> Hi,
>>
>> As the first part of PR70799 fix I'd like to add constants support for
>> DI-STV pass (which is also related to PR70763).  This patch adds CONST_INT
>> support as insn operands and extends cost model accordingly.
>>
>> Bootstrapped and regtested on x86_64-unknown-linux-gnu{-m32}.  OK for trunk?
>>
>> Thanks,
>> Ilya
>> --
>> gcc/
>>
>> 2016-04-29  Ilya Enkovich  
>>
>> PR target/70799
>> PR target/70763
>> * config/i386/i386.c (dimode_scalar_to_vector_candidate_p): Allow
>> integer constants.
>> (dimode_scalar_chain::vector_const_cost): New.
>> (dimode_scalar_chain::compute_convert_gain): Handle constants.
>> (dimode_scalar_chain::convert_op): Likewise.
>> (dimode_scalar_chain::convert_insn): Likewise.
>
> I think we should fix STV regression first:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70321
>
> In 64-bit, STV is run before CSE2 pass.  Can we do that for 32-bit?

64bit STV only handles constants ATM, so it is not that susceptible to
where the pass is inserted. However, I think that we *can* also move
32bit STV pass before CSE2 pass. If I'm not missing some details here,
the combine will then combine V2DI instructions instead of DImode
insns, and this should be irrelevant as far as combine is concerned.

But the above patch is orthogonal to the STV introduced regression,
regression depends on a pass insertion point.

Uros.


[SH][committed] Remove *negnegt, *movtt patterns

2016-05-02 Thread Oleg Endo
Hi,

The *negnegt, *movtt patterns seem to have no effect anymore.  Removing
them doesn't show any changes in the CSiBE set and the known related
test cases in the testsuite pass.

Tested on sh-elf with

make -k check RUNTESTFLAGS="--target_board=sh-sim\{-m2/-ml,-m2/-mb,
-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}"

Committed as r235704.

Cheers,
Oleg

gcc/ChangeLog:
* config/sh/sh.md (*negnegt, *movtt): Remove.diff --git a/gcc/config/sh/sh.md b/gcc/config/sh/sh.md
index 997088c..9b43542 100644
--- a/gcc/config/sh/sh.md
+++ b/gcc/config/sh/sh.md
@@ -8344,16 +8344,6 @@
 gcc_unreachable ();
 })
 
-;; The *negnegt pattern helps the combine pass to figure out how to fold 
-;; an explicit double T bit negation.
-(define_insn_and_split "*negnegt"
-  [(set (reg:SI T_REG)
-	(eq:SI (match_operand 0 "negt_reg_operand" "") (const_int 0)))]
-  "TARGET_SH1"
-  "#"
-  ""
-  [(const_int 0)])
-
 ;; Store (negated) T bit as all zeros or ones in a reg.
 ;;	subc	Rn,Rn	! Rn = Rn - Rn - T; T = T
 ;;	not	Rn,Rn	! Rn = 0 - Rn
@@ -8378,15 +8368,6 @@
 }
   [(set_attr "type" "arith")])
 
-;; The *movtt pattern eliminates redundant T bit to T bit moves / tests.
-(define_insn_and_split "*movtt"
-  [(set (reg:SI T_REG)
-	(eq:SI (match_operand 0 "t_reg_operand" "") (const_int 1)))]
-  "TARGET_SH1"
-  "#"
-  ""
-  [(const_int 0)])
-
 ;; Invert the T bit.
 ;; On SH2A we can use the nott insn.  On anything else this must be done with
 ;; multiple insns like:


Re: Move "X +- C1 CMP C2 to X CMP C2 -+ C1" to match.pd

2016-05-02 Thread Richard Biener
On Fri, Apr 29, 2016 at 7:36 PM, Marc Glisse  wrote:
> On Fri, 29 Apr 2016, Richard Biener wrote:
>
>> Another option is to move the enum declaration from flag-types.h to
>> coretypes.h.  I think I like that best.
>
>
> This works.

Ok.

Thanks,
Richard.

> 2016-05-02  Marc Glisse  
>
> gcc/
> * flag-types.h (enum warn_strict_overflow_code): Move ...
> * coretypes.h: ... here.
> * fold-const.h (fold_overflow_warning): Declare.
> * fold-const.c (fold_overflow_warning): Make non-static.
> (fold_comparison): Move the transformation of X +- C1 CMP C2
> into X CMP C2 -+ C1 ...
> * match.pd: ... here.
> * gimple-fold.c (fold_stmt_1): Protect with
>
> fold_defer_overflow_warnings.
>
> gcc/testsuite/
> * gcc.dg/tree-ssa/20040305-1.c: Adjust.
>
> --
> Marc Glisse
>
> Index: gcc/coretypes.h
> ===
> --- gcc/coretypes.h (revision 235644)
> +++ gcc/coretypes.h (working copy)
> @@ -215,20 +215,44 @@ enum optimization_type {
>  /* Possible initialization status of a variable.   When requested
> by the user, this information is tracked and recorded in the DWARF
> debug information, along with the variable's location.  */
>  enum var_init_status
>  {
>VAR_INIT_STATUS_UNKNOWN,
>VAR_INIT_STATUS_UNINITIALIZED,
>VAR_INIT_STATUS_INITIALIZED
>  };
>
> +/* Names for the different levels of -Wstrict-overflow=N.  The numeric
> +   values here correspond to N.  */
> +enum warn_strict_overflow_code
> +{
> +  /* Overflow warning that should be issued with -Wall: a questionable
> + construct that is easy to avoid even when using macros.  Example:
> + folding (x + CONSTANT > x) to 1.  */
> +  WARN_STRICT_OVERFLOW_ALL = 1,
> +  /* Overflow warning about folding a comparison to a constant because
> + of undefined signed overflow, other than cases covered by
> + WARN_STRICT_OVERFLOW_ALL.  Example: folding (abs (x) >= 0) to 1
> + (this is false when x == INT_MIN).  */
> +  WARN_STRICT_OVERFLOW_CONDITIONAL = 2,
> +  /* Overflow warning about changes to comparisons other than folding
> + them to a constant.  Example: folding (x + 1 > 1) to (x > 0).  */
> +  WARN_STRICT_OVERFLOW_COMPARISON = 3,
> +  /* Overflow warnings not covered by the above cases.  Example:
> + folding ((x * 10) / 5) to (x * 2).  */
> +  WARN_STRICT_OVERFLOW_MISC = 4,
> +  /* Overflow warnings about reducing magnitude of constants in
> + comparison.  Example: folding (x + 2 > y) to (x + 1 >= y).  */
> +  WARN_STRICT_OVERFLOW_MAGNITUDE = 5
> +};
> +
>  /* The type of an alias set.  Code currently assumes that variables of
> this type can take the values 0 (the alias set which aliases
> everything) and -1 (sometimes indicating that the alias set is
> unknown, sometimes indicating a memory barrier) and -2 (indicating
> that the alias set should be set to a unique value but has not been
> set yet).  */
>  typedef int alias_set_type;
>
>  struct edge_def;
>  typedef struct edge_def *edge;
> Index: gcc/flag-types.h
> ===
> --- gcc/flag-types.h(revision 235644)
> +++ gcc/flag-types.h(working copy)
> @@ -171,44 +171,20 @@ enum stack_check_type
>/* Check the stack and rely on the target configuration files to
>   check the static frame of functions, i.e. use the generic
>   mechanism only for dynamic stack allocations.  */
>STATIC_BUILTIN_STACK_CHECK,
>
>/* Check the stack and entirely rely on the target configuration
>   files, i.e. do not use the generic mechanism at all.  */
>FULL_BUILTIN_STACK_CHECK
>  };
>
> -/* Names for the different levels of -Wstrict-overflow=N.  The numeric
> -   values here correspond to N.  */
> -enum warn_strict_overflow_code
> -{
> -  /* Overflow warning that should be issued with -Wall: a questionable
> - construct that is easy to avoid even when using macros.  Example:
> - folding (x + CONSTANT > x) to 1.  */
> -  WARN_STRICT_OVERFLOW_ALL = 1,
> -  /* Overflow warning about folding a comparison to a constant because
> - of undefined signed overflow, other than cases covered by
> - WARN_STRICT_OVERFLOW_ALL.  Example: folding (abs (x) >= 0) to 1
> - (this is false when x == INT_MIN).  */
> -  WARN_STRICT_OVERFLOW_CONDITIONAL = 2,
> -  /* Overflow warning about changes to comparisons other than folding
> - them to a constant.  Example: folding (x + 1 > 1) to (x > 0).  */
> -  WARN_STRICT_OVERFLOW_COMPARISON = 3,
> -  /* Overflow warnings not covered by the above cases.  Example:
> - folding ((x * 10) / 5) to (x * 2).  */
> -  WARN_STRICT_OVERFLOW_MISC = 4,
> -  /* Overflow warnings about reducing magnitude of constants in
> - comparison.  Example: folding (x + 2 > y) to (x + 1 >= y).  */
> -  WARN_STRICT_OVERFLOW_MAGNITUDE = 5
> -};
> -
>  /* Floating-point contraction mode.  */
>  enum fp_contract_mode

Re: [PATCH] Fix spec-options.c test case

2016-05-02 Thread Dominik Vogt
On Sun, May 01, 2016 at 07:52:40AM +, Bernd Edlinger wrote:
> I took a closer look at this test case, and I found, except that
> it triggers a dejagnu bug, it is also wrong.  I have tested with
> a cross-compiler for target=sh-elf and found that the test case
> actually FAILs because the foo.specs uses "cppruntime" which
> is only referenced in gcc/config/sh/superh.h, but sh/superh.h
> is only included for target sh*-superh-elf, see gcc/config.gcc.
> 
> This means that it can only pass for target=sh-superh-elf.
> 
> The attached patch fixes the testcase and makes it run always,
> so that it does no longer triggers the dejagnu bug.

Looks like a viable solution.  I'd add a comment about the bug
though.

> -/* { dg-do compile } */
> -/* { dg-do run { target sh*-*-* } } */
> +/* { dg-do run } */
> +/* { dg-shouldfail "" { ! sh*-superh-elf } } */

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



Re: Canonicalize X u< X to UNORDERED_EXPR

2016-05-02 Thread Richard Biener
On Sat, Apr 30, 2016 at 8:44 PM, Marc Glisse  wrote:
> Hello,
>
> this case seemed to be missing in the various X cmp X transformations. It
> does not change the generated code in the testcase.
>
> The missing :c is rather trivial. I can commit it separately if you prefer.

I think it's not missing.  Commutating the first one is enough to eventually
make the @1s match up.  I think you should get a diagnostic on a duplicate
pattern when adding another :c (hmm, no, it's indeed "different" patterns but
still redundant).

> Bootstrap+regtest on powerpc64le-unknown-linux-gnu.

Ok for the new pattern.

Thanks,
Richard.

> 2016-05-02  Marc Glisse  
>
> gcc/
> * match.pd ((A & B) OP (C & B)): Mark '&' as commutative.
> (X u< X, X u> X): New transformations
>
> gcc/testsuite/
> * gcc.dg/tree-ssa/unord.c: New testcase.
>
> --
> Marc Glisse
> Index: trunk/gcc/match.pd
> ===
> --- trunk/gcc/match.pd  (revision 235654)
> +++ trunk/gcc/match.pd  (working copy)
> @@ -783,21 +783,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>@0)
>   /* (~x | y) & x -> x & y */
>   /* (~x & y) | x -> x | y */
>   (simplify
>(bitop:c (rbitop:c (bit_not @0) @1) @0)
>(bitop @0 @1)))
>
>  /* Simplify (A & B) OP0 (C & B) to (A OP0 C) & B. */
>  (for bitop (bit_and bit_ior bit_xor)
>   (simplify
> -  (bitop (bit_and:c @0 @1) (bit_and @2 @1))
> +  (bitop (bit_and:c @0 @1) (bit_and:c @2 @1))
>(bit_and (bitop @0 @2) @1)))
>
>  /* (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) */
>  (simplify
>(bit_and (bit_ior @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2)
>(bit_ior (bit_and @0 @2) (bit_and @1 @2)))
>
>  /* Combine successive equal operations with constants.  */
>  (for bitop (bit_and bit_ior bit_xor)
>   (simplify
> @@ -1914,20 +1914,24 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>   (simplify
>(cmp @0 @0)
>(if (cmp != NE_EXPR
> || ! FLOAT_TYPE_P (TREE_TYPE (@0))
> || ! HONOR_NANS (@0))
> { constant_boolean_node (false, type); })))
>  (for cmp (unle unge uneq)
>   (simplify
>(cmp @0 @0)
>{ constant_boolean_node (true, type); }))
> +(for cmp (unlt ungt)
> + (simplify
> +  (cmp @0 @0)
> +  (unordered @0 @0)))
>  (simplify
>   (ltgt @0 @0)
>   (if (!flag_trapping_math)
>{ constant_boolean_node (false, type); }))
>
>  /* Fold ~X op ~Y as Y op X.  */
>  (for cmp (simple_comparison)
>   (simplify
>(cmp (bit_not@2 @0) (bit_not@3 @1))
>(if (single_use (@2) && single_use (@3))
> Index: trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c
> ===
> --- trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c (revision 0)
> +++ trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c (working copy)
> @@ -0,0 +1,7 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-optimized" } */
> +
> +int f(double a){double b=a;return !__builtin_islessequal(a,b);}
> +int g(double a){double b=a;return !__builtin_isgreaterequal(a,b);}
> +
> +/* { dg-final { scan-tree-dump-times " unord " 2 "optimized" } } */
>


Re: [PATCH] Fix PR tree-optimization/51513

2016-05-02 Thread Richard Biener
On Fri, Apr 29, 2016 at 2:09 PM, Peter Bergner  wrote:
> On Fri, 2016-04-29 at 11:56 +0200, Richard Biener wrote:
>> Your testcase passes '2' where it passes just fine.  If I pass 3 as which
>> I indeed get an abort () but you can't reasonably expect it to return 13 
>> then.
>
> Bah, I added an extra case and didn't change the argument.  :-(
> Let me fix that and then dig into the current behavior.
>
>
>
>> So I fail to see the actual bug you are fixing and I wonder why you do stuff
>> at the GIMPLE level when we only remove the unreachable blocks at RTL
>> level CFG cleanup.  Iff then the "fix" should be there.
>
> I actually started out trying to fix the problem in rtl first, but
> ran into multiple problems, which at the time made it seem like
> fixing this at the GIMPLE level was a better solution.
>
>
>
>> But as said, the behavior is expected - in fact the jump-table code should
>> be optimized for a unreachable default case to simply omit the range
>> check!  That would be a better fix (also avoiding the wild branch).
>
> I know I've seen the wild branch due to normal case statements having
> __builtin_unreachable() too, so it's not just a default case problem.
> That said, I'll have a look to see whether we can fix unreachable
> normal case statements too.  Thanks.

Again, the wild jump is not a bug but at most a missed optimization
(to remove it).

Richard.

> Peter
>
>


Re: Support <, <=, > and >= for offset_int and widest_int

2016-05-02 Thread Richard Biener
On Fri, Apr 29, 2016 at 2:26 PM, Richard Sandiford
 wrote:
> offset_int and widest_int are supposed to be at least one bit wider
> than all the values they need to represent, with the extra bits
> being signs.  Thus offset_int is effectively int128_t and widest_int
> is effectively intNNN_t, for target-dependent NNN.
>
> Because the types are signed, there's not really any need to specify
> a sign for operations like comparison.  I think things would be clearer
> if we supported <, <=, > and >= for them (but not for wide_int, which
> doesn't have a sign).
>
> Tested on x86_64-linux-gnu and aarch64-linux-gnu.  OK to install?

Ok.

Thanks,
Richard.

> Thanks,
> Richard
>
>
> gcc/
> * wide-int.h: Update offset_int and widest_int documentation.
> (WI_SIGNED_BINARY_PREDICATE_RESULT): New macro.
> (wi::binary_traits): Allow ordered comparisons between offset_int and
> offset_int, between widest_int and widest_int, and between either
> of these types and basic C types.
> (operator <, <=, >, >=): Define for the same combinations.
> * tree.h (tree_int_cst_lt): Use comparison operators instead
> of wi:: comparisons.
> (tree_int_cst_le): Likewise.
> * gimple-fold.c (fold_array_ctor_reference): Likewise.
> (fold_nonarray_ctor_reference): Likewise.
> * gimple-ssa-strength-reduction.c (record_increment): Likewise.
> * tree-affine.c (aff_comb_cannot_overlap_p): Likewise.
> * tree-parloops.c (try_transform_to_exit_first_loop_alt): Likewise.
> * tree-sra.c (completely_scalarize): Likewise.
> * tree-ssa-alias.c (stmt_kills_ref_p): Likewise.
> * tree-ssa-reassoc.c (extract_bit_test_mask): Likewise.
> * tree-vrp.c (extract_range_from_binary_expr_1): Likewise.
> (check_for_binary_op_overflow): Likewise.
> (search_for_addr_array): Likewise.
> * ubsan.c (ubsan_expand_objsize_ifn): Likewise.
>
> Index: gcc/wide-int.h
> ===
> --- gcc/wide-int.h
> +++ gcc/wide-int.h
> @@ -53,22 +53,26 @@ along with GCC; see the file COPYING3.  If not see
>   multiply, division, shifts, comparisons, and operations that need
>   overflow detected), the signedness must be specified separately.
>
> - 2) offset_int.  This is a fixed size representation that is
> - guaranteed to be large enough to compute any bit or byte sized
> - address calculation on the target.  Currently the value is 64 + 4
> - bits rounded up to the next number even multiple of
> - HOST_BITS_PER_WIDE_INT (but this can be changed when the first
> - port needs more than 64 bits for the size of a pointer).
> -
> - This flavor can be used for all address math on the target.  In
> - this representation, the values are sign or zero extended based
> - on their input types to the internal precision.  All math is done
> - in this precision and then the values are truncated to fit in the
> - result type.  Unlike most gimple or rtl intermediate code, it is
> - not useful to perform the address arithmetic at the same
> - precision in which the operands are represented because there has
> - been no effort by the front ends to convert most addressing
> - arithmetic to canonical types.
> + 2) offset_int.  This is a fixed-precision integer that can hold
> + any address offset, measured in either bits or bytes, with at
> + least one extra sign bit.  At the moment the maximum address
> + size GCC supports is 64 bits.  With 8-bit bytes and an extra
> + sign bit, offset_int therefore needs to have at least 68 bits
> + of precision.  We round this up to 128 bits for efficiency.
> + Values of type T are converted to this precision by sign- or
> + zero-extending them based on the signedness of T.
> +
> + The extra sign bit means that offset_int is effectively a signed
> + 128-bit integer, i.e. it behaves like int128_t.
> +
> + Since the values are logically signed, there is no need to
> + distinguish between signed and unsigned operations.  Sign-sensitive
> + comparison operators <, <=, > and >= are therefore supported.
> +
> + [ Note that, even though offset_int is effectively int128_t,
> +   it can still be useful to use unsigned comparisons like
> +   wi::leu_p (a, b) as a more efficient short-hand for
> +   "a >= 0 && a <= b". ]
>
>   3) widest_int.  This representation is an approximation of
>   infinite precision math.  However, it is not really infinite
> @@ -76,9 +80,9 @@ along with GCC; see the file COPYING3.  If not see
>   precision math where the precision is 4 times the size of the
>   largest integer that the target port can represent.
>
> - widest_int is supposed to be wider than any number that it needs to
> - store, meaning that there is always at least one leading sign bit.
> - All widest_int value

Re: [PATCH, i386, PR target/70799, 1/2] Support constants in STV pass (DImode)

2016-05-02 Thread Uros Bizjak
On Fri, Apr 29, 2016 at 5:42 PM, Ilya Enkovich  wrote:
> Hi,
>
> As the first part of PR70799 fix I'd like to add constants support for
> DI-STV pass (which is also related to PR70763).  This patch adds CONST_INT
> support as insn operands and extends cost model accordingly.
>
> Bootstrapped and regtested on x86_64-unknown-linux-gnu{-m32}.  OK for trunk?
>
> Thanks,
> Ilya
> --
> gcc/
>
> 2016-04-29  Ilya Enkovich  
>
> PR target/70799
> PR target/70763
> * config/i386/i386.c (dimode_scalar_to_vector_candidate_p): Allow
> integer constants.
> (dimode_scalar_chain::vector_const_cost): New.
> (dimode_scalar_chain::compute_convert_gain): Handle constants.
> (dimode_scalar_chain::convert_op): Likewise.
> (dimode_scalar_chain::convert_insn): Likewise.
>
> gcc/testsuite/
>
> 2016-04-29  Ilya Enkovich  
>
> * gcc.target/i386/pr70799-1.c: New test.
>> @@ -3639,6 +3675,22 @@ dimode_scalar_chain::convert_op (rtx *op, rtx_insn 
>> *insn)
>   }
>*op = gen_rtx_SUBREG (V2DImode, *op, 0);
>  }
> +  else if (CONST_INT_P (*op))
> +{
> +  rtx cst = const0_rtx;
> +  rtx tmp = gen_rtx_SUBREG (V2DImode, gen_reg_rtx (DImode), 0);
> +
> +  /* Prefer all ones vector in case of -1.  */
> +  if (constm1_operand (*op, GET_MODE (*op)))
> +   cst = *op;
> +  cst = gen_rtx_CONST_VECTOR (V2DImode, gen_rtvec (2, *op, cst));

It took me a while to decipher the above functionality ;)

Why not just:

  else if (CONST_INT_P (*op))
{
  rtx tmp = gen_rtx_SUBREG (V2DImode, gen_reg_rtx (DImode), 0);
  rtx vec;

  /* Prefer all ones vector in case of -1.  */
  if (constm1_operand (*op, GET_MODE (*op)))
vec = CONSTM1_RTX (V2DImode);
  else
vec = gen_rtx_CONST_VECTOR (V2DImode,
gen_rtvec (2, *op, const0_rtx));

  if (!standard_sse_constant_p (vec, V2DImode))
vec = validize_mem (force_const_mem (V2DImode, vec));

  emit_insn_before (gen_move_insn (tmp, vec), insn);
  *op = tmp;
}

Comparing this part to timode_scalar_chain::convert_insn, there is a
NONDEBUG_INSN_P check. Do you need one in the above code? Also, TImode
pass handles REG_EQUAL and REG_EQUIV notes. Does dimode pass also need
this functionality?

Uros.


Re: Add a wi::to_wide helper function

2016-05-02 Thread Richard Biener
On Fri, Apr 29, 2016 at 2:32 PM, Richard Sandiford
 wrote:
> As Richard says, we ought to have a convenient way of converting
> an INTEGER_CST to a wide_int of a particular precision without
> having to extract the sign of the INTEGER_CST's type each time.
> This patch adds a wi::to_wide helper for that, alongside the
> existing wi::to_offset and wi::to_widest.
>
> Tested on x86_64-linux-gnu and aarch64-linux-gnu.  OK to install?

Ok.

Thanks,
Richard.

> Thanks,
> Richard
>
>
> gcc/
> * tree.h (wi::to_wide): New function.
> * expr.c (expand_expr_real_1): Use wi::to_wide.
> * fold-const.c (int_const_binop_1): Likewise.
> (extract_muldiv_1): Likewise.
>
> gcc/c-family/
> * c-common.c (shorten_compare): Use wi::to_wide.
>
> Index: gcc/tree.h
> ===
> --- gcc/tree.h
> +++ gcc/tree.h
> @@ -5211,6 +5211,8 @@ namespace wi
>to_widest (const_tree);
>
>generic_wide_int  > to_offset 
> (const_tree);
> +
> +  wide_int to_wide (const_tree, unsigned int);
>  }
>
>  inline unsigned int
> @@ -5240,6 +5242,16 @@ wi::to_offset (const_tree t)
>return t;
>  }
>
> +/* Convert INTEGER_CST T to a wide_int of precision PREC, extending or
> +   truncating as necessary.  When extending, use sign extension if T's
> +   type is signed and zero extension if T's type is unsigned.  */
> +
> +inline wide_int
> +wi::to_wide (const_tree t, unsigned int prec)
> +{
> +  return wide_int::from (t, prec, TYPE_SIGN (TREE_TYPE (t)));
> +}
> +
>  template 
>  inline wi::extended_tree ::extended_tree (const_tree t)
>: m_t (t)
> Index: gcc/expr.c
> ===
> --- gcc/expr.c
> +++ gcc/expr.c
> @@ -9729,10 +9729,9 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode 
> tmode,
>   GET_MODE_PRECISION (TYPE_MODE (type)), we need to extend from
>   the former to the latter according to the signedness of the
>   type. */
> -  temp = immed_wide_int_const (wide_int::from
> +  temp = immed_wide_int_const (wi::to_wide
>(exp,
> -   GET_MODE_PRECISION (TYPE_MODE (type)),
> -   TYPE_SIGN (type)),
> +   GET_MODE_PRECISION (TYPE_MODE (type))),
>TYPE_MODE (type));
>return temp;
>
> Index: gcc/fold-const.c
> ===
> --- gcc/fold-const.c
> +++ gcc/fold-const.c
> @@ -963,8 +963,7 @@ int_const_binop_1 (enum tree_code code, const_tree arg1, 
> const_tree parg2,
>signop sign = TYPE_SIGN (type);
>bool overflow = false;
>
> -  wide_int arg2 = wide_int::from (parg2, TYPE_PRECISION (type),
> - TYPE_SIGN (TREE_TYPE (parg2)));
> +  wide_int arg2 = wi::to_wide (parg2, TYPE_PRECISION (type));
>
>switch (code)
>  {
> @@ -6394,10 +6393,8 @@ extract_muldiv_1 (tree t, tree c, enum tree_code code, 
> tree wide_type,
>   bool overflow_mul_p;
>   signop sign = TYPE_SIGN (ctype);
>   unsigned prec = TYPE_PRECISION (ctype);
> - wide_int mul = wi::mul (wide_int::from (op1, prec,
> - TYPE_SIGN (TREE_TYPE 
> (op1))),
> - wide_int::from (c, prec,
> - TYPE_SIGN (TREE_TYPE (c))),
> + wide_int mul = wi::mul (wi::to_wide (op1, prec),
> + wi::to_wide (c, prec),
>   sign, &overflow_mul_p);
>   overflow_p = TREE_OVERFLOW (c) | TREE_OVERFLOW (op1);
>   if (overflow_mul_p
> Index: gcc/c-family/c-common.c
> ===
> --- gcc/c-family/c-common.c
> +++ gcc/c-family/c-common.c
> @@ -4012,10 +4012,9 @@ shorten_compare (location_t loc, tree *op0_ptr, tree 
> *op1_ptr,
>   /* Convert primop1 to target type, but do not introduce
>  additional overflow.  We know primop1 is an int_cst.  */
>   primop1 = force_fit_type (*restype_ptr,
> -   wide_int::from
> - (primop1,
> -  TYPE_PRECISION (*restype_ptr),
> -  TYPE_SIGN (TREE_TYPE (primop1))),
> +   wi::to_wide
> +(primop1,
> + TYPE_PRECISION (*restype_ptr)),
> 0, TREE_OVERFLOW (primop1));
> }
>if (type != *restype_ptr)


Re: Simplify cst_and_fits_in_hwi

2016-05-02 Thread Richard Biener
On Fri, Apr 29, 2016 at 2:34 PM, Richard Sandiford
 wrote:
> While looking at the use of cst_and_fits_in_hwi in tree-ssa-loop-ivopts.c,
> I had difficulty working out what the function actually tests.  The
> final NUNITS check seems redundant, since it asks about the number of
> HWIs in the _unextended_ constant.  We've already checked that the
> unextended constant has no more than HOST_BITS_PER_WIDE_INT bits, so the
> length must be 1.
>
> I think this was my fault, sorry.
>
> Tested on x86_64-linux-gnu and aarch64-linux-gnu.  OK to install?

Ok.

Richard.

> Thanks,
> Richard
>
>
> gcc/
> * tree.c (cst_and_fits_in_hwi): Simplify.
>
> Index: gcc/tree.c
> ===
> --- gcc/tree.c
> +++ gcc/tree.c
> @@ -1675,13 +1675,8 @@ build_low_bits_mask (tree type, unsigned bits)
>  bool
>  cst_and_fits_in_hwi (const_tree x)
>  {
> -  if (TREE_CODE (x) != INTEGER_CST)
> -return false;
> -
> -  if (TYPE_PRECISION (TREE_TYPE (x)) > HOST_BITS_PER_WIDE_INT)
> -return false;
> -
> -  return TREE_INT_CST_NUNITS (x) == 1;
> +  return (TREE_CODE (x) == INTEGER_CST
> + && TYPE_PRECISION (TREE_TYPE (x)) <= HOST_BITS_PER_WIDE_INT);
>  }
>
>  /* Build a newly constructed VECTOR_CST node of length LEN.  */


Re: Support << and >> for offset_int and widest_int

2016-05-02 Thread Richard Biener
On Fri, Apr 29, 2016 at 2:30 PM, Richard Sandiford
 wrote:
> Following on from the comparison patch, I think it makes sense to
> support << and >> for offset_int (int128_t) and widest_int (intNNN_t),
> with >> being arithmetic shift.  It doesn't make sense to use
> logical right shift on a potentially negative offset_int, since
> the precision of 128 bits has no meaning on the target.
>
> Tested on x86_64-linux-gnu and aarch64-linux-gnu.  OK to install?

Ok.

Richard.

> Thanks,
> Richard
>
>
> gcc/
> * wide-int.h: Update offset_int and widest_int documentation.
> (WI_SIGNED_SHIFT_RESULT): New macro.
> (wi::binary_shift): Define signed_shift_result_type for
> shifts on offset_int- and widest_int-like types.
> (generic_wide_int): Support <<= and >>= if << and >> are supported.
> * tree.h (int_bit_position): Use shift operators instead of wi::
>  shifts.
> * alias.c (adjust_offset_for_component_ref): Likewise.
> * expr.c (get_inner_reference): Likewise.
> * fold-const.c (fold_comparison): Likewise.
> * gimple-fold.c (fold_nonarray_ctor_reference): Likewise.
> * gimple-ssa-strength-reduction.c (restructure_reference): Likewise.
> * tree-dfa.c (get_ref_base_and_extent): Likewise.
> * tree-ssa-alias.c (indirect_ref_may_alias_decl_p): Likewise.
> (stmt_kills_ref_p): Likewise.
> * tree-ssa-ccp.c (bit_value_binop_1): Likewise.
> * tree-ssa-math-opts.c (find_bswap_or_nop_load): Likewise.
> * tree-ssa-sccvn.c (copy_reference_ops_from_ref): Likewise.
> (ao_ref_init_from_vn_reference): Likewise.
>
> gcc/cp/
> * init.c (build_new_1): Use shift operators instead of wi:: shifts.
>
> Index: gcc/wide-int.h
> ===
> --- gcc/wide-int.h
> +++ gcc/wide-int.h
> @@ -68,6 +68,8 @@ along with GCC; see the file COPYING3.  If not see
>   Since the values are logically signed, there is no need to
>   distinguish between signed and unsigned operations.  Sign-sensitive
>   comparison operators <, <=, > and >= are therefore supported.
> + Shift operators << and >> are also supported, with >> being
> + an _arithmetic_ right shift.
>
>   [ Note that, even though offset_int is effectively int128_t,
> it can still be useful to use unsigned comparisons like
> @@ -82,7 +84,8 @@ along with GCC; see the file COPYING3.  If not see
>
>   Like offset_int, widest_int is wider than all the values that
>   it needs to represent, so the integers are logically signed.
> - Sign-sensitive comparison operators <, <=, > and >= are supported.
> + Sign-sensitive comparison operators <, <=, > and >= are supported,
> + as are << and >>.
>
>   There are several places in the GCC where this should/must be used:
>
> @@ -259,6 +262,11 @@ along with GCC; see the file COPYING3.  If not see
>  #define WI_BINARY_RESULT(T1, T2) \
>typename wi::binary_traits ::result_type
>
> +/* The type of result produced by T1 << T2.  Leads to substitution failure
> +   if the operation isn't supported.  Defined purely for brevity.  */
> +#define WI_SIGNED_SHIFT_RESULT(T1, T2) \
> +  typename wi::binary_traits ::signed_shift_result_type
> +
>  /* The type of result produced by a signed binary predicate on types T1 and 
> T2.
> This is bool if signed comparisons make sense for T1 and T2 and leads to
> substitution failure otherwise.  */
> @@ -405,6 +413,7 @@ namespace wi
> so as not to confuse gengtype.  */
>  typedef generic_wide_int < fixed_wide_int_storage
>::precision> > result_type;
> +typedef result_type signed_shift_result_type;
>  typedef bool signed_predicate_result;
>};
>
> @@ -416,6 +425,7 @@ namespace wi
>  STATIC_ASSERT (int_traits ::precision == int_traits ::precision);
>  typedef generic_wide_int < fixed_wide_int_storage
>::precision> > result_type;
> +typedef result_type signed_shift_result_type;
>  typedef bool signed_predicate_result;
>};
>
> @@ -681,6 +691,11 @@ public:
>template  \
>  generic_wide_int &OP (const T &c) { return (*this = wi::F (*this, c)); }
>
> +/* Restrict these to cases where the shift operator is defined.  */
> +#define SHIFT_ASSIGNMENT_OPERATOR(OP, OP2) \
> +  template  \
> +generic_wide_int &OP (const T &c) { return (*this = *this OP2 c); }
> +
>  #define INCDEC_OPERATOR(OP, DELTA) \
>generic_wide_int &OP () { *this += DELTA; return *this; }
>
> @@ -702,12 +717,15 @@ public:
>ASSIGNMENT_OPERATOR (operator +=, add)
>ASSIGNMENT_OPERATOR (operator -=, sub)
>ASSIGNMENT_OPERATOR (operator *=, mul)
> +  SHIFT_ASSIGNMENT_OPERATOR (operator <<=, <<)
> +  SHIFT_ASSIGNMENT_OPERATOR (operator >>=, >>)
>INCDEC_OPERATOR (operator ++, 1)
>INCDEC_OPERATOR (operator --, -1)
>
>  #undef BINARY_PREDICATE
>  #undef UNARY_OPERATOR
>  #un

Re: [PATCH GCC]Proving no-trappness for array ref in tree if-conv using loop niter information.

2016-05-02 Thread Richard Biener
On Fri, Apr 29, 2016 at 5:05 PM, Bin.Cheng  wrote:
> On Fri, Apr 29, 2016 at 12:16 PM, Richard Biener
>  wrote:
>> On Thu, Apr 28, 2016 at 2:56 PM, Bin Cheng  wrote:
>>> Hi,
>>> Tree if-conversion sometimes cannot convert conditional array reference 
>>> into unconditional one.  Root cause is GCC conservatively assumes newly 
>>> introduced array reference could be out of array bound and thus trapping.  
>>> This patch improves the situation by proving the converted unconditional 
>>> array reference is within array bound using loop niter information.  To be 
>>> specific, it checks every index of array reference to see if it's within 
>>> bound in ifcvt_memrefs_wont_trap.  This patch also factors out 
>>> base_object_writable checking if the base object is writable or not.
>>> Bootstrap and test on x86_64 and aarch64, is it OK?
>>
>> I think you miss to handle the case optimally where the only
>> non-ARRAY_REF idx is the dereference of the
>> base-pointer for, say, p->a[i].  In this case we can use
>> base_master_dr to see if p is unconditionally dereferenced
> Yes, will pick up this case.
>
>> in the loop.  You also fail to handle the case where we have
>> MEM_REF[&x].a[i] that is, you see a decl base.
> I am having difficulty in creating this case for ifcvt, any advices?  Thanks.

Sth like

float a[128];
float foo (int n, int i)
{
  return (*((float(*)[n])a))[i];
}

should do the trick (w/o the component-ref).  Any other type-punning
would do it, too.

>> I suppose for_each_index should be fixed for this particular case (to
>> return true), same for TARGET_MEM_REF TMR_BASE.
>>
>> +  /* The case of nonconstant bounds could be handled, but it would be
>> + complicated.  */
>> +  if (TREE_CODE (low) != INTEGER_CST || !integer_zerop (low)
>> +  || !high || TREE_CODE (high) != INTEGER_CST)
>> +return false;
>> +
>>
>> handling of a non-zero but constant low bound is important - otherwise
>> all this is a no-op for Fortran.  It
>> shouldn't be too difficult to handle after all.  In fact I think your
>> code does handle it correctly already.
>>
>> +  if (!init || TREE_CODE (init) != INTEGER_CST
>> +  || !step || TREE_CODE (step) != INTEGER_CST || integer_zerop (step))
>> +return false;
>>
>> step == 0 should be easy to handle as well, no?  The index will simply
>> always be 'init' ...
>>
>> +  /* In case the relevant bound of the array does not fit in type, or
>> + it does, but bound + step (in type) still belongs into the range of the
>> + array, the index may wrap and still stay within the range of the array
>> + (consider e.g. if the array is indexed by the full range of
>> + unsigned char).
>> +
>> + To make things simpler, we require both bounds to fit into type, 
>> although
>> + there are cases where this would not be strictly necessary.  */
>> +  if (!int_fits_type_p (high, type) || !int_fits_type_p (low, type))
>> +return false;
>> +
>> +  low = fold_convert (type, low);
>>
>> please use wide_int for all of this.
> Now I use wi:fits_to_tree_p instead of int_fits_type_p. But I am not
> sure what's the meaning by "handle "low = fold_convert (type, low);"
> related code in wide_int".   Do you mean to use tree_int_cst_compare
> instead of tree_int_cst_compare in the following code?

I don't think you need any kind of fits-to-type check here.  You'd simply
use to_widest () when operating on / comparing with high/low.

And no, I mean to do it all with widest_ints.

>>
>> I wonder if we can do sth for wrapping IVs like
>>
>> int a[2048];
>>
>> for (int i = 0; i < 4096; ++i)
>>   ... a[(unsigned char)i];
>>
>> as well.  Like if the IVs type max and min value are within the array bounds
>> simply return true?
> I think we can only do this for read.  For write this is not safe.
> From vectorizer's point of view, is this worth handling?  Could
> vectorizer handles wrapping IV in a smaller range than loop IV?

Possibly, but dependence analysis might get confused.

Richard.

> Thanks,
> bin
>>
>> Thanks,
>> Richard.
>>
>>> Thanks,
>>> bin
>>>
>>> 2016-04-28  Bin Cheng  
>>>
>>> * tree-if-conv.c (tree-ssa-loop.h): Include header file.
>>> (tree-ssa-loop-niter.h): Ditto.
>>> (idx_within_array_bound, ref_within_array_bound): New functions.
>>> (ifcvt_memrefs_wont_trap): Check if array ref is within bound.
>>> Factor out check on writable base object to ...
>>> (base_object_writable): ... here.


Re: [PATCH GCC]Do more tree if-conversions by handlding PHIs with more than two arguments.

2016-05-02 Thread Richard Biener
On Fri, Apr 29, 2016 at 5:51 PM, Bin.Cheng  wrote:
> On Thu, Apr 28, 2016 at 10:18 AM, Richard Biener
>  wrote:
>> On Wed, Apr 27, 2016 at 5:49 PM, Bin Cheng  wrote:
>>> Hi,
>>> Currently tree if-conversion only supports PHIs with no more than two 
>>> arguments unless the loop is marked with "simd pragma".  This patch makes 
>>> such PHIs supported unconditionally if they have no more than 
>>> MAX_PHI_ARG_NUM arguments, thus cases like PR56541 can be fixed.  Note 
>>> because a chain of "?:" operators are needed to compute mult-arg PHI, this 
>>> patch records the case and versions loop so that vectorizer can fall back 
>>> to the original loop if if-conversion+vectorization isn't beneficial.  
>>> Ideally, cost computation in vectorizer should be improved to measure 
>>> benefit against the original loop, rather than if-converted loop.  So far 
>>> MAX_PHI_ARG_NUM is set to (4) because cases with more arguments are rare 
>>> and not likely beneficial.
>>>
>>> Apart from above change, the patch also makes changes like: only split 
>>> critical edge when we have to; cleanups code logic in if_convertible_loop_p 
>>> about aggressive_if_conv.
>>>
>>> Bootstrap and test on x86_64 and AArch64, is it OK?
>>
>> Can you make this magic number a --param please?  Otherwise ok.
> Hi,
> Here is the updated patch.  I also added a vectorization test case
> since PR56541 was reported against it.
> Bootstrap & test on x86_64, is it OK?

+/* { dg-options "-O3 -fdump-tree-ifcvt-stats" { target *-*-* } } */

you can omit { target *-*-* } here.

Ok with that change.

Thanks,
Richard.

> Thanks,
> bin
>
>>
>> Thanks,
>> Richard.
>>


RE: [PATCHv2 0/7] ARC: Add support for nps400 variant

2016-05-02 Thread Claudiu Zissulescu
Please also consider to address also the following warnings introduced:

mainline/gcc/gcc/config/arc/arc.md:888: warning: source missing a mode?
mainline/gcc/gcc/config/arc/arc.md:906: warning: source missing a mode?
mainline/gcc/gcc/config/arc/arc.md:921: warning: source missing a mode?
mainline/gcc/gcc/config/arc/arc.md:6146: warning: source missing a mode?

Thanks,
Claudiu

> -Original Message-
> From: Andrew Burgess [mailto:andrew.burg...@embecosm.com]
> Sent: Saturday, April 30, 2016 12:17 AM
> To: Claudiu Zissulescu; Joern Wolfgang Rennecke
> Cc: gcc-patches@gcc.gnu.org; noa...@mellanox.com
> Subject: Re: [PATCHv2 0/7] ARC: Add support for nps400 variant
> 
> * Claudiu Zissulescu  [2016-04-29
> 09:03:53 +]:
> 
> > I see the next tests failing:
> >
> > FAIL: gcc.target/arc/movb-1.c scan-assembler movb[ \t]+r[0-5]+, *r[0-5]+,
> *r[0-5]+, *19, *21, *8
> > FAIL: gcc.target/arc/movb-2.c scan-assembler movb[ \t]+r[0-5]+, *r[0-5]+,
> *r[0-5]+, *23, *23, *9
> > FAIL: gcc.target/arc/movb-5.c scan-assembler movb[ \t]+r[0-5]+, *r[0-5]+,
> *r[0-5]+, *23, *(23|7), *9
> > FAIL: gcc.target/arc/movh_cl-1.c scan-assembler movh.cl r[0-
> 9]+,0xc000>>16
> 
> Claudiu, Joern,
> 
> I believe that the patch below should resolve the issues that you're
> seeing for little endian arc tests.
> 
> It's mostly just updating the expected results, though one test needed
> improving for l/e arc.
> 
> In the final test the layout used for bitfields within a struct on
> little endian arc just happened to result in a movb (move bits) not
> being generated when it could / should have been.  I've added a new
> peephole2 case to catch this.
> 
> Thanks,
> Andrew
> 
> ---
> 
> gcc/arc: New peephole2 and little endian arc test fixes
> 
> Resolve some test failures introduced for little endian arc as a result
> of the recent arc/nps400 additions.
> 
> There's a new peephole2 optimisation to merge together two zero_extracts
> in order that the movb instruction can be used.
> 
> One of the test cases is extended so that the test does something
> meaningful in both big and little endian arc mode.
> 
> Other tests have their expected results updated to reflect improvements
> in other areas of GCC.
> 
> gcc/ChangeLog:
> 
>   * config/arc/arc.md (movb peephole2): New peephole2 to merge
> two
>   zero_extract operations to allow a movb to occur.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arc/movb-1.c: Update little endian arc results.
>   * gcc.target/arc/movb-2.c: Likewise.
>   * gcc.target/arc/movb-5.c: Likewise.
>   * gcc.target/arc/movh_cl-1.c: Extend test to cover little endian
>   arc.
> ---
>  gcc/ChangeLog|  5 +
>  gcc/config/arc/arc.md| 14 ++
>  gcc/testsuite/ChangeLog  |  8 
>  gcc/testsuite/gcc.target/arc/movb-1.c|  2 +-
>  gcc/testsuite/gcc.target/arc/movb-2.c|  2 +-
>  gcc/testsuite/gcc.target/arc/movb-5.c|  2 +-
>  gcc/testsuite/gcc.target/arc/movh_cl-1.c | 11 +++
>  7 files changed, 41 insertions(+), 3 deletions(-)
> 
> diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
> index c61107f..0b92594 100644
> --- a/gcc/config/arc/arc.md
> +++ b/gcc/config/arc/arc.md
> @@ -6144,6 +6144,20 @@
>  (zero_extract:SI (match_dup 1) (match_dup 5) (match_dup
> 7)))])
> (match_dup 1)])
> 
> +(define_peephole2
> +  [(set (match_operand:SI 0 "register_operand" "")
> +(zero_extract:SI (match_dup 0)
> +  (match_operand:SI 1 "const_int_operand" "")
> +  (match_operand:SI 2 "const_int_operand" "")))
> +   (set (zero_extract:SI (match_operand:SI 3 "register_operand" "")
> +  (match_dup 1)
> + (match_dup 2))
> + (match_dup 0))]
> +  "TARGET_NPS_BITOPS
> +   && !reg_overlap_mentioned_p (operands[0], operands[3])"
> +  [(set (zero_extract:SI (match_dup 3) (match_dup 1) (match_dup 2))
> +(zero_extract:SI (match_dup 0) (match_dup 1) (match_dup 2)))])
> +
>  ;; include the arc-FPX instructions
>  (include "fpx.md")
> 
> diff --git a/gcc/testsuite/gcc.target/arc/movb-1.c
> b/gcc/testsuite/gcc.target/arc/movb-1.c
> index 65d4ba4..94d9f5f 100644
> --- a/gcc/testsuite/gcc.target/arc/movb-1.c
> +++ b/gcc/testsuite/gcc.target/arc/movb-1.c
> @@ -10,4 +10,4 @@ f (void)
>bar.b = foo.b;
>  }
>  /* { dg-final { scan-assembler "movb\[ \t\]+r\[0-5\]+, *r\[0-5\]+, 
> *r\[0-5\]+,
> *5, *3, *8" { target arceb-*-* } } } */
> -/* { dg-final { scan-assembler "movb\[ \t\]+r\[0-5\]+, *r\[0-5\]+, 
> *r\[0-5\]+,
> *19, *21, *8" { target arc-*-* } } } */
> +/* { dg-final { scan-assembler "movb\[ \t\]+r\[0-5\]+, *r\[0-5\]+, 
> *r\[0-5\]+,
> *3, *5, *8" { target arc-*-* } } } */
> diff --git a/gcc/testsuite/gcc.target/arc/movb-2.c
> b/gcc/testsuite/gcc.target/arc/movb-2.c
> index 1ba9976..708f393 100644
> --- a/gcc/testsuite/gcc.target/arc/movb-2.c
> +++ b/gcc/testsuite/gcc.target/arc/movb-2.c
> @@

Re: [PATCH GCC]Don't clobber cbase when computing iv_use cost.

2016-05-02 Thread Richard Biener
On Fri, Apr 29, 2016 at 5:55 PM, Bin Cheng  wrote:
> Hi,
> This fixes a latent bug I introduced.  Variable "cbase" shouldn't be modified 
> since it will be used afterwards.  Bootstrap and test on x86_64.  I think 
> it's an obvious change, is it OK?

Ok.

Richard.

> Thanks,
> bin
>
> 2016-04-28  Bin Cheng  
>
> * tree-ssa-loop-ivopts.c (get_computation_cost_at): Don't clobber
> cbase.


Re: [PATCH GCC]Check depends_on before recording invariant expressions

2016-05-02 Thread Richard Biener
On Fri, Apr 29, 2016 at 6:04 PM, Bin Cheng  wrote:
> Hi,
> This patch fixes a latent bug in IVOPT.  Variable "depends_on" should be 
> checked before recording invariant expression, otherwise we end up with 
> constant (even ZERO) invariant expressions.  Apparently this results in wrong 
> register pressure.
>
> Bootstrap and test on x86_64.  Is it OK?

Ok.

Richard.

> Thanks,
> bin
>
> 2016-04-28  Bin Cheng  
>
> * tree-ssa-loop-ivopts.c (get_computation_cost_at): Check depends_on
> before using it.


[Ada] Fix bug on Get_Line when incomplete last line in file

2016-05-02 Thread Arnaud Charlet
It may occur in some occasions that Get_Line incorrectly sets its Last
parameter to one past the correct value. This can only occur when the
line being copied is the last line of the file, and does not contain
the newline character.

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-05-02  Yannick Moy  

* a-tigeli.adb (Get_Line): Fix bound for test to
decide when to compensate for character 0 added by call to fgets.

Index: a-tigeli.adb
===
--- a-tigeli.adb(revision 235706)
+++ a-tigeli.adb(working copy)
@@ -120,10 +120,15 @@
 K : Natural := Natural (P - S);
 
  begin
---  Now Buf (K + 2) should be 0, or otherwise Buf (K) is the 0
---  put in by fgets, so compensate.
+--  If K + 2 is greater than N, then Buf (K + 1) cannot be a LM
+--  character from the source file, as the call to fgets copied at
+--  most N - 1 characters. Otherwise, either LM is a character from
+--  the source file and then Buf (K + 2) should be 0, or LM is a
+--  character put in Buf by memset and then Buf (K) is the 0 put in
+--  by fgets. In both cases where LM does not come from the source
+--  file, compensate.
 
-if K + 2 > Buf'Last or else Buf (K + 2) /= ASCII.NUL then
+if K + 2 > N or else Buf (K + 2) /= ASCII.NUL then
 
--  Incomplete last line, so remove the extra 0
 


Re: [PATCH][GCC7] Remove scaling of COMPONENT_REF/ARRAY_REF ops 2/3

2016-05-02 Thread Eric Botcazou
> The following experiment resulted from looking at making
> array_ref_low_bound and array_ref_element_size non-mutating.  Again
> I wondered why we do this strange scaling by offset/element alignment.

The idea is to expose the alignment factor to the RTL expander:

tree tem
  = get_inner_reference (exp, &bitsize, &bitpos, &offset, &mode1,
 &unsignedp, &reversep, &volatilep, true);

[...]

rtx offset_rtx = expand_expr (offset, NULL_RTX, VOIDmode,
  EXPAND_SUM);

[...]

op0 = offset_address (op0, offset_rtx,
  highest_pow2_factor (offset));

With the scaling, offset is something like _69 * 4 so highest_pow2_factor can 
see the factor and passes it down to offset_address:

(gdb) p debug_rtx(op0)
(mem/c:SI (plus:SI (reg/f:SI 193)
(reg:SI 194)) [3 *s.16_63 S4 A32])

With your patch in the same situation:

(gdb) p debug_rtx(op0)
(mem/c:SI (plus:SI (reg/f:SI 139)
(reg:SI 116 [ _33 ])) [3 *s.16_63 S4 A8])

On strict-alignment targets, this makes a big difference, e.g. SPARC:

ld  [%i4+%i5], %i0

vs

ldub[%i5+%i4], %g1
sll %g1, 24, %g1
add %i5, %i4, %i5
ldub[%i5+1], %i0
sll %i0, 16, %i0
or  %i0, %g1, %i0
ldub[%i5+2], %g1
sll %g1, 8, %g1
or  %g1, %i0, %g1
ldub[%i5+3], %i0
or  %i0, %g1, %i0


Now this is mitigated by a couple of things:

  1. the above pessimization only happens on the RHS; on the LHS, the expander 
calls highest_pow2_factor_for_target instead of highest_pow2_factor and the 
former takes into account the type's alignment thanks to the MAX:

/* Similar, except that the alignment requirements of TARGET are
   taken into account.  Assume it is at least as aligned as its
   type, unless it is a COMPONENT_REF in which case the layout of
   the structure gives the alignment.  */

static unsigned HOST_WIDE_INT
highest_pow2_factor_for_target (const_tree target, const_tree exp)
{
  unsigned HOST_WIDE_INT talign = target_align (target) / BITS_PER_UNIT;
  unsigned HOST_WIDE_INT factor = highest_pow2_factor (exp);

  return MAX (factor, talign);
}

  2. highest_pow2_factor can be rescued by the set_nonzero_bits machinery of 
the SSA CCP pass because it calls tree_ctz.  The above example was compiled 
with -O -fno-tree-ccp on SPARC; at -O, the code isn't pessimized.

> So - the following patch gets rid of that scaling.  For a "simple"
> C testcase
> 
> void bar (void *);
> void foo (int n)
> {
>   struct S { struct R { int b[n]; } a[2]; int k; } s;
>   s.k = 1;
>   s.a[1].b[7] = 3;
>   bar (&s);
> }

This only exposes the LHS case, here's a more complete testcase:

void bar (void *);

int foo (int n)
{
  struct S { struct R { char b[n]; } a[2]; int k; } s;
  s.k = 1;
  s.a[1].b[7] = 3;
  bar (&s);
  return s.k;
}

-- 
Eric Botcazou


Re: Canonicalize X u< X to UNORDERED_EXPR

2016-05-02 Thread Marc Glisse

On Mon, 2 May 2016, Richard Biener wrote:


On Sat, Apr 30, 2016 at 8:44 PM, Marc Glisse  wrote:

Hello,

this case seemed to be missing in the various X cmp X transformations. It
does not change the generated code in the testcase.

The missing :c is rather trivial. I can commit it separately if you prefer.


I think it's not missing.  Commutating the first one is enough to eventually
make the @1s match up.  I think you should get a diagnostic on a duplicate
pattern when adding another :c (hmm, no, it's indeed "different" patterns but
still redundant).


Let's see. The current pattern is
  (bitop (bit_and:c @0 @1) (bit_and @2 @1))

This matches:
(X & Y) ^ (Z & Y)
(Y & X) ^ (Z & Y)

If I have for instance (Y & X) ^ (Y & Z), I don't see how that is going to 
match, and indeed we don't simplify that. On the other hand, if I have 
bit_ior instead of bit_xor, then we have another very similar 
transformation a 100 lines up in match.pd


(for op (bit_and bit_ior)
 rop (bit_ior bit_and)
 (simplify
  (op (convert? (rop:c @0 @1)) (convert? (rop @0 @2)))

That one also commutes only one, but starting from a different match. We 
should probably reorganize them a bit.



Bootstrap+regtest on powerpc64le-unknown-linux-gnu.


Ok for the new pattern.

Thanks,
Richard.


2016-05-02  Marc Glisse  

gcc/
* match.pd ((A & B) OP (C & B)): Mark '&' as commutative.
(X u< X, X u> X): New transformations

gcc/testsuite/
* gcc.dg/tree-ssa/unord.c: New testcase.

--
Marc Glisse
Index: trunk/gcc/match.pd
===
--- trunk/gcc/match.pd  (revision 235654)
+++ trunk/gcc/match.pd  (working copy)
@@ -783,21 +783,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   @0)
  /* (~x | y) & x -> x & y */
  /* (~x & y) | x -> x | y */
  (simplify
   (bitop:c (rbitop:c (bit_not @0) @1) @0)
   (bitop @0 @1)))

 /* Simplify (A & B) OP0 (C & B) to (A OP0 C) & B. */
 (for bitop (bit_and bit_ior bit_xor)
  (simplify
-  (bitop (bit_and:c @0 @1) (bit_and @2 @1))
+  (bitop (bit_and:c @0 @1) (bit_and:c @2 @1))
   (bit_and (bitop @0 @2) @1)))

 /* (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) */
 (simplify
   (bit_and (bit_ior @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2)
   (bit_ior (bit_and @0 @2) (bit_and @1 @2)))

 /* Combine successive equal operations with constants.  */
 (for bitop (bit_and bit_ior bit_xor)
  (simplify
@@ -1914,20 +1914,24 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (simplify
   (cmp @0 @0)
   (if (cmp != NE_EXPR
|| ! FLOAT_TYPE_P (TREE_TYPE (@0))
|| ! HONOR_NANS (@0))
{ constant_boolean_node (false, type); })))
 (for cmp (unle unge uneq)
  (simplify
   (cmp @0 @0)
   { constant_boolean_node (true, type); }))
+(for cmp (unlt ungt)
+ (simplify
+  (cmp @0 @0)
+  (unordered @0 @0)))
 (simplify
  (ltgt @0 @0)
  (if (!flag_trapping_math)
   { constant_boolean_node (false, type); }))

 /* Fold ~X op ~Y as Y op X.  */
 (for cmp (simple_comparison)
  (simplify
   (cmp (bit_not@2 @0) (bit_not@3 @1))
   (if (single_use (@2) && single_use (@3))
Index: trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c
===
--- trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c (revision 0)
+++ trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c (working copy)
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-optimized" } */
+
+int f(double a){double b=a;return !__builtin_islessequal(a,b);}
+int g(double a){double b=a;return !__builtin_isgreaterequal(a,b);}
+
+/* { dg-final { scan-tree-dump-times " unord " 2 "optimized" } } */


--
Marc Glisse


[Ada] Spurious error on container "of" loop

2016-05-02 Thread Arnaud Charlet
This patch modifies the implementation of container indexing to inspect the
base type of the container in case the container is a subtype. No
simple reproducer found.

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-05-02  Hristian Kirtchev  

* sem_ch4.adb (Find_Indexing_Operations): Use the underlying type
of the container base type in case the container is a subtype.
* sem_ch5.adb (Analyze_Iterator_Specification): Ensure that
the selector has an entity when checking for a component of a
mutable object.

Index: sem_ch5.adb
===
--- sem_ch5.adb (revision 235706)
+++ sem_ch5.adb (working copy)
@@ -1817,7 +1817,7 @@
   Bas : Entity_Id;
   Typ : Entity_Id;
 
-   --   Start of processing for Analyze_iterator_Specification
+   --   Start of processing for Analyze_Iterator_Specification
 
begin
   Enter_Name (Def_Id);
@@ -2207,6 +2207,8 @@
  --  be performed.
 
  if Nkind (Orig_Iter_Name) = N_Selected_Component
+   and then
+ Present (Entity (Selector_Name (Orig_Iter_Name)))
and then Ekind_In
   (Entity (Selector_Name (Orig_Iter_Name)),
E_Component,
Index: sem_ch4.adb
===
--- sem_ch4.adb (revision 235711)
+++ sem_ch4.adb (working copy)
@@ -7619,12 +7619,14 @@
   begin
  Typ := T;
 
+ --  Use the specific type when the parameter type is class-wide
+
  if Is_Class_Wide_Type (Typ) then
 Typ := Root_Type (Typ);
  end if;
 
  Ref := Empty;
- Typ := Underlying_Type (Typ);
+ Typ := Underlying_Type (Base_Type (Typ));
 
  Inspect_Primitives   (Typ, Ref);
  Inspect_Declarations (Typ, Ref);


[Ada] Race condition in allocator with finalization

2016-05-02 Thread Arnaud Charlet
This patch fixes a race condition in an allocator for a type that needs
finalization. The race condition is unlikely to occur in practice;
it occurs when the allocator is in a Finalize that occurs after the
corresponding master has already started its finalization. Finalize
operations often deallocate memory, but rarely allocate.

However, this fix is also an efficiency improvement, because it reduces the
number of lock/unlock calls.

No test is available; it's too hard to force the race condition to happen.

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-05-02  Bob Duff  

* s-stposu.adb (Allocate_Any_Controlled): Don't lock/unlock twice.

Index: s-stposu.adb
===
--- s-stposu.adb(revision 235706)
+++ s-stposu.adb(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
---  Copyright (C) 2011-2015, Free Software Foundation, Inc. --
+--  Copyright (C) 2011-2016, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -123,9 +123,6 @@
   N_Size  : Storage_Count;
   Subpool : Subpool_Handle := null;
 
-  Allocation_Locked : Boolean;
-  --  This flag stores the state of the associated collection
-
   Header_And_Padding : Storage_Offset;
   --  This offset includes the size of a FM_Node plus any additional
   --  padding due to a larger alignment.
@@ -170,25 +167,25 @@
 
   else
  --  If the master is missing, then the expansion of the access type
- --  failed to create one. This is a serious error.
+ --  failed to create one. This is a compiler bug.
 
- if Context_Master = null then
-raise Program_Error
-  with "missing master in pool allocation";
+ pragma Assert
+   (Context_Master /= null, "missing master in pool allocation");
 
  --  If a subpool is present, then this is the result of erroneous
  --  allocator expansion. This is not a serious error, but it should
  --  still be detected.
 
- elsif Context_Subpool /= null then
+ if Context_Subpool /= null then
 raise Program_Error
   with "subpool not required in pool allocation";
+ end if;
 
  --  If the allocation is intended to be on a subpool, but the access
  --  type's pool does not support subpools, then this is the result of
- --  erroneous end-user code.
+ --  incorrect end-user code.
 
- elsif On_Subpool then
+ if On_Subpool then
 raise Program_Error
   with "pool of access type does not support subpools";
  end if;
@@ -209,24 +206,20 @@
  --Write - finalization
 
  Lock_Task.all;
- Allocation_Locked := Finalization_Started (Master.all);
- Unlock_Task.all;
 
  --  Do not allow the allocation of controlled objects while the
  --  associated master is being finalized.
 
- if Allocation_Locked then
+ if Finalization_Started (Master.all) then
 raise Program_Error with "allocation after finalization started";
  end if;
 
  --  Check whether primitive Finalize_Address is available. If it is
  --  not, then either the expansion of the designated type failed or
- --  the expansion of the allocator failed. This is a serious error.
+ --  the expansion of the allocator failed. This is a compiler bug.
 
- if Fin_Address = null then
-raise Program_Error
-  with "primitive Finalize_Address not available";
- end if;
+ pragma Assert
+   (Fin_Address /= null, "primitive Finalize_Address not available");
 
  --  The size must acount for the hidden header preceding the object.
  --  Account for possible padding space before the header due to a
@@ -262,7 +255,7 @@
   --  Step 4: Attachment
 
   if Is_Controlled then
- Lock_Task.all;
+ --  Note that we already did "Lock_Task.all;" in Step 2 above.
 
  --  Map the allocated memory into a FM_Node record. This converts the
  --  top of the allocated bits into a list header. If there is padding
@@ -334,6 +327,16 @@
   else
  Addr := N_Addr;
   end if;
+
+   exception
+  when others =>
+ --  If we locked, we want to unlock
+
+ if Is_Controlled then
+Unlock_Task.all;
+ end if;
+
+ raise;
end Allocate_Any_Controlled;
 

[Ada] Error in handling of convention of formal parameter

2016-05-02 Thread Arnaud Charlet
This patch fixes an error in the determination of the convention to be used for
a formal parameter, when the corresponding type carries a convention pragma.

Executing:

gcc -c p.ads -gnatRm

must yield:

Representation information for unit P (spec)

for Rec'Size use 1024;
for Rec'Alignment use 1;
for Rec use record
   I at 0 range  0 .. 1023;
end record;

for Arr'Size use 64;
for Arr'Alignment use 4;
for Arr'Component_Size use 32;

procedure proc1 declared at p.ads:11:13
  convention : Ada
  r : passed by copy
  i : passed by copy

procedure proc2 declared at p.ads:12:13
  convention : Ada
  a : passed by reference
  i : passed by copy

---
package P is

  type Rec is record
I : String (1 .. 128);
  end record;
  pragma Convention (Ada_Pass_By_Copy, Rec);

  type Arr is array (1 .. 2) of Integer;
  pragma Convention (Ada_Pass_By_Reference, Arr);

  procedure Proc1 (R : Rec; I : Integer) is null;
  procedure Proc2 (A : Arr; I : Integer) is null;

end P;

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-05-02  Ed Schonberg  

* sem_ch6.adb (Process_Formals): Check properly the type of a
formal to determine whether a given convention applies to it.

Index: sem_ch6.adb
===
--- sem_ch6.adb (revision 235706)
+++ sem_ch6.adb (working copy)
@@ -10792,24 +10792,28 @@
 
  --  Force call by reference if aliased
 
- if Is_Aliased (Formal) then
-Set_Mechanism (Formal, By_Reference);
+ declare
+Conv : constant Convention_Id := Convention (Etype (Formal));
+ begin
+if Is_Aliased (Formal) then
+   Set_Mechanism (Formal, By_Reference);
 
---  Warn if user asked this to be passed by copy
+   --  Warn if user asked this to be passed by copy
 
-if Convention (Formal_Type) = Convention_Ada_Pass_By_Copy then
-   Error_Msg_N
- ("cannot pass aliased parameter & by copy??", Formal);
-end if;
+   if Conv = Convention_Ada_Pass_By_Copy then
+  Error_Msg_N
+("cannot pass aliased parameter & by copy??", Formal);
+   end if;
 
- --  Force mechanism if type has Convention Ada_Pass_By_Ref/Copy
+--  Force mechanism if type has Convention Ada_Pass_By_Ref/Copy
 
- elsif Convention (Formal_Type) = Convention_Ada_Pass_By_Copy then
-Set_Mechanism (Formal, By_Copy);
+elsif Conv = Convention_Ada_Pass_By_Copy then
+   Set_Mechanism (Formal, By_Copy);
 
- elsif Convention (Formal_Type) = Convention_Ada_Pass_By_Reference then
-Set_Mechanism (Formal, By_Reference);
- end if;
+elsif Conv = Convention_Ada_Pass_By_Reference then
+   Set_Mechanism (Formal, By_Reference);
+end if;
+ end;
 
   <>
  Next (Param_Spec);


[Ada] Representation information for nested subprograms

2016-05-02 Thread Arnaud Charlet
With this patch the representation information generated with the -gnatR
compilation switch includes information on subprograms nested within
subprogram bodies.

Executing

   gcc -c -gnatRm p.adb

must yield;

---

Representation information for unit P (body)

procedure inner declared at p.adb:9:17
  convention : Ada

procedure inner2 declared at p.adb:11:20
  convention : Ada
  f : passed by copy

function sum declared at p.adb:23:16
  convention : Ada
  vec : passed by reference
  returns by copy

Representation information for unit P (spec)

for T'Alignment use 4;
for T'Component_Size use 32;

procedure s declared at p.ads:5:14
  convention : Ada
  a : passed by reference

---
package P
is
   type T is array (Positive range <>) of Integer;
   
   procedure S (A : in out T);
end P;
---
package body P is

   ---
   -- S --
   ---

   procedure S (A : in out T) is
  X, Y : Integer;
  procedure Inner
  is
 procedure Inner2 (F : in out Integer)
 is
 begin
F := F * 45;
 end Inner2;
 
  begin
 X := Y + 1;
 Inner2 (X);
 Y := Y - 3;
  end Inner;
  
  function Sum (Vec : T) return Integer is
 Res : Integer := 0;
  begin
 for I in Vec'range loop
Res := Res + Vec (I);
 end loop;
 return Res;
  end Sum;

   begin
  X := A (1);
  Y := A (2);
  
  Inner;
  
  A (1) := X + 1;
  A (2) := X + 3;
  
   end S;

end P;

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-05-02  Ed Schonberg  

* repinfo.adb (List_Entities): Make procedure recursive, to
provide representation information for subprograms declared
within subprogram bodies.

Index: repinfo.adb
===
--- repinfo.adb (revision 235710)
+++ repinfo.adb (working copy)
@@ -135,10 +135,15 @@
--  Called before outputting anything for an entity. Ensures that
--  a blank line precedes the output for a particular entity.
 
-   procedure List_Entities (Ent : Entity_Id; Bytes_Big_Endian : Boolean);
+   procedure List_Entities
+ (Ent : Entity_Id;
+  Bytes_Big_Endian : Boolean;
+  In_Subprogram: Boolean := False);
--  This procedure lists the entities associated with the entity E, starting
--  with the First_Entity and using the Next_Entity link. If a nested
--  package is found, entities within the package are recursively processed.
+   --  When recursing within a subprogram body, Is_Subprogram suppresses
+   --  duplicate information about signature.
 
procedure List_Name (Ent : Entity_Id);
--  List name of entity Ent in appropriate case. The name is listed with
@@ -314,7 +319,11 @@
-- List_Entities --
---
 
-   procedure List_Entities (Ent : Entity_Id; Bytes_Big_Endian : Boolean) is
+   procedure List_Entities
+ (Ent : Entity_Id;
+  Bytes_Big_Endian : Boolean;
+  In_Subprogram: Boolean := False)
+   is
   Body_E : Entity_Id;
   E  : Entity_Id;
 
@@ -353,12 +362,15 @@
 and then Nkind (Declaration_Node (Ent)) not in N_Renaming_Declaration
   then
  --  If entity is a subprogram and we are listing mechanisms,
- --  then we need to list mechanisms for this entity.
+ --  then we need to list mechanisms for this entity. We skip this
+ --  if it is a nested subprogram, as the information has already
+ --  been produced when listing the enclosing scope.
 
  if List_Representation_Info_Mechanisms
and then (Is_Subprogram (Ent)
   or else Ekind (Ent) = E_Entry
   or else Ekind (Ent) = E_Entry_Family)
+   and then not In_Subprogram
  then
 Need_Blank_Line := True;
 List_Mechanisms (Ent);
@@ -386,6 +398,13 @@
  List_Mechanisms (E);
   end if;
 
+  --  Recurse into entities local to subprogram
+
+  List_Entities (E, Bytes_Big_Endian, True);
+
+   elsif Ekind (E) in Formal_Kind and then In_Subprogram then
+  null;
+
elsif Ekind_In (E, E_Entry,
   E_Entry_Family,
   E_Subprogram_Type)


[Ada] Fix Ada.Directories.Delete_Tree not to change current directory

2016-05-02 Thread Arnaud Charlet
... which is very unfriendly to tasking programs, where a task running
Delete_Tree could cause a unexpected change of current directory in another
task running concurrently.

The program below is expected to display

  No directory change observed

--

with Ada.Text_IO; use Ada.Text_IO;
with Ada.Directories;

procedure P0 is

   Root_Directory : constant String := Ada.Directories.Current_Directory;
   Temp_Path : constant String := Root_Directory & "/tmp/";

   --  The idea is to have 2 tasks:

   --  One which repeteadly creates and deletes a dir until requested to stop.

   --  and

   --  Another which monitors changes to its current directory.

   --  The two tasks aren't explicitly synchronized, except for the monitor
   --  requesting the creation/deletion task to stop when it's done monitoring
   --  for a while (fixed number of iterations).

   --  We expect the OS scheduler to let the monitor run concurrently with the
   --  creation/deletion task, hopefully while the latter is performing
   --  Delete_Tree, so a change of current directory there would be observed.

   --  This is not full proof but was showing the problem consistently on
   --  at least a couple of native Linux and Windows platforms before the
   --  correction was applied.

   task Create_Delete_Dir is
  entry Stop;
   end;

   task Monitor_Current_Dir;

   task body Create_Delete_Dir is
  Stop_Requested : Boolean := False;
   begin
  while not Stop_Requested loop
 select
accept Stop;
Stop_Requested := True;
 else
Ada.Directories.Create_Path (Temp_Path);
Ada.Directories.Delete_Tree (Temp_Path);
 end select;
  end loop;
   end;

   task body Monitor_Current_Dir is
  Dir_Change_Iteration : Integer := 0;
   begin
  for I in 1 .. 10 loop
 if Ada.Directories.Current_Directory /= Root_Directory then
Dir_Change_Iteration := I;
exit;
 end if;
  end loop;

  if Dir_Change_Iteration > 0 then
 Put_Line ("Directory change at "
 & Integer'Image(Dir_Change_Iteration));
  else
 Put_Line ("No directory change observed");
  end if;

  --  Done monitoring, request the creation/deletion task
  --  to stop and exit.

  Create_Delete_Dir.Stop;

   end;

begin
   null;
end;

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-05-02  Olivier Hainque  

* a-direct.adb (Delete_Tree): Use full names to designate subdirs
and files therein, instead of local names after a change of
current directory.

Index: a-direct.adb
===
--- a-direct.adb(revision 235706)
+++ a-direct.adb(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
---  Copyright (C) 2004-2015, Free Software Foundation, Inc. --
+--  Copyright (C) 2004-2016, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -597,7 +597,6 @@
-
 
procedure Delete_Tree (Directory : String) is
-  Current_Dir : constant String := Current_Directory;
   Search  : Search_Type;
   Dir_Ent : Directory_Entry_Type;
begin
@@ -611,28 +610,32 @@
  raise Name_Error with '"' & Directory & """ not a directory";
 
   else
- Set_Directory (Directory);
 
- Start_Search (Search, Directory => ".", Pattern => "");
+ --  We used to change the current directory to Directory here,
+ --  allowing the use of a local Simple_Name for all references. This
+ --  turned out unfriendly to multitasking programs, where tasks
+ --  running in parallel of this Delete_Tree could see their current
+ --  directory change unpredictably. We now resort to Full_Name
+ --  computations to reach files and subdirs instead.
+
+ Start_Search (Search, Directory => Directory, Pattern => "");
  while More_Entries (Search) loop
 Get_Next_Entry (Search, Dir_Ent);
 
 declare
-   File_Name : constant String := Simple_Name (Dir_Ent);
-
+   Sname : constant String := Simple_Name (Dir_Ent);
+   Fname : constant String := Full_Name (Dir_Ent);
 begin
-   if OS_Lib.Is_Directory (File_Name) then
-  if File_Name /= "." and then File_Name /= ".." then
- Delete_Tree (File_Name);
+   if OS_Lib.Is_Directory (Fname) then
+  if 

[PATCH][Committed] [ARC] Fix warnings, update source code.

2016-05-02 Thread Claudiu Zissulescu
Small syntactic fixes. Committed as obvious.

Best,
Claudiu

include/
2016-05-02  Claudiu Zissulescu  

* config/arc/arc.c (arc_preferred_simd_mode): Remove enum keyword.
(arc_save_restore): Likewise.
(arc_dwarf_register_span): Likewise.
(arc_output_pic_addr_const): Initialize suffix variable.
---
 gcc/ChangeLog| 7 +++
 gcc/config/arc/arc.c | 9 +
 2 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index fecfdab..35b24f5 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,12 @@
 2016-05-02  Claudiu Zissulescu  
 
+   * config/arc/arc.c (arc_preferred_simd_mode): Remove enum keyword.
+   (arc_save_restore): Likewise.
+   (arc_dwarf_register_span): Likewise.
+   (arc_output_pic_addr_const): Initialize suffix variable.
+
+2016-05-02  Claudiu Zissulescu  
+
* config/arc/arc-protos.h (compact_memory_operand_p): Declare.
* config/arc/arc.c (arc_output_commutative_cond_exec): Consider
bmaskn instruction.
diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index a54fddb..49edc0a 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -264,8 +264,8 @@ arc_vector_mode_supported_p (machine_mode mode)
 
 /* Implements target hook TARGET_VECTORIZE_PREFERRED_SIMD_MODE.  */
 
-static enum machine_mode
-arc_preferred_simd_mode (enum machine_mode mode)
+static machine_mode
+arc_preferred_simd_mode (machine_mode mode)
 {
   switch (mode)
 {
@@ -2347,7 +2347,7 @@ arc_save_restore (rtx base_reg,
 
   for (regno = 0; regno <= 31; regno++)
{
- enum machine_mode mode = SImode;
+ machine_mode mode = SImode;
  bool found = false;
 
  if (TARGET_LL64
@@ -5124,6 +5124,7 @@ arc_output_pic_addr_const (FILE * file, rtx x, int code)
suffix = "@dtpoff";
  break;
default:
+ suffix = "@invalid";
  output_operand_lossage ("invalid UNSPEC as operand: %d", XINT (x,1));
  break;
}
@@ -9847,7 +9848,7 @@ arc_no_speculation_in_delay_slots_p ()
 static rtx
 arc_dwarf_register_span (rtx rtl)
 {
-   enum machine_mode mode = GET_MODE (rtl);
+   machine_mode mode = GET_MODE (rtl);
unsigned regno;
rtx p;
 
-- 
1.9.1



Re: Support << and >> for offset_int and widest_int

2016-05-02 Thread Richard Sandiford
"H.J. Lu"  writes:
> On Fri, Apr 29, 2016 at 5:30 AM, Richard Sandiford
>  wrote:
>> Following on from the comparison patch, I think it makes sense to
>> support << and >> for offset_int (int128_t) and widest_int (intNNN_t),
>> with >> being arithmetic shift.  It doesn't make sense to use
>> logical right shift on a potentially negative offset_int, since
>> the precision of 128 bits has no meaning on the target.
>>
>> Tested on x86_64-linux-gnu and aarch64-linux-gnu.  OK to install?
>>
>> Thanks,
>> Richard
>>
>>
>> gcc/
>> * wide-int.h: Update offset_int and widest_int documentation.
>> (WI_SIGNED_SHIFT_RESULT): New macro.
>> (wi::binary_shift): Define signed_shift_result_type for
>> shifts on offset_int- and widest_int-like types.
>> (generic_wide_int): Support <<= and >>= if << and >> are supported.
>> * tree.h (int_bit_position): Use shift operators instead of wi::
>>  shifts.
>> * alias.c (adjust_offset_for_component_ref): Likewise.
>> * expr.c (get_inner_reference): Likewise.
>> * fold-const.c (fold_comparison): Likewise.
>> * gimple-fold.c (fold_nonarray_ctor_reference): Likewise.
>> * gimple-ssa-strength-reduction.c (restructure_reference): Likewise.
>> * tree-dfa.c (get_ref_base_and_extent): Likewise.
>> * tree-ssa-alias.c (indirect_ref_may_alias_decl_p): Likewise.
>> (stmt_kills_ref_p): Likewise.
>> * tree-ssa-ccp.c (bit_value_binop_1): Likewise.
>> * tree-ssa-math-opts.c (find_bswap_or_nop_load): Likewise.
>> * tree-ssa-sccvn.c (copy_reference_ops_from_ref): Likewise.
>> (ao_ref_init_from_vn_reference): Likewise.
>>
>> gcc/cp/
>> * init.c (build_new_1): Use shift operators instead of wi:: shifts.
>
> Can you also update change_zero_ext in combine.c:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70687
>
> It should use wide_int << instead of HOST_WIDE_INT <<< to
> support __int128.

The patch doesn't add wide_int shift operators since wide_ints have no
sign.  You need to use wi:: shift routines for them instead.  However,
there's already a wi::mask function for creating this kind of value
directly.

Like you say, the PR is about converting other code to use wi::, which
is very different from what this patch is doing.  I'll take the PR
anyway though.  Hope to post a patch on Wednesday.

It also looks like the code is missing a check that the ZERO_EXTEND mode
is scalar, since the transformation would be incorrect for vectors.

Thanks,
Richard


Re: [PATCH][GCC7] Remove scaling of COMPONENT_REF/ARRAY_REF ops 2/3

2016-05-02 Thread Richard Biener
On Mon, 2 May 2016, Eric Botcazou wrote:

> > The following experiment resulted from looking at making
> > array_ref_low_bound and array_ref_element_size non-mutating.  Again
> > I wondered why we do this strange scaling by offset/element alignment.
> 
> The idea is to expose the alignment factor to the RTL expander:
> 
>   tree tem
> = get_inner_reference (exp, &bitsize, &bitpos, &offset, &mode1,
>&unsignedp, &reversep, &volatilep, true);
> 
> [...]
> 
>   rtx offset_rtx = expand_expr (offset, NULL_RTX, VOIDmode,
> EXPAND_SUM);
> 
> [...]
> 
>   op0 = offset_address (op0, offset_rtx,
> highest_pow2_factor (offset));
> 
> With the scaling, offset is something like _69 * 4 so highest_pow2_factor can 
> see the factor and passes it down to offset_address:
> 
> (gdb) p debug_rtx(op0)
> (mem/c:SI (plus:SI (reg/f:SI 193)
> (reg:SI 194)) [3 *s.16_63 S4 A32])
> 
> With your patch in the same situation:
> 
> (gdb) p debug_rtx(op0)
> (mem/c:SI (plus:SI (reg/f:SI 139)
> (reg:SI 116 [ _33 ])) [3 *s.16_63 S4 A8])
> 
> On strict-alignment targets, this makes a big difference, e.g. SPARC:
> 
>   ld  [%i4+%i5], %i0
> 
> vs
> 
>   ldub[%i5+%i4], %g1
>   sll %g1, 24, %g1
>   add %i5, %i4, %i5
>   ldub[%i5+1], %i0
>   sll %i0, 16, %i0
>   or  %i0, %g1, %i0
>   ldub[%i5+2], %g1
>   sll %g1, 8, %g1
>   or  %g1, %i0, %g1
>   ldub[%i5+3], %i0
>   or  %i0, %g1, %i0
> 
> 
> Now this is mitigated by a couple of things:
> 
>   1. the above pessimization only happens on the RHS; on the LHS, the 
> expander 
> calls highest_pow2_factor_for_target instead of highest_pow2_factor and the 
> former takes into account the type's alignment thanks to the MAX:
> 
> /* Similar, except that the alignment requirements of TARGET are
>taken into account.  Assume it is at least as aligned as its
>type, unless it is a COMPONENT_REF in which case the layout of
>the structure gives the alignment.  */
> 
> static unsigned HOST_WIDE_INT
> highest_pow2_factor_for_target (const_tree target, const_tree exp)
> {
>   unsigned HOST_WIDE_INT talign = target_align (target) / BITS_PER_UNIT;
>   unsigned HOST_WIDE_INT factor = highest_pow2_factor (exp);
> 
>   return MAX (factor, talign);
> }
> 
>   2. highest_pow2_factor can be rescued by the set_nonzero_bits machinery of 
> the SSA CCP pass because it calls tree_ctz.  The above example was compiled 
> with -O -fno-tree-ccp on SPARC; at -O, the code isn't pessimized.
> 
> > So - the following patch gets rid of that scaling.  For a "simple"
> > C testcase
> > 
> > void bar (void *);
> > void foo (int n)
> > {
> >   struct S { struct R { int b[n]; } a[2]; int k; } s;
> >   s.k = 1;
> >   s.a[1].b[7] = 3;
> >   bar (&s);
> > }
> 
> This only exposes the LHS case, here's a more complete testcase:
> 
> void bar (void *);
> 
> int foo (int n)
> {
>   struct S { struct R { char b[n]; } a[2]; int k; } s;
>   s.k = 1;
>   s.a[1].b[7] = 3;
>   bar (&s);
>   return s.k;
> }

Ok.  Note that on x86_64 at least SLSR wrecks the component-ref case:

@@ -30,10 +34,14 @@
   _9 = _8 + 4;
   s.1_11 = __builtin_alloca_with_align (_9, 32);
   _12 = _7 >> 2;
-  s.1_11->k{off: _12 * 4} = 1;
+  _18 = _12 * 4;
+  _19 = s.1_11 + _18;
+  MEM[(struct S *)_19] = 1;
   s.1_11->a[1]{lb: 0 sz: _5}.b[7] = 3;
   bar (s.1_11);
-  _16 = s.1_11->k{off: _12 * 4};
+  _20 = _12 * 4;
+  _21 = s.1_11 + _20;
+  _16 = MEM[(struct S *)_21];
   __builtin_stack_restore (saved_stack.3_3);
   return _16;

It seems to me that the issue in the end is that where we compute
alignment from is the pieces gathered by get_inner_reference
instead of computing it alongside of that information in
get_inner_reference, taking advantage of DECL_OFFSET_ALIGN
and the array element type alignment there.  This would be
another opportunity to merge get_inner_reference and
get_object_alignment_2 (or have a common worker).

That we expose the exact division in the IL has unfortunate side-effects
like above.

Do I understand you correctly that without using -fno-tree-ccp there
are currently no regressions?  What about -O0 then?  The code
generated by -O0 on x86_64 currently is quite horrible of course,
so maybe we don't care too much...  I think -Og disables CCPs
bit-tracking though.

I see that the constant offset parts are only sometimes folded
into op0 before calling offset_address with the variable offset parts.
Now with get_object_alignment_2 computing alignment and misalignment
one could split out the misaligning offset from bitpos in some way
to have op0 always be aligned.

Sth like

Index: gcc/expr.c
===
--- gcc/expr.c  (revision 235706)
+++ gcc/expr.c  (working copy)
@@ -10334,6 +10334,9 @@ expand_expr_real_1 (tree exp, rtx target
offset_rtx =

Re: Canonicalize X u< X to UNORDERED_EXPR

2016-05-02 Thread Richard Biener
On Mon, May 2, 2016 at 11:18 AM, Marc Glisse  wrote:
> On Mon, 2 May 2016, Richard Biener wrote:
>
>> On Sat, Apr 30, 2016 at 8:44 PM, Marc Glisse  wrote:
>>>
>>> Hello,
>>>
>>> this case seemed to be missing in the various X cmp X transformations. It
>>> does not change the generated code in the testcase.
>>>
>>> The missing :c is rather trivial. I can commit it separately if you
>>> prefer.
>>
>>
>> I think it's not missing.  Commutating the first one is enough to
>> eventually
>> make the @1s match up.  I think you should get a diagnostic on a duplicate
>> pattern when adding another :c (hmm, no, it's indeed "different" patterns
>> but
>> still redundant).
>
>
> Let's see. The current pattern is
>   (bitop (bit_and:c @0 @1) (bit_and @2 @1))
>
> This matches:
> (X & Y) ^ (Z & Y)
> (Y & X) ^ (Z & Y)
>
> If I have for instance (Y & X) ^ (Y & Z), I don't see how that is going to
> match, and indeed we don't simplify that.

Hmm, indeed.  I might have wrongly ommitted some :c then in other places
as well.  So the original patch is ok.

> On the other hand, if I have
> bit_ior instead of bit_xor, then we have another very similar transformation
> a 100 lines up in match.pd
>
> (for op (bit_and bit_ior)
>  rop (bit_ior bit_and)
>  (simplify
>   (op (convert? (rop:c @0 @1)) (convert? (rop @0 @2)))
>
> That one also commutes only one, but starting from a different match. We
> should probably reorganize them a bit.

Yeah.

Richard.

>
>>> Bootstrap+regtest on powerpc64le-unknown-linux-gnu.
>>
>>
>> Ok for the new pattern.
>>
>> Thanks,
>> Richard.
>>
>>> 2016-05-02  Marc Glisse  
>>>
>>> gcc/
>>> * match.pd ((A & B) OP (C & B)): Mark '&' as commutative.
>>> (X u< X, X u> X): New transformations
>>>
>>> gcc/testsuite/
>>> * gcc.dg/tree-ssa/unord.c: New testcase.
>>>
>>> --
>>> Marc Glisse
>>> Index: trunk/gcc/match.pd
>>> ===
>>> --- trunk/gcc/match.pd  (revision 235654)
>>> +++ trunk/gcc/match.pd  (working copy)
>>> @@ -783,21 +783,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>>>@0)
>>>   /* (~x | y) & x -> x & y */
>>>   /* (~x & y) | x -> x | y */
>>>   (simplify
>>>(bitop:c (rbitop:c (bit_not @0) @1) @0)
>>>(bitop @0 @1)))
>>>
>>>  /* Simplify (A & B) OP0 (C & B) to (A OP0 C) & B. */
>>>  (for bitop (bit_and bit_ior bit_xor)
>>>   (simplify
>>> -  (bitop (bit_and:c @0 @1) (bit_and @2 @1))
>>> +  (bitop (bit_and:c @0 @1) (bit_and:c @2 @1))
>>>(bit_and (bitop @0 @2) @1)))
>>>
>>>  /* (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) */
>>>  (simplify
>>>(bit_and (bit_ior @0 CONSTANT_CLASS_P@1) CONSTANT_CLASS_P@2)
>>>(bit_ior (bit_and @0 @2) (bit_and @1 @2)))
>>>
>>>  /* Combine successive equal operations with constants.  */
>>>  (for bitop (bit_and bit_ior bit_xor)
>>>   (simplify
>>> @@ -1914,20 +1914,24 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>>>   (simplify
>>>(cmp @0 @0)
>>>(if (cmp != NE_EXPR
>>> || ! FLOAT_TYPE_P (TREE_TYPE (@0))
>>> || ! HONOR_NANS (@0))
>>> { constant_boolean_node (false, type); })))
>>>  (for cmp (unle unge uneq)
>>>   (simplify
>>>(cmp @0 @0)
>>>{ constant_boolean_node (true, type); }))
>>> +(for cmp (unlt ungt)
>>> + (simplify
>>> +  (cmp @0 @0)
>>> +  (unordered @0 @0)))
>>>  (simplify
>>>   (ltgt @0 @0)
>>>   (if (!flag_trapping_math)
>>>{ constant_boolean_node (false, type); }))
>>>
>>>  /* Fold ~X op ~Y as Y op X.  */
>>>  (for cmp (simple_comparison)
>>>   (simplify
>>>(cmp (bit_not@2 @0) (bit_not@3 @1))
>>>(if (single_use (@2) && single_use (@3))
>>> Index: trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c
>>> ===
>>> --- trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c (revision 0)
>>> +++ trunk/gcc/testsuite/gcc.dg/tree-ssa/unord.c (working copy)
>>> @@ -0,0 +1,7 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-O -fdump-tree-optimized" } */
>>> +
>>> +int f(double a){double b=a;return !__builtin_islessequal(a,b);}
>>> +int g(double a){double b=a;return !__builtin_isgreaterequal(a,b);}
>>> +
>>> +/* { dg-final { scan-tree-dump-times " unord " 2 "optimized" } } */
>
>
> --
> Marc Glisse


Re: [PATCH][genrecog] Fix warning about potentially uninitialised use of label

2016-05-02 Thread Richard Sandiford
Kyrill Tkachov  writes:
> Hi all,
>
> I'm getting a warning when building genrecog that 'label' may be used
> uninitialised in:
>
>uint64_t label = 0;
>
>if (d->test.kind == rtx_test::CODE
>&& d->if_statement_p (&label)
>&& label == CONST_INT)
>
> This is because if_statement_p looks like this:
>   inline bool
>   decision::if_statement_p (uint64_t *label) const
>   {
> if (singleton () && first->labels.length () == 1)
>   {
> if (label)
>   *label = first->labels[0];
> return true;
>   }
> return false;
>   }
>
> It's not guaranteed to write label.

It is guaranteed to write to label on a true return though, so it looks
like a false positive.  Is current GCC warning for this or are you using
an older host compiler?

Thanks,
Richard


Re: Fix PR rtl-optimization/70886

2016-05-02 Thread Bernd Schmidt

On 05/02/2016 09:12 AM, Eric Botcazou wrote:

The pointer comparison is not stable for VALUEs when cselib is used (this is
the business of canonical cselib values).  I tried rtx_equal_for_cselib_p here
but this doesn't work because there are dangling VALUEs during scheduling
(VALUEs whose associated cselib value has been reclaimed but still referenced
as addresses of MEMs on lists).


Dangling as in it has a null VAL_PTR because it was decided it's useless?


2016-05-02  Eric Botcazou  

PR rtl-optimization/70886
* sched-deps.c (estimate_dep_weak): Canonicalize cselib values.

* cselib.h (rtx_equal_for_cselib_1): Declare.
(rtx_equal_for_cselib_p: New inline function.
* cselib.c (rtx_equal_for_cselib_p): Delete.
(rtx_equal_for_cselib_1): Make public.


I think this is OK.


Bernd


[PATCH, i386]: Merge FP ops patterns

2016-05-02 Thread Uros Bizjak
Hello!

2016-05-02  Uros Bizjak  

* config/i386/predicates.md (nonimm_ssenomem_operand): New predicate.
(register_mixssei387nonimm_operand): Remove predicate.
* config/i386/i386.md (*fop__comm): Merge from
*fop__comm_mixed and *fop__comm_i387.  Disable unsupported
alternatives using "enabled" attribute.  Also check X87_ENABLE_ARITH
for TARGET_MIX_SSE_I387 alternatives.
(*fop__1): Merge from *fop__1_mixed and *fop__1_i387.
Disable unsupported alternatives using "enabled" attribute.  Use
nonimm_ssenomem_operand as operand 1 predicate.  Also check
X87_ENABLE_ARITH for TARGET_MIX_SSE_I387 alternatives.
* config/i386/predicates.md (nonimm_ssenomem_operand): New predicate.
(register_mixssei387nonimm_operand): Remove predicate.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
Index: i386.md
===
--- i386.md (revision 235693)
+++ i386.md (working copy)
@@ -13989,12 +13989,13 @@
 ;; Gcc is slightly more smart about handling normal two address instructions
 ;; so use special patterns for add and mull.
 
-(define_insn "*fop__comm_mixed"
+(define_insn "*fop__comm"
   [(set (match_operand:MODEF 0 "register_operand" "=f,x,v")
(match_operator:MODEF 3 "binary_fp_operator"
  [(match_operand:MODEF 1 "nonimmediate_operand" "%0,0,v")
   (match_operand:MODEF 2 "nonimmediate_operand" "fm,xm,vm")]))]
-  "SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH
+  "((SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH)
+|| (TARGET_80387 && X87_ENABLE_ARITH (mode)))
&& COMMUTATIVE_ARITH_P (operands[3])
&& !(MEM_P (operands[1]) && MEM_P (operands[2]))"
   "* return output_387_binary_op (insn, operands);"
@@ -14010,26 +14011,18 @@
(set_attr "prefix" "orig,orig,vex")
(set_attr "mode" "")
(set (attr "enabled")
- (cond [(eq_attr "alternative" "0")
-  (symbol_ref "TARGET_MIX_SSE_I387")
-  ]
-   (const_string "*")))])
+ (if_then_else
+   (match_test ("SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH"))
+   (if_then_else
+(eq_attr "alternative" "0")
+(symbol_ref "TARGET_MIX_SSE_I387
+ && X87_ENABLE_ARITH (mode)")
+(const_string "*"))
+   (if_then_else
+(eq_attr "alternative" "0")
+(symbol_ref "true")
+(symbol_ref "false"])
 
-(define_insn "*fop__comm_i387"
-  [(set (match_operand:MODEF 0 "register_operand" "=f")
-   (match_operator:MODEF 3 "binary_fp_operator"
- [(match_operand:MODEF 1 "nonimmediate_operand" "%0")
-  (match_operand:MODEF 2 "nonimmediate_operand" "fm")]))]
-  "TARGET_80387 && X87_ENABLE_ARITH (mode)
-   && COMMUTATIVE_ARITH_P (operands[3])
-   && !(MEM_P (operands[1]) && MEM_P (operands[2]))"
-  "* return output_387_binary_op (insn, operands);"
-  [(set (attr "type")
-   (if_then_else (match_operand:MODEF 3 "mult_operator")
-  (const_string "fmul")
-  (const_string "fop")))
-   (set_attr "mode" "")])
-
 (define_insn "*rcpsf2_sse"
   [(set (match_operand:SF 0 "register_operand" "=x")
(unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "xm")]
@@ -14042,14 +14035,15 @@
(set_attr "prefix" "maybe_vex")
(set_attr "mode" "SF")])
 
-(define_insn "*fop__1_mixed"
+(define_insn "*fop__1"
   [(set (match_operand:MODEF 0 "register_operand" "=f,f,x,v")
(match_operator:MODEF 3 "binary_fp_operator"
  [(match_operand:MODEF 1
-"register_mixssei387nonimm_operand" "0,fm,0,v")
+"nonimm_ssenomem_operand" "0,fm,0,v")
   (match_operand:MODEF 2
-"nonimmediate_operand"  "fm,0,xm,vm")]))]
-  "SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH
+"nonimmediate_operand""fm,0,xm,vm")]))]
+  "((SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH)
+|| (TARGET_80387 && X87_ENABLE_ARITH (mode)))
&& !COMMUTATIVE_ARITH_P (operands[3])
&& !(MEM_P (operands[1]) && MEM_P (operands[2]))"
   "* return output_387_binary_op (insn, operands);"
@@ -14065,28 +14059,18 @@
(set_attr "prefix" "orig,orig,orig,vex")
(set_attr "mode" "")
(set (attr "enabled")
- (cond [(eq_attr "alternative" "0,1")
-  (symbol_ref "TARGET_MIX_SSE_I387")
-  ]
-   (const_string "*")))])
+ (if_then_else
+   (match_test ("SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH"))
+   (if_then_else
+(eq_attr "alternative" "0,1")
+(symbol_ref "TARGET_MIX_SSE_I387
+ && X87_ENABLE_ARITH (mode)")
+(const_string "*"))
+   (if_then_else
+(eq_attr "alternative" "0,1")
+(symbol_ref "true")
+(symbol_ref "false"])
 
-;; This pattern is not fully shadowed by the pattern above.
-(define_insn "*fop__1_i387"
-  [(set (match_operand:MODEF 0 "register_operand" "=f,f")
-   (match_operator:MODEF 3 "binary_fp_operator"
- [(match_operand:MODEF 1 "

[Ada] Correctly set Last when calling Text_IO.Get_Line on empty string

2016-05-02 Thread Arnaud Charlet
Ada RM requires that procedure Text_IO_Get_Line sets Last to
Item'First - 1 when parameter Item is the empty string, which was not
performed until now.

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-05-02  Yannick Moy  

* a-tigeli.adb (Get_Line): Always set Last prior to returning.

Index: a-tigeli.adb
===
--- a-tigeli.adb(revision 235710)
+++ a-tigeli.adb(working copy)
@@ -150,6 +150,12 @@
 begin
FIO.Check_Read_Status (AP (File));
 
+   --  Set Last to Item'First - 1 when no characters are read, as mandated by
+   --  Ada RM. In the case where Item'First is negative or null, this results
+   --  in Constraint_Error being raised.
+
+   Last := Item'First - 1;
+
--  Immediate exit for null string, this is a case in which we do not
--  need to test for end of file and we do not skip a line mark under
--  any circumstances.
@@ -160,8 +166,6 @@
 
N := Item'Last - Item'First + 1;
 
-   Last := Item'First - 1;
-
--  Here we have at least one character, if we are immediately before
--  a line mark, then we will just skip past it storing no characters.
 


libgomp: In OpenACC testing, cycle though $offload_targets, and by default only build for the offload target that we're actually going to test (was: check-target-libgomp wall time, without vs. with of

2016-05-02 Thread Thomas Schwinge
Hi Jakub!

On Fri, 29 Apr 2016 09:43:41 +0200, Jakub Jelinek  wrote:
> On Thu, Apr 28, 2016 at 12:43:43PM +0200, Thomas Schwinge wrote:
> > commit 3b521f3e35fdb4b320e95b5f6a82b8d89399481a
> > Author: Thomas Schwinge 
> > Date:   Thu Apr 21 11:36:39 2016 +0200
> > 
> > libgomp: Unconfuse offload plugins vs. offload targets
> 
> I don't like this patch at all, rather than unconfusing stuff it
> makes stuff confusing.  Plugins are just a way to support various
> offloading targets.

Huh; my patch exactly clarifies that the offload_targets variable does
not actually list offload target names, but does list libgomp offload
plugin names...

> Can you please post just a short patch without all those changes
> that does what you want, rather than renaming everything at the same time?

I thought incremental, self-contained patches were easier to review.
Anyway, here's the three patches merged into one:

commit 8060ae3474072eef685381d80f566d1c0942c603
Author: Thomas Schwinge 
Date:   Thu Apr 21 11:36:39 2016 +0200

libgomp: In OpenACC testing, cycle though $offload_targets, and by default 
only build for the offload target that we're actually going to test

libgomp/
* plugin/configfrag.ac (offload_targets): Actually enumerate
offload targets, and add...
(offload_plugins): ... this one to enumerate offload plugins.
(OFFLOAD_PLUGINS): Renamed from OFFLOAD_TARGETS.
* target.c (gomp_target_init): Adjust to that.
* testsuite/lib/libgomp.exp: Likewise.
(offload_targets_s, offload_targets_s_openacc): Remove variables.
(offload_target_to_openacc_device_type): New proc.
(check_effective_target_openacc_nvidia_accel_selected)
(check_effective_target_openacc_host_selected): Examine
$openacc_device_type instead of $offload_target_openacc.
* Makefile.in: Regenerate.
* config.h.in: Likewise.
* configure: Likewise.
* testsuite/Makefile.in: Likewise.
* testsuite/libgomp.oacc-c++/c++.exp: Cycle through
$offload_targets (plus "disable") instead of
$offload_targets_s_openacc, and add "-foffload=$offload_target" to
tagopt.
* testsuite/libgomp.oacc-c/c.exp: Likewise.
* testsuite/libgomp.oacc-fortran/fortran.exp: Likewise.
---
 libgomp/Makefile.in|  1 +
 libgomp/config.h.in|  4 +-
 libgomp/configure  | 44 +++--
 libgomp/plugin/configfrag.ac   | 39 +++-
 libgomp/target.c   |  8 +--
 libgomp/testsuite/Makefile.in  |  1 +
 libgomp/testsuite/lib/libgomp.exp  | 72 ++
 libgomp/testsuite/libgomp.oacc-c++/c++.exp | 30 +
 libgomp/testsuite/libgomp.oacc-c/c.exp | 30 +
 libgomp/testsuite/libgomp.oacc-fortran/fortran.exp | 22 ---
 10 files changed, 142 insertions(+), 109 deletions(-)

diff --git libgomp/Makefile.in libgomp/Makefile.in
[snipped]
diff --git libgomp/config.h.in libgomp/config.h.in
[snipped]
diff --git libgomp/configure libgomp/configure
[snipped]
diff --git libgomp/plugin/configfrag.ac libgomp/plugin/configfrag.ac
index 88b4156..de0a6f6 100644
--- libgomp/plugin/configfrag.ac
+++ libgomp/plugin/configfrag.ac
@@ -26,8 +26,6 @@
 # see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 # .
 
-offload_targets=
-AC_SUBST(offload_targets)
 plugin_support=yes
 AC_CHECK_LIB(dl, dlsym, , [plugin_support=no])
 if test x"$plugin_support" = xyes; then
@@ -142,7 +140,13 @@ AC_SUBST(PLUGIN_HSA_LIBS)
 
 
 
-# Get offload targets and path to install tree of offloading compiler.
+# Parse offload targets, and figure out libgomp plugin, and configure the
+# corresponding offload compiler.  offload_plugins and offload_targets will be
+# populated in the same order.
+offload_plugins=
+offload_targets=
+AC_SUBST(offload_plugins)
+AC_SUBST(offload_targets)
 offload_additional_options=
 offload_additional_lib_paths=
 AC_SUBST(offload_additional_options)
@@ -151,13 +155,13 @@ if test x"$enable_offload_targets" != x; then
   for tgt in `echo $enable_offload_targets | sed -e 's#,# #g'`; do
 tgt_dir=`echo $tgt | grep '=' | sed 's/.*=//'`
 tgt=`echo $tgt | sed 's/=.*//'`
-tgt_name=
+tgt_plugin=
 case $tgt in
   *-intelmic-* | *-intelmicemul-*)
-   tgt_name=intelmic
+   tgt_plugin=intelmic
;;
   nvptx*)
-tgt_name=nvptx
+   tgt_plugin=nvptx
PLUGIN_NVPTX=$tgt
PLUGIN_NVPTX_CPPFLAGS=$CUDA_DRIVER_CPPFLAGS
PLUGIN_NVPTX_LDFLAGS=$CUDA_DRIVER_LDFLAGS
@@ -184,7 +188,7 @@ if test x"$enable_offload_targets" != x; then
;;
esac
;;
-  hsa*)
+  hsa)
case "${target}" in
  x86_64-*-*)
case " ${CC} ${CFLAGS} " in
@@ -192,7 +196,7 @@ if test x"$enable_offload_

[Ada] Undefined symbol with interface types

2016-05-02 Thread Arnaud Charlet
Compiling with optimizations enabled the linker may report undefined
symbols of the form "xxxTs" associated with types that implement
interface types. This issue was introduced on 2016-01-16 as part of
improving the support of subprograms that are inlined across units
with optimization.

No small reproducer available.

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-05-02  Javier Miranda  

* exp_disp.adb (Make_Tags): Do not generate the
external name of interface tags adding the suffix counter since
it causes problems at link time when the IP routines are inlined
across units with optimization.

Index: exp_disp.adb
===
--- exp_disp.adb(revision 235706)
+++ exp_disp.adb(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
---  Copyright (C) 1992-2015, Free Software Foundation, Inc. --
+--  Copyright (C) 1992-2016, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -6751,8 +6751,7 @@
if Building_Static_DT (Typ) then
   Iface_DT :=
 Make_Defining_Identifier (Loc,
-  Chars => New_External_Name
- (Typ_Name, 'T', Suffix_Index => -1));
+  Chars => New_External_Name (Typ_Name, 'T'));
   Import_DT
 (Tag_Typ => Related_Type (Node (AI_Tag_Comp)),
  DT  => Iface_DT,


[Ada] Predicate checks when Assertion policy is Ignore

2016-05-02 Thread Arnaud Charlet
This patch implements the proper semantics of predicated subtypes in various
contexts when the assertion policy is Ignore. This affects the semantics of
case constructs and object declarations when values that do not satisfy the
predicate are present.

Tested in ACATS 4.0J tests C54003 and C457005

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-05-02  Ed Schonberg  

* einfo.ads, einfo.adb (Predicates_Ignared): new flag to indicate
that predicate checking is disabled for predicated subtypes in
the context of an Assertion_Policy pragma.
* checks.adb (Apply_Predicate_Check): Do nothing if
Predicates_Ignored is true.
* exp_ch3.adb (Expand_Freeze_Enumeration_Type): If
Predicates_Ignores is true, the function Rep_To_Pos does raise
an exception for invalid data.
* exp_ch4.adb (Expand_N_Type_Conversion): IF target is a predicated
type do not apply check if Predicates_Ignored is true.
* exp_ch5.adb (Expand_N_Case_Statement): If Predicates_Ignored
is true, sem_prag.adb:
* sem_ch3.adb (Analyze_Object_Declaration): If Predicates_Ignored
is true do not emit predicate check on initializing expression.

Index: exp_ch5.adb
===
--- exp_ch5.adb (revision 235711)
+++ exp_ch5.adb (working copy)
@@ -2573,10 +2573,11 @@
   --  does not obey the predicate, the value is marked non-static, and
   --  there can be no corresponding static alternative. In that case we
   --  replace the case statement with an exception, regardless of whether
-  --  assertions are enabled or not.
+  --  assertions are enabled or not, unless predicates are ignored.
 
   if Compile_Time_Known_Value (Expr)
 and then Has_Predicates (Etype (Expr))
+and then not Predicates_Ignored (Etype (Expr))
 and then not Is_OK_Static_Expression (Expr)
   then
  Rewrite (N,
@@ -2659,7 +2660,9 @@
  --  comes from source -- no need to validity check internally
  --  generated case statements).
 
- if Validity_Check_Default then
+ if Validity_Check_Default
+   and then not Predicates_Ignored (Etype (Expr))
+ then
 Ensure_Valid (Expr);
  end if;
 
@@ -2788,9 +2791,31 @@
 
  if not Others_Present then
 Others_Node := Make_Others_Choice (Sloc (Last_Alt));
-Set_Others_Discrete_Choices
-  (Others_Node, Discrete_Choices (Last_Alt));
-Set_Discrete_Choices (Last_Alt, New_List (Others_Node));
+
+--  If Predicates_Ignored is true the value does not satisfy the
+--  predicate, and there is no Others choice, Constraint_Error
+--  must be raised (4.5.7 (21/3)).
+
+if Predicates_Ignored (Etype (Expr)) then
+   declare
+  Except : constant Node_Id :=
+   Make_Raise_Constraint_Error (Loc,
+ Reason => CE_Invalid_Data);
+  New_Alt : constant Node_Id :=
+Make_Case_Statement_Alternative (Loc,
+  Discrete_Choices => New_List (Make_Others_Choice (Loc)),
+  Statements => New_List (Except));
+   begin
+  Append (New_Alt, Alternatives (N));
+  Analyze_And_Resolve (Except);
+   end;
+
+else
+   Set_Others_Discrete_Choices
+ (Others_Node, Discrete_Choices (Last_Alt));
+   Set_Discrete_Choices (Last_Alt, New_List (Others_Node));
+end if;
+
  end if;
 
  --  Deal with possible declarations of controlled objects, and also
Index: sem_ch3.adb
===
--- sem_ch3.adb (revision 235706)
+++ sem_ch3.adb (working copy)
@@ -3814,14 +3814,15 @@
   --  do this in the analyzer and not the expander because the analyzer
   --  does some substantial rewriting in some cases.
 
-  --  We need a predicate check if the type has predicates, and if either
-  --  there is an initializing expression, or for default initialization
-  --  when we have at least one case of an explicit default initial value
-  --  and then this is not an internal declaration whose initialization
-  --  comes later (as for an aggregate expansion).
+  --  We need a predicate check if the type has predicates that are not
+  --  ignored, and if either there is an initializing expression, or for
+  --  default initialization when we have at least one case of an explicit
+  --  default initial value and then this is not an internal declaration
+  --  whose initialization comes later (as for an aggregate expansion).
 
   if not Suppress_Assignment_Checks (N)
 and then Present (Predicate_Function (T))
+and then not Predicates_Ignored (T)

[Ada] Handling of attribute definition clauses for ASIS with GNSA

2016-05-02 Thread Arnaud Charlet
This patch introduces new switch -gnatd.H to enabled ASIS_GNSA mode. When
active, this mode disabled the call to gigi. In addition, the patch suppresses
various error checks with respect to attribute definition clauses in ASIS mode.


-- Source --


--  clauses.ads

package Clauses is

   --  Alignment

   type Align_T is tagged record
  Comp : Integer := 1;
   end record;

   for Align_T'Alignment use 7;

   Align_Obj : Align_T;

   for Align_Obj'Alignment use 7;

   --  Component_Size

   type Comp_Siz_T is array (1 .. 5) of Integer;

   for Comp_Siz_T'Component_Size use -1;

   --  Machine_Radix

   type Mach_Rad_T is delta 0.01 digits 15;

   for Mach_Rad_T'Machine_Radix use -1;

   --  Object_Size

   type Obj_Siz_T is record
  Comp : Integer := 1;
   end record;

   for Obj_Siz_T'Object_Size use -1;

   --  Size

   type Siz_Elem_T is new Integer;

   for Siz_Elem_T'Size use -1;

   type Siz_Rec_T is record
  Comp : Integer := 1;
   end record;

   for Siz_Rec_T'Size use -1;

   Siz_Elem_Obj : Siz_Elem_T;

   for Siz_Elem_Obj'Size use -1;

   Siz_Rec_Obj : Siz_Rec_T;

   for Siz_Rec_Obj'Size use -1;

   --  Storage_Size

   task type Stor_Siz_T;

   for Stor_Siz_T'Storage_Size use -1;

   --  Stream_Size

   type Str_Siz_Elem_T is new Integer;

   for Str_Siz_Elem_T'Stream_Size use -1;

   --  Value_Size

   type Val_Siz_T is array (1 .. 5) of Integer;

   for Val_Siz_T'Value_Size use -1;
end Clauses;


-- Compilation and output --


$ gcc -c clauses.ads
$ gcc -c clauses.ads -gnatct -gnatd.H
clauses.ads:9:30: alignment value must be positive
clauses.ads:13:32: alignment value must be positive
clauses.ads:19:38: size for "Integer" too small, minimum allowed is 32
clauses.ads:25:37: machine radix value must be 2 or 10
clauses.ads:33:34: Object_Size must be a multiple of 8
clauses.ads:39:28: size for "Siz_Elem_T" too small, minimum allowed is 32
clauses.ads:49:30: size for "Siz_Elem_T" too small, minimum allowed is 32
clauses.ads:65:04: stream size for elementary type must be a power of 2 and at
  least 8

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-05-02  Hristian Kirtchev  

* debug.adb: Document the use of switch -gnatd.H.
* gnat1drv.adb (Adjust_Global_Switches): Set ASIS_GNSA mode when
-gnatd.H is present.
(Gnat1drv): Suppress the call to gigi when ASIS_GNSA mode is active.
* opt.ads: Add new option ASIS_GNSA_Mode.
* sem_ch13.adb (Alignment_Error): New routine.
(Analyze_Attribute_Definition_Clause): Suppress certain errors in
ASIS mode for attribute clause Alignment, Machine_Radix, Size, and
Stream_Size.
(Check_Size): Use routine Size_Too_Small_Error to
suppress certain errors in ASIS mode.
(Get_Alignment_Value): Use routine Alignment_Error to suppress certain
errors in ASIS mode.
(Size_Too_Small_Error): New routine.

Index: debug.adb
===
--- debug.adb   (revision 235710)
+++ debug.adb   (working copy)
@@ -125,7 +125,7 @@
--  d.E  Turn selected errors into warnings
--  d.F  Debug mode for GNATprove
--  d.G  Ignore calls through generic formal parameters for elaboration
-   --  d.H
+   --  d.H  GNSA mode for ASIS
--  d.I  Do not ignore enum representation clauses in CodePeer mode
--  d.J  Disable parallel SCIL generation mode
--  d.K
@@ -630,6 +630,9 @@
--   now fixed, but we provide this debug flag to revert to the previous
--   situation of ignoring such calls to aid in transition.
 
+   --  d.H  Sets ASIS_GNSA_Mode to True. This signals the front end to suppress
+   --   the call to gigi in ASIS_Mode.
+
--  d.I  Do not ignore enum representation clauses in CodePeer mode.
--   The default of ignoring representation clauses for enumeration
--   types in CodePeer is good for the majority of Ada code, but in some
Index: gnat1drv.adb
===
--- gnat1drv.adb(revision 235706)
+++ gnat1drv.adb(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
---  Copyright (C) 1992-2015, Free Software Foundation, Inc. --
+--  Copyright (C) 1992-2016, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -180,6 +180,12 @@
   if Operating_Mode = Check_Semantics and then Tree_Output then
  ASIS_Mode := True;
 
+ --  Set ASIS 

[Ada] Optimization of anonymous access-to-controlled types

2016-05-02 Thread Arnaud Charlet
This patch modifies the creation of finalization masters for anonymous access-
to-controlled types. Prior to this change, each compilation unit utilized a
single heterogeneous finalization master to service all allocations where the
associated type is anonymous access-to-controlled. This patch removes the use
of the single heterogeneous finalization master and instead introduces multiple
homogenous finalization masters. This leads to increase in performance because
allocation no longer needs to maintain a mapping between allocated object and
corresponding Finalize_Address primitive in a runtime hash data structure. As
a result, anonymous access-to-controlled types are on par with named access-to-
controlled types.


-- Source --


--  types.ads

with Ada.Finalization; use Ada.Finalization;

package Types is
   type Ctrl is new Controlled with record
  Id : Natural;
   end record;

   --  Anonymous types

   type Anon_Discr (Discr : access Ctrl) is null record;

   type Anon_Comps is record
  Comp_1 : access Ctrl;
  Comp_2 : access Ctrl;
   end record;

   type Anon_Array is array (1 .. 5) of access Ctrl;

   --  Named types

   type Ctrl_Ptr is access all Ctrl;

   type Named_Discr (Discr : Ctrl_Ptr) is null record;

   type Named_Discr_Ptr is access all Named_Discr;

   type Named_Comps is record
  Comp_1 : Ctrl_Ptr;
  Comp_2 : Ctrl_Ptr;
   end record;
end Types;

--  performance.adb

with Ada.Calendar; use Ada.Calendar;
with Ada.Finalization; use Ada.Finalization;
with Ada.Text_IO;  use Ada.Text_IO;
with Types;use Types;

procedure Performance is
   Percentage : constant := 0.3; --  30%
   Max_Iterations : constant := 50_000;

   Diff_A  : Duration;
   Diff_N  : Duration;
   Factor  : Duration;
   Start_A : Time;
   Start_N : Time;

begin
   Start_A := Clock;

   for Iteration in 1 .. Max_Iterations loop
  declare
 Anon_Discr_Obj : access Anon_Discr :=
new Anon_Discr'(Discr =>
  new Ctrl'(Controlled with Id => 1));
 Anon_Comps_Obj : constant Anon_Comps :=
(Comp_1 => new Ctrl'(Controlled with Id => 2),
 Comp_2 => new Ctrl'(Controlled with Id => 3));
  begin null; end;
   end loop;

   Diff_A  := Clock - Start_A;
   Start_N := Clock;

   for Iteration in 1 .. Max_Iterations loop
  declare
 Named_Discr_Obj : Named_Discr_Ptr :=
 new Named_Discr'(Discr =>
   new Ctrl'(Controlled with Id => 4));
 Named_Comps_Obj : constant Named_Comps :=
 (Comp_1 => new Ctrl'(Controlled with Id => 5),
  Comp_2 => new Ctrl'(Controlled with Id => 6));
  begin null; end;
   end loop;

   Diff_N := Clock - Start_N;
   Factor := Diff_N * Percentage;

   if Diff_N - Factor < Diff_A and then Diff_A < Diff_N + Factor then
  Put_Line ("Anonymous vs Named within expected percentage");
   else
  Put_Line ("ERROR");
   end if;
end Performance;


-- Compilation and output --


$ gnatmake -q performance.adb
$ ./performance
Anonymous vs Named within expected percentage

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-05-02  Hristian Kirtchev  

* einfo.adb Anonymous_Master now uses Node35.
(Anonymous_Master): Update the assertion and node reference.
(Set_Anonymous_Master): Update the assertion and node reference.
(Write_Field35_Name): Add output for Anonymous_Master.
(Write_Field36_Name): The output is now undefined.
* einfo.ads Update the node and description of attribute
Anonymous_Master. Remove prior occurrences in entities as this
is now a type attribute.
* exp_ch3.adb (Expand_Freeze_Array_Type): Remove local variable
Ins_Node. Anonymous access- to-controlled component types no
longer need finalization masters. The master is now built when
a related allocator is expanded.
(Expand_Freeze_Record_Type): Remove local variable Has_AACC. Do not
detect whether the record type has at least one component of anonymous
access-to- controlled type. These types no longer need finalization
masters. The master is now built when a related allocator is expanded.
* exp_ch4.adb Remove with and use clauses for Lib and Sem_Ch8.
(Current_Anonymous_Master): Removed.
(Expand_N_Allocator): Call Build_Anonymous_Master to create a
finalization master for an anonymous access-to-controlled type.
* exp_ch6.adb (Add_Finalization_Master_Actual_To_Build_In_Place_Call):
Call routine Build_Anonymous_Master to create a finalization master
for an anonymous access-to-controlled type.
* exp_ch7.adb (Allows_Finalization_Master): New routine.
(Build_Anonymous_Master): New ro

Re: Fix PR rtl-optimization/70886

2016-05-02 Thread Eric Botcazou
> Dangling as in it has a null VAL_PTR because it was decided it's useless?

Yes, discard_useless_values was invoked on it and the VALUE had not been 
preserved (no VALUE is preserved during scheduling).  But it's referenced in 
the address of a MEM present in one of the lists maintained by the scheduler.

I think that the definitive fix would to do what var-tracking does: start to 
preserve VALUEs and call rtx_equal_for_cselib_p.  But preserving VALUEs is a 
tricky business, see the var-tracking code, and the usage of cselib in the 
scheduler is almost transparent at the moment, so I don't feel like messing 
with that for a comparison failure on IA-64.

> I think this is OK.

Thanks.

-- 
Eric Botcazou


[Ada] Minimize internally built wrappers of protected types

2016-05-02 Thread Arnaud Charlet
This patch minimizes the generation of wrappers of protected type
dispatching primitives since they may cause reporting spurious
ambiguity in correct code. After this patch the following test
compiles without errors:

package Ambig is
   type Lim_Iface is limited interface;

   type Prot is synchronized new Lim_Iface with private;

   function Not_Ambiguous (Obj : Prot) return Boolean;

private
   protected type Prot is new Lim_Iface with
  entry Dummy;
   end Prot;

   Obj : Prot;
   Val : constant Boolean := Not_Ambiguous (Obj);--  OK
end Ambig;

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-05-02  Javier Miranda  

* sem_ch3.adb (Process_Full_View): Remove from visibility
wrappers of synchronized types to avoid spurious errors with
their wrapped entity.
* exp_ch9.adb (Build_Wrapper_Spec): Do not generate the wrapper
if no interface primitive is covered by the subprogram and this is
not a primitive declared between two views; see Process_Full_View.
(Build_Protected_Sub_Specification): Link the dispatching
subprogram with its original non-dispatching protected subprogram
since their names differ.
(Expand_N_Protected_Type_Declaration):
If a protected subprogram overrides an interface primitive then
do not build a wrapper if it was already built.
* einfo.ads, einfo.adb (Original_Protected_Subprogram): New attribute.
* sem_ch4.adb (Names_Match): New subprogram.
* sem_ch6.adb (Check_Synchronized_Overriding): Moved
to library level and defined in the public part of the
package to invoke it from Exp_Ch9.Build_Wrapper_Spec
(Has_Matching_Entry_Or_Subprogram): New subprogram.
(Report_Conflict): New subprogram.

Index: sem_ch3.adb
===
--- sem_ch3.adb (revision 235732)
+++ sem_ch3.adb (working copy)
@@ -19835,6 +19835,13 @@
Curr_Nod := Wrap_Spec;
 
Analyze (Wrap_Spec);
+
+   --  Remove the wrapper from visibility to avoid
+   --  spurious conflict with the wrapped entity.
+
+   Set_Is_Immediately_Visible
+ (Defining_Entity (Specification (Wrap_Spec)),
+  False);
 end if;
 
 Next_Elmt (Prim_Elmt);
Index: exp_ch9.adb
===
--- exp_ch9.adb (revision 235706)
+++ exp_ch9.adb (working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
---  Copyright (C) 1992-2015, Free Software Foundation, Inc. --
+--  Copyright (C) 1992-2016, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -2443,13 +2443,6 @@
   Obj_Typ : Entity_Id;
   Formals : List_Id) return Node_Id
is
-  Loc   : constant Source_Ptr := Sloc (Subp_Id);
-  First_Param   : Node_Id;
-  Iface : Entity_Id;
-  Iface_Elmt: Elmt_Id;
-  Iface_Op  : Entity_Id;
-  Iface_Op_Elmt : Elmt_Id;
-
   function Overriding_Possible
 (Iface_Op : Entity_Id;
  Wrapper  : Entity_Id) return Boolean;
@@ -2631,6 +2624,16 @@
  return New_Formals;
   end Replicate_Formals;
 
+  --  Local variables
+
+  Loc : constant Source_Ptr := Sloc (Subp_Id);
+  First_Param : Node_Id := Empty;
+  Iface   : Entity_Id;
+  Iface_Elmt  : Elmt_Id;
+  Iface_Op: Entity_Id;
+  Iface_Op_Elmt   : Elmt_Id;
+  Overridden_Subp : Entity_Id;
+
--  Start of processing for Build_Wrapper_Spec
 
begin
@@ -2638,17 +2641,24 @@
 
   pragma Assert (Is_Tagged_Type (Obj_Typ));
 
+  --  Check if this subprogram has a profile that matches some interface
+  --  primitive
+
+  Check_Synchronized_Overriding (Subp_Id, Overridden_Subp);
+
+  if Present (Overridden_Subp) then
+ First_Param :=
+   First (Parameter_Specifications (Parent (Overridden_Subp)));
+
   --  An entry or a protected procedure can override a routine where the
   --  controlling formal is either IN OUT, OUT or is of access-to-variable
   --  type. Since the wrapper must have the exact same signature as that of
   --  the overridden subprogram, we try to find the overriding candidate
   --  and use its controlling formal.
 
-  First_Param := Empty;
-
   --  Check ev

Re: [PATCH] Fix spec-options.c test case

2016-05-02 Thread Bernd Schmidt

On 05/01/2016 09:52 AM, Bernd Edlinger wrote:

Hi,

I took a closer look at this test case, and I found, except that
it triggers a dejagnu bug, it is also wrong.  I have tested with
a cross-compiler for target=sh-elf and found that the test case
actually FAILs because the foo.specs uses "cppruntime" which
is only referenced in gcc/config/sh/superh.h, but sh/superh.h
is only included for target sh*-superh-elf, see gcc/config.gcc.

This means that it can only pass for target=sh-superh-elf.

The attached patch fixes the testcase and makes it run always,
so that it does no longer triggers the dejagnu bug.


So, two things. Why not use a string in the specs file that exists on 
all targets? If it's a sh-specific thing we want to test, move why not 
move it to gcc.target?



Bernd


[C PATCH] Fix ICE-on-invalid with enum forward declaration (PR c/70851)

2016-05-02 Thread Marek Polacek
Here, the problem was that we weren't diagnosing invalid code when an array
dimension was of incomplete enum type.  That led to an ICE in gimplifier.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2016-05-02  Marek Polacek  

PR c/70851
* c-decl.c (grokdeclarator): Diagnose when array's size has an
incomplete type.

* gcc.dg/enum-incomplete-3.c: New test.

diff --git gcc/c/c-decl.c gcc/c/c-decl.c
index 16e4250..7094efc 100644
--- gcc/c/c-decl.c
+++ gcc/c/c-decl.c
@@ -5799,10 +5799,21 @@ grokdeclarator (const struct c_declarator *declarator,
  {
if (name)
  error_at (loc, "size of array %qE has non-integer type",
-   name);
+   name);
else
  error_at (loc,
-   "size of unnamed array has non-integer type");
+   "size of unnamed array has non-integer type");
+   size = integer_one_node;
+ }
+   /* This can happen with enum forward declaration.  */
+   else if (!COMPLETE_TYPE_P (TREE_TYPE (size)))
+ {
+   if (name)
+ error_at (loc, "size of array %qE has incomplete type",
+   name);
+   else
+ error_at (loc, "size of unnamed array has incomplete "
+   "type");
size = integer_one_node;
  }
 
diff --git gcc/testsuite/gcc.dg/enum-incomplete-3.c 
gcc/testsuite/gcc.dg/enum-incomplete-3.c
index e69de29..db1138b 100644
--- gcc/testsuite/gcc.dg/enum-incomplete-3.c
+++ gcc/testsuite/gcc.dg/enum-incomplete-3.c
@@ -0,0 +1,20 @@
+/* PR c/70851 */
+/* { dg-do compile } */
+/* { dg-options "" } */
+
+enum E e; /* { dg-error "storage size" } */
+
+void bar (int [e]); /* { dg-error "size of unnamed array has incomplete type" 
} */
+void bar2 (int [][e]); /* { dg-error "size of unnamed array has incomplete 
type" } */
+
+void
+foo (void)
+{
+  int a1[e]; /* { dg-error "size of array .a1. has incomplete type" } */
+  int a2[e][3]; /* { dg-error "size of array .a2. has incomplete type" } */
+
+  struct S
+  {
+int a3[e]; /* { dg-error "size of array .a3. has incomplete type" } */
+  };
+}

Marek


[Ada] Crash on inlined call to subprogram declared in package instance

2016-05-02 Thread Arnaud Charlet
This patch fixes a compiler crash on an inlined call in a nested instantiation
when the corresponding generic unit has aspect specifications.

The following must compile quietly:

   gcc -c inst.ads -gnatn -O

---
with Pointer;
package Inst is
   package I is new Pointer (Integer);
end;
---
package body Pointer is
   function Deref (Pointer : WI.W) return P is
   begin
  return WI.Load (Pointer);
   end;
end;
---
with Wrapper;
generic
   type Any_Type;
package Pointer is
   pragma Elaborate_Body;
   type P is access Any_Type;
   package WI is new Wrapper (P);
end;
---
package body Wrapper is
   function Load (A : W) return Data_Type is
   begin
  return A.Data;
   end;
end;
---
generic
   type Data_Type is private;
package Wrapper with Preelaborate
 is
   pragma Preelaborate;
   type W is record
  Data : Data_Type;
   end record;
   function Load (A : W) return Data_Type with Inline;
end;

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-05-02  Ed Schonberg  

* exp_ch6.adb (Expand_Call): When inlining a call to a function
declared in a package instance, locate the instance node of the
package after the actual package declaration. skipping over
pragmas that may have been introduced when the generic unit
carries aspects that are transformed into pragmas.

Index: exp_ch6.adb
===
--- exp_ch6.adb (revision 235742)
+++ exp_ch6.adb (working copy)
@@ -3970,8 +3970,9 @@
   and then Optimization_Level > 0
 then
declare
-  Inst : Entity_Id;
-  Decl : Node_Id;
+  Decl  : Node_Id;
+  Inst  : Entity_Id;
+  Inst_Node : Node_Id;
 
begin
   Inst := Scope (Subp);
@@ -4001,7 +4002,19 @@
 null;
 
  else
-Add_Pending_Instantiation (Next (Decl), Decl);
+--  The instantiation node follows the package
+--  declaration for the instance. If the generic
+--  unit had aspect specifications, they have
+--  been transformed into pragmas in the instance,
+--  and the instance node appears after them.
+
+Inst_Node := Next (Decl);
+
+while Nkind (Inst_Node) /= N_Package_Instantiation loop
+   Inst_Node := Next (Inst_Node);
+end loop;
+
+Add_Pending_Instantiation (Inst_Node, Decl);
  end if;
   end if;
end;


[Ada] Speed up memory management

2016-05-02 Thread Arnaud Charlet
This patch speeds up "new" and Unchecked_Deallocation in simple cases.
No change in behavior; no test available.

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-05-02  Bob Duff  

* s-memory.adb (Alloc, Realloc): Move checks
for Size = 0 or size_t'Last into the Result = System.Null_Address
path for efficiency. Improve comments (based on actual C language
requirements for malloc).
* exp_util.adb (Build_Allocate_Deallocate_Proc): Optimize the
case where we are using the default Global_Pool_Object, and we
don't need the heavy finalization machinery.

Index: exp_util.adb
===
--- exp_util.adb(revision 235744)
+++ exp_util.adb(working copy)
@@ -584,6 +584,14 @@
   elsif Is_RTE (Pool_Id, RE_SS_Pool) then
  return;
 
+  --  Optimize the case where we are using the default Global_Pool_Object,
+  --  and we don't need the heavy finalization machinery.
+
+  elsif Pool_Id = RTE (RE_Global_Pool_Object)
+and then not Needs_Finalization (Desig_Typ)
+  then
+ return;
+
   --  Do not replicate the machinery if the allocator / free has already
   --  been expanded and has a custom Allocate / Deallocate.
 
Index: s-memory.adb
===
--- s-memory.adb(revision 235706)
+++ s-memory.adb(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
---  Copyright (C) 2001-2013, Free Software Foundation, Inc. --
+--  Copyright (C) 2001-2016, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -43,14 +43,12 @@
 
 pragma Compiler_Unit_Warning;
 
-with Ada.Exceptions;
+with System.CRTL;
+with System.Parameters;
 with System.Soft_Links;
-with System.Parameters;
-with System.CRTL;
 
 package body System.Memory is
 
-   use Ada.Exceptions;
use System.Soft_Links;
 
function c_malloc (Size : System.CRTL.size_t) return System.Address
@@ -68,33 +66,41 @@
---
 
function Alloc (Size : size_t) return System.Address is
-  Result  : System.Address;
-  Actual_Size : size_t := Size;
-
+  Result : System.Address;
begin
-  if Size = size_t'Last then
- Raise_Exception (Storage_Error'Identity, "object too large");
-  end if;
-
-  --  Change size from zero to non-zero. We still want a proper pointer
-  --  for the zero case because pointers to zero length objects have to
-  --  be distinct, but we can't just go ahead and allocate zero bytes,
-  --  since some malloc's return zero for a zero argument.
-
-  if Size = 0 then
- Actual_Size := 1;
-  end if;
-
   if Parameters.No_Abort then
- Result := c_malloc (System.CRTL.size_t (Actual_Size));
+ Result := c_malloc (System.CRTL.size_t (Size));
   else
  Abort_Defer.all;
- Result := c_malloc (System.CRTL.size_t (Actual_Size));
+ Result := c_malloc (System.CRTL.size_t (Size));
  Abort_Undefer.all;
   end if;
 
   if Result = System.Null_Address then
- Raise_Exception (Storage_Error'Identity, "heap exhausted");
+ --  If Size = 0, we can't allocate 0 bytes, because then two different
+ --  allocators, one of which has Size = 0, could return pointers that
+ --  compare equal, which is wrong. (Nonnull pointers compare equal if
+ --  and only if they designate the same object, and two different
+ --  allocators allocate two different objects).
+
+ --  malloc(0) is defined to allocate a non-zero-sized object (in which
+ --  case we won't get here, and all is well) or NULL, in which case we
+ --  get here. We also get here in case of error. So check for the
+ --  zero-size case, and allocate 1 byte. Otherwise, raise
+ --  Storage_Error.
+
+ --  We check for zero size here, rather than at the start, for
+ --  efficiency.
+
+ if Size = 0 then
+return Alloc (1);
+ end if;
+
+ if Size = size_t'Last then
+raise Storage_Error with "object too large";
+ end if;
+
+ raise Storage_Error with "heap exhausted";
   end if;
 
   return Result;
@@ -125,23 +131,21 @@
   return System.Address
is
   Result  : System.Address;
-  Actual_Size : constant size_t := Size;
-
begin
-  if Size = size_t'Last then
- Raise_Exception (Storage_Error'Identity, 

[Ada] Remove spurious accessibility check for aggregate component

2016-05-02 Thread Arnaud Charlet
This patch removes an accessibility check that was improperly applied, via
a type conversion, to an an operand that is an access parameter that also
requires a non-null check.

Test in ACATS 4.0K C3A0030.

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-05-02  Ed Schonberg  

* sem_util.adb (Aggregate_Constraint_Checks): Separate
accessibility checks and non-null checks for aggregate components,
to prevent spurious accessibility errors.

Index: sem_util.adb
===
--- sem_util.adb(revision 235732)
+++ sem_util.adb(working copy)
@@ -326,21 +326,19 @@
   --  Ada 2005 (AI-230): Generate a conversion to an anonymous access
   --  component's type to force the appropriate accessibility checks.
 
-  --  Ada 2005 (AI-231): Generate conversion to the null-excluding
-  --  type to force the corresponding run-time check
+  --  Ada 2005 (AI-231): Generate conversion to the null-excluding type to
+  --  force the corresponding run-time check
 
   if Is_Access_Type (Check_Typ)
-and then ((Is_Local_Anonymous_Access (Check_Typ))
-or else (Can_Never_Be_Null (Check_Typ)
-  and then not Can_Never_Be_Null (Exp_Typ)))
+and then Is_Local_Anonymous_Access (Check_Typ)
   then
  Rewrite (Exp, Convert_To (Check_Typ, Relocate_Node (Exp)));
  Analyze_And_Resolve (Exp, Check_Typ);
  Check_Unset_Reference (Exp);
   end if;
 
-  --  This is really expansion activity, so make sure that expansion is
-  --  on and is allowed. In GNATprove mode, we also want check flags to
+  --  What follows is really expansion activity, so check that expansion
+  --  is on and is allowed. In GNATprove mode, we also want check flags to
   --  be added in the tree, so that the formal verification can rely on
   --  those to be present. In GNATprove mode for formal verification, some
   --  treatment typically only done during expansion needs to be performed
@@ -353,6 +351,13 @@
  return;
   end if;
 
+  if Is_Access_Type (Check_Typ)
+and then Can_Never_Be_Null (Check_Typ)
+and then not Can_Never_Be_Null (Exp_Typ)
+  then
+ Install_Null_Excluding_Check (Exp);
+  end if;
+
   --  First check if we have to insert discriminant checks
 
   if Has_Discriminants (Exp_Typ) then


[Ada] Crash on illegal allocator for limited type.

2016-05-02 Thread Arnaud Charlet
This patch fixes a compiler crash on an allocator for an object of a limited
type, when the expression of the qualified expression is a type conversion.

Compiling bug1.adb must yield:
bug1.adb:29:08: illegal expression
  for initialized allocator of a limited type (RM 7.5 (2.7/2))

---
pragma Ada_2012;
procedure Bug1 is
   package Interf is
  type T is limited interface;
  subtype Implementor is T'Class;
  function Init return T is abstract;
   end Interf;

   package Impl is
  type T is limited new Interf.T with private;
  overriding function Init return T;
   private
  type T is limited new Interf.T with null record;
   end Impl;

   package body Impl is
  function Init return T is
  begin
 return Obj : T do
null;
 end return;
  end Init;
   end Impl;
   use Impl;

   V : access Interf.T'Class;
   Thing : T := Init;
begin
  V := new Interf.T'class'(Interf.Implementor(Impl.Init));  
end Bug1;

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-05-02  Ed Schonberg  

* sem_ch3.adb (OK_For_Limited_Init): A type conversion is not
always legal in the in-place initialization of a limited entity
(e.g. an allocator).
* sem_res.adb (Resolve_Allocator): Improve error message with RM
reference  when allocator expression is illegal.

Index: sem_ch3.adb
===
--- sem_ch3.adb (revision 235740)
+++ sem_ch3.adb (working copy)
@@ -18656,11 +18656,14 @@
is
begin
   --  An object of a limited interface type can be initialized with any
-  --  expression of a nonlimited descendant type.
+  --  expression of a nonlimited descendant type. However this does not
+  --  apply if this is a view conversion of some other expression. This
+  --  is checked below.
 
   if Is_Class_Wide_Type (Typ)
 and then Is_Limited_Interface (Typ)
 and then not Is_Limited_Type (Etype (Exp))
+and then Nkind (Exp) /= N_Type_Conversion
   then
  return True;
   end if;
Index: sem_res.adb
===
--- sem_res.adb (revision 235714)
+++ sem_res.adb (working copy)
@@ -4767,13 +4767,21 @@
and then not In_Instance_Body
  then
 if not OK_For_Limited_Init (Etype (E), Expression (E)) then
-   Error_Msg_N ("initialization not allowed for limited types", N);
+   if Nkind (Parent (N)) = N_Assignment_Statement then
+  Error_Msg_N
+("illegal expression for initialized allocator of a "
+ & "limited type (RM 7.5 (2.7/2))", N);
+   else
+  Error_Msg_N
+("initialization not allowed for limited types", N);
+   end if;
+
Explain_Limited_Type (Etype (E), N);
 end if;
  end if;
 
- --  A qualified expression requires an exact match of the type.
- --  Class-wide matching is not allowed.
+ --  A qualified expression requires an exact match of the type. Class-
+ --  wide matching is not allowed.
 
  if (Is_Class_Wide_Type (Etype (Expression (E)))
   or else Is_Class_Wide_Type (Etype (E)))


Re: [PATCH 2/4] PR c++/62314: add fixit hint for "expected ';' after class definition"

2016-05-02 Thread Bernd Schmidt

On 04/28/2016 04:28 PM, David Malcolm wrote:

whereas clang reportedly emits:

test.c:2:12: error: expected ';' after struct
  struct a {}
 ^
 ;

(note the offset of the location, and the fix-it hint)

The following patch gives us the latter, more readable output.


Huh. Only the non-C++ parts remain to be reviewed, and I have no 
technical objections, but do people really want this? To me that looks 
like unnecessary visual clutter that eats up vertical space for no 
reason. I know what a semicolon looks like without the compiler telling 
me twice.



Bernd


Re: [C PATCH] Fix ICE-on-invalid with enum forward declaration (PR c/70851)

2016-05-02 Thread Bernd Schmidt



On 05/02/2016 12:27 PM, Marek Polacek wrote:

Here, the problem was that we weren't diagnosing invalid code when an array
dimension was of incomplete enum type.  That led to an ICE in gimplifier.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2016-05-02  Marek Polacek  

PR c/70851
* c-decl.c (grokdeclarator): Diagnose when array's size has an
incomplete type.

* gcc.dg/enum-incomplete-3.c: New test.


Ok.


Bernd



Re: [PATCH #2], Fix _Complex when there are multiple FP types the same size

2016-05-02 Thread Bernd Schmidt

On 04/30/2016 06:00 PM, Segher Boessenkool wrote:

On Fri, Apr 29, 2016 at 04:51:27PM -0400, Michael Meissner wrote:

2016-04-29  Michael Meissner  

* config/rs6000/rs6000.c (rs6000_hard_regno_nregs_internal): Add
support for __float128 complex datatypes.
(rs6000_hard_regno_mode_ok): Likewise.
(rs6000_setup_reg_addr_masks): Likewise.
(rs6000_complex_function_value): Likewise.
* config/rs6000/rs6000.h (FLOAT128_IEEE_P): Likewise.
__float128 and __ibm128 complex.
(FLOAT128_IBM_P): Likewise.
(ALTIVEC_ARG_MAX_RETURN): Likewise.
* doc/extend.texi (Additional Floating Types): Document that
-mfloat128 must be used to enable __float128.  Document complex
__float128 and __ibm128 support.

[gcc/testsuite]
2016-04-29  Michael Meissner  

* gcc.target/powerpc/float128-complex-1.c: New tests for complex
__float128.
* gcc.target/powerpc/float128-complex-2.c: Likewise.


The powerpc parts are okay for trunk.  Thank you!


Ok for the other parts as well. Although I wonder:


+  /* build_complex_type has set the expected mode to allow having multiple
+complex types for multiple floating point types that have the same
+size such as the PowerPC with __ibm128 and __float128.  If this was
+not set, figure out the mode manually.  */
+  if (TYPE_MODE (type) == VOIDmode)
+   {
+ unsigned int precision = TYPE_PRECISION (TREE_TYPE (type));
+ enum mode_class mclass = (TREE_CODE (TREE_TYPE (type)) == REAL_TYPE
+   ? MODE_COMPLEX_FLOAT : MODE_COMPLEX_INT);
+ SET_TYPE_MODE (type, mode_for_size (2 * precision, mclass, 0));
+   }


What happens if you assert that it isn't VOIDmode?


Bernd


Re: [PATCH v2] [libatomic] Add RTEMS support

2016-05-02 Thread Sebastian Huber

If nobody objects, I will back port this to GCC 6 next Monday.

On 20/04/16 14:40, Joel Sherrill wrote:


As the other RTEMS target maintainer. I second the request.

--joel

On Apr 20, 2016 7:35 AM, "Sebastian Huber" 
> wrote:


Hello,

I know that I am pretty late, but is there a chance to get this
into the GCC 6.1 release?

On 19/04/16 14:56, Sebastian Huber wrote:

v2: Do not use architecture configuration due to broken ARM
libatomic
support.

gcc/

* config/rtems.h (LIB_SPEC): Add -latomic.

libatomic/

* configure.tgt (*-*-rtems*): New supported target.
* config/rtems/host-config.h: New file.
* config/rtems/lock.c: Likewise.


-- 
Sebastian Huber, embedded brains GmbH


Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16 
Fax : +49 89 189 47 41-09 
E-Mail  : sebastian.hu...@embedded-brains.de

PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.

___
devel mailing list
de...@rtems.org 
http://lists.rtems.org/mailman/listinfo/devel



--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.



Re: Enabling -frename-registers?

2016-05-02 Thread Uros Bizjak
Hello!

> On 04/17/2016 08:59 PM, Jeff Law wrote:
>
>> invoke.texi has an independent list (probably incomplete! ;( of all the
>> things that -O2 enables.  Make sure to add -frename-registers to that
>> list and this is Ok for the trunk (gcc-7).
>
> This is what I committed.

The patch introduced bootstrap failure on alpha-linux-gnu.

Non-bootstrapped build regresses:

FAIL: gcc.dg/torture/pr69542.c   -O2  (test for excess errors)
FAIL: gcc.dg/torture/pr69542.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  (test for excess errors)
FAIL: gcc.dg/torture/pr69542.c   -Os  (test for excess errors)

where all failures are -fcompare-debug failures.

The failure can be reproduced with a crosscompiler to alpha-linux-gnu:

$ /ssd/uros/gcc-build-alpha/gcc/xgcc -B /ssd/uros/gcc-build-alpha/gcc
-O2 -S -fcompare-debug pr69542.c
xgcc: error: pr69542.c: -fcompare-debug failure

$ /ssd/uros/gcc-build-alpha/gcc/xgcc -B /ssd/uros/gcc-build-alpha/gcc
-O2 -S -fcompare-debug -fno-rename-registers pr69542.c
[OK]

Uros.


Re: Enabling -frename-registers?

2016-05-02 Thread Bernd Schmidt

On 05/02/2016 01:12 PM, Uros Bizjak wrote:


On 04/17/2016 08:59 PM, Jeff Law wrote:


invoke.texi has an independent list (probably incomplete! ;( of all the
things that -O2 enables.  Make sure to add -frename-registers to that
list and this is Ok for the trunk (gcc-7).


This is what I committed.


The patch introduced bootstrap failure on alpha-linux-gnu.

Non-bootstrapped build regresses:

FAIL: gcc.dg/torture/pr69542.c   -O2  (test for excess errors)
FAIL: gcc.dg/torture/pr69542.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  (test for excess errors)
FAIL: gcc.dg/torture/pr69542.c   -Os  (test for excess errors)

where all failures are -fcompare-debug failures.


Is the bootstrap failure also of this nature, or do you suspect 
something else?



Bernd


Re: Enabling -frename-registers?

2016-05-02 Thread Uros Bizjak
On Mon, May 2, 2016 at 1:15 PM, Bernd Schmidt  wrote:
> On 05/02/2016 01:12 PM, Uros Bizjak wrote:
>
>>> On 04/17/2016 08:59 PM, Jeff Law wrote:
>>>
 invoke.texi has an independent list (probably incomplete! ;( of all the
 things that -O2 enables.  Make sure to add -frename-registers to that
 list and this is Ok for the trunk (gcc-7).
>>>
>>>
>>> This is what I committed.
>>
>>
>> The patch introduced bootstrap failure on alpha-linux-gnu.
>>
>> Non-bootstrapped build regresses:
>>
>> FAIL: gcc.dg/torture/pr69542.c   -O2  (test for excess errors)
>> FAIL: gcc.dg/torture/pr69542.c   -O2 -flto -fno-use-linker-plugin
>> -flto-partition=none  (test for excess errors)
>> FAIL: gcc.dg/torture/pr69542.c   -Os  (test for excess errors)
>>
>> where all failures are -fcompare-debug failures.
>
>
> Is the bootstrap failure also of this nature, or do you suspect something
> else?

I'm analysing the dumps, but I strongly suspect that it is the same problem.

(Please also note that the compiler includes Eric's PR 70886 patch [1]).

[1] https://gcc.gnu.org/ml/gcc-patches/2016-05/msg00026.html

Uros.


RE: [PATCH 6/6] [ARC] Various instruction pattern fixes

2016-05-02 Thread Claudiu Zissulescu
> > gcc/
> > 2016-04-18  Claudiu Zissulescu  
> >
> > * config/arc/arc.md (mulsidi3): Change operand 0 predicate to
> > register_operand.
> > (umulsidi3): Likewise.
> > (indirect_jump): Fix jump instruction assembly patterns.
> > (arcset): Change operand 1 predicate to
> nonmemory_operand.
> > (arcsetltu, arcsetgeu): Likewise.
> ChangeLog omission: You are also adding an r/n/r alternative.
> > (arcsethi, arcsetls): Fix pattern.
> Otherwise this is OK.
> 
> If the constant / register comparisons come from an expander, in
> general the expander should be fixed to swap the operands and
> use the swapped comparison code, to get canonical rtl.
> OTOH, constant re-materialization during register allocation can change
> a reg-reg into
> a constant-reg comparison, and at that stage, canonicalization would not
> be expected.

I will commit this patch without the arcset* mods, this is safer. 

Thanks!
Claudiu


Re: Avoid NULL cfun ICE in gcc/config/nvptx/nvptx.c:nvptx_libcall_value

2016-05-02 Thread Thomas Schwinge
Hi!

On Thu, 28 Apr 2016 13:42:39 +0200, Bernd Schmidt  wrote:
> On 04/28/2016 01:15 PM, Alexander Monakov wrote:
> > So if my understanding is correct, additional !cfun check can be acceptable 
> > as
> > a fix along the existing hack.  Perhaps with a note about the nature of the
> > hack.
> 
> Yes, I think Thomas' patch is ok.

Thanks for the review of the patch as well as the existing code; I filed
 "[nvptx] Revisit cfun->machine->doing_call".
To resolve the immediate problem, I committed to trunk in r235748:

commit 5fbb617747498ce3b0fd97aaa5334611e6220263
Author: tschwinge 
Date:   Mon May 2 11:25:17 2016 +

[PR target/70860] [nvptx] Handle NULL cfun in nvptx_libcall_value

gcc/
PR target/70860
* config/nvptx/nvptx.c (nvptx_libcall_value): Handle NULL cfun.
(nvptx_function_value): Assert non-NULL cfun.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@235748 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog| 6 ++
 gcc/config/nvptx/nvptx.c | 3 ++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git gcc/ChangeLog gcc/ChangeLog
index 49a314a..f344e0f 100644
--- gcc/ChangeLog
+++ gcc/ChangeLog
@@ -1,3 +1,9 @@
+2016-05-02  Thomas Schwinge  
+
+   PR target/70860
+   * config/nvptx/nvptx.c (nvptx_libcall_value): Handle NULL cfun.
+   (nvptx_function_value): Assert non-NULL cfun.
+
 2016-05-02  Eric Botcazou  
 
PR rtl-optimization/70886
diff --git gcc/config/nvptx/nvptx.c gcc/config/nvptx/nvptx.c
index b088cf8..a6c90b6 100644
--- gcc/config/nvptx/nvptx.c
+++ gcc/config/nvptx/nvptx.c
@@ -483,7 +483,7 @@ nvptx_strict_argument_naming (cumulative_args_t cum_v)
 static rtx
 nvptx_libcall_value (machine_mode mode, const_rtx)
 {
-  if (!cfun->machine->doing_call)
+  if (!cfun || !cfun->machine->doing_call)
 /* Pretend to return in a hard reg for early uses before pseudos can be
generated.  */
 return gen_rtx_REG (mode, NVPTX_RETURN_REGNUM);
@@ -502,6 +502,7 @@ nvptx_function_value (const_tree type, const_tree 
ARG_UNUSED (func),
 
   if (outgoing)
 {
+  gcc_assert (cfun);
   cfun->machine->return_mode = mode;
   return gen_rtx_REG (mode, NVPTX_RETURN_REGNUM);
 }


Grüße
 Thomas


[testuite,AArch64] Make scan for 'br' more robust

2016-05-02 Thread Christophe Lyon
Hi,

I've noticed a "regression" of AArch64's noplt_3.c in the gcc-6-branch
because my validation script adds the branch name to gcc/REVISION.

As a result scan-assembler-times "br" also matched "gcc-6-branch",
hence the failure.

The small attached patch replaces "br" by "br\t" to fix the problem.

I've also made a similar change to tail_indirect_call_1 although the
problem did not happen for this test because it uses scan-assembler
instead of scan-assembler-times. I think it's better to make it more
robust too.

OK?

Christophe
2016-05-02  Christophe Lyon  

* gcc.target/aarch64/noplt_3.c: Scan for "br\t".
* gcc.target/aarch64/tail_indirect_call_1.c: Likewise.
diff --git a/gcc/testsuite/gcc.target/aarch64/noplt_3.c 
b/gcc/testsuite/gcc.target/aarch64/noplt_3.c
index ef6e65d..a382618 100644
--- a/gcc/testsuite/gcc.target/aarch64/noplt_3.c
+++ b/gcc/testsuite/gcc.target/aarch64/noplt_3.c
@@ -16,5 +16,5 @@ cal_novalue (int a)
   dec (a);
 }
 
-/* { dg-final { scan-assembler-times "br" 2 } } */
+/* { dg-final { scan-assembler-times "br\t" 2 } } */
 /* { dg-final { scan-assembler-not "b\t" } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/tail_indirect_call_1.c 
b/gcc/testsuite/gcc.target/aarch64/tail_indirect_call_1.c
index 4759d20..e863323 100644
--- a/gcc/testsuite/gcc.target/aarch64/tail_indirect_call_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/tail_indirect_call_1.c
@@ -3,7 +3,7 @@
 
 typedef void FP (int);
 
-/* { dg-final { scan-assembler "br" } } */
+/* { dg-final { scan-assembler "br\t" } } */
 /* { dg-final { scan-assembler-not "blr" } } */
 void
 f1 (FP fp, int n)


Re: Enabling -frename-registers?

2016-05-02 Thread Uros Bizjak
On Mon, May 2, 2016 at 1:21 PM, Uros Bizjak  wrote:

 On 04/17/2016 08:59 PM, Jeff Law wrote:

> invoke.texi has an independent list (probably incomplete! ;( of all the
> things that -O2 enables.  Make sure to add -frename-registers to that
> list and this is Ok for the trunk (gcc-7).


 This is what I committed.
>>>
>>>
>>> The patch introduced bootstrap failure on alpha-linux-gnu.
>>>
>>> Non-bootstrapped build regresses:
>>>
>>> FAIL: gcc.dg/torture/pr69542.c   -O2  (test for excess errors)
>>> FAIL: gcc.dg/torture/pr69542.c   -O2 -flto -fno-use-linker-plugin
>>> -flto-partition=none  (test for excess errors)
>>> FAIL: gcc.dg/torture/pr69542.c   -Os  (test for excess errors)
>>>
>>> where all failures are -fcompare-debug failures.
>>
>>
>> Is the bootstrap failure also of this nature, or do you suspect something
>> else?
>
> I'm analysing the dumps, but I strongly suspect that it is the same problem.
>
> (Please also note that the compiler includes Eric's PR 70886 patch [1]).
>
> [1] https://gcc.gnu.org/ml/gcc-patches/2016-05/msg00026.html

With the referred testcase, the compare-debug failure trips on (debug_insn 101)

;; basic block 8, loop depth 0, count 0, freq 9788, maybe hot
;;  prev block 7, next block 9, flags: (REACHABLE, RTL, MODIFIED)
;;  pred:   7 [100.0%]  (FALLTHRU)
;; bb 8 artificial_defs: { }
;; bb 8 artificial_uses: { u-1(30){ }}
;; lr  in   1 [$1] 2 [$2] 3 [$3] 4 [$4] 5 [$5] 6 [$6] 29 [$29] 30 [$30]
;; lr  use  1 [$1] 2 [$2] 4 [$4] 30 [$30]
;; lr  def  1 [$1] 2 [$2] 4 [$4]
;; live  in   1 [$1] 2 [$2] 3 [$3] 4 [$4] 5 [$5] 6 [$6] 29 [$29] 30 [$30]
;; live  gen  1 [$1] 2 [$2] 4 [$4]
;; live  kill
(note 100 99 101 8 [bb 8] NOTE_INSN_BASIC_BLOCK)
(debug_insn 101 100 104 8 (var_location:DI y (mem/f:DI (plus:DI
(plus:DI (ashift:DI (reg:DI 2 $2 [orig:110 _19 ] [110])
(const_int 3 [0x3]))
(reg/v/f:DI 1 $1 [orig:109 n ] [109]))
(const_int 8 [0x8])) [3 n_18->h S8 A64])) pr69542.c:31 -1
 (nil))
(insn 104 101 105 8 (set (reg:DI 17 $17 [154])
(ashift:DI (reg:DI 2 $2 [orig:110 _19 ] [110])
(const_int 3 [0x3]))) pr69542.c:31 64 {ashldi3}
 (nil))

The difference starts in rnreg pass, where without debug:

processing block 8:
opening incoming chain
Creating chain $1 (53)
opening incoming chain
Creating chain $2 (54)
opening incoming chain
Creating chain $3 (55)
opening incoming chain
Creating chain $4 (56)
opening incoming chain
Creating chain $5 (57)
opening incoming chain
Creating chain $6 (58)
opening incoming chain
Creating chain $29 (59)
opening incoming chain
Creating chain $30 (60)
Closing chain $2 (54) at insn 92 (terminate_write, superset)
Creating chain $2 (61) at insn 92
Closing chain $1 (53) at insn 93 (terminate_dead, superset)
...

and with debug:

processing block 8:
opening incoming chain
Creating chain $1 (53)
opening incoming chain
Creating chain $2 (54)
opening incoming chain
Creating chain $3 (55)
opening incoming chain
Creating chain $4 (56)
opening incoming chain
Creating chain $5 (57)
opening incoming chain
Creating chain $6 (58)
opening incoming chain
Creating chain $29 (59)
opening incoming chain
Creating chain $30 (60)
==> Cannot rename chain $1 (53) at insn 101 (mark_read)
==> Widening register in chain $1 (53) at insn 101
Closing chain $2 (54) at insn 104 (terminate_write, superset)
Creating chain $2 (61) at insn 104
Closing chain $1 (53) at insn 105 (terminate_dead, superset)
...

> Uros.


Re: [PATCH] Fix PR tree-optimization/51513

2016-05-02 Thread Peter Bergner
On Mon, 2016-05-02 at 10:49 +0200, Richard Biener wrote:
> Again, the wild jump is not a bug but at most a missed optimization
> (to remove it).

Sorry, came down with a cold and haven't looked into this yet.
I'll do that today.

I agree it's a missed optimization bug. We noticed this with a post
compilation analysis tool that had problems itself with the wild
branch  (since fixed) which is why we wanted to fix this.

Peter



[ada, build] Fix make install-gcc-specs with empty GCC_SPEC_FILES

2016-05-02 Thread Rainer Orth
Installing gcc 6.1.0 on Solaris 10 with /bin/ksh failed for
install-gcc-specs (I'd already seen that for 5.1.0, but forgotten about
it):

make[3]: Entering directory `/var/gcc/gcc-6.1.0/10-gcc-gas/gcc/ada'
for f in ; do \
cp -p /vol/src/gnu/gcc/gcc-6.1.0/gcc/ada/$f \
/vol/gcc-5/lib/gcc/i386-pc-solaris2.10/6.1.0/$(echo $f|sed -e 
's#_[a-zA-Z0-9]*##g'); \
done
/bin/ksh: syntax error at line 1 : `;' unexpected
make[3]: *** [install-gcc-specs] Error 2

For most targets, GCC_SPEC_FILES is empty, and for f in ; do makes the
ancient Solaris 10 ksh88 choke.  The problem can be avoided by using
$(foreach instead.  During testing, I noticed that the target also
doesn't honor DESTDIR.

The following patch fixes both.  Tested by running make
install-gcc-specs on i386-pc-solaris2.10 and additionally with
GCC_SPEC_FILES=vxworks-x86-link.spec.

Ok for mainline and the gcc-6 and gcc-5 branches?

Rainer


2016-04-29  Rainer Orth  

* gcc-interface/Makefile.in (install-gcc-specs): Use foreach.
Honor DESTDIR.

# HG changeset patch
# Parent  6a07b8a8ce8b870d4cf37ebbcac7d7965340d4d6
Fix make install-gcc-specs with empty GCC_SPECS_FILES

diff --git a/gcc/ada/gcc-interface/Makefile.in b/gcc/ada/gcc-interface/Makefile.in
--- a/gcc/ada/gcc-interface/Makefile.in
+++ b/gcc/ada/gcc-interface/Makefile.in
@@ -2670,10 +2670,9 @@ gnatlink-re: ../stamp-tools gnatmake-re
 install-gcc-specs:
 #	Install all the requested GCC spec files.
 
-	for f in $(GCC_SPEC_FILES); do \
-	$(INSTALL_DATA_DATE) $(srcdir)/ada/$$f \
-	$(libsubdir)/$$(echo $$f|sed -e 's#_[a-zA-Z0-9]*##g'); \
-	done
+	$(foreach f,$(GCC_SPEC_FILES), \
+	$(INSTALL_DATA_DATE) $(srcdir)/ada/$(f) \
+	$(DESTDIR)$(libsubdir)/$$(echo $(f)|sed -e 's#_[a-zA-Z0-9]*##g');)
 
 install-gnatlib: ../stamp-gnatlib-$(RTSDIR) install-gcc-specs
 	$(RMDIR) $(DESTDIR)$(ADA_RTL_OBJ_DIR)

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [ada, build] Fix make install-gcc-specs with empty GCC_SPEC_FILES

2016-05-02 Thread Arnaud Charlet
> Installing gcc 6.1.0 on Solaris 10 with /bin/ksh failed for
> install-gcc-specs (I'd already seen that for 5.1.0, but forgotten about
> it):
> 
> make[3]: Entering directory `/var/gcc/gcc-6.1.0/10-gcc-gas/gcc/ada'
> for f in ; do \
> cp -p /vol/src/gnu/gcc/gcc-6.1.0/gcc/ada/$f \
> /vol/gcc-5/lib/gcc/i386-pc-solaris2.10/6.1.0/$(echo
> $f|sed -e 's#_[a-zA-Z0-9]*##g'); \
> done
> /bin/ksh: syntax error at line 1 : `;' unexpected
> make[3]: *** [install-gcc-specs] Error 2
> 
> For most targets, GCC_SPEC_FILES is empty, and for f in ; do makes the
> ancient Solaris 10 ksh88 choke.  The problem can be avoided by using
> $(foreach instead.  During testing, I noticed that the target also
> doesn't honor DESTDIR.
> 
> The following patch fixes both.  Tested by running make
> install-gcc-specs on i386-pc-solaris2.10 and additionally with
> GCC_SPEC_FILES=vxworks-x86-link.spec.
> 
> Ok for mainline and the gcc-6 and gcc-5 branches?

OK, thanks.


Re: Thoughts on memcmp expansion (PR43052)

2016-05-02 Thread Richard Biener
On Thu, Apr 28, 2016 at 8:36 PM, Bernd Schmidt  wrote:
> On 01/18/2016 10:22 AM, Richard Biener wrote:
>>
>> See also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52171 - the
>> inline expansion
>> for small sizes and equality compares should be done on GIMPLE.  Today the
>> strlen pass might be an appropriate place to do this given its
>> superior knowledge
>> about string lengths.
>>
>> The idea of turning eq feeding memcmp into a special memcmp_eq is good but
>> you have to avoid doing that too early - otherwise you'd lose on
>>
>>res = memcmp (p, q, sz);
>>if (memcmp (p, q, sz) == 0)
>> ...
>>
>> that is, you have to make sure CSE got the chance to common the two calls.
>> This is why I think this kind of transform needs to happen in specific
>> places
>> (like during strlen opt) rather than in generic folding.
>
>
> Ok, here's an update. I kept pieces of your patch from that PR, but also
> translating memcmps larger than a single operation into memcmp_eq as in my
> previous patch.
>
> Then, I added by_pieces infrastructure for memcmp expansion. To avoid any
> more code duplication in this area, I abstracted the existing code and
> converted it to C++ classes since that seemed to fit pretty well.
>
> There are a few possible ways I could go with this, which is why I'm posting
> it more as a RFD at this point.
>  - should store_by_pieces be eliminated in favour of doing just
>move_by_pieces with constfns?
>  - the C++ification could be extended, making move_by_pieces_d and
>compare_by_pieces_d classes inheriting from a common base. This
>would get rid of the callbacks, replacing them with virtuals,
>and also make some of the current struct members private.
>  - could move all of the by_pieces stuff out into a separate file?
>
> Later, I think we'll also want to extend this to allow vector mode
> operations, but I think that's a separate patch.
>
> So, opinions what I should be doing with this patch? FWIW it bootstraps and
> tests OK on x86_64-linux.

+struct pieces_addr
+{
...
+  void *m_cfndata;
+public:
+  pieces_addr (rtx, bool, by_pieces_constfn, void *);

unless you strick private: somewhere the public: is redundant

Index: gcc/tree-ssa-forwprop.c
===
--- gcc/tree-ssa-forwprop.c (revision 235474)
+++ gcc/tree-ssa-forwprop.c (working copy)
@@ -1435,6 +1435,76 @@ simplify_builtin_call (gimple_stmt_itera
}
}
   break;
+
+case BUILT_IN_MEMCMP:
+  {

I think this doesn't belong in forwprop.  If we want to stick it into
a pass rather than
folding it should be in tree-ssa-strlen.c.

+   if (tree_fits_uhwi_p (len)
+   && (leni = tree_to_uhwi (len)) <= GET_MODE_SIZE (word_mode)
+   && exact_log2 (leni) != -1
+   && (align1 = get_pointer_alignment (arg1)) >= leni * BITS_PER_UNIT
+   && (align2 = get_pointer_alignment (arg2)) >= leni * BITS_PER_UNIT)
+ {
+   location_t loc = gimple_location (stmt2);
+   tree type, off;
+   type = build_nonstandard_integer_type (leni * BITS_PER_UNIT, 1);
+   gcc_assert (GET_MODE_SIZE (TYPE_MODE (type)) == leni);
+   off = build_int_cst
+(build_pointer_type_for_mode (char_type_node, ptr_mode, true), 0);
+   arg1 = build2_loc (loc, MEM_REF, type, arg1, off);
+   arg2 = build2_loc (loc, MEM_REF, type, arg2, off);
+   res = fold_convert_loc (loc, TREE_TYPE (res),
+   fold_build2_loc (loc, NE_EXPR,
+boolean_type_node,
+arg1, arg2));
+   gimplify_and_update_call_from_tree (gsi_p, res);
+   return true;
+ }

for this part see gimple_fold_builtin_memory_op handling of

  /* If we can perform the copy efficiently with first doing all loads
 and then all stores inline it that way.  Currently efficiently
 means that we can load all the memory into a single integer
 register which is what MOVE_MAX gives us.  */

we might want to share a helper yielding the type of the load/store
or NULL if not possible.

Note that we can handle size-1 memcmp even for ordered compares.

Jakub, where do you think this fits best?  Note that gimple-fold.c may
not use immediate uses but would have to match this from the
comparison (I still have to find a way to handle this in match.pd where
the result expression contains virtual operands in the not toplevel stmt).

Richard.

>
> Bernd


Re: Thoughts on memcmp expansion (PR43052)

2016-05-02 Thread Bernd Schmidt

On 05/02/2016 02:52 PM, Richard Biener wrote:

+struct pieces_addr
+{
...
+  void *m_cfndata;
+public:
+  pieces_addr (rtx, bool, by_pieces_constfn, void *);

unless you strick private: somewhere the public: is redundant


Yeah, ideally I want to turn these into a classes rather than structs. 
Maybe that particular one can already be done, but I'm kind of wondering 
how far to take the C++ification of the other one.



Index: gcc/tree-ssa-forwprop.c
===
--- gcc/tree-ssa-forwprop.c (revision 235474)
+++ gcc/tree-ssa-forwprop.c (working copy)
@@ -1435,6 +1435,76 @@ simplify_builtin_call (gimple_stmt_itera
 }
 }
break;
+
+case BUILT_IN_MEMCMP:
+  {

I think this doesn't belong in forwprop.  If we want to stick it into
a pass rather than
folding it should be in tree-ssa-strlen.c.


This part (and the other one you quoted) was essentially your prototype 
patch from PR52171. I can put it whereever you like, really.



Note that we can handle size-1 memcmp even for ordered compares.


One would hope this doesn't occur very often...


Jakub, where do you think this fits best?  Note that gimple-fold.c may
not use immediate uses but would have to match this from the
comparison (I still have to find a way to handle this in match.pd where
the result expression contains virtual operands in the not toplevel stmt).



Bernd


Re: Thoughts on memcmp expansion (PR43052)

2016-05-02 Thread Richard Biener
On Mon, May 2, 2016 at 2:57 PM, Bernd Schmidt  wrote:
> On 05/02/2016 02:52 PM, Richard Biener wrote:
>>
>> +struct pieces_addr
>> +{
>> ...
>> +  void *m_cfndata;
>> +public:
>> +  pieces_addr (rtx, bool, by_pieces_constfn, void *);
>>
>> unless you strick private: somewhere the public: is redundant
>
>
> Yeah, ideally I want to turn these into a classes rather than structs. Maybe
> that particular one can already be done, but I'm kind of wondering how far
> to take the C++ification of the other one.
>
>> Index: gcc/tree-ssa-forwprop.c
>> ===
>> --- gcc/tree-ssa-forwprop.c (revision 235474)
>> +++ gcc/tree-ssa-forwprop.c (working copy)
>> @@ -1435,6 +1435,76 @@ simplify_builtin_call (gimple_stmt_itera
>>  }
>>  }
>> break;
>> +
>> +case BUILT_IN_MEMCMP:
>> +  {
>>
>> I think this doesn't belong in forwprop.  If we want to stick it into
>> a pass rather than
>> folding it should be in tree-ssa-strlen.c.
>
>
> This part (and the other one you quoted) was essentially your prototype
> patch from PR52171. I can put it whereever you like, really.

I think it fits best in tree-ssa-strlen.c:strlen_optimize_stmt for the moment.

>> Note that we can handle size-1 memcmp even for ordered compares.
>
>
> One would hope this doesn't occur very often...

C++ templates  but yes.

Richard.

>
>> Jakub, where do you think this fits best?  Note that gimple-fold.c may
>> not use immediate uses but would have to match this from the
>> comparison (I still have to find a way to handle this in match.pd where
>> the result expression contains virtual operands in the not toplevel stmt).
>
>
>
> Bernd


Re: Enabling -frename-registers?

2016-05-02 Thread Bernd Schmidt

On 05/02/2016 01:57 PM, Uros Bizjak wrote:


With the referred testcase, the compare-debug failure trips on (debug_insn 101)


Ok, INDEX_REG_CLASS is NO_REGS on alpha, and apparently the contents of 
the MEM isn't a valid address.


Try this?


Bernd
Index: gcc/regrename.c
===
--- gcc/regrename.c	(revision 235753)
+++ gcc/regrename.c	(working copy)
@@ -1238,6 +1238,19 @@ scan_rtx_reg (rtx_insn *insn, rtx *loc,
 }
 }
 
+/* A wrapper around base_reg_class which returns ALL_REGS if INSN is a
+   DEBUG_INSN.  The arguments MODE, AS, CODE and INDEX_CODE are as for
+   base_reg_class.  */
+
+static reg_class
+base_reg_class_for_rename (rtx_insn *insn, machine_mode mode, addr_space_t as,
+			   rtx_code code, rtx_code index_code)
+{
+  if (DEBUG_INSN_P (insn))
+return ALL_REGS;
+  return base_reg_class (mode, as, code, index_code);
+}
+
 /* Adapted from find_reloads_address_1.  CL is INDEX_REG_CLASS or
BASE_REG_CLASS depending on how the register is being considered.  */
 
@@ -1343,12 +1356,16 @@ scan_rtx_address (rtx_insn *insn, rtx *l
 	  }
 
 	if (locI)
-	  scan_rtx_address (insn, locI, INDEX_REG_CLASS, action, mode, as);
+	  {
+	reg_class iclass = DEBUG_INSN_P (insn) ? ALL_REGS : INDEX_REG_CLASS;
+	scan_rtx_address (insn, locI, iclass, action, mode, as);
+	  }
 	if (locB)
-	  scan_rtx_address (insn, locB,
-			base_reg_class (mode, as, PLUS, index_code),
-			action, mode, as);
-
+	  {
+	reg_class bclass = base_reg_class_for_rename (insn, mode, as, PLUS,
+			  index_code);
+	scan_rtx_address (insn, locB, bclass, action, mode, as);
+	  }
 	return;
   }
 
@@ -1366,10 +1383,13 @@ scan_rtx_address (rtx_insn *insn, rtx *l
   break;
 
 case MEM:
-  scan_rtx_address (insn, &XEXP (x, 0),
-			base_reg_class (GET_MODE (x), MEM_ADDR_SPACE (x),
-	MEM, SCRATCH),
-			action, GET_MODE (x), MEM_ADDR_SPACE (x));
+  {
+	reg_class bclass = base_reg_class_for_rename (insn, GET_MODE (x),
+		  MEM_ADDR_SPACE (x),
+		  MEM, SCRATCH);
+	scan_rtx_address (insn, &XEXP (x, 0), bclass, action, GET_MODE (x),
+			  MEM_ADDR_SPACE (x));
+  }
   return;
 
 case REG:
@@ -1416,10 +1436,14 @@ scan_rtx (rtx_insn *insn, rtx *loc, enum
   return;
 
 case MEM:
-  scan_rtx_address (insn, &XEXP (x, 0),
-			base_reg_class (GET_MODE (x), MEM_ADDR_SPACE (x),
-	MEM, SCRATCH),
-			action, GET_MODE (x), MEM_ADDR_SPACE (x));
+  {
+	reg_class bclass = base_reg_class_for_rename (insn, GET_MODE (x),
+		  MEM_ADDR_SPACE (x),
+		  MEM, SCRATCH);
+
+	scan_rtx_address (insn, &XEXP (x, 0), bclass, action, GET_MODE (x),
+			  MEM_ADDR_SPACE (x));
+  }
   return;
 
 case SET:


Re: Enabling -frename-registers?

2016-05-02 Thread Jeff Law

On 04/29/2016 07:32 AM, Bernd Schmidt wrote:

On 04/29/2016 03:02 PM, David Edelsohn wrote:

How has this show general benefit for all architectures to deserve
enabling it by default at -O2?


It should improve postreload scheduling in general, and it can also help
clear up bad code generation left behind by register allocation.
Right.  ISTM the round-robin renaming reduces the false dependencies 
that are inherently created by register allocation.


It should benefit any architecture that utilizes post-reload scheduling.

jeff




[libvtv, build] Don't install libvtv without --enable-vtable-verify

2016-05-02 Thread Rainer Orth
When installing gcc 6.1.0 on Solaris 12, installation failed in libvtv:

libtool: install: /usr/gnu/bin/install -c .libs/libvtv.lai 
/var/gcc/gcc-6.1.0/12-gcc-gas/install/vol/gcc-6/lib/amd64/libvtv.la
libtool: install: /usr/gnu/bin/install -c .libs/libvtv.a 
/var/gcc/gcc-6.1.0/12-gcc-gas/install/vol/gcc-6/lib/amd64/libvtv.a
/usr/gnu/bin/install: cannot stat '.libs/libvtv.a': No such file or directory
make[10]: *** [install-toolexeclibLTLIBRARIES] Error 1
make[10]: Leaving directory 
`/var/gcc/gcc-6.1.0/12-gcc-gas/i386-pc-solaris2.12/amd64/libvtv'

The problem is that libvtv.a is created like this

libtool: link: ar rc .libs/libvtv.a 
libtool: link: ranlib .libs/libvtv.a

i.e. with no objects, when --enable-vtable-verify isn't specified, and
Solaris ar does nothing in this case, unlike GNU ar which creates an
archive containing only the 8-byte archive header.

Given that in this situation libvtv is useless anyway (the vtv_*.o files
in libgcc aren't built either), I've chosen to avoid the installation
completely.

Tested on i386-pc-solaris2.11 without and with --enable-vtable-verify.

Ok for mainline and gcc-6 branch?

Thanks.
Rainer


2016-04-29  Rainer Orth  

* Makefile.am (toolexeclib_LTLIBRARIES): Only set if
ENABLE_VTABLE_VERIFY.
Simplify.
* Makefile.in: Regenerate.

# HG changeset patch
# Parent  47d2bbf59155ec37a1fa565f8774657b7045ef41
Don't install libvtv without --enable-vtable-verify

diff --git a/libvtv/Makefile.am b/libvtv/Makefile.am
--- a/libvtv/Makefile.am
+++ b/libvtv/Makefile.am
@@ -38,10 +38,11 @@ AM_CXXFLAGS = $(XCFLAGS)
 AM_CXXFLAGS += $(LIBSTDCXX_RAW_CXX_CXXFLAGS)
 AM_CXXFLAGS += -Wl,-u_vtable_map_vars_start,-u_vtable_map_vars_end
 
+if ENABLE_VTABLE_VERIFY
+  toolexeclib_LTLIBRARIES = libvtv.la
 if VTV_CYGMIN
-  toolexeclib_LTLIBRARIES = libvtv.la libvtv_stubs.la
-else
-  toolexeclib_LTLIBRARIES = libvtv.la
+  toolexeclib_LTLIBRARIES += libvtv_stubs.la
+endif
 endif
 
 vtv_headers = \

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH, rs6000] Add support for int versions of vec_adde

2016-05-02 Thread Andreas Schwab
Bill Seurer  writes:

> * gcc.target/powerpc/vec-adde.c: New test.
> * gcc.target/powerpc/vec-adde-int128.c: New test.

-m32:

FAIL: gcc.target/powerpc/vec-adde.c execution test
FAIL: gcc.target/powerpc/vec-adde-int128.c (test for excess errors)
Excess errors:
/daten/gcc/gcc-20160501/gcc/testsuite/gcc.target/powerpc/vec-adde-int128.c:66:30:
 error: '__int128' is not supported on this target

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


PING: [PATCH] PR target/70454: Build x86 libgomp with -march=i486 or better

2016-05-02 Thread H.J. Lu
On Mon, Apr 25, 2016 at 1:36 PM, H.J. Lu  wrote:
> If x86 libgomp isn't compiled with -march=i486 or better, append
> -march=i486 XCFLAGS for x86 libgomp build.
>
> Tested on i686 with and without --with-arch=i386.  Tested on
> x86-64 with and without --with-arch_32=i386.  OK for trunk?
>
>
> H.J.
> ---
> PR target/70454
> * configure.tgt (XCFLAGS): Append -march=i486 to compile x86
> libgomp if needed.
> ---
>  libgomp/configure.tgt | 36 
>  1 file changed, 16 insertions(+), 20 deletions(-)
>
> diff --git a/libgomp/configure.tgt b/libgomp/configure.tgt
> index 77e73f0..c876e80 100644
> --- a/libgomp/configure.tgt
> +++ b/libgomp/configure.tgt
> @@ -67,28 +67,24 @@ if test x$enable_linux_futex = xyes; then
> ;;
>
>  # Note that bare i386 is not included here.  We need cmpxchg.
> -i[456]86-*-linux*)
> +i[456]86-*-linux* | x86_64-*-linux*)
> config_path="linux/x86 linux posix"
> -   case " ${CC} ${CFLAGS} " in
> - *" -m64 "*|*" -mx32 "*)
> -   ;;
> - *)
> -   if test -z "$with_arch"; then
> - XCFLAGS="${XCFLAGS} -march=i486 -mtune=${target_cpu}"
> +   # Need i486 or better.
> +   cat > conftestx.c < +#if defined __x86_64__ || defined __i486__ || defined __pentium__ \
> +  || defined __pentiumpro__ || defined __pentium4__ \
> +  || defined __geode__ || defined __SSE__
> +# error Need i486 or better
> +#endif
> +EOF
> +   if ${CC} ${CFLAGS} -c -o conftestx.o conftestx.c > /dev/null 2>&1; 
> then
> +   if test "${target_cpu}" = x86_64; then
> +   XCFLAGS="${XCFLAGS} -march=i486 -mtune=generic"
> +   else
> +   XCFLAGS="${XCFLAGS} -march=i486 -mtune=${target_cpu}"
> fi
> -   esac
> -   ;;
> -
> -# Similar jiggery-pokery for x86_64 multilibs, except here we
> -# can't rely on the --with-arch configure option, since that
> -# applies to the 64-bit side.
> -x86_64-*-linux*)
> -   config_path="linux/x86 linux posix"
> -   case " ${CC} ${CFLAGS} " in
> - *" -m32 "*)
> -   XCFLAGS="${XCFLAGS} -march=i486 -mtune=generic"
> -   ;;
> -   esac
> +   fi
> +   rm -f conftestx.c conftestx.o
> ;;
>
>  # Note that sparcv7 and sparcv8 is not included here.  We need cas.
> --
> 2.5.5
>

PING.


-- 
H.J.


Re: [PATCH] Drop excess size used for run time allocated stack variables.

2016-05-02 Thread Dominik Vogt
>  static rtx
> -round_push (rtx size)
> +round_push (rtx size, int already_added)

round_push also needs to know about the required alignment in case
that is more strict than a simple stack slot alignment.

>  {
> -  rtx align_rtx, alignm1_rtx;
> +  rtx align_rtx, add_rtx;
>  
>if (!SUPPORTS_STACK_ALIGNMENT
>|| crtl->preferred_stack_boundary == MAX_SUPPORTED_STACK_ALIGNMENT)
>  {
>int align = crtl->preferred_stack_boundary / BITS_PER_UNIT;

=>

  int align = MAX (required_align,
   crtl->preferred_stack_boundary) / BITS_PER_UNIT;

Unfortunately the testsuite didn't detect this bug; it showed up
while testing anothe patch related to stack layout.  I'm testing
the modified patch right now.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



Re: [PATCH] Fix spec-options.c test case

2016-05-02 Thread Bernd Edlinger
On 02.05.2016 12:26, Bernd Schmidt wrote:
> On 05/01/2016 09:52 AM, Bernd Edlinger wrote:
>> Hi,
>>
>> I took a closer look at this test case, and I found, except that
>> it triggers a dejagnu bug, it is also wrong.  I have tested with
>> a cross-compiler for target=sh-elf and found that the test case
>> actually FAILs because the foo.specs uses "cppruntime" which
>> is only referenced in gcc/config/sh/superh.h, but sh/superh.h
>> is only included for target sh*-superh-elf, see gcc/config.gcc.
>>
>> This means that it can only pass for target=sh-superh-elf.
>>
>> The attached patch fixes the testcase and makes it run always,
>> so that it does no longer triggers the dejagnu bug.
>
> So, two things. Why not use a string in the specs file that exists on
> all targets? If it's a sh-specific thing we want to test, move why not
> move it to gcc.target?


Yes, you are right.  Only the original use-case seems to be
sh-superh-elf specific.  But there are also spec strings
that are always available.  I think adding -DFOO to
"cpp_unique_options" will work on any target, and make the
test case even more useful.


So is the updated patch OK?


Thanks
Bernd.


2016-05-02  Bernd Edlinger  

	* gcc.dg/spec-options.c: Run the test on all targets.
	* gcc.dg/foo.specs: Use cpp_unique_options.

Index: gcc/testsuite/gcc.dg/foo.specs
===
--- gcc/testsuite/gcc.dg/foo.specs	(revision 235675)
+++ gcc/testsuite/gcc.dg/foo.specs	(working copy)
@@ -1,2 +1,2 @@
-*cppruntime:
+*cpp_unique_options:
 + %{tfoo: -DFOO}
Index: gcc/testsuite/gcc.dg/spec-options.c
===
--- gcc/testsuite/gcc.dg/spec-options.c	(revision 235675)
+++ gcc/testsuite/gcc.dg/spec-options.c	(working copy)
@@ -1,8 +1,7 @@
 /* Check that -mfoo is accepted if defined in a user spec
and that it is not passed on the command line.  */
 /* Must be processed in EXTRA_SPECS to run.  */
-/* { dg-do compile } */
-/* { dg-do run { target sh*-*-* } } */
+/* { dg-do run } */
 /* { dg-options "-B${srcdir}/gcc.dg --specs=foo.specs -tfoo" } */
 
 extern void abort(void);


[PATCH] [ARC] Use GOTOFFPC relocation for pc-relative accesses.

2016-05-02 Thread Claudiu Zissulescu
This patch makes the pc-relative access to be more safe by using @pcl
syntax. This new syntax generates a pc-relative relocation which will
be handled by assembler.

OK to apply?
Claudiu

gcc/
2016-05-02  Claudiu Zissulescu  
Joern Rennecke  

* config/arc/arc.c (arc_print_operand_address): Handle pc-relative
addresses.
(arc_needs_pcl_p): Add GOTOFFPC.
(arc_legitimate_pic_addr_p): Likewise.
(arc_output_pic_addr_const): Likewise.
(arc_legitimize_pic_address): Generate a pc-relative address using
GOTOFFPC.
(arc_output_libcall): Use @pcl syntax.
(arc_delegitimize_address_0): Delegitimize ARC_UNSPEC_GOTOFFPC.
* config/arc/arc.md ("unspec"): Add ARC_UNSPEC_GOTOFFPC.
(*movsi_insn): Use @pcl syntax.
(doloop_begin_i): Likewise.
---
 gcc/config/arc/arc.c  | 53 ---
 gcc/config/arc/arc.md |  6 --
 2 files changed, 33 insertions(+), 26 deletions(-)

diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index 49edc0a..c0aa075 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -3528,7 +3528,8 @@ arc_print_operand_address (FILE *file , rtx addr)
 || XINT (c, 1) == UNSPEC_TLS_IE))
|| (GET_CODE (c) == PLUS
&& GET_CODE (XEXP (c, 0)) == UNSPEC
-   && (XINT (XEXP (c, 0), 1) == UNSPEC_TLS_OFF)))
+   && (XINT (XEXP (c, 0), 1) == UNSPEC_TLS_OFF
+   || XINT (XEXP (c, 0), 1) == ARC_UNSPEC_GOTOFFPC)))
  {
arc_output_pic_addr_const (file, c, 0);
break;
@@ -4636,6 +4637,7 @@ arc_needs_pcl_p (rtx x)
 switch (XINT (x, 1))
   {
   case ARC_UNSPEC_GOT:
+  case ARC_UNSPEC_GOTOFFPC:
   case UNSPEC_TLS_GD:
   case UNSPEC_TLS_IE:
return true;
@@ -4698,9 +4700,10 @@ arc_legitimate_pic_addr_p (rtx addr)
   || XVECLEN (addr, 0) != 1)
 return false;
 
-  /* Must be one of @GOT, @GOTOFF, @tlsgd, tlsie.  */
+  /* Must be one of @GOT, @GOTOFF, @GOTOFFPC, @tlsgd, tlsie.  */
   if (XINT (addr, 1) != ARC_UNSPEC_GOT
   && XINT (addr, 1) != ARC_UNSPEC_GOTOFF
+  && XINT (addr, 1) != ARC_UNSPEC_GOTOFFPC
   && XINT (addr, 1) != UNSPEC_TLS_GD
   && XINT (addr, 1) != UNSPEC_TLS_IE)
 return false;
@@ -4917,26 +4920,15 @@ arc_legitimize_pic_address (rtx orig, rtx oldx)
   else if (!flag_pic)
return orig;
   else if (CONSTANT_POOL_ADDRESS_P (addr) || SYMBOL_REF_LOCAL_P (addr))
-   {
- /* This symbol may be referenced via a displacement from the
-PIC base address (@GOTOFF).  */
+   return gen_rtx_CONST (Pmode,
+ gen_rtx_UNSPEC (Pmode, gen_rtvec (1, addr),
+ ARC_UNSPEC_GOTOFFPC));
 
- /* FIXME: if we had a way to emit pc-relative adds that
-don't create a GOT entry, we could do without the use of
-the gp register.  */
- crtl->uses_pic_offset_table = 1;
- pat = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, addr), ARC_UNSPEC_GOTOFF);
- pat = gen_rtx_CONST (Pmode, pat);
- pat = gen_rtx_PLUS (Pmode, pic_offset_table_rtx, pat);
-   }
-  else
-   {
- /* This symbol must be referenced via a load from the
-Global Offset Table (@GOTPC).  */
- pat = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, addr), ARC_UNSPEC_GOT);
- pat = gen_rtx_CONST (Pmode, pat);
- pat = gen_const_mem (Pmode, pat);
-   }
+  /* This symbol must be referenced via a load from the Global
+Offset Table (@GOTPC).  */
+  pat = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, addr), ARC_UNSPEC_GOT);
+  pat = gen_rtx_CONST (Pmode, pat);
+  pat = gen_const_mem (Pmode, pat);
 
   if (oldx == NULL)
oldx = gen_reg_rtx (Pmode);
@@ -4952,6 +4944,7 @@ arc_legitimize_pic_address (rtx orig, rtx oldx)
  if (GET_CODE (addr) == UNSPEC)
{
  /* Check that the unspec is one of the ones we generate?  */
+ return orig;
}
  else
gcc_assert (GET_CODE (addr) == PLUS);
@@ -5105,6 +5098,9 @@ arc_output_pic_addr_const (FILE * file, rtx x, int code)
case ARC_UNSPEC_GOTOFF:
  suffix = "@gotoff";
  break;
+   case ARC_UNSPEC_GOTOFFPC:
+ suffix = "@pcl",   pcrel = true;
+ break;
case ARC_UNSPEC_PLT:
  suffix = "@plt";
  break;
@@ -5389,6 +5385,7 @@ arc_legitimate_constant_p (machine_mode mode, rtx x)
  {
  case ARC_UNSPEC_PLT:
  case ARC_UNSPEC_GOTOFF:
+ case ARC_UNSPEC_GOTOFFPC:
  case ARC_UNSPEC_GOT:
  case UNSPEC_TLS_GD:
  case UNSPEC_TLS_IE:
@@ -7648,7 +7645,7 @@ arc_output_libcall (const char *fname)
  || (TARGET_MEDIUM_CALLS && arc_ccfsm_cond_exec_p ()))
 {
   if (flag_pic)
-   sprintf (buf, "add r12,pcl,@%s-(.&-4)\n\tjl%%!%%* [r12]", fname);
+  

Re: [PATCH] Fix PR target/70669 (allow __float128 to use direct move)

2016-05-02 Thread Andreas Schwab
Michael Meissner  writes:

>   PR target/70669
>   * gcc.target/powerpc/pr70669.c: New test.

FAIL: gcc.target/powerpc/pr70669.c scan-assembler mtvsrd
FAIL: gcc.target/powerpc/pr70669.c scan-assembler-times stxvd2x 1

foo:
.quad   .L.foo,.TOC.@tocbase,0
.previous
.type   foo, @function
.L.foo:
lxvd2x 12,0,4
xxpermdi 11,12,12,3
mfvsrd 10,12
mfvsrd 11,11
#APP
 # 14 "/daten/gcc/gcc-20160501/gcc/testsuite/gcc.target/powerpc/pr70669.c" 1
 # 10
 # 0 "" 2
#NO_APP
std 10,0(3)
std 11,8(3)
blr

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: [PATCH] Fix spec-options.c test case

2016-05-02 Thread Bernd Schmidt

On 05/02/2016 03:43 PM, Bernd Edlinger wrote:

Yes, you are right.  Only the original use-case seems to be
sh-superh-elf specific.  But there are also spec strings
that are always available.  I think adding -DFOO to
"cpp_unique_options" will work on any target, and make the
test case even more useful.


So is the updated patch OK?


If that passes testing on non-sh, yes.


Bernd



Re: [PATCH] Drop excess size used for run time allocated stack variables.

2016-05-02 Thread Jeff Law

On 04/29/2016 04:12 PM, Dominik Vogt wrote:

The attached patch removes excess stack space allocation with
alloca in some situations.  Plese check the commit message in the
patch for details.

Ciao

Dominik ^_^  ^_^

-- Dominik Vogt IBM Germany


0001-ChangeLog


gcc/ChangeLog

* explow.c (round_push): Use know adjustment.
(allocate_dynamic_stack_space): Pass known adjustment to round_push.
If I understand the state of this patch correctly, you're working on 
another iteration, so I'm not going to dig into this version.


However, I would strongly recommend some tests, even if they are target 
specific.  You can always copy pr36728-1 into the s390x directory and 
look at size of the generated stack.  Simliarly for pr50938 for x86.


jeff


Re: [PATCH][genrecog] Fix warning about potentially uninitialised use of label

2016-05-02 Thread Jeff Law

On 05/02/2016 03:47 AM, Richard Sandiford wrote:

Kyrill Tkachov  writes:

Hi all,

I'm getting a warning when building genrecog that 'label' may be used
uninitialised in:

   uint64_t label = 0;

   if (d->test.kind == rtx_test::CODE
   && d->if_statement_p (&label)
   && label == CONST_INT)

This is because if_statement_p looks like this:
  inline bool
  decision::if_statement_p (uint64_t *label) const
  {
if (singleton () && first->labels.length () == 1)
  {
if (label)
  *label = first->labels[0];
return true;
  }
return false;
  }

It's not guaranteed to write label.


It is guaranteed to write to label on a true return though, so it looks
like a false positive.  Is current GCC warning for this or are you using
an older host compiler?
And if it's warning with the current trunk, we should open a BZ with a 
suitable testcase.  DOM/VRP should have detected this and removed the 
path with the uninitialized use from the CFG, if it didn't then it's a 
bug worth filing.


jeff


Re: [PATCH] c++/66561 - __builtin_LINE at al. should yield constant expressions

2016-05-02 Thread Jeff Law

On 04/29/2016 01:32 PM, Jason Merrill wrote:

On 04/26/2016 07:59 PM, Martin Sebor wrote:

* builtins.c (fold_builtin_FILE): New function.
(fold_builtin_FUNCTION, fold_builtin_LINE): New functions.
(fold_builtin_0): Call them.


Can we now remove the handling for these built-ins from gimplify_call_expr?

Dunno.  Might be worth an experiment.




+// Verify the line numbe returned by the built-in.


Typo.

OK with those adjustments and Jeff's.
No objections from me.  Just wanted to make sure I understood why we had 
to handle the builtins differently than _DECL nodes.


Thanks,
jeff


Fix for PR70909 in Libiberty Demangler (4)

2016-05-02 Thread Marcel Böhme
Hi,

This fixes several stack overflows due to infinite recursion in d_print_comp 
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70909).

The method d_print_comp in cp-demangle.c recursively constructs the 
d_print_info dpi from the demangle_component dc. The method d_print_comp_inner 
traverses dc as a graph. Now, dc can be a graph with cycles leading to infinite 
recursion in several distinct cases. The patch uses the component stack to find 
whether the current node dc has itself as ancestor more than once. 

Bootstrapped and regression tested on x86_64-pc-linux-gnu. Test cases added to 
libiberty/testsuite/demangler-expected and checked PR70909 and related stack 
overflows are resolved.

Best regards,
- Marcel



Index: ChangeLog
===
--- ChangeLog   (revision 235760)
+++ ChangeLog   (working copy)
@@ -1,3 +1,19 @@
+2016-05-02  Marcel Böhme  
+
+   PR c++/70909
+   PR c++/61460
+   PR c++/68700
+   PR c++/67738
+   PR c++/68383
+   PR c++/70517
+   PR c++/61805
+   PR c++/62279
+   PR c++/67264
+   * cp-demangle.c: Prevent infinite recursion when traversing cyclic
+   demangle component.
+   (d_print_comp): Return when demangle component has itself as ancistor
+   more than once.
+
 2016-04-30  Oleg Endo  
 
* configure: Remove SH5 support.
Index: cp-demangle.c
===
--- cp-demangle.c   (revision 235760)
+++ cp-demangle.c   (working copy)
@@ -5436,6 +5436,24 @@ d_print_comp (struct d_print_info *dpi, int option
 {
   struct d_component_stack self;
 
+  self.parent = dpi->component_stack;
+
+  while (self.parent)
+{
+  self.dc = self.parent->dc;
+  self.parent = self.parent->parent;
+  if (dc != NULL && self.dc == dc)
+   {
+ while (self.parent)
+   {
+ self.dc = self.parent->dc;
+ self.parent = self.parent->parent;
+ if (self.dc == dc)
+   return;
+   }
+   }
+}
+
   self.dc = dc;
   self.parent = dpi->component_stack;
   dpi->component_stack = &self;
Index: testsuite/demangle-expected
===
--- testsuite/demangle-expected (revision 235760)
+++ testsuite/demangle-expected (working copy)
@@ -4431,3 +4431,69 @@ _Q.__0
 
 _Q10-__9cafebabe.
 cafebabe.::-(void)
+#
+# Test demangler crash PR62279
+
+_ZN5Utils9transformIPN15ProjectExplorer13BuildStepListEZNKS1_18BuildConfiguration14knownStepListsEvEUlS3_E_EE5QListIDTclfp0_cvT__RKS6_IS7_ET0_
+QList 
Utils::transform(ProjectExplorer::BuildConfiguration::knownStepLists()
 const::{lambda(ProjectExplorer::BuildStepList*)#1} const&, 
ProjectExplorer::BuildConfiguration::knownStepLists() 
const::{lambda(ProjectExplorer::BuildStepList*)#1})
+#
+
+_ZSt7forwardIKSaINSt6thread5_ImplISt12_Bind_simpleIFZN6WIM_DL5Utils9AsyncTaskC4IMNS3_8Hardware12FpgaWatchdogEKFvvEIPS8_EEEibOT_DpOT0_EUlvE_vEESD_RNSt16remove_referenceISC_E4typeE
+std::allocator(int, bool, void 
(WIM_DL::Hardware::FpgaWatchdog::*&&)() const, 
WIM_DL::Hardware::FpgaWatchdog*&&)::{lambda()#1} ()> > > const&& 
std::forward(int, bool, 
std::allocator(int, bool, void 
(WIM_DL::Hardware::FpgaWatchdog::*&&)() const, 
WIM_DL::Hardware::FpgaWatchdog*&&)::{lambda()#1} ()> > > const&&, 
WIM_DL::Hardware::FpgaWatchdog*&&)::{lambda()#1} ()> > > 
const>(std::remove_reference(int, bool, void 
(WIM_DL::Hardware::FpgaWatchdog::*&&)() const, 
WIM_DL::Hardware::FpgaWatchdog*&&)::{lambda()#1} ()> > > const>::type&)
+#
+# Test demangler crash PR61805
+
+_ZNK5niven5ColorIfLi4EEdvIfEENSt9enable_ifIXsrSt13is_arithmeticIT_E5valueEKNS0_IDTmlcvS5__Ecvf_EELi44typeES5_
+std::enable_if::value, niven::Color const>::type niven::Color::operator/(float) const
+#
+# Test recursion PR70517
+
+_ZSt4moveIRZN11tconcurrent6futureIvE4thenIZ5awaitIS2_EDaOT_EUlRKS6_E_EENS1_INSt5decayIDTclfp_defpTEEE4typeEEES7_EUlvE_EONSt16remove_referenceIS6_E4typeES7_
+std::remove_reference::type> tconcurrent::future::then 
>(tconcurrent::future&&)::{lambda(tconcurrent::future::type> tconcurrent::future::then >()::{lambda( const&)#1}>( 
const)::{lambda()#1}& const&)#1}>(auto await 
>()::{lambda( const&)#1}&& const)::{lambda()#1}&>::type&& 
std::move::type> 
tconcurrent::future::then 
>(tconcurrent::future::type> 
tconcurrent::future::then 
>(tconcurrent::future&&)::{lambda(& const&)#1}>(auto 
await >()::{lambda( const&)#1}&& 
const)::{lambda()#1}&)::{lambda(tconcurrent::future::type> tconcurrent::future::then >(tconcurrent::future&&)::{lambda(& 
const&)#1}>(auto await >()::{lambda(&)#1}&& 
const)::{lambda()#1}& const&)#1}>(tconcurrent::future::type> tconcurrent::future::then >(tconcurrent::future&&)::{lambda(& 
const&)#1}>(auto await >()::{lambda(&)#1}&& 
const)::{lambda()#1}& 
const)::{lambda()#1}&>(tconcurrent::future::type> tconcurrent::future::then 
>(tconcurrent::future&&)::{lambda(tconcurrent::future

Re: [PATCH] Clean up tests where a later dg-do completely overrides another.

2016-05-02 Thread Jeff Law

On 04/29/2016 05:56 PM, Dominik Vogt wrote:


Yeah, sorry, I really should have mentioned this but forgot about
it.  It's a bug in DejaGnu.  When it encounters a conditional
dg-do and the condition does not match, it *still* replaces the
do-action of a prior dg-do with the current one.  With DejaGnu
prior to 1.6, there are two possibilities.

  /* { dg-do run { target sh*-*-* } } */
  /* { dg-do compile } */

The "compile" always wins.  The test is just compiled on all
platforms and not run anywhere.  With

  /* { dg-do compile } */
  /* { dg-do run { target sh*-*-* } } */

The "run" wins and the test is compiled and run everywhere, even
on targets that do not match sh*-*-*.

The bug is fixed in DejaGnu-1.6 which has been released on the
15th of April.  This is the fix:

  
http://git.savannah.gnu.org/gitweb/?p=dejagnu.git;a=commit;h=569f8718b534a2cd9511a7d640352eb0126ff492

(The patch could easily be backported to earlier DejaGnu
releases.)

I think it might be best to either update DejaGnu locally or to
live with the failure.  It really indicates a bug - in DejaGnu
though, not in Gcc.  However, there are some target specific test
cases that rely on multiple conditional dg-do to work properly
that are not executed as they should be (some stuff on Power and
another target that I can't remember; Mips?).  The only way to
deal with the situation properly is to upgrade DejaGnu.  Otherwise
you either have failing test cases or test cases don't do what the
test file says.

Maybe a comment should be added to the test case

  /* If this test is *run* (not just compiled) and therefore fails
 on non sh*-targets, this is because of a bug older DejaGnu
 versions.  This is fixed with DejaGnu-1.6.  */
I think we have a couple issues now that are resolved if we step forward 
to a newer version of dejagnu.


Given dejagnu-1.6 was recently released, should we just bite the bullet 
and ask everyone to step forward?


jeff


Re: [libvtv, build] Don't install libvtv without --enable-vtable-verify

2016-05-02 Thread Jeff Law

On 05/02/2016 07:33 AM, Rainer Orth wrote:

When installing gcc 6.1.0 on Solaris 12, installation failed in libvtv:

libtool: install: /usr/gnu/bin/install -c .libs/libvtv.lai 
/var/gcc/gcc-6.1.0/12-gcc-gas/install/vol/gcc-6/lib/amd64/libvtv.la
libtool: install: /usr/gnu/bin/install -c .libs/libvtv.a 
/var/gcc/gcc-6.1.0/12-gcc-gas/install/vol/gcc-6/lib/amd64/libvtv.a
/usr/gnu/bin/install: cannot stat '.libs/libvtv.a': No such file or directory
make[10]: *** [install-toolexeclibLTLIBRARIES] Error 1
make[10]: Leaving directory 
`/var/gcc/gcc-6.1.0/12-gcc-gas/i386-pc-solaris2.12/amd64/libvtv'

The problem is that libvtv.a is created like this

libtool: link: ar rc .libs/libvtv.a
libtool: link: ranlib .libs/libvtv.a

i.e. with no objects, when --enable-vtable-verify isn't specified, and
Solaris ar does nothing in this case, unlike GNU ar which creates an
archive containing only the 8-byte archive header.

Given that in this situation libvtv is useless anyway (the vtv_*.o files
in libgcc aren't built either), I've chosen to avoid the installation
completely.

Tested on i386-pc-solaris2.11 without and with --enable-vtable-verify.

Ok for mainline and gcc-6 branch?

Thanks.
Rainer


2016-04-29  Rainer Orth  

* Makefile.am (toolexeclib_LTLIBRARIES): Only set if
ENABLE_VTABLE_VERIFY.
Simplify.
* Makefile.in: Regenerate.

OK.
jeff


Re: [patch] cleanups for vtable-verify

2016-05-02 Thread Jeff Law

On 05/01/2016 07:34 AM, Steven Bosscher wrote:

Hello,

This patch is random cleanups on the vtable-verify code.
OK for trunk?

Ciao!
Steven

gcc/
  * vtable-verify.h (verify_vtbl_ptr_fndecl): Add GTY markers.
  (num_vtable_map_nodes): Remove extern declaration.
  (vtbl_mangled_name_types, vtbl_mangled_name_ids): Likewise.
  * vtable-verify.c (num_vtable_map_nodes): Make static.
  (vtbl_mangled_name_types, vtbl_mangled_name_ids): Likewise.
  (verify_vtbl_ptr_fndecl): Remove redundant extern declaration.

cp/
  * vtable-class-hierarchy.c (vtable_find_or_create_map_decl):
  Make static.
  (vtv_compute_class_hierarchy_transitive_closure): Eliminate uses of
  num_vtable_map_nodes in lieu of vtbl_map_nodes_vec.length() and of
  vtbl_map_nodes_vec.iterate().
  (guess_num_vtable_pointers, register_all_pairs,
  write_out_vtv_count_data, vtv_register_class_hierarchy_information,
  vtv_generate_init_routine): Likewise.


OK.
jeff


Re: [PATCH][CilkPlus] Merge libcilkrts from upstream

2016-05-02 Thread Jeff Law

On 04/29/2016 05:36 AM, Ilya Verbin wrote:

Hi!

This patch brings the latest libcilkrts from upstream.
Regtested on i686-linux and x86_64-linux.

Abidiff:
Functions changes summary: 0 Removed, 1 Changed (16 filtered out), 2 Added 
functions
Variables changes summary: 0 Removed, 0 Changed (1 filtered out), 0 Added 
variable
2 Added functions:
  'function void __cilkrts_resume()'{__cilkrts_resume@@CILKABI1}
  'function void __cilkrts_suspend()'{__cilkrts_suspend@@CILKABI1}
1 function with some indirect sub-type change:
  [C]'function __cilkrts_worker_ptr __cilkrts_bind_thread_1()' at 
cilk-abi.c:412:1 has some indirect sub-type changes:
Please note that the symbol of this function is 
__cilkrts_bind_thread@@CILKABI0
 and it aliases symbol: __cilkrts_bind_thread_1@@CILKABI1
return type changed:
  underlying type '__cilkrts_worker*' changed:
in pointed to type 'struct __cilkrts_worker' at abi.h:161:1:
  1 data member changes (8 filtered):
   type of 'global_state_t* __cilkrts_worker::g' changed:
 in pointed to type 'typedef global_state_t' at abi.h:113:1:
   underlying type 'struct global_state_t' at global_state.h:119:1 
changed:
   [...]

OK for trunk?

libcilkrts/
* Makefile.am: Merge from upstream, version 2.0.4420.0
.
* README: Likewise.
* configure.ac: Likewise.
* configure.tgt: Likewise.
* include/cilk/cilk.h: Likewise.
* include/cilk/cilk_api.h: Likewise.
* include/cilk/cilk_api_linux.h: Likewise.
* include/cilk/cilk_stub.h: Likewise.
* include/cilk/cilk_undocumented.h: Likewise.
* include/cilk/common.h: Likewise.
* include/cilk/holder.h: Likewise.
* include/cilk/hyperobject_base.h: Likewise.
* include/cilk/metaprogramming.h: Likewise.
* include/cilk/reducer.h: Likewise.
* include/cilk/reducer_file.h: Likewise.
* include/cilk/reducer_list.h: Likewise.
* include/cilk/reducer_max.h: Likewise.
* include/cilk/reducer_min.h: Likewise.
* include/cilk/reducer_min_max.h: Likewise.
* include/cilk/reducer_opadd.h: Likewise.
* include/cilk/reducer_opand.h: Likewise.
* include/cilk/reducer_opmul.h: Likewise.
* include/cilk/reducer_opor.h: Likewise.
* include/cilk/reducer_opxor.h: Likewise.
* include/cilk/reducer_ostream.h: Likewise.
* include/cilk/reducer_string.h: Likewise.
* include/cilktools/cilkscreen.h: Likewise.
* include/cilktools/cilkview.h: Likewise.
* include/cilktools/fake_mutex.h: Likewise.
* include/cilktools/lock_guard.h: Likewise.
* include/internal/abi.h: Likewise.
* include/internal/cilk_fake.h: Likewise.
* include/internal/cilk_version.h: Likewise.
* include/internal/metacall.h: Likewise.
* include/internal/rev.mk: Likewise.
* mk/cilk-version.mk: Likewise.
* runtime/acknowledgements.dox: Likewise.
* runtime/bug.cpp: Likewise.
* runtime/bug.h: Likewise.
* runtime/c_reducers.c: Likewise.
* runtime/cilk-abi-cilk-for.cpp: Likewise.
* runtime/cilk-abi-vla-internal.c: Likewise.
* runtime/cilk-abi-vla-internal.h: Likewise.
* runtime/cilk-abi.c: Likewise.
* runtime/cilk-ittnotify.h: Likewise.
* runtime/cilk-tbb-interop.h: Likewise.
* runtime/cilk_api.c: Likewise.
* runtime/cilk_fiber-unix.cpp: Likewise.
* runtime/cilk_fiber-unix.h: Likewise.
* runtime/cilk_fiber.cpp: Likewise.
* runtime/cilk_fiber.h: Likewise.
* runtime/cilk_malloc.c: Likewise.
* runtime/cilk_malloc.h: Likewise.
* runtime/component.h: Likewise.
* runtime/config/generic/cilk-abi-vla.c: Likewise.
* runtime/config/generic/os-fence.h: Likewise.
* runtime/config/generic/os-unix-sysdep.c: Likewise.
* runtime/config/x86/cilk-abi-vla.c: Likewise.
* runtime/config/x86/os-fence.h: Likewise.
* runtime/config/x86/os-unix-sysdep.c: Likewise.
* runtime/doxygen-layout.xml: Likewise.
* runtime/doxygen.cfg: Likewise.
* runtime/except-gcc.cpp: Likewise.
* runtime/except-gcc.h: Likewise.
* runtime/except.h: Likewise.
* runtime/frame_malloc.c: Likewise.
* runtime/frame_malloc.h: Likewise.
* runtime/full_frame.c: Likewise.
* runtime/full_frame.h: Likewise.
* runtime/global_state.cpp: Likewise.
* runtime/global_state.h: Likewise.
* runtime/jmpbuf.c: Likewise.
* runtime/jmpbuf.h: Likewise.
* runtime/linux-symbols.ver: Likewise.
* runtime/local_state.c: Likewise.
* runtime/local_state.h: Likewise.
* runtime/mac-symbols.txt: Likewise.
* runtime/metacall_impl.c: Likewise.
* runtime/metacal

Re: [PATCH] Re-use cc1-checksum.c for stage-final

2016-05-02 Thread Jeff Law

On 04/29/2016 05:36 AM, Richard Biener wrote:

On Thu, 28 Apr 2016, Jeff Law wrote:


On 04/28/2016 02:49 AM, Richard Biener wrote:


The following prototype patch re-uses cc1-checksum.c from the
previous stage when compiling stage-final.  This eventually
allows to compare cc1 from the last two stages to fix the
lack of a true comparison when doing LTO bootstrap (it
compiles LTO bytecode from the compile-stage there, not the
final optimization result).

Bootstrapped on x86_64-unknown-linux-gnu.

When stripping gcc/cc1 and prev-gcc/cc1 after the bootstrap
they now compare identical (with LTO bootstrap it should
not require stripping as that doesn't do a bootstrap-debug AFAIK).

Is sth like this acceptable?  (consider it also done for cp/Make-lang.in)

In theory we can compare all stage1 languages but I guess comparing
the required ones for a LTO bootstrap, cc1, cc1plus and lto1 would
be sufficient (or even just comparing one binary in which case
comparing lto1 would not require any patches).

This also gets rid of the annoying warning that cc1-checksum.o
differs (obviously).

Thanks,
Richard.

2016-04-28  Richard Biener  

c/
* Make-lang.in (cc1-checksum.c): For stage-final re-use
the checksum from the previous stage.

I won't object if you add a comment into the fragment indicating why you're
doing this.


So the following is a complete patch (not considering people may
add objc or obj-c++ to stage1 languages).  Build with --disable-bootstrap,
bootstrapped and profilebootstrapped with verifying it works as
intended (looks like we don't compare with profiledbootstrap - huh,
we're building stagefeedback only once)

Ok for trunk?

Step 2 will now be to figure out how to also compare cc1 (for example)
when using bootstrap-lto ... (we don't want to do this unconditionally
as it is a waste of time when the objects are not only LTO bytecode).

Thanks,
Richard.

2016-04-29  Richard Biener  

c/
* Make-lang.in (cc1-checksum.c): For stage-final re-use
the checksum from the previous stage.

cp/
* Make-lang.in (cc1plus-checksum.c): For stage-final re-use
the checksum from the previous stage.

LGTM.
jeff



[committed] Fix internal-fn handling in ipa-pure-const.c (PR rtl-optimization/70467)

2016-05-02 Thread Jakub Jelinek
Hi!

During testing of my PR70467 patch I've run into execute/va-arg-13.c
miscompilation, caused by ipa-pure-const.c in ipa mode saying a function
using VA_ARG internal function is pure - it might not be, e.g. on x86_64
where va_list is [1] array of struct and is passed to this "pure" function,
it modifies the va_list object in the caller.

As internal functions don't have corresponding cgraph edges, nothing handles
them after the initial walk over the function, so we need to always handle
internal calls there.

Bootstrapped/regtested on x86_64-linux and i686-linux, preapproved by Honza
on IRC, committed to trunk.

2016-05-02  Jakub Jelinek  

PR rtl-optimization/70467
* ipa-pure-const.c (check_call): Handle internal calls even in
ipa mode like in local mode.

--- gcc/ipa-pure-const.c.jj 2016-04-22 18:21:36.0 +0200
+++ gcc/ipa-pure-const.c2016-05-02 16:10:46.232077435 +0200
@@ -616,8 +616,10 @@ check_call (funct_state local, gcall *ca
   /* Either callee is unknown or we are doing local analysis.
  Look to see if there are any bits available for the callee (such as by
  declaration or because it is builtin) and process solely on the basis of
- those bits. */
-  else if (!ipa)
+ those bits.  Handle internal calls always, those calls don't have
+ corresponding cgraph edges and thus aren't processed during
+ the propagation.  */
+  else if (!ipa || gimple_call_internal_p (call))
 {
   enum pure_const_state_e call_state;
   bool call_looping;

Jakub


Re: [PATCH] Clean up tests where a later dg-do completely overrides another.

2016-05-02 Thread Dominik Vogt
On Mon, May 02, 2016 at 09:29:50AM -0600, Jeff Law wrote:
> On 04/29/2016 05:56 PM, Dominik Vogt wrote:
> > ...
> >Maybe a comment should be added to the test case
> >
> >  /* If this test is *run* (not just compiled) and therefore fails
> > on non sh*-targets, this is because of a bug older DejaGnu
> > versions.  This is fixed with DejaGnu-1.6.  */
> I think we have a couple issues now that are resolved if we step
> forward to a newer version of dejagnu.
> 
> Given dejagnu-1.6 was recently released, should we just bite the
> bullet and ask everyone to step forward?

I'm all for that.  I've recently added s390 test cases that
require Dejagnu 1.6.  Apart from the discussed problem with
spec-options.c, there are a number of Power (and some other
target) test cases that do not work properly with older Dejagnu
version but would finally work (read: actually test something) if
the new version were required.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



[PATCH] Improve add/sub TImode double word splitters (PR rtl-optimization/70467)

2016-05-02 Thread Jakub Jelinek
On Fri, Apr 01, 2016 at 08:29:17PM +0200, Uros Bizjak wrote:
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for stage1
> > (while the previous patch looks simple enough that I'd like to see it in
> > 6.x, this one IMHO can wait).
> 
> Yes, please. This is not a regression.

So, I'm reposting this patch now, bootstrapped/regtested again on
x86_64-linux and i686-linux, ok for trunk?

Had to commit the ipa-pure-const.c fix first, because in va-arg-13.c
CSE with this patch optimized away the stores in the second va_start,
assuming the (incorrectly marked pure) function call would not modify
those.

2016-05-02  Jakub Jelinek  

PR rtl-optimization/70467
* cse.c (cse_insn): Handle no-op MEM moves after folding.

* gcc.target/i386/pr70467-1.c: New test.

--- gcc/cse.c.jj2016-04-01 17:21:25.615271730 +0200
+++ gcc/cse.c   2016-04-01 17:31:27.705243745 +0200
@@ -4575,6 +4575,7 @@ cse_insn (rtx_insn *insn)
   for (i = 0; i < n_sets; i++)
 {
   bool repeat = false;
+  bool mem_noop_insn = false;
   rtx src, dest;
   rtx src_folded;
   struct table_elt *elt = 0, *p;
@@ -5166,7 +5167,7 @@ cse_insn (rtx_insn *insn)
}
 
  /* Avoid creation of overlapping memory moves.  */
- if (MEM_P (trial) && MEM_P (SET_DEST (sets[i].rtl)))
+ if (MEM_P (trial) && MEM_P (dest) && !rtx_equal_p (trial, dest))
{
  rtx src, dest;
 
@@ -5277,6 +5278,21 @@ cse_insn (rtx_insn *insn)
  break;
}
 
+ /* Similarly, lots of targets don't allow no-op
+(set (mem x) (mem x)) moves.  */
+ else if (n_sets == 1
+  && MEM_P (trial)
+  && MEM_P (dest)
+  && rtx_equal_p (trial, dest)
+  && !side_effects_p (dest)
+  && (cfun->can_delete_dead_exceptions
+  || insn_nothrow_p (insn)))
+   {
+ SET_SRC (sets[i].rtl) = trial;
+ mem_noop_insn = true;
+ break;
+   }
+
  /* Reject certain invalid forms of CONST that we create.  */
  else if (CONSTANT_P (trial)
   && GET_CODE (trial) == CONST
@@ -5494,6 +5510,16 @@ cse_insn (rtx_insn *insn)
  /* No more processing for this set.  */
  sets[i].rtl = 0;
}
+
+  /* Similarly for no-op MEM moves.  */
+  else if (mem_noop_insn)
+   {
+ if (cfun->can_throw_non_call_exceptions && can_throw_internal (insn))
+   cse_cfg_altered = true;
+ delete_insn_and_edges (insn);
+ /* No more processing for this set.  */
+ sets[i].rtl = 0;
+   }
 
   /* If this SET is now setting PC to a label, we know it used to
 be a conditional or computed branch.  */
--- gcc/testsuite/gcc.target/i386/pr70467-1.c.jj2016-04-01 
17:28:20.297742549 +0200
+++ gcc/testsuite/gcc.target/i386/pr70467-1.c   2016-04-01 17:28:20.297742549 
+0200
@@ -0,0 +1,55 @@
+/* PR rtl-optimization/70467 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mno-sse" } */
+
+void foo (unsigned long long *);
+
+void
+bar (void)
+{
+  unsigned long long a;
+  foo (&a);
+  a &= 0x7fffULL;
+  foo (&a);
+  a &= 0x7fffULL;
+  foo (&a);
+  a &= 0x7fffULL;
+  foo (&a);
+  a &= 0x7fffULL;
+  foo (&a);
+  a &= 0xULL;
+  foo (&a);
+  a &= 0xULL;
+  foo (&a);
+  a |= 0x7fffULL;
+  foo (&a);
+  a |= 0x7fffULL;
+  foo (&a);
+  a |= 0x7fffULL;
+  foo (&a);
+  a |= 0x7fffULL;
+  foo (&a);
+  a |= 0xULL;
+  foo (&a);
+  a |= 0xULL;
+  foo (&a);
+  a ^= 0x7fffULL;
+  foo (&a);
+  a ^= 0x7fffULL;
+  foo (&a);
+  a ^= 0x7fffULL;
+  foo (&a);
+  a ^= 0x7fffULL;
+  foo (&a);
+  a ^= 0xULL;
+  foo (&a);
+  a ^= 0xULL;
+  foo (&a);
+}
+
+/* { dg-final { scan-assembler-not "andl\[ \t\]*.-1," { target ia32 } } } */
+/* { dg-final { scan-assembler-not "andl\[ \t\]*.0," { target ia32 } } } */
+/* { dg-final { scan-assembler-not "orl\[ \t\]*.-1," { target ia32 } } } */
+/* { dg-final { scan-assembler-not "orl\[ \t\]*.0," { target ia32 } } } */
+/* { dg-final { scan-assembler-not "xorl\[ \t\]*.-1," { target ia32 } } } */
+/* { dg-final { scan-assembler-not "xorl\[ \t\]*.0," { target ia32 } } } */


Jakub


[PATCH] Allow arg promotion in gimple_builtin_call_types_compatible_p (PR target/49244)

2016-05-02 Thread Jakub Jelinek
Hi!

Most of the builtins don't pass arguments in char/short types,
except for some sync/atomic builtins, some sanitizer builtins and
TM builtins.
On targets where targetm.calls.promote_prototypes returns true (e.g. always
on x86_64/i686), unfortunately this means that gimple_call_builtin_p
often returns false for those, e.g. on __sync_fetch_and_add_2, because
the second argument has been promoted to int from unsigned short.
We actually expand those right, because we don't check that during
expansion, but e.g. such atomics aren't instrumented with -fsanitize=thread
because of this etc.

The following patch allows those cases.  Bootstrapped/regtested on
x86_64-linux and i686-linux, ok for trunk?

2016-05-02  Jakub Jelinek  

PR target/49244
* gimple.c (gimple_builtin_call_types_compatible_p): Allow
char/short arguments promoted to int because of promote_prototypes.

--- gcc/gimple.c.jj 2016-03-11 17:37:43.0 +0100
+++ gcc/gimple.c2016-05-02 12:20:16.490716014 +0200
@@ -2486,7 +2486,16 @@ gimple_builtin_call_types_compatible_p (
   if (!targs)
return true;
   tree arg = gimple_call_arg (stmt, i);
-  if (!useless_type_conversion_p (TREE_VALUE (targs), TREE_TYPE (arg)))
+  tree type = TREE_VALUE (targs);
+  if (!useless_type_conversion_p (type, TREE_TYPE (arg))
+ /* char/short integral arguments are promoted to int
+by several frontends if targetm.calls.promote_prototypes
+is true.  Allow such promotion too.  */
+ && !(INTEGRAL_TYPE_P (type)
+  && TYPE_PRECISION (type) < TYPE_PRECISION (integer_type_node)
+  && targetm.calls.promote_prototypes (TREE_TYPE (fndecl))
+  && useless_type_conversion_p (integer_type_node,
+TREE_TYPE (arg
return false;
   targs = TREE_CHAIN (targs);
 }


Jakub


[PATCH] Optimize bit test and * atomics into lock; bt[src] (PR target/49244)

2016-05-02 Thread Jakub Jelinek
Hi!

This patch adds pattern recognition (see attached testcase on what it e.g.
can handle) of the i?86/x86_64 lock; bt[src] operations.
It is too late to do this during or after RTL expansion, so it is done late
during gimple, by recognizing these sequences in the fold builtins pass,
turning those into an internal call which represents atomically setting,
complementing or resetting a bit and remembering the previous value of the
bit.

The patch doesn't handle (yet) the weirdo handling of memory operands where
the counter can be actually not just in between 0 and bitsize - 1 of the
particular mode, but can be much larger and the CPU locates the right memory
word first, but could be extended to handle that later.
I'd like to find out if there are other targets that have similar
instructions in their ISAs, or if x86_64/i686 is the only one.

Bootstrapped/regtested on x86_64-linux and i686-linux (relies on the
gimple.c patch I've just posted, otherwise the expected number of
scan-assembler-times would need to be tweaked for the short int cases).
Ok for trunk?

2016-05-02  Jakub Jelinek  

PR target/49244
* tree-ssa-ccp.c: Include stor-layout.h and optabs-query.h.
(optimize_atomic_bit_test_and): New function.
(pass_fold_builtins::execute): Use it.
* optabs.def (atomic_bit_test_and_set_optab,
atomic_bit_test_and_complement_optab,
atomic_bit_test_and_reset_optab): New optabs.
* internal-fn.def (ATOMIC_BIT_TEST_AND_SET,
ATOMIC_BIT_TEST_AND_COMPLEMENT, ATOMIC_BIT_TEST_AND_RESET): New ifns.
* builtins.h (expand_ifn_atomic_bit_test_and): New prototype.
* builtins.c (expand_ifn_atomic_bit_test_and): New function.
* internal-fn.c (expand_ATOMIC_BIT_TEST_AND_SET,
expand_ATOMIC_BIT_TEST_AND_COMPLEMENT,
expand_ATOMIC_BIT_TEST_AND_RESET): New functions.
* doc/md.texi (atomic_bit_test_and_set@var{mode},
atomic_bit_test_and_complement@var{mode},
atomic_bit_test_and_reset@var{mode}): Document.
* config/i386/sync.md (atomic_bit_test_and_set,
atomic_bit_test_and_complement,
atomic_bit_test_and_reset): New expanders.
(atomic_bit_test_and_set_1,
atomic_bit_test_and_complement_1,
atomic_bit_test_and_reset_1): New insns.

* gcc.target/i386/pr49244-1.c: New test.
* gcc.target/i386/pr49244-2.c: New test.

--- gcc/tree-ssa-ccp.c.jj   2016-05-01 12:21:05.063587549 +0200
+++ gcc/tree-ssa-ccp.c  2016-05-02 13:01:36.367044729 +0200
@@ -140,6 +140,8 @@ along with GCC; see the file COPYING3.
 #include "builtins.h"
 #include "tree-chkp.h"
 #include "cfgloop.h"
+#include "stor-layout.h"
+#include "optabs-query.h"
 
 
 /* Possible lattice values.  */
@@ -2697,6 +2699,224 @@ optimize_unreachable (gimple_stmt_iterat
   return ret;
 }
 
+/* Optimize
+ mask_2 = 1 << cnt_1;
+ _4 = __atomic_fetch_or_* (ptr_6, mask_2, _3);
+ _5 = _4 & mask_2;
+   to
+ _4 = ATOMIC_BIT_TEST_AND_SET (ptr_6, cnt_1, 0, _3);
+ _5 = _4;
+   If _5 is only used in _5 != 0 or _5 == 0 comparisons, 1
+   is passed instead of 0, and the builtin just returns a zero
+   or 1 value instead of the actual bit.
+   Similarly for __sync_fetch_and_or_* (without the ", _3" part
+   in there), and/or if mask_2 is a power of 2 constant.
+   Similarly for xor instead of or, use ATOMIC_BIT_TEST_AND_COMPLEMENT
+   in that case.  And similarly for and instead of or, except that
+   the second argument to the builtin needs to be one's complement
+   of the mask instead of mask.  */
+
+static void
+optimize_atomic_bit_test_and (gimple_stmt_iterator *gsip,
+ enum internal_fn fn, bool has_model_arg,
+ bool after)
+{
+  gimple *call = gsi_stmt (*gsip);
+  tree lhs = gimple_call_lhs (call);
+  use_operand_p use_p;
+  gimple *use_stmt;
+  tree mask, bit;
+  optab optab;
+
+  if (!flag_inline_atomics
+  || optimize_debug
+  || !gimple_call_builtin_p (call, BUILT_IN_NORMAL)
+  || !lhs
+  || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs)
+  || !single_imm_use (lhs, &use_p, &use_stmt)
+  || !is_gimple_assign (use_stmt)
+  || gimple_assign_rhs_code (use_stmt) != BIT_AND_EXPR
+  || !gimple_vdef (call))
+return;
+
+  switch (fn)
+{
+case IFN_ATOMIC_BIT_TEST_AND_SET:
+  optab = atomic_bit_test_and_set_optab;
+  break;
+case IFN_ATOMIC_BIT_TEST_AND_COMPLEMENT:
+  optab = atomic_bit_test_and_complement_optab;
+  break;
+case IFN_ATOMIC_BIT_TEST_AND_RESET:
+  optab = atomic_bit_test_and_reset_optab;
+  break;
+default:
+  return;
+}
+
+  if (optab_handler (optab, TYPE_MODE (TREE_TYPE (lhs))) == CODE_FOR_nothing)
+return;
+
+  mask = gimple_call_arg (call, 1);
+  tree use_lhs = gimple_assign_lhs (use_stmt);
+  if (!use_lhs)
+return;
+
+  if (TREE_CODE (mask) == INTEGER_CST)
+{
+  if (fn == IFN_ATOMIC_BIT_TEST_AND_RESET)
+   mask = const_uno

Re: [PATCH] Improve add/sub TImode double word splitters (PR rtl-optimization/70467)

2016-05-02 Thread Bernd Schmidt



2016-05-02  Jakub Jelinek  

PR rtl-optimization/70467
* cse.c (cse_insn): Handle no-op MEM moves after folding.

* gcc.target/i386/pr70467-1.c: New test.


I seem to have a memory of acking this before. Certainly looks OK.


Bernd



Re: [PATCH] Better location info for "incomplete type" error msg (PR c/70756)

2016-05-02 Thread Marek Polacek
On Fri, Apr 29, 2016 at 04:04:13PM -0400, Jason Merrill wrote:
> On 04/28/2016 11:59 AM, Marek Polacek wrote:
> > 3) for the C++ FE I used a macro so that I don't have to change all the
> > cxx_incomplete_type_error calls now,
> 
> How about an inline overload, instead?
 
I realized the macro was already there, but inline overloads should probably
be preferred these days.  So I used them instead.

> It seems sad to discard the location information; could we pass it into
> cxx_incomplete_type_diagnostic?

I suppose I can, though it required another inline overload.  I'm not sure
if the patch will make the C++ diagnostics about incomplete types better,
most likely not :/.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2016-05-02  Marek Polacek  

PR c/70756
* c-common.c (pointer_int_sum): Call size_in_bytes_loc instead of
size_in_bytes and pass LOC to it.

* c-decl.c (build_compound_literal): Pass LOC down to
c_incomplete_type_error.
* c-tree.h (require_complete_type): Adjust declaration.
(c_incomplete_type_error): Likewise.
* c-typeck.c (require_complete_type): Add location parameter, pass it
down to c_incomplete_type_error.
(c_incomplete_type_error): Add location parameter, pass it down to
error_at.
(build_component_ref): Pass location down to c_incomplete_type_error.
(default_conversion): Pass location down to require_complete_type.
(build_array_ref): Likewise.
(build_function_call_vec): Likewise.
(convert_arguments): Likewise.
(build_unary_op): Likewise.
(build_c_cast): Likewise.
(build_modify_expr): Likewise.
(convert_for_assignment): Likewise.
(c_finish_omp_clauses): Likewise.

* cp-tree.h (cxx_incomplete_type_error,
cxx_incomplete_type_diagnostic): New inline overloads.
* typeck2.c (cxx_incomplete_type_diagnostic,
cxx_incomplete_type_error): Add location parameter.

* langhooks-def.h (lhd_incomplete_type_error): Adjust declaration.
* langhooks.c (lhd_incomplete_type_error): Add location parameter.
* langhooks.h (incomplete_type_error): Likewise.
* tree.c (size_in_bytes_loc): Renamed from size_in_bytes.  Add location
parameter, pass it down to incomplete_type_error.
* tree.h (size_in_bytes): New inline overload.
(size_in_bytes_loc): Renamed from size_in_bytes.

* gcc.dg/pr70756.c: New test.

diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c
index d45bf1b..747d55d 100644
--- gcc/c-family/c-common.c
+++ gcc/c-family/c-common.c
@@ -4269,7 +4269,7 @@ pointer_int_sum (location_t loc, enum tree_code 
resultcode,
   size_exp = integer_one_node;
 }
   else
-size_exp = size_in_bytes (TREE_TYPE (result_type));
+size_exp = size_in_bytes_loc (loc, TREE_TYPE (result_type));
 
   /* We are manipulating pointer values, so we don't need to warn
  about relying on undefined signed overflow.  We disable the
diff --git gcc/c/c-decl.c gcc/c/c-decl.c
index 7094efc..48fa65c 100644
--- gcc/c/c-decl.c
+++ gcc/c/c-decl.c
@@ -5112,7 +5112,7 @@ build_compound_literal (location_t loc, tree type, tree 
init, bool non_const)
 
   if (type == error_mark_node || !COMPLETE_TYPE_P (type))
 {
-  c_incomplete_type_error (NULL_TREE, type);
+  c_incomplete_type_error (loc, NULL_TREE, type);
   return error_mark_node;
 }
 
diff --git gcc/c/c-tree.h gcc/c/c-tree.h
index 4633182..d3a6c4c 100644
--- gcc/c/c-tree.h
+++ gcc/c/c-tree.h
@@ -588,13 +588,13 @@ extern tree c_last_sizeof_arg;
 extern struct c_switch *c_switch_stack;
 
 extern tree c_objc_common_truthvalue_conversion (location_t, tree);
-extern tree require_complete_type (tree);
+extern tree require_complete_type (location_t, tree);
 extern int same_translation_unit_p (const_tree, const_tree);
 extern int comptypes (tree, tree);
 extern int comptypes_check_different_types (tree, tree, bool *);
 extern bool c_vla_type_p (const_tree);
 extern bool c_mark_addressable (tree);
-extern void c_incomplete_type_error (const_tree, const_tree);
+extern void c_incomplete_type_error (location_t, const_tree, const_tree);
 extern tree c_type_promotes_to (tree);
 extern struct c_expr default_function_array_conversion (location_t,
struct c_expr);
diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c
index 58c2139..32fd504 100644
--- gcc/c/c-typeck.c
+++ gcc/c/c-typeck.c
@@ -183,11 +183,12 @@ struct tagged_tu_seen_cache {
 static const struct tagged_tu_seen_cache * tagged_tu_seen_base;
 static void free_all_tagged_tu_seen_up_to (const struct tagged_tu_seen_cache 
*);
 
-/* Do `exp = require_complete_type (exp);' to make sure exp
-   does not have an incomplete type.  (That includes void types.)  */
+/* Do `exp = require_complete_type (loc, exp);' to make sure exp
+   does not have an incomplete type.  (That includes void types.)
+   LOC 

Re: Inline across -ffast-math boundary

2016-05-02 Thread Jan Hubicka
> On Thu, 21 Apr 2016, Jan Hubicka wrote:
> 
> > Hi,
> > this patch implements the long promised logic to inline across -ffast-math
> > boundary when eitehr caller or callee has no fp operations in it.  This is
> > needed to resolve code quality regression on Firefox with LTO where
> > -O3/-O2/-Ofast flags are combined and we fail to inline a lot of comdats
> > otherwise.
> > 
> > Bootstrapped/regtested x86_64-linux. Ricahrd, I would like to know your 
> > opinion
> > on fp_expression_p predicate - it is bit ugly but I do not know how to 
> > implement
> > it better.
> > 
> > We still won't inline -O1 code into -O2+ because flag_strict_overflow 
> > differs.
> > I will implement similar logic for overflows incrementally. Similarly 
> > flag_errno_math
> > can be handled better, but I am not sure it matters - I think wast majority 
> > of time
> > users set errno_math in sync with other -ffast-math implied flags.
> 
> Note that for reasons PR70586 shows (const functions having possible
> trapping side-effect because of FP math or division) we'd like to
> have sth like "uses FP math" "uses possibly trapping integer math"
> "uses integer math with undefined overflow" on a per function level
> and propagated alongside pure/const/nothrow state.
> 
> So maybe you can fit that into a more suitable place than just the
> inliner (which of course is interested in "uses FP math locally",
> not the transitive answer we need for PR70586).

We don't really have much more suitable place - ipa-inline-analysis is 
doing most of the analysis of function body that is usefull for IPA passes,
not only for inliner. It should be renamed perhaps to something like
function_body_summary.  I will do that later this stage1.

For PR70686 in addition to transitive answer we will need to know that
the transformation is win. Const function may take a lot of time and
introducing new call on code path that did not used it previously is
bad idea unless we know that the function is very cheap (which may
be true only for fast builtins, I don't know)

This patch impleemnts the suggested check for presence of FP parameters.
We can play with special casing the moves incrementally.

Bootstrapped/regtested x86_64-linux.

Index: testsuite/ChangeLog
===
--- testsuite/ChangeLog (revision 235765)
+++ testsuite/ChangeLog (working copy)
@@ -1,3 +1,7 @@
+2016-05-02  Jan Hubicka  
+
+   * gcc.dg/ipa/inline-8.c: New testcase.
+
 2016-05-02  Jakub Jelinek  
 
PR rtl-optimization/70467
Index: testsuite/gcc.dg/ipa/inline-8.c
===
--- testsuite/gcc.dg/ipa/inline-8.c (revision 0)
+++ testsuite/gcc.dg/ipa/inline-8.c (working copy)
@@ -0,0 +1,36 @@
+/* Verify that we do not inline isnanf test info -ffast-math code but that we
+   do inline trivial functions across -Ofast boundary.  */
+/* { dg-do run } */
+/* { dg-options "-O2"  } */
+#include 
+/* Can't be inlined because isnanf will be optimized out.  */
+int
+cmp (float a)
+{
+  return isnanf (a);
+}
+/* Can be inlined.  */
+int
+move (int a)
+{
+  return a;
+}
+float a;
+void
+set ()
+{
+ a=nan("");
+}
+float b;
+__attribute__ ((optimize("Ofast")))
+int
+main()
+{
+  b++;
+  if (cmp(a))
+__builtin_abort ();
+  float a = move (1);
+  if (!__builtin_constant_p (a))
+__builtin_abort ();
+  return 0;
+}
Index: ChangeLog
===
--- ChangeLog   (revision 235765)
+++ ChangeLog   (working copy)
@@ -1,3 +1,18 @@
+2016-05-02  Jan Hubicka  
+
+   * ipa-inline-analysis.c (reset_inline_summary): Clear fp_expressions
+   (dump_inline_summary): Dump it.
+   (fp_expression_p): New predicate.
+   (estimate_function_body_sizes): Use it.
+   (inline_merge_summary): Merge fp_expressions.
+   (inline_read_section): Read fp_expressions.
+   (inline_write_summary): Write fp_expressions.
+   * ipa-inline.c (can_inline_edge_p): Permit inlining across fp math
+   codegen boundary if either caller or callee is !fp_expressions.
+   * ipa-inline.h (inline_summary): Add fp_expressions.
+   * ipa-inline-transform.c (inline_call): When inlining !fp_expressions
+   to fp_expressions be sure the fp generation flags are updated.
+
 2016-05-02  Jakub Jelinek  
 
PR rtl-optimization/70467
Index: ipa-inline-analysis.c
===
--- ipa-inline-analysis.c   (revision 235693)
+++ ipa-inline-analysis.c   (working copy)
@@ -850,7 +850,8 @@ evaluate_conditions_for_known_args (stru
  if (known_aggs.exists ())
{
  agg = known_aggs[c->operand_num];
- val = ipa_find_agg_cst_for_param (agg, c->offset, c->by_ref);
+ val = ipa_find_agg_cst_for_param (agg, known_vals[c->operand_num],
+   c->offset, c->by_ref);
}
  

Re: Fix for PR70498 in Libiberty Demangler

2016-05-02 Thread Bernd Schmidt

On 05/01/2016 06:24 PM, Marcel Böhme wrote:


Please attach it (text/plain) instead.

Done.


That still seemed to be inlined, but I managed to apply this version. 
Committed.



Bernd


Re: Inline across -ffast-math boundary

2016-05-02 Thread Marek Polacek
On Mon, May 02, 2016 at 07:01:38PM +0200, Jan Hubicka wrote:
> This patch impleemnts the suggested check for presence of FP parameters.
> We can play with special casing the moves incrementally.

This patch has broken the build:

/home/marek/src/gcc/gcc/ipa-inline.c: In function ‘bool
can_inline_edge_p(cgraph_edge*, bool, bool, bool)’:
/home/marek/src/gcc/gcc/ipa-inline.c:341:55: error: ‘CIF_THUNK’ was not
declared in this scope
 e->inline_failed = e->caller->thunk.thunk_p ? CIF_THUNK :
CIF_MISMATCHED_ARGUMENTS;
   ^
Makefile:1085: recipe for target 'ipa-inline.o' failed
make: *** [ipa-inline.o] Error 1
make: *** Waiting for unfinished jobs
/home/marek/src/gcc/gcc/ipa-inline-analysis.c: In function ‘clause_t
evaluate_conditions_for_known_args(cgraph_node*, bool, vec,
vec)’:
/home/marek/src/gcc/gcc/ipa-inline-analysis.c:854:27: error: invalid conversion
from ‘tree_node*’ to ‘long int’ [-fpermissive]
   c->offset, c->by_ref);
   ^
/home/marek/src/gcc/gcc/ipa-inline-analysis.c:854:27: error: too many arguments
to function ‘tree_node* ipa_find_agg_cst_for_param(ipa_agg_jump_function*, long
int, bool)’
In file included from /home/marek/src/gcc/gcc/ipa-inline-analysis.c:90:0:
/home/marek/src/gcc/gcc/ipa-prop.h:639:6: note: declared here
 tree ipa_find_agg_cst_for_param (struct ipa_agg_jump_function *,
HOST_WIDE_INT,
  ^
Makefile:1085: recipe for target 'ipa-inline-analysis.o' failed
make: *** [ipa-inline-analysis.o] Error 1

Marek


Re: [PATCH] Better location info for "incomplete type" error msg (PR c/70756)

2016-05-02 Thread Jason Merrill

On 05/02/2016 12:41 PM, Marek Polacek wrote:

On Fri, Apr 29, 2016 at 04:04:13PM -0400, Jason Merrill wrote:

On 04/28/2016 11:59 AM, Marek Polacek wrote:

3) for the C++ FE I used a macro so that I don't have to change all the
 cxx_incomplete_type_error calls now,


How about an inline overload, instead?


I realized the macro was already there, but inline overloads should probably
be preferred these days.  So I used them instead.


It seems sad to discard the location information; could we pass it into
cxx_incomplete_type_diagnostic?


I suppose I can, though it required another inline overload.  I'm not sure
if the patch will make the C++ diagnostics about incomplete types better,
most likely not :/.
+inline void
+cxx_incomplete_type_diagnostic (const_tree value, const_tree type,
+   diagnostic_t diag_kind)
+{
+  cxx_incomplete_type_diagnostic (input_location, value, type, diag_kind);
+}
+


...


-cxx_incomplete_type_diagnostic (const_tree value, const_tree type,
-   diagnostic_t diag_kind)
+cxx_incomplete_type_diagnostic (location_t loc, const_tree value,

-  location_t loc = EXPR_LOC_OR_LOC (value, input_location);


Shouldn't we use EXPR_LOC_OR_LOC in the inline?

Jason



Re: Inline across -ffast-math boundary

2016-05-02 Thread Jan Hubicka
> On Mon, May 02, 2016 at 07:01:38PM +0200, Jan Hubicka wrote:
> > This patch impleemnts the suggested check for presence of FP parameters.
> > We can play with special casing the moves incrementally.
> 
> This patch has broken the build:
> 
> /home/marek/src/gcc/gcc/ipa-inline.c: In function ‘bool
> can_inline_edge_p(cgraph_edge*, bool, bool, bool)’:
> /home/marek/src/gcc/gcc/ipa-inline.c:341:55: error: ‘CIF_THUNK’ was not
> declared in this scope
>  e->inline_failed = e->caller->thunk.thunk_p ? CIF_THUNK :
> CIF_MISMATCHED_ARGUMENTS;

Sorry, I managed to miss independent changes that was in my tree.  I will revert
the evaluate_conditions_for_known_args and add CIF_THUNK shortly.

Honza
>^
> Makefile:1085: recipe for target 'ipa-inline.o' failed
> make: *** [ipa-inline.o] Error 1
> make: *** Waiting for unfinished jobs
> /home/marek/src/gcc/gcc/ipa-inline-analysis.c: In function ‘clause_t
> evaluate_conditions_for_known_args(cgraph_node*, bool, vec,
> vec)’:
> /home/marek/src/gcc/gcc/ipa-inline-analysis.c:854:27: error: invalid 
> conversion
> from ‘tree_node*’ to ‘long int’ [-fpermissive]
>c->offset, c->by_ref);
>^
> /home/marek/src/gcc/gcc/ipa-inline-analysis.c:854:27: error: too many 
> arguments
> to function ‘tree_node* ipa_find_agg_cst_for_param(ipa_agg_jump_function*, 
> long
> int, bool)’
> In file included from /home/marek/src/gcc/gcc/ipa-inline-analysis.c:90:0:
> /home/marek/src/gcc/gcc/ipa-prop.h:639:6: note: declared here
>  tree ipa_find_agg_cst_for_param (struct ipa_agg_jump_function *,
> HOST_WIDE_INT,
>   ^
> Makefile:1085: recipe for target 'ipa-inline-analysis.o' failed
> make: *** [ipa-inline-analysis.o] Error 1
> 
>   Marek


[SPARC] Support for --with-{cpu,tune}-{32,64} in sparc*-* targets

2016-05-02 Thread Jose E. Marchesi

Hi people.

This patch adds support for the --with-{cpu,tune}-{32,64} configure
options to sparc*-* targets.  This allows to separately select cpus and
tune options for -m32 and -m64 modes in multilib compilers.

Tested in sparc64-*-* and sparcv9-*-* targets.

2016-04-28  Jose E. Marchesi  

* config.gcc (sparc*-*-*): Support cpu_32, cpu_64, tune_32 and
tune_64.
* doc/install.texi (--with-cpu-32, --with-cpu-64): Document
support on SPARC.
* config/sparc/linux64.h (OPTION_DEFAULT_SPECS): Add entries for
cpu_32, cpu_64, tune_32 and tune_64.



diff --git a/gcc/config.gcc b/gcc/config.gcc
index f66e48c..e4bf17a 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -4280,9 +4280,9 @@ case "${target}" in
esac
;;
sparc*-*-*)
-   supported_defaults="cpu float tune"
+   supported_defaults="cpu cpu_32 cpu_64 float tune tune_32 
tune_64"
 
-   for which in cpu tune; do
+   for which in cpu cpu_32 cpu_64 tune tune_32 tune_64; do
eval "val=\$with_$which"
case ${val} in
"" | sparc | sparcv9 | sparc64 \
diff --git a/gcc/config/sparc/linux64.h b/gcc/config/sparc/linux64.h
index a1ef325..9d53c29 100644
--- a/gcc/config/sparc/linux64.h
+++ b/gcc/config/sparc/linux64.h
@@ -164,22 +164,42 @@ extern const char *host_detect_local_cpu (int argc, const 
char **argv);
 #endif
 
 /* Support for a compile-time default CPU, et cetera.  The rules are:
-   --with-cpu is ignored if -mcpu is specified.
-   --with-tune is ignored if -mtune is specified.
+   --with-cpu is ignored if -mcpu is specified; likewise --with-cpu-32
+ and --with-cpu-64.
+   --with-tune is ignored if -mtune is specified; likewise --with-tune-32
+ and --with-tune-64.
--with-float is ignored if -mhard-float, -msoft-float, -mfpu, or -mno-fpu
  are specified.
In the SPARC_BI_ARCH compiler we cannot pass %{!mcpu=*:-mcpu=%(VALUE)}
here, otherwise say -mcpu=v7 would be passed even when -m64.
-   CC1_SPEC above takes care of this instead.  */
+   CC1_SPEC above takes care of this instead.
+
+   Note that the order of the cpu* and tune* options matters: the
+   config.gcc file always sets with_cpu to some value, even if the
+   user didn't use --with-cpu when invoking the configure script.
+   This value is based on the target name.  Therefore we have to make
+   sure that --with-cpu-32 takes precedence to --with-cpu in < v9
+   systems, and that --with-cpu-64 takes precedence to --with-cpu in
+   >= v9 systems.  As for the tune* options, in some platforms
+   config.gcc also sets a default value for it if the user didn't use
+   --with-tune when invoking the configure script.  */
 #undef OPTION_DEFAULT_SPECS
 #if DEFAULT_ARCH32_P
 #define OPTION_DEFAULT_SPECS \
+  {"cpu_32", "%{!m64:%{!mcpu=*:-mcpu=%(VALUE)}}" }, \
+  {"cpu_64", "%{m64:%{!mcpu=*:-mcpu=%(VALUE)}}" }, \
   {"cpu", "%{!m64:%{!mcpu=*:-mcpu=%(VALUE)}}" }, \
+  {"tune_32", "%{!m64:%{!mtune=*:-mtune=%(VALUE)}}" }, \
+  {"tune_64", "%{m64:%{!mtune=*:-mtune=%(VALUE)}}" }, \
   {"tune", "%{!mtune=*:-mtune=%(VALUE)}" }, \
   {"float", 
"%{!msoft-float:%{!mhard-float:%{!mfpu:%{!mno-fpu:-m%(VALUE)-float" }
 #else
 #define OPTION_DEFAULT_SPECS \
+  {"cpu_32", "%{m32:%{!mcpu=*:-mcpu=%(VALUE)}}" }, \
+  {"cpu_64", "%{!m32:%{!mcpu=*:-mcpu=%(VALUE)}}" }, \
   {"cpu", "%{!m32:%{!mcpu=*:-mcpu=%(VALUE)}}" }, \
+  {"tune_32", "%{m32:%{!mtune=*:-mtune=%(VALUE)}}" },  \
+  {"tune_64", "%{!m32:%{!mtune=*:-mtune=%(VALUE)}}" }, \
   {"tune", "%{!mtune=*:-mtune=%(VALUE)}" }, \
   {"float", 
"%{!msoft-float:%{!mhard-float:%{!mfpu:%{!mno-fpu:-m%(VALUE)-float" }
 #endif
diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index e1ca26c..ee2494e 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -1241,7 +1241,7 @@ This option is only supported on some targets, including 
ARC, ARM, i386, M68k,
 PowerPC, and SPARC@.  It is mandatory for ARC@.  The @option{--with-cpu-32} and
 @option{--with-cpu-64} options specify separate default CPUs for
 32-bit and 64-bit modes; these options are only supported for i386,
-x86-64 and PowerPC.
+x86-64, PowerPC, and SPARC@.
 
 @item --with-schedule=@var{cpu}
 @itemx --with-arch=@var{cpu}


[OpenACC] minor code cleanup

2016-05-02 Thread Nathan Sidwell
While working on some more loop partitioning improvements, I noticed some 
unnecessary checking.  By construction an openacc loop must have at least one 
head/tail marker, so we should assert that earlier when lowering the loop (in 
the common compiler) and then rely  on it later when processing the loop (in the 
accelerator compiler).


ok for trunk?

nathan
2016-05-02  Nathan Sidwell  

	* omp-low.c (lower_oacc_head_tail): Assert there is at least one
	marker.
	(oacc_loop_process): Check mask for loop termination.

Index: omp-low.c
===
--- omp-low.c	(revision 235758)
+++ omp-low.c	(working copy)
@@ -6402,12 +6402,10 @@ lower_oacc_head_tail (location_t loc, tr
   gimple_seq_add_stmt (head, gimple_build_assign (ddvar, integer_zero_node));
 
   unsigned count = lower_oacc_head_mark (loc, ddvar, clauses, head, ctx);
-  if (!count)
-lower_oacc_loop_marker (loc, ddvar, false, integer_zero_node, tail);
-  
   tree fork_kind = build_int_cst (unsigned_type_node, IFN_UNIQUE_OACC_FORK);
   tree join_kind = build_int_cst (unsigned_type_node, IFN_UNIQUE_OACC_JOIN);
 
+  gcc_assert (count);
   for (unsigned done = 1; count; count--, done++)
 {
   gimple_seq fork_seq = NULL;
@@ -19331,10 +19329,8 @@ oacc_loop_process (oacc_loop *loop)
 
   oacc_loop_xform_loop (loop->head_end, loop->ifns, mask_arg, chunk_arg);
 
-  for (ix = 0; ix != GOMP_DIM_MAX && loop->heads[ix]; ix++)
+  for (ix = 0; ix != GOMP_DIM_MAX && mask; ix++)
 	{
-	  gcc_assert (mask);
-
 	  while (!(GOMP_DIM_MASK (dim) & mask))
 	dim++;
 


  1   2   >