date:20160224

[PATCH] Fix PR69907

2016-02-24 Thread Richard Biener


The following fixes PR69907, BB vectorization not properly avoiding
loading excess elements beyond the last scalar access.  As seen by
the testsuite adjustments this pessimizes the cases where we'd know
the underlying object is larger, but that's a general issue also
for loop vectorization with gaps.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2016-02-24  Richard Biener  

PR tree-optimization/69907
* tree-vect-stmts.c (vectorizable_load): Check for gaps at the
end of permutations for BB vectorization.

* gcc.dg/vect/bb-slp-pr69907.c: New testcase.
* gcc.dg/vect/bb-slp-34.c: XFAIL.
* gcc.dg/vect/bb-slp-pr68892.c: Likewise.

Index: gcc/tree-vect-stmts.c
===
*** gcc/tree-vect-stmts.c   (revision 233620)
--- gcc/tree-vect-stmts.c   (working copy)
*** vectorizable_load (gimple *stmt, gimple_
*** 6395,6400 
--- 6394,6412 
slp_perm = true;
  
group_size = GROUP_SIZE (vinfo_for_stmt (first_stmt));
+ 
+   /* ???  The following is overly pessimistic (as well as the loop
+  case above) in the case we can statically determine the excess
+elements loaded are within the bounds of a decl that is accessed.
+Likewise for BB vectorizations using masked loads is a possibility.  */
+   if (bb_vinfo && slp_perm && group_size % nunits != 0)
+   {
+ dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+  "BB vectorization with gaps at the end of a load "
+  "is not supported\n");
+ return false;
+   }
+ 
if (!slp
  && !PURE_SLP_STMT (stmt_info)
  && !STMT_VINFO_STRIDED_P (stmt_info))
Index: gcc/testsuite/gcc.dg/vect/bb-slp-pr69907.c
===
*** gcc/testsuite/gcc.dg/vect/bb-slp-pr69907.c  (revision 0)
--- gcc/testsuite/gcc.dg/vect/bb-slp-pr69907.c  (working copy)
***
*** 0 
--- 1,12 
+ /* { dg-do compile } */
+ /* { dg-additional-options "-O3" } */
+ /* { dg-require-effective-target vect_unpack } */
+ 
+ void foo(unsigned *p1, unsigned short *p2)
+ {
+   int n;
+   for (n = 0; n < 320; n++)
+ p1[n] = p2[n * 2];
+ }
+ 
+ /* { dg-final { scan-tree-dump "BB vectorization with gaps at the end of a 
load is not supported" "slp1" } } */
Index: gcc/testsuite/gcc.dg/vect/bb-slp-34.c
===
*** gcc/testsuite/gcc.dg/vect/bb-slp-34.c   (revision 233620)
--- gcc/testsuite/gcc.dg/vect/bb-slp-34.c   (working copy)
*** int main()
*** 32,35 
return 0;
  }
  
! /* { dg-final { scan-tree-dump "basic block vectorized" "slp2" { target 
vect_perm } } } */
--- 32,36 
return 0;
  }
  
! /* ??? XFAILed because we access "excess" elements with the permutation.  */
! /* { dg-final { scan-tree-dump "basic block vectorized" "slp2" { target 
vect_perm xfail *-*-* } } } */
Index: gcc/testsuite/gcc.dg/vect/bb-slp-pr68892.c
===
*** gcc/testsuite/gcc.dg/vect/bb-slp-pr68892.c  (revision 233620)
--- gcc/testsuite/gcc.dg/vect/bb-slp-pr68892.c  (working copy)
*** void foo(void)
*** 13,17 
b[3] = a[3][0];
  }
  
! /* { dg-final { scan-tree-dump "not profitable" "slp2" } } */
  /* { dg-final { scan-tree-dump-times "Basic block will be vectorized" 0 
"slp2" } } */
--- 13,19 
b[3] = a[3][0];
  }
  
! /* ???  The profitability check is not reached because we give up on the
!gaps we access earlier.  */
! /* { dg-final { scan-tree-dump "not profitable" "slp2" { xfail *-*-* } } } */
  /* { dg-final { scan-tree-dump-times "Basic block will be vectorized" 0 
"slp2" } } */

Re: [PATCH] Bump max-ssa-name-query-depth default (PR c/69918)

2016-02-24 Thread Richard Biener

On Tue, 23 Feb 2016, Jakub Jelinek wrote:

> Hi!
> 
> As mentioned in the PR, the builtin-integral-1.c testcase fails on
> i?86 Solaris.  The problem is that on Solaris the dg-add-options c99_runtime 
> adds -std=c99, which turns -fexcess-precision=standard, and that on some
> arches changes
>   _388 = (double) i1_63(D);
>   _389 = (double) i2_335(D);
>   _390 = _388 * _389;
>   _391 = (long double) _390;
>   _392 = __builtin_ceill (_391);
> into
>   _398 = (double) i1_63(D);
>   _399 = (long double) _398;
>   _400 = (double) i2_335(D);
>   _401 = (long double) _400;
>   _402 = _399 * _401;
>   _403 = __builtin_ceill (_402);
> But the default value of the max-ssa-name-query-depth param prevents in this
> case from recognizing the argument to __builtin_ciell will always be
> integral value.  We have the possibility to either tweak the testcase
> (e.g. add -fexcess-precision=fast, or --param max-ssa-name-query-depth=3),
> or change the IMHO way too low default.  As only the latter will help
> for excess precision code in real-world even for simple addition of two
> values, I think it is best to bump the default.  Perhaps even 5 wouldn't
> hurt, but maybe we can increase it more for gcc 7.

Note that as we do not limit the number of PHI args to visit the
actual number of lookups can be unbound already, it's generally
O(max-depth^N) with N being the max number of operands in a stmt.

I suppose another max-phi-args-query parameter would be good to
limit this to max-depth^2 in practice.

> Bootstrapped/regtested on x86_64-linux and i686-linux, tested on the
> testcase using cross to i386-pc-solaris2.11.  Ok for trunk?

Ok.

Thanks,
Richard.

> 2016-02-23  Jakub Jelinek  
> 
>   PR c/69918
>   * params.def (PARAM_MAX_SSA_NAME_QUERY_DEPTH): Bump default from
>   2 to 3.
> 
> --- gcc/params.def.jj 2016-02-01 23:34:34.0 +0100
> +++ gcc/params.def2016-02-23 18:43:04.359322654 +0100
> @@ -1191,7 +1191,7 @@ DEFPARAM (PARAM_MAX_SSA_NAME_QUERY_DEPTH
> "max-ssa-name-query-depth",
> "Maximum recursion depth allowed when querying a property of an"
> " SSA name.",
> -   2, 1, 0)
> +   3, 1, 0)
>  
>  DEFPARAM (PARAM_MAX_RTL_IF_CONVERSION_INSNS,
> "max-rtl-if-conversion-insns",
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: [PATCH] Fix normal_inner_ref expansion (PR middle-end/69909)

2016-02-24 Thread Richard Biener

On Tue, 23 Feb 2016, Jakub Jelinek wrote:

> Hi!
> 
> When the base of a handled component (BIT_FIELD_REF in the testcase)
> is SSA_NAME which happens to be expanded as some MEM (on the testcase
> it is SSA_NAME set to VIEW_CONVERT_EXPR of an SSA_NAME that has MEM as
> DECL_RTL), expand_expr_real_1 can try to update the MEM attributes from
> exp, but that is wrong, it might change the alias set of the MEM, etc.
> If the base is SSA_NAME, we should keep the attributes unmodified.
> The patch actually also tests for !MEM_P (orig_op0), so that if
> the SSA_NAME expanded to non-MEM (e.g. constant or REG), but the
> normal_inner_ref expansion forces it for whatever reason in memory,
> we still set the attributes of such a MEM, it is a temporary in that
> case, rather than the original read.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Thanks,
Richard.

> 2016-02-23  Jakub Jelinek  
>   Richard Biener  
> 
>   PR middle-end/69909
>   * expr.c (expand_expr_real_1) : Avoid
>   set_mem_attributes if tem is SSA_NAME which got expanded
>   as a MEM.
> 
>   * gcc.dg/torture/pr69909.c: New test.
> 
> --- gcc/expr.c.jj 2016-02-23 13:54:02.0 +0100
> +++ gcc/expr.c2016-02-23 14:30:23.810657866 +0100
> @@ -10521,7 +10521,11 @@ expand_expr_real_1 (tree exp, rtx target
>   if (op0 == orig_op0)
> op0 = copy_rtx (op0);
>  
> - set_mem_attributes (op0, exp, 0);
> + /* Don't set memory attributes if the base expression is
> +SSA_NAME that got expanded as a MEM.  In that case, we should
> +just honor its original memory attributes.  */
> + if (TREE_CODE (tem) != SSA_NAME || !MEM_P (orig_op0))
> +   set_mem_attributes (op0, exp, 0);
>  
>   if (REG_P (XEXP (op0, 0)))
> mark_reg_pointer (XEXP (op0, 0), MEM_ALIGN (op0));
> --- gcc/testsuite/gcc.dg/torture/pr69909.c.jj 2016-02-23 14:25:27.819719259 
> +0100
> +++ gcc/testsuite/gcc.dg/torture/pr69909.c2016-02-23 14:25:27.818719272 
> +0100
> @@ -0,0 +1,35 @@
> +/* PR middle-end/69909 */
> +/* { dg-do run { target int128 } } */
> +/* { dg-additional-options "-w" } */
> +
> +typedef unsigned V __attribute__ ((vector_size (32)));
> +typedef __int128 T;
> +typedef __int128 U __attribute__ ((vector_size (32)));
> +
> +__attribute__((noinline, noclone)) T
> +foo (T a, V b, V c, V d, V e, U f)
> +{
> +  d[6] ^= 0x10;
> +  f -= (U) d;
> +  f[1] |= f[1] << (a & 127);
> +  c ^= d;
> +  return b[7] + c[2] + c[2] + d[6] + e[2] + f[1];
> +}
> +
> +int
> +main ()
> +{
> +  if (__CHAR_BIT__ != 8 || sizeof (unsigned) != 4 || sizeof (T) != 16)
> +return 0;
> +
> +  T x = foo (1, (V) { 9, 2, 5, 8, 1, 2, 9, 3 },
> + (V) { 1, 2, 3, 4, 5, 6, 7, 8 },
> + (V) { 4, 1, 2, 9, 8, 3, 5, 2 },
> + (V) { 3, 6, 1, 3, 2, 9, 4, 8 }, (U) { 3, 5 });
> +  if (((unsigned long long) (x >> 64) != 0xULL
> +   || (unsigned long long) x != 0xfffe001aULL)
> +  && ((unsigned long long) (x >> 64) != 0xfffdULL
> +   || (unsigned long long) x != 0x0022ULL))
> +__builtin_abort ();
> +  return 0;
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: [PATCH] Fix PR68963

2016-02-24 Thread Richard Biener

On Mon, 22 Feb 2016, Richard Biener wrote:

> 
> The following patch fixes invalid upper bounds recorded for conditonal
> array accesses - it doesn't depend on whether their IV wrap or not
> (and we were unsetting 'reliable' only anyway).  In fact conditional
> accesses should be good enough for an estimate, just wrapping ones
> not.  Until we determine whether the controlling expression is
> dependent on the loop IV that's probably the best to do here.
> 
> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
> 
> Honza, about reliable vs. non-reliable - was that the original intent
> or did you want to always record 'reliable' but really reset 'upper'
> here?

Ok, so this was the wrong place to fix and we should actually record
those bounds.  But the actual bound recorded is wrong as
record_nonwrapping_iv for IVs { -y, +, 1 }_1 and an array index
range of [0, 3] falls back to using 0 for the initial value of the
IV for the purpose of niter compute - that's of course only valid
if the stmt is always executed.

And in fact if the IV were { -1, +, 1 }_1 and the array index
range [0, 3] we could conclude the loop doesn't loop at all as
the first iteration would contain undefined behavior (if the
stmt was always executed) - currently we compute an upper bound
of 3 - (-1) here which is overly conservative.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2016-02-24  Richard Biener  

PR middle-end/68963
* tree-ssa-loop-niter.c (derive_constant_upper_bound_ops): Fix
bogus check.
(record_nonwrapping_iv): Do not fall back to the low/high bound
for non-constant IV bases if the stmt is not always executed.

* gcc.dg/torture/pr68963.c: New testcase.

Index: gcc/testsuite/gcc.dg/torture/pr68963.c
===
*** gcc/testsuite/gcc.dg/torture/pr68963.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr68963.c  (working copy)
***
*** 0 
--- 1,41 
+ /* { dg-do run } */
+ 
+ static const float a[3] = { 1, 2, 3 };
+ int b = 3;
+ 
+ __attribute__((noinline, noclone)) void
+ bar (int x)
+ {
+   if (x != b++)
+ __builtin_abort ();
+ }
+ 
+ void
+ foo (float *x, int y)
+ {
+   int i;
+   for (i = 0; i < 2 * y; ++i)
+ {
+   if (i < y)
+   x[i] = a[i];
+   else
+   {
+ bar (i);
+ x[i] = a[i - y];
+   }
+ }
+ }
+ 
+ int
+ main ()
+ {
+   float x[10];
+   unsigned int i;
+   for (i = 0; i < 10; ++i)
+ x[i] = 1337;
+   foo (x, 3);
+   for (i = 0; i < 10; ++i)
+ if (x[i] != (i < 6 ? (i % 3) + 1 : 1337))
+   __builtin_abort ();
+   return 0;
+ }
Index: gcc/tree-ssa-loop-niter.c
===
*** gcc/tree-ssa-loop-niter.c   (revision 233634)
--- gcc/tree-ssa-loop-niter.c   (working copy)
*** derive_constant_upper_bound_ops (tree ty
*** 2757,2763 
 enum tree_code code, tree op1)
  {
tree subtype, maxt;
!   widest_int bnd, max, mmax, cst;
gimple *stmt;
  
if (INTEGRAL_TYPE_P (type))
--- 2757,2763 
 enum tree_code code, tree op1)
  {
tree subtype, maxt;
!   widest_int bnd, max, cst;
gimple *stmt;
  
if (INTEGRAL_TYPE_P (type))
*** derive_constant_upper_bound_ops (tree ty
*** 2823,2830 
  /* OP0 + CST.  We need to check that
 BND <= MAX (type) - CST.  */
  
! mmax -= cst;
! if (wi::ltu_p (bnd, max))
return max;
  
  return bnd + cst;
--- 2823,2830 
  /* OP0 + CST.  We need to check that
 BND <= MAX (type) - CST.  */
  
! widest_int mmax = max - cst;
! if (wi::leu_p (bnd, mmax))
return max;
  
  return bnd + cst;
*** record_nonwrapping_iv (struct loop *loop
*** 3065,3071 
  && get_range_info (orig_base, &min, &max) == VR_RANGE
  && wi::gts_p (high, max))
base = wide_int_to_tree (unsigned_type, max);
!   else if (TREE_CODE (base) != INTEGER_CST)
base = fold_convert (unsigned_type, high);
delta = fold_build2 (MINUS_EXPR, unsigned_type, base, extreme);
step = fold_build1 (NEGATE_EXPR, unsigned_type, step);
--- 3065,3073 
  && get_range_info (orig_base, &min, &max) == VR_RANGE
  && wi::gts_p (high, max))
base = wide_int_to_tree (unsigned_type, max);
!   else if (TREE_CODE (base) != INTEGER_CST
!  && dominated_by_p (CDI_DOMINATORS,
! loop->latch, gimple_bb (stmt)))
base = fold_convert (unsigned_type, high);
delta = fold_build2 (MINUS_EXPR, unsigned_type, base, extreme);
step = fold_build1 (NEGATE_EXPR, unsigned_type, step);
*** record_nonwrapping_iv (struct loop *loop
*** 3080,3086 
  && get_range_info (orig_base, &min, &max) == VR_RANGE
  && wi

[PATCH][gcse] PR rtl-optimization/69886: Check target mode in can_assign_to_reg_without_clobbers_p

2016-02-24 Thread Kyrill Tkachov


Hi all,

In this PR we get an ICE when the hoist pass ends up creating an
(insn 88 0 0 (set (reg:OI 136)
(const_int 0 [0])) -1
 (nil))

instruction. AArch64 doesn't support such an OImode set.
The only OImode set operations that aarch64 supports are load/store-multiple 
operations
on vector registers.

want_to_gcse_p should have rejected this move long before process_insert_insn 
tried to
insert it in the stream.  But it didn't because 
can_assign_to_reg_without_clobbers_p is only
given the (const_int 0) expression and asked whether there can be a valid SET 
operation on that.
It should also consider the mode that such an operation is requested in, rather 
than extracting
the mode from the operand (VOIDmode for CONST_INTs). Luckily, want_to_gcse_p 
already has a mode
argument that it uses in its costs calculation, so we can just pass it down.

This patch extends can_assign_to_reg_without_clobbers_p to take a mode argument 
and use it when
testing the validity of the SET instructions that it creates, so such an OImode 
move is properly
rejected.

Bootstrapped and tested on aarch64-none-linux-gnu, arm-none-linux-gnueabihf, 
x86_64-unknown-linux-gnu.
There are no codegen differences on SPEC2006 for aarch64 resulting from this 
patch.

This bug appears in all versions that have aarch64, so it's not a regression, 
but I think it's
a fairly low risk patch.

Is this ok for trunk now or when stage 1 reopens?

Thanks,
Kyrill

2016-02-24  Kyrylo Tkachov  

PR rtl-optimization/69886
* gcse.c (can_assign_to_reg_without_clobbers_p): Accept mode
argument.  Use it when checking validity of set instructions.
(want_to_gcse_p): Pass mode to can_assign_to_reg_without_clobbers_p.
(compute_ld_motion_mems): Update can_assign_to_reg_without_clobbers_p
callsite.
* rtl.h (can_assign_to_reg_without_clobbers_p): Update prototype.
* store-motion.c (find_moveable_store): Update
can_assign_to_reg_without_clobbers_p callsite.

2016-02-24  Kyrylo Tkachov  

PR rtl-optimization/69886
* gcc.dg/torture/pr69886.c: New test.
diff --git a/gcc/gcse.c b/gcc/gcse.c
index 500de7a95ce856d7cd710d3be1efed865d40703c..51277a1cb613a4005ee240c51949083daed8c54c 100644
--- a/gcc/gcse.c
+++ b/gcc/gcse.c
@@ -810,7 +810,7 @@ want_to_gcse_p (rtx x, machine_mode mode, int *max_distance_ptr)
 	*max_distance_ptr = max_distance;
 	}
 
-  return can_assign_to_reg_without_clobbers_p (x);
+  return can_assign_to_reg_without_clobbers_p (x, mode);
 }
 }
 
@@ -818,9 +818,9 @@ want_to_gcse_p (rtx x, machine_mode mode, int *max_distance_ptr)
 
 static GTY(()) rtx_insn *test_insn;
 
-/* Return true if we can assign X to a pseudo register such that the
-   resulting insn does not result in clobbering a hard register as a
-   side-effect.
+/* Return true if we can assign X to a pseudo register of mode MODE
+   such that the resulting insn does not result in clobbering a hard
+   register as a side-effect.
 
Additionally, if the target requires it, check that the resulting insn
can be copied.  If it cannot, this means that X is special and probably
@@ -831,14 +831,14 @@ static GTY(()) rtx_insn *test_insn;
maybe live hard regs.  */
 
 bool
-can_assign_to_reg_without_clobbers_p (rtx x)
+can_assign_to_reg_without_clobbers_p (rtx x, machine_mode mode)
 {
   int num_clobbers = 0;
   int icode;
   bool can_assign = false;
 
   /* If this is a valid operand, we are OK.  If it's VOIDmode, we aren't.  */
-  if (general_operand (x, GET_MODE (x)))
+  if (general_operand (x, mode))
 return 1;
   else if (GET_MODE (x) == VOIDmode)
 return 0;
@@ -857,7 +857,7 @@ can_assign_to_reg_without_clobbers_p (rtx x)
 
   /* Now make an insn like the one we would make when GCSE'ing and see if
  valid.  */
-  PUT_MODE (SET_DEST (PATTERN (test_insn)), GET_MODE (x));
+  PUT_MODE (SET_DEST (PATTERN (test_insn)), mode);
   SET_SRC (PATTERN (test_insn)) = x;
 
   icode = recog (PATTERN (test_insn), test_insn, &num_clobbers);
@@ -3830,12 +3830,13 @@ compute_ld_motion_mems (void)
 		  if (MEM_P (dest) && simple_mem (dest))
 		{
 		  ptr = ldst_entry (dest);
-
+		  machine_mode src_mode = GET_MODE (src);
 		  if (! MEM_P (src)
 			  && GET_CODE (src) != ASM_OPERANDS
 			  /* Check for REG manually since want_to_gcse_p
 			 returns 0 for all REGs.  */
-			  && can_assign_to_reg_without_clobbers_p (src))
+			  && can_assign_to_reg_without_clobbers_p (src,
+src_mode))
 			ptr->stores = alloc_INSN_LIST (insn, ptr->stores);
 		  else
 			ptr->invalid = 1;
diff --git a/gcc/rtl.h b/gcc/rtl.h
index 703dffe1d3e479fa787eb3491f68370f3b68048c..a5f20d9ce3bfee6e803ac5aeec79e1955b3e1687 100644
--- a/gcc/rtl.h
+++ b/gcc/rtl.h
@@ -3565,7 +3565,7 @@ extern void init_lower_subreg (void);
 
 /* In gcse.c */
 extern bool can_copy_p (machine_mode);
-extern bool can_assign_to_reg_without_clobbers_p (rtx);
+extern bool can_assign_to_reg_without_clobbers_p (rtx, machine_mode);
 extern rtx fis_get_condit

[PATCH][wwwdocs][committed] Fix typo and tws in changes.html

2016-02-24 Thread Kyrill Tkachov


Hi all,

I'm committing this patch as obvious to fix a typo and a TWS occurrence in 
changes.html.

Thanks,
Kyrill
Index: htdocs/gcc-6/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/changes.html,v
retrieving revision 1.61
diff -U 3 -r1.61 changes.html
--- htdocs/gcc-6/changes.html	19 Feb 2016 05:00:54 -	1.61
+++ htdocs/gcc-6/changes.html	22 Feb 2016 16:56:57 -
@@ -59,8 +59,8 @@
 	alias attributes. This makes it possible to access
 	both a variable and its alias in one translation unit which is common
 	with link-time optimization.
-Value range propagation now assumes the that this pointer
-	of C++ member functions is non-null.  This eliminates 
+Value range propagation now assumes that the this pointer
+	of C++ member functions is non-null.  This eliminates
 	common null pointer checks
 	but also breaks some non-conforming code-bases (such as Qt-5, Chromium,
 	KDevelop). As a temporary work-around

Re: [PATCH, PR middle-end/68134] Reject scalar modes in default get_mask_mode hook

2016-02-24 Thread Ilya Enkovich

2016-02-22 14:50 GMT+03:00 Alan Lawrence :
> On 20/02/16 09:29, Ilya Enkovich wrote:
>>
>> 2016-02-19 20:36 GMT+03:00 Alan Lawrence :
>>>
>>> On 17/11/15 11:49, Ilya Enkovich wrote:

 Hi,

 Default hook for get_mask_mode is supposed to return integer vector
 modes.
 This means it should reject calar modes returned by mode_for_vector.
 Bootstrapped and regtested on x86_64-unknown-linux-gnu, regtested on
 aarch64-unknown-linux-gnu.  OK for trunk?

 Thanks,
 Ilya
 --
 gcc/

 2015-11-17  Ilya Enkovich  

  PR middle-end/68134
  * targhooks.c (default_get_mask_mode): Filter out
  scalar modes returned by mode_for_vector.
>>>
>>>
>>>
>>> I've been playing around with a patch that expands arbitrary
>>> VEC_COND_EXPRs
>>> using the vec_cmp and vcond_mask optabs - allowing platforms to drop the
>>> ugly vcond pattern (for which a cross-product of modes is
>>> usually required) where vcond = vec_cmp + vcond_mask. (E.g. ARM and
>>> AArch64.)
>>>
>>> Mostly this is fairly straightforward, relatively little midend code is
>>> required, and the backend cleans up quite a bit. However, I get stuck on
>>> the
>>> case of singleton vectors (64x1). No surprises there, then...
>>>
>>> The PR/68134 fix, makes the 'mask mode' for comparing 64x1 vectors, into
>>> BLKmode, so that we get past this in expand_vector_operations:
>>>
>>> /* A scalar operation pretending to be a vector one.  */
>>>if (VECTOR_BOOLEAN_TYPE_P (type)
>>>&& !VECTOR_MODE_P (TYPE_MODE (type))
>>>&& TYPE_MODE (type) != BLKmode)
>>>  return;
>>>
>>> and we do the operation piecewise. (Which is what we want; there is only
>>> one
>>> piece!)
>>>
>>> However, with my vec_cmp + vcond_mask changes, dropping vconddidi, this
>>> means we look for a vcond_maskdiblk and vec_cmpdiblk. Which doesn't
>>> really
>>> feel right - it feels like the 64x1 mask should be a DImode, just like
>>> other
>>> 64x1 vectors.
>>
>>
>> The problem here is to distinguish vector mask of one DI element and
>> DI scalar mask.  We don't want to lower scalar mask manipulations
>> because they are simple integer operations, not vector ones. Probably
>> vector of a single DI should have V1DI mode and not pretend to be a
>> scalar?  This would make things easier.
>
>
> Thanks for the quite reply, Ilya.
>
> What's the difference between, as you say, a "simple integer operation" and
> a "vector" operation of just one element?

The difference is at least in how this operation is expanded.  You
would use different optabs for scalar and vector cases. Also note that
default_get_mask_mode uses BLKmode for scalar modes not just because
of a single element vector.  mode_for_vector may return DImode for
V2SI and V4HI vectors in case target doesn't define such vector modes.
To distinguish true scalar masks I avoid scalar mode usage for
non-scalar masks.  One element vector might be an exception here.  You
may try to define TARGET_VECTORIZE_GET_MASK_MODE for your target and
keep scalar mode when you want it.

>
> This is why we do *not* have V1DImode in the AArch64 (or ARM) backends, but
> instead treat 64x1 vectors as DImode - the operations are the same; so
> keeping them as the same mode, enables CSE and lots of other optimizations,
> plus we don't have to have two near-identical copies (DI + V1DI) for many
> patterns, etc...

Well, you don't have to keep V1DI mode after expand.  You may also
just don't add any new patterns for vector optabs and therefore get
these vector operations lowered into 'true' scalar operations with
both scalar type and mode.

>
> If the operations were on a "DI scalar mask", when would the first part of
> that test, VECTOR_BOOLEAN_TYPE_P, hold?

Any vector comparison produces a value of a boolean vector type.  Even
if these vectors and mask have a scalar mode.

Thanks,
Ilya

>
> Thanks, Alan

Re: [PATCH][ARM] PR target/69875 Fix atomic_loaddi expansion

2016-02-24 Thread Ramana Radhakrishnan



On 19/02/16 15:24, Kyrill Tkachov wrote:
> Hi all,
> 
> The atomic_loaddi expander on arm has some issues and can benefit from a 
> rewrite to properly
> perform double-word atomic loads on various architecture levels.
> 
> Consider the code:
> --
> #include 
> 
> atomic_ullong foo;
> 
> int glob;
> 
> int main(void) {
> atomic_load_explicit(&foo, memory_order_acquire);
> return glob;
> }
> -
> 
> Compiled with -O2 -march=armv7-a -std=c11 this gives:
> movwr3, #:lower16:glob
> movtr3, #:upper16:glob
> dmb ish
> movwr2, #:lower16:foo
> movtr2, #:upper16:foo
> ldrexd  r0, r1, [r2]
> ldr r0, [r3]
> bx  lr
> 
> For the acquire memory model the barrier should be after the ldrexd, not 
> before.
> The same code is generated when compiled with -march=armv7ve. However, we can 
> get away with a single LDRD
> on such systems. In issue C.c of The ARM Architecture Reference Manual for 
> ARMv7-A and ARMv7-R
> recommends at chapter A3.5.3:
> "In an implementation that includes the Large Physical Address Extension, 
> LDRD and STRD accesses to 64-bit aligned
> locations are 64-bit single-copy atomic".
> We still need the barrier after the LDRD to enforce the acquire ordering 
> semantics.
> 
> For ARMv8-A we can do even better and use the load double-word acquire 
> instruction: LDAEXD, with no need for
> a barrier afterwards.
> 
> I've discussed the required sequences with some kernel folk and had a read 
> through:
> https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html
> and this is the patch I've come up with.
> 
> This patch handles all three of the above cases by rewriting the 
> atomic_loaddi expander.
> With this patch for the above code with -march=armv7-a we would now generate:
> movwr3, #:lower16:foo
> movtr3, #:upper16:foo
> ldrexd  r0, r1, [r3]
> movwr3, #:lower16:glob
> movtr3, #:upper16:glob
> dmb ish
> ldr r0, [r3]
> bx  lr
> 
> For -march=armv7ve:
> movwr3, #:lower16:foo
> movtr3, #:upper16:foo
> ldrdr2, r3, [r3]
> movwr3, #:lower16:glob
> movtr3, #:upper16:glob
> dmb ish
> ldr r0, [r3]
> bx  lr
> 
> and for -march=armv8-a:
> movwr3, #:lower16:foo
> movtr3, #:upper16:foo
> ldaexd  r2, r3, [r3]
> movwr3, #:lower16:glob
> movtr3, #:upper16:glob
> ldr r0, [r3]
> bx  lr
> 
> For the relaxed memory model the armv7ve and armv8-a can be relaxed to a 
> single
> LDRD instruction, without any barriers.
> 
> Bootstrapped and tested on arm-none-linux-gnueabihf.
> 
> Ok for trunk?


This is OK and needed for the release branches. Thanks.

regards
Ramana
> 
> Thanks,
> Kyrill
> 
> P.S. The backport to the previous branches will look a bit different because 
> the
> ARM_FSET_HAS_CPU1 machinery in arm.h was introduced for GCC 6. I'll prepare a 
> backport
> separately if this is accepted.
> 
> 2016-02-19  Kyrylo Tkachov  
> 
> PR target/69875
> * config/arm/arm.h (TARGET_HAVE_LPAE): Define.
> * config/arm/unspecs.md (VUNSPEC_LDRD_ATOMIC): New value.
> * config/arm/sync.md (arm_atomic_loaddi2_ldrd): New pattern.
> (atomic_loaddi_1): Delete.
> (atomic_loaddi): Rewrite expander using the above changes.
> 
> 2016-02-19  Kyrylo Tkachov  
> 
> PR target/69875
> * gcc.target/arm/atomic_loaddi_acquire.x: New file.
> * gcc.target/arm/atomic_loaddi_relaxed.x: Likewise.
> * gcc.target/arm/atomic_loaddi_seq_cst.x: Likewise.
> * gcc.target/arm/atomic_loaddi_1.c: New test.
> * gcc.target/arm/atomic_loaddi_2.c: Likewise.
> * gcc.target/arm/atomic_loaddi_3.c: Likewise.
> * gcc.target/arm/atomic_loaddi_4.c: Likewise.
> * gcc.target/arm/atomic_loaddi_5.c: Likewise.
> * gcc.target/arm/atomic_loaddi_6.c: Likewise.
> * gcc.target/arm/atomic_loaddi_7.c: Likewise.
> * gcc.target/arm/atomic_loaddi_8.c: Likewise.
> * gcc.target/arm/atomic_loaddi_9.c: Likewise.

Re: [PATCH] 69780 - [4.9/5/6 Regression] ICE on __builtin_alloca_with_align, with small alignment

2016-02-24 Thread Dominique d'Humières

The test gcc.dg/builtins-68.c fails on x86_64-apple-darwin15:

FAIL: gcc.dg/builtins-68.c  (test for errors, line 79)
FAIL: gcc.dg/builtins-68.c (test for excess errors)
Excess errors:
/opt/gcc/work/gcc/testsuite/gcc.dg/builtins-68.c:107:40: warning: large integer 
implicitly truncated to unsigned type [-Woverflow]
/opt/gcc/work/gcc/testsuite/gcc.dg/builtins-68.c:11:19: warning: large integer 
implicitly truncated to unsigned type [-Woverflow]
/opt/gcc/work/gcc/testsuite/gcc.dg/builtins-68.c:109:40: warning: large integer 
implicitly truncated to unsigned type [-Woverflow]

TIA

Dominique

[PATCH][ARM] Add initial support for the Cortex-A32

2016-02-24 Thread Kyrill Tkachov


Hi all,

This patch adds initial support for the Cortex-A32 core.
It is an ARMv8-A core and this patch enables the -mcpu=cortex-a32 and
-mtune=cortex-a32 options.

The initial tunings are set to the same parameters as for Cortex-A35.

Bootstrapped and tested on arm-none-linux-gnueabihf together with a binutils
suitably patched to recognise -mcpu=cortex-a32 and the respective .cpu directive
(https://sourceware.org/ml/binutils/2016-02/msg00345.html)
The build was configured with --with-cpu=cortex-a32 --with-mode=thumb 
--with-fpu=neon-fp-armv8 --with-float=hard

Ok for trunk?

Thanks,
Kyrill

2016-02-24  Kyrylo Tkachov  

* config/arm/arm-cores.def (cortex-a32): New entry.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm-tune.md: Regenerate.
* config/arm/bpabi.h (BE8_LINK_SPEC): Add mcpu=cortex-a32.
* config/arm/t-aprofile: Handle mcpu=cortex-a32.
* doc/invoke.texi (ARM Options): Document cortex-a32 as value
for -mcpu and -mtune.
diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
index 6538861898689e64a3554f709c5a3355cffad187..b61b7f82b68d3b1f42ee5e22b537fb69392ce337 100644
--- a/gcc/config/arm/arm-cores.def
+++ b/gcc/config/arm/arm-cores.def
@@ -165,6 +165,7 @@ ARM_CORE("cortex-a15.cortex-a7", cortexa15cortexa7, cortexa7,	7A,	ARM_FSET_MAKE_
 ARM_CORE("cortex-a17.cortex-a7", cortexa17cortexa7, cortexa7,	7A,	ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV | FL_FOR_ARCH7A), cortex_a12)
 
 /* V8 Architecture Processors */
+ARM_CORE("cortex-a32",	cortexa32, cortexa53,	8A,	ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_CRC32 | FL_FOR_ARCH8A), cortex_a35)
 ARM_CORE("cortex-a35",	cortexa35, cortexa53,	8A,	ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_CRC32 | FL_FOR_ARCH8A), cortex_a35)
 ARM_CORE("cortex-a53",	cortexa53, cortexa53,	8A,	ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_CRC32 | FL_FOR_ARCH8A), cortex_a53)
 ARM_CORE("cortex-a57",	cortexa57, cortexa57,	8A,	ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_CRC32 | FL_FOR_ARCH8A), cortex_a57)
diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index 6d6ee96828146fe076a6a1ee285f6a1d578b6c85..4b7522cb7afd189dc7edda1bb824b3ae509756b4 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -304,6 +304,9 @@ EnumValue
 Enum(processor_type) String(cortex-a17.cortex-a7) Value(cortexa17cortexa7)
 
 EnumValue
+Enum(processor_type) String(cortex-a32) Value(cortexa32)
+
+EnumValue
 Enum(processor_type) String(cortex-a35) Value(cortexa35)
 
 EnumValue
diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md
index 1c842180cee6afd7a560ef51b63632bb0f83b932..b66344a838e0579ea687dabc4e4b6343f16705ad 100644
--- a/gcc/config/arm/arm-tune.md
+++ b/gcc/config/arm/arm-tune.md
@@ -32,7 +32,8 @@ (define_attr "tune"
 	cortexr4f,cortexr5,cortexr7,
 	cortexm7,cortexm4,cortexm3,
 	marvell_pj4,cortexa15cortexa7,cortexa17cortexa7,
-	cortexa35,cortexa53,cortexa57,
-	cortexa72,exynosm1,qdf24xx,
-	xgene1,cortexa57cortexa53,cortexa72cortexa53"
+	cortexa32,cortexa35,cortexa53,
+	cortexa57,cortexa72,exynosm1,
+	qdf24xx,xgene1,cortexa57cortexa53,
+	cortexa72cortexa53"
 	(const (symbol_ref "((enum attr_tune) arm_tune)")))
diff --git a/gcc/config/arm/bpabi.h b/gcc/config/arm/bpabi.h
index 82128ef0735bb4b223908b2393a46d97e020156b..5d6c4ed51eac2d136871b52d87faefc2ebaa4a43 100644
--- a/gcc/config/arm/bpabi.h
+++ b/gcc/config/arm/bpabi.h
@@ -68,6 +68,7 @@
|mcpu=cortex-a15.cortex-a7\
|mcpu=cortex-a17.cortex-a7\
|mcpu=marvell-pj4	\
+   |mcpu=cortex-a32	\
|mcpu=cortex-a35	\
|mcpu=cortex-a53	\
|mcpu=cortex-a57	\
diff --git a/gcc/config/arm/t-aprofile b/gcc/config/arm/t-aprofile
index 609570643cab23ff699d48a0ea0ee3f991b71c85..b0ecc2fe45da581b6f1cf1a3e1aea7d428c0e533 100644
--- a/gcc/config/arm/t-aprofile
+++ b/gcc/config/arm/t-aprofile
@@ -86,6 +86,7 @@ MULTILIB_MATCHES   += march?armv7ve=mcpu?cortex-a12
 MULTILIB_MATCHES   += march?armv7ve=mcpu?cortex-a17
 MULTILIB_MATCHES   += march?armv7ve=mcpu?cortex-a15.cortex-a7
 MULTILIB_MATCHES   += march?armv7ve=mcpu?cortex-a17.cortex-a7
+MULTILIB_MATCHES   += march?armv8-a=mcpu?cortex-a32
 MULTILIB_MATCHES   += march?armv8-a=mcpu?cortex-a35
 MULTILIB_MATCHES   += march?armv8-a=mcpu?cortex-a53
 MULTILIB_MATCHES   += march?armv8-a=mcpu?cortex-a57
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 0a2a6f45d7cf916a84dc48b6885cf04d43b12d8a..e6b52b4a4b59cbda59b3a8fc25e9bab2f25934c5 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -13874,8 +13874,8 @@ Permissible names are: @samp{arm2}, @samp{arm250},
 @samp{arm1156t2-s}, @samp{arm1156t2f-s}, @samp{arm1176jz-s}, @samp{arm1176jzf-s},
 @samp{generic-armv7-a}, @samp{cortex-a5}, @samp{cortex-a7}, @samp{cortex-a8},
 @samp{cortex-a9}, @samp{cortex-a12}, @samp{cortex-a15}, @samp{cortex-a17},
-@samp{cortex-a35}, @samp{cortex-a53}, @samp{cortex-a57}, @samp{cortex-a72},
-@samp{cortex-r4},
+@samp{cortex-a32}, @samp{cortex-a35}, @samp{cortex-

Re: [PATCH][ARM] Add initial support for the Cortex-A32

2016-02-24 Thread Richard Earnshaw (lists)

On 24/02/16 10:49, Kyrill Tkachov wrote:
> Hi all,
> 
> This patch adds initial support for the Cortex-A32 core.
> It is an ARMv8-A core and this patch enables the -mcpu=cortex-a32 and
> -mtune=cortex-a32 options.
> 
> The initial tunings are set to the same parameters as for Cortex-A35.
> 
> Bootstrapped and tested on arm-none-linux-gnueabihf together with a
> binutils
> suitably patched to recognise -mcpu=cortex-a32 and the respective .cpu
> directive
> (https://sourceware.org/ml/binutils/2016-02/msg00345.html)
> The build was configured with --with-cpu=cortex-a32 --with-mode=thumb
> --with-fpu=neon-fp-armv8 --with-float=hard
> 
> Ok for trunk?
> 
> Thanks,
> Kyrill
> 
> 2016-02-24  Kyrylo Tkachov  
> 
> * config/arm/arm-cores.def (cortex-a32): New entry.
> * config/arm/arm-tables.opt: Regenerate.
> * config/arm/arm-tune.md: Regenerate.
> * config/arm/bpabi.h (BE8_LINK_SPEC): Add mcpu=cortex-a32.
> * config/arm/t-aprofile: Handle mcpu=cortex-a32.
> * doc/invoke.texi (ARM Options): Document cortex-a32 as value
> for -mcpu and -mtune.
> 

OK.

R.

> cortex-a32-gcc.patch
> 
> 
> diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
> index 
> 6538861898689e64a3554f709c5a3355cffad187..b61b7f82b68d3b1f42ee5e22b537fb69392ce337
>  100644
> --- a/gcc/config/arm/arm-cores.def
> +++ b/gcc/config/arm/arm-cores.def
> @@ -165,6 +165,7 @@ ARM_CORE("cortex-a15.cortex-a7", cortexa15cortexa7, 
> cortexa7, 7A, ARM_FSET_MAKE_
>  ARM_CORE("cortex-a17.cortex-a7", cortexa17cortexa7, cortexa7,7A, 
> ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_THUMB_DIV | FL_ARM_DIV | FL_FOR_ARCH7A), 
> cortex_a12)
>  
>  /* V8 Architecture Processors */
> +ARM_CORE("cortex-a32",   cortexa32, cortexa53,   8A, 
> ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_CRC32 | FL_FOR_ARCH8A), cortex_a35)
>  ARM_CORE("cortex-a35",   cortexa35, cortexa53,   8A, 
> ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_CRC32 | FL_FOR_ARCH8A), cortex_a35)
>  ARM_CORE("cortex-a53",   cortexa53, cortexa53,   8A, 
> ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_CRC32 | FL_FOR_ARCH8A), cortex_a53)
>  ARM_CORE("cortex-a57",   cortexa57, cortexa57,   8A, 
> ARM_FSET_MAKE_CPU1 (FL_LDSCHED | FL_CRC32 | FL_FOR_ARCH8A), cortex_a57)
> diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
> index 
> 6d6ee96828146fe076a6a1ee285f6a1d578b6c85..4b7522cb7afd189dc7edda1bb824b3ae509756b4
>  100644
> --- a/gcc/config/arm/arm-tables.opt
> +++ b/gcc/config/arm/arm-tables.opt
> @@ -304,6 +304,9 @@ EnumValue
>  Enum(processor_type) String(cortex-a17.cortex-a7) Value(cortexa17cortexa7)
>  
>  EnumValue
> +Enum(processor_type) String(cortex-a32) Value(cortexa32)
> +
> +EnumValue
>  Enum(processor_type) String(cortex-a35) Value(cortexa35)
>  
>  EnumValue
> diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md
> index 
> 1c842180cee6afd7a560ef51b63632bb0f83b932..b66344a838e0579ea687dabc4e4b6343f16705ad
>  100644
> --- a/gcc/config/arm/arm-tune.md
> +++ b/gcc/config/arm/arm-tune.md
> @@ -32,7 +32,8 @@ (define_attr "tune"
>   cortexr4f,cortexr5,cortexr7,
>   cortexm7,cortexm4,cortexm3,
>   marvell_pj4,cortexa15cortexa7,cortexa17cortexa7,
> - cortexa35,cortexa53,cortexa57,
> - cortexa72,exynosm1,qdf24xx,
> - xgene1,cortexa57cortexa53,cortexa72cortexa53"
> + cortexa32,cortexa35,cortexa53,
> + cortexa57,cortexa72,exynosm1,
> + qdf24xx,xgene1,cortexa57cortexa53,
> + cortexa72cortexa53"
>   (const (symbol_ref "((enum attr_tune) arm_tune)")))
> diff --git a/gcc/config/arm/bpabi.h b/gcc/config/arm/bpabi.h
> index 
> 82128ef0735bb4b223908b2393a46d97e020156b..5d6c4ed51eac2d136871b52d87faefc2ebaa4a43
>  100644
> --- a/gcc/config/arm/bpabi.h
> +++ b/gcc/config/arm/bpabi.h
> @@ -68,6 +68,7 @@
> |mcpu=cortex-a15.cortex-a7\
> |mcpu=cortex-a17.cortex-a7\
> |mcpu=marvell-pj4 \
> +   |mcpu=cortex-a32  \
> |mcpu=cortex-a35  \
> |mcpu=cortex-a53  \
> |mcpu=cortex-a57  \
> diff --git a/gcc/config/arm/t-aprofile b/gcc/config/arm/t-aprofile
> index 
> 609570643cab23ff699d48a0ea0ee3f991b71c85..b0ecc2fe45da581b6f1cf1a3e1aea7d428c0e533
>  100644
> --- a/gcc/config/arm/t-aprofile
> +++ b/gcc/config/arm/t-aprofile
> @@ -86,6 +86,7 @@ MULTILIB_MATCHES   += march?armv7ve=mcpu?cortex-a12
>  MULTILIB_MATCHES   += march?armv7ve=mcpu?cortex-a17
>  MULTILIB_MATCHES   += march?armv7ve=mcpu?cortex-a15.cortex-a7
>  MULTILIB_MATCHES   += march?armv7ve=mcpu?cortex-a17.cortex-a7
> +MULTILIB_MATCHES   += march?armv8-a=mcpu?cortex-a32
>  MULTILIB_MATCHES   += march?armv8-a=mcpu?cortex-a35
>  MULTILIB_MATCHES   += march?armv8-a=mcpu?cortex-a53
>  MULTILIB_MATCHES   += march?armv8-a=mcpu?cortex-a57
> diff --git a/gcc/doc/invoke

[PATCH][ARM][5 Backport] PR target/69875 Fix atomic_loaddi expansion

2016-02-24 Thread Kyrill Tkachov


Hi all,

This is the GCC 5 backport of 
https://gcc.gnu.org/ml/gcc-patches/2016-02/msg01338.html.
The differences are that TARGET_HAVE_LPAE has to be defined in arm.h in a 
different way because
the ARM_FSET_HAS_CPU1 mechanism doesn't exist on this branch.
Also, the scan-assembler tests that check for the DMB instruction are updated 
to check for
"dmb sy" rather than "dmb ish", because the memory barrier instruction changed 
on trunk for GCC 6.

Bootstrapped and tested on the GCC 5 branch on arm-none-linux-gnueabihf.

Ok for the branch after the trunk patch has had a few days to bake?

Thanks,
Kyrill

2016-02-24  Kyrylo Tkachov  

PR target/69875
* config/arm/arm.h (TARGET_HAVE_LPAE): Define.
* config/arm/unspecs.md (VUNSPEC_LDRD_ATOMIC): New value.
* config/arm/sync.md (arm_atomic_loaddi2_ldrd): New pattern.
(atomic_loaddi_1): Delete.
(atomic_loaddi): Rewrite expander using the above changes.

2016-02-24  Kyrylo Tkachov  

PR target/69875
* gcc.target/arm/atomic_loaddi_acquire.x: New file.
* gcc.target/arm/atomic_loaddi_relaxed.x: Likewise.
* gcc.target/arm/atomic_loaddi_seq_cst.x: Likewise.
* gcc.target/arm/atomic_loaddi_1.c: New test.
* gcc.target/arm/atomic_loaddi_2.c: Likewise.
* gcc.target/arm/atomic_loaddi_3.c: Likewise.
* gcc.target/arm/atomic_loaddi_4.c: Likewise.
* gcc.target/arm/atomic_loaddi_5.c: Likewise.
* gcc.target/arm/atomic_loaddi_6.c: Likewise.
* gcc.target/arm/atomic_loaddi_7.c: Likewise.
* gcc.target/arm/atomic_loaddi_8.c: Likewise.
* gcc.target/arm/atomic_loaddi_9.c: Likewise.
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 928d18faeff645b598b5a7c678d8c8d6e35393e0..5561e433b2929df311aec3263033c14b366836d3 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -369,6 +369,10 @@ extern void (*arm_lang_output_object_attributes_hook)(void);
 /* Nonzero if this chip supports ldrex and strex */
 #define TARGET_HAVE_LDREX	((arm_arch6 && TARGET_ARM) || arm_arch7)
 
+/* Nonzero if this chip supports LPAE.  */
+#define TARGET_HAVE_LPAE		\
+  (arm_arch7 && ((insn_flags & FL_FOR_ARCH7VE) == FL_FOR_ARCH7VE))
+
 /* Nonzero if this chip supports ldrex{bh} and strex{bh}.  */
 #define TARGET_HAVE_LDREXBH	((arm_arch6k && TARGET_ARM) || arm_arch7)
 
diff --git a/gcc/config/arm/sync.md b/gcc/config/arm/sync.md
index 75dd52ea3aa94a227c62b6f77ee78ebf0eee61d5..acafd0a6ec474466ff7e2c67ae16d9a0dbb9cf5c 100644
--- a/gcc/config/arm/sync.md
+++ b/gcc/config/arm/sync.md
@@ -99,32 +99,62 @@ (define_insn "atomic_store"
   [(set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
 
-;; Note that ldrd and vldr are *not* guaranteed to be single-copy atomic,
-;; even for a 64-bit aligned address.  Instead we use a ldrexd unparied
-;; with a store.
+;; An LDRD instruction usable by the atomic_loaddi expander on LPAE targets
+
+(define_insn "arm_atomic_loaddi2_ldrd"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(unspec_volatile:DI
+	  [(match_operand:DI 1 "arm_sync_memory_operand" "Q")]
+	VUNSPEC_LDRD_ATOMIC))]
+  "ARM_DOUBLEWORD_ALIGN && TARGET_HAVE_LPAE"
+  "ldrd%?\t%0, %H0, %C1"
+  [(set_attr "predicable" "yes")
+   (set_attr "predicable_short_it" "no")])
+
+;; There are three ways to expand this depending on the architecture
+;; features available.  As for the barriers, a load needs a barrier
+;; after it on all non-relaxed memory models except when the load
+;; has acquire semantics (for ARMv8-A).
+
 (define_expand "atomic_loaddi"
   [(match_operand:DI 0 "s_register_operand")		;; val out
(match_operand:DI 1 "mem_noofs_operand")		;; memory
(match_operand:SI 2 "const_int_operand")]		;; model
-  "TARGET_HAVE_LDREXD && ARM_DOUBLEWORD_ALIGN"
+  "(TARGET_HAVE_LDREXD || TARGET_HAVE_LPAE || TARGET_HAVE_LDACQ)
+   && ARM_DOUBLEWORD_ALIGN"
 {
-  enum memmodel model = memmodel_from_int (INTVAL (operands[2]));
-  expand_mem_thread_fence (model);
-  emit_insn (gen_atomic_loaddi_1 (operands[0], operands[1]));
-  if (is_mm_seq_cst (model))
+  memmodel model = memmodel_from_int (INTVAL (operands[2]));
+
+  /* For ARMv8-A we can use an LDAEXD to atomically load two 32-bit registers
+ when acquire or stronger semantics are needed.  When the relaxed model is
+ used this can be relaxed to a normal LDRD.  */
+  if (TARGET_HAVE_LDACQ)
+{
+  if (is_mm_relaxed (model))
+	emit_insn (gen_arm_atomic_loaddi2_ldrd (operands[0], operands[1]));
+  else
+	emit_insn (gen_arm_load_acquire_exclusivedi (operands[0], operands[1]));
+
+  DONE;
+}
+
+  /* On LPAE targets LDRD and STRD accesses to 64-bit aligned
+ locations are 64-bit single-copy atomic.  We still need barriers in the
+ appropriate places to implement the ordering constraints.  */
+  if (TARGET_HAVE_LPAE)
+emit_insn (gen_arm_atomic_loaddi2_ldrd (operands[0], operands[1]));
+  else
+emit_insn (gen_arm_load_exclusivedi (operands[0], operands[1]));
+
+
+  /* All non-relaxed models need a barrier after the l

[PATCH][ARM][4.9 Backport] PR target/69875 Fix atomic_loaddi expansion

2016-02-24 Thread Kyrill Tkachov


Hi all,

This is the GCC 4.9 backport of 
https://gcc.gnu.org/ml/gcc-patches/2016-02/msg01338.html.
The differences are that TARGET_HAVE_LPAE has to be defined in arm.h in a 
different way because
the ARM_FSET_HAS_CPU1 mechanism doesn't exist on this branch. Also, due to the 
location of insn_flags
and the various FL_* (on the 4.9 branch they're defined locally in arm.c rather 
than in arm-protos.h)
I chose to define TARGET_HAVE_LPAE in terms of hardware divide instruction 
availability. This should be
an equivalent definition.

Also, the scan-assembler tests that check for the DMB instruction are updated 
to check for
"dmb sy" rather than "dmb ish", because the memory barrier instruction changed 
on trunk for GCC 6.

Bootstrapped and tested on the GCC 4.9 branch on arm-none-linux-gnueabihf.

Ok for the branch after the trunk patch has had a few days to bake?

Thanks,
Kyrill

2016-02-24  Kyrylo Tkachov  

PR target/69875
* config/arm/arm.h (TARGET_HAVE_LPAE): Define.
* config/arm/unspecs.md (VUNSPEC_LDRD_ATOMIC): New value.
* config/arm/sync.md (arm_atomic_loaddi2_ldrd): New pattern.
(atomic_loaddi_1): Delete.
(atomic_loaddi): Rewrite expander using the above changes.

2016-02-24  Kyrylo Tkachov  

PR target/69875
* gcc.target/arm/atomic_loaddi_acquire.x: New file.
* gcc.target/arm/atomic_loaddi_relaxed.x: Likewise.
* gcc.target/arm/atomic_loaddi_seq_cst.x: Likewise.
* gcc.target/arm/atomic_loaddi_1.c: New test.
* gcc.target/arm/atomic_loaddi_2.c: Likewise.
* gcc.target/arm/atomic_loaddi_3.c: Likewise.
* gcc.target/arm/atomic_loaddi_4.c: Likewise.
* gcc.target/arm/atomic_loaddi_5.c: Likewise.
* gcc.target/arm/atomic_loaddi_6.c: Likewise.
* gcc.target/arm/atomic_loaddi_7.c: Likewise.
* gcc.target/arm/atomic_loaddi_8.c: Likewise.
* gcc.target/arm/atomic_loaddi_9.c: Likewise.
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 7c9b1b91f0926b1530bd61bb187772962b2a227d..ad97c01312cdb215768644413dd1a4061f8bd046 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -363,6 +363,11 @@ extern void (*arm_lang_output_object_attributes_hook)(void);
 /* Nonzero if this chip supports ldrex and strex */
 #define TARGET_HAVE_LDREX	((arm_arch6 && TARGET_ARM) || arm_arch7)
 
+/* Nonzero if this chip supports LPAE.  Such systems also support the
+   hardware divide instructions.  */
+#define TARGET_HAVE_LPAE		\
+  (arm_arch7 && arm_arch_arm_hwdiv && arm_arch_thumb_hwdiv)
+
 /* Nonzero if this chip supports ldrex{bh} and strex{bh}.  */
 #define TARGET_HAVE_LDREXBH	((arm_arch6k && TARGET_ARM) || arm_arch7)
 
diff --git a/gcc/config/arm/sync.md b/gcc/config/arm/sync.md
index 25ed926fc4e71db499d7b7fe40dc54a02182d6cb..5218ea3b8cbfee1f2fb04dd8720a1810317976af 100644
--- a/gcc/config/arm/sync.md
+++ b/gcc/config/arm/sync.md
@@ -103,31 +103,61 @@ (define_insn "atomic_store"
   [(set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")])
 
-;; Note that ldrd and vldr are *not* guaranteed to be single-copy atomic,
-;; even for a 64-bit aligned address.  Instead we use a ldrexd unparied
-;; with a store.
+;; An LDRD instruction usable by the atomic_loaddi expander on LPAE targets
+
+(define_insn "arm_atomic_loaddi2_ldrd"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (unspec_volatile:DI
+	[(match_operand:DI 1 "arm_sync_memory_operand" "Q")]
+	  VUNSPEC_LDRD_ATOMIC))]
+  "ARM_DOUBLEWORD_ALIGN && TARGET_HAVE_LPAE"
+  "ldrd%?\t%0, %H0, %C1"
+  [(set_attr "predicable" "yes")
+   (set_attr "predicable_short_it" "no")])
+
+;; There are three ways to expand this depending on the architecture
+;; features available.  As for the barriers, a load needs a barrier
+;; after it on all non-relaxed memory models except when the load
+;; has acquire semantics (for ARMv8-A).
+
 (define_expand "atomic_loaddi"
   [(match_operand:DI 0 "s_register_operand")		;; val out
(match_operand:DI 1 "mem_noofs_operand")		;; memory
(match_operand:SI 2 "const_int_operand")]		;; model
-  "TARGET_HAVE_LDREXD && ARM_DOUBLEWORD_ALIGN"
+  "(TARGET_HAVE_LDREXD || TARGET_HAVE_LPAE || TARGET_HAVE_LDACQ)
+   && ARM_DOUBLEWORD_ALIGN"
 {
   enum memmodel model = (enum memmodel) INTVAL (operands[2]);
-  expand_mem_thread_fence (model);
-  emit_insn (gen_atomic_loaddi_1 (operands[0], operands[1]));
-  if (model == MEMMODEL_SEQ_CST)
-expand_mem_thread_fence (model);
-  DONE;
-})
 
-(define_insn "atomic_loaddi_1"
-  [(set (match_operand:DI 0 "s_register_operand" "=r")
-	(unspec:DI [(match_operand:DI 1 "mem_noofs_operand" "Ua")]
-		   UNSPEC_LL))]
-  "TARGET_HAVE_LDREXD && ARM_DOUBLEWORD_ALIGN"
-  "ldrexd%?\t%0, %H0, %C1"
-  [(set_attr "predicable" "yes")
-   (set_attr "predicable_short_it" "no")])
+  /* For ARMv8-A we can use an LDAEXD to atomically load two 32-bit registers
+ when acquire or stronger semantics are needed.  When the relaxed model is
+ used this can be relaxed to a normal LDRD.  */
+  if (TARGET_HAVE_LDACQ)
+{
+

Re: [PATCH] Fix thinko in build_vector_from_ctor (PR middle-end/69915)

2016-02-24 Thread Richard Biener

On Tue, Feb 23, 2016 at 9:06 PM, Jakub Jelinek  wrote:
> Hi!
>
> This function has changed last year to support embedded VECTOR_CSTs in the
> ctor elements.  Before that change, there was no pos var and idx used to
> match exactly the indices in the new vector, but if there is any VECTOR_CST,
> it will fill in more positions.
> Unfortunately, the final loop which zeros in any positions not filled in yet
> has not changed, which is wrong for the case when there were any
> VECTOR_CSTs.  E.g. on the testcase, we have a V16HImode type ctor which
> contains two V8HImode VECTOR_CSTs (full of zeros).  Each of them fills in
> 8 positions, so the final loop shouldn't add anything, but as idx at that
> point is 2, it will add further 14 elements, resulting in alloca
> buffer overflow.
>
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?

Ok.

Thanks,
Richard.

> 2016-02-23  Jakub Jelinek  
>
> PR middle-end/69915
> * tree.c (build_vector_from_ctor): Fix handling of VECTOR_CST
> elements.
>
> * gcc.dg/pr69915.c: New test.
>
> --- gcc/tree.c.jj   2016-02-08 18:39:17.0 +0100
> +++ gcc/tree.c  2016-02-23 15:50:03.566700694 +0100
> @@ -1749,7 +1749,7 @@ build_vector_from_ctor (tree type, vecelse
> vec[pos++] = value;
>  }
> -  for (; idx < TYPE_VECTOR_SUBPARTS (type); ++idx)
> +  while (pos < TYPE_VECTOR_SUBPARTS (type))
>  vec[pos++] = build_zero_cst (TREE_TYPE (type));
>
>return build_vector (type, vec);
> --- gcc/testsuite/gcc.dg/pr69915.c.jj   2016-02-23 16:02:09.825732486 +0100
> +++ gcc/testsuite/gcc.dg/pr69915.c  2016-02-23 16:01:47.0 +0100
> @@ -0,0 +1,15 @@
> +/* PR middle-end/69915 */
> +/* { dg-do compile } */
> +/* { dg-options "-O -ftracer" } */
> +
> +typedef unsigned short V __attribute__ ((vector_size (32)));
> +
> +unsigned
> +foo (unsigned x, unsigned c, V *p)
> +{
> +  V v = *p;
> +  if (c < 360)
> +v = (V) { 0 };
> +  v *= (V) { x };
> +  return v[1];
> +}
>
> Jakub

Re: better debug info for C++ cdtors, aliases, thunks and other trampolines

2016-02-24 Thread Richard Biener

On Wed, Feb 24, 2016 at 5:07 AM, Alexandre Oliva  wrote:
> Hi, Richard,
>
> Thanks for the feedback.  I'm afraid I can't quite figure out what
> you're getting at.  Please see below.
>
> On Feb 22, 2016, Richard Biener  wrote:
>
>> I think this breaks early-debug assumptions in creating new decl DIEs
>> rather than just annotating old ones.
>
> Can you elaborate on it, or point at where these assumptions you allude
> to are documented?  I'm afraid I can't even tell whether the problem you
> allude to has to do with users of the debug hooks interface or the
> dwarf2out implementation thereof.
>
> Sure enough, we haven't created DIEs for user-introduced or C++ cdtor
> aliases before, so there are no DIEs for us to annotate, and there are
> no other uses of the debug hooks interface related with them, that could
> possibly interfere with them.
>
> Conversely, alias decls for which we *have* created DIEs are ones that
> cgraph turned into aliases; we do NOT want to pretend they're the same
> function, and ideally we'd emit separate debug information for them, but
> we can't retroactively figure out blocks, line numbers, variable
> locations and whatnot for the unrelated function that happened to
> optimize to the same executable code.  The best we can do for such
> aliases ATM is to leave them alone; that's unchanged.
>
>> So assemble_aliases is the wrong spot to do this.
>
> Here, you seem to be talking about *users* of the debug hooks interface.
> But then, I'd argue that the fact that debug info for aliases in
> dwarf2out is implemented as DIEs is an internal implementation detail,
> so why should assumptions made by the user side of the interface matter?

early-debug means that when dwarf2out_early_finish runs we have created
all type and decl DIEs.  From thereon only things like location expressions
should be added to existing DIEs.

So I was chiming in to see whether the added DIEs may cause any
issue when going forward with dropping frontends after dwarf2out_early_finish.
The function_decl call in expand_thunk looks dangerous, the new hooks
are self-contained enough that those late DIEs should (hopefully) cause
no issues (they can be seen as annotating an exiting DIE).

Richard.

> --
> Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

Re: [PATCH][gcse] PR rtl-optimization/69886: Check target mode in can_assign_to_reg_without_clobbers_p

2016-02-24 Thread Richard Biener

On Wed, Feb 24, 2016 at 10:34 AM, Kyrill Tkachov
 wrote:
> Hi all,
>
> In this PR we get an ICE when the hoist pass ends up creating an
> (insn 88 0 0 (set (reg:OI 136)
> (const_int 0 [0])) -1
>  (nil))
>
> instruction. AArch64 doesn't support such an OImode set.
> The only OImode set operations that aarch64 supports are load/store-multiple
> operations
> on vector registers.
>
> want_to_gcse_p should have rejected this move long before
> process_insert_insn tried to
> insert it in the stream.  But it didn't because
> can_assign_to_reg_without_clobbers_p is only
> given the (const_int 0) expression and asked whether there can be a valid
> SET operation on that.
> It should also consider the mode that such an operation is requested in,
> rather than extracting
> the mode from the operand (VOIDmode for CONST_INTs). Luckily, want_to_gcse_p
> already has a mode
> argument that it uses in its costs calculation, so we can just pass it down.
>
> This patch extends can_assign_to_reg_without_clobbers_p to take a mode
> argument and use it when
> testing the validity of the SET instructions that it creates, so such an
> OImode move is properly
> rejected.
>
> Bootstrapped and tested on aarch64-none-linux-gnu, arm-none-linux-gnueabihf,
> x86_64-unknown-linux-gnu.
> There are no codegen differences on SPEC2006 for aarch64 resulting from this
> patch.
>
> This bug appears in all versions that have aarch64, so it's not a
> regression, but I think it's
> a fairly low risk patch.
>
> Is this ok for trunk now or when stage 1 reopens?

Ok for trunk.

Thanks,
Richard.

> Thanks,
> Kyrill
>
> 2016-02-24  Kyrylo Tkachov  
>
> PR rtl-optimization/69886
> * gcse.c (can_assign_to_reg_without_clobbers_p): Accept mode
> argument.  Use it when checking validity of set instructions.
> (want_to_gcse_p): Pass mode to can_assign_to_reg_without_clobbers_p.
> (compute_ld_motion_mems): Update can_assign_to_reg_without_clobbers_p
> callsite.
> * rtl.h (can_assign_to_reg_without_clobbers_p): Update prototype.
> * store-motion.c (find_moveable_store): Update
> can_assign_to_reg_without_clobbers_p callsite.
>
> 2016-02-24  Kyrylo Tkachov  
>
> PR rtl-optimization/69886
> * gcc.dg/torture/pr69886.c: New test.

Re: [PR69315] enable finish_function to recurse for constexpr functions

2016-02-24 Thread Alexandre Oliva

On Feb 10, 2016, Alexandre Oliva  wrote:

> On Jan 26, 2016, Alexandre Oliva  wrote:
>> We don't want finish_function to be called recursively from mark_used.
>> However, it's desirable and necessary to call itself recursively when
>> performing delayed folding, because that may have to instantiate and
>> evaluate constexpr template functions.

>> So, arrange for finish_function to accept being called recursively
>> during delayed folding, save and restore the controlling variables,
>> and process the deferred mark_used calls only when the outermost call
>> completes.

>> Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  Ok to install?

> Ping?

> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg02010.html

Ping?

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

[PING][PATCH, PR69607] Mark offload symbols as global in lto

2016-02-24 Thread Tom de Vries


On 17/02/16 16:48, Tom de Vries wrote:

On 17/02/16 13:30, Jakub Jelinek wrote:

On Wed, Feb 17, 2016 at 01:02:17PM +0100, Tom de Vries wrote:

Mark offload symbols as global in lto


I'm really not familiar with that part of LTO, so I'm CCing Honza and
Richard here.

2016-02-08  Tom de Vries  

PR lto/69607
* lto-partition.c (promote_offload_tables): New function.
* lto-partition.h (promote_offload_tables):  Declare.


Just one space instead of two after :


* lto.c (do_whole_program_analysis): call promote_offload_tables.


Capital C in Call.



Done.


diff --git a/libgomp/testsuite/libgomp.c/target-37.c
b/libgomp/testsuite/libgomp.c/target-37.c
new file mode 100644
index 000..1edb21e
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/target-37.c
@@ -0,0 +1,98 @@
+/* { dg-do run { target lto } } */
+/* { dg-additional-sources "target-38.c" } */
+/* { dg-additional-options "-flto -flto-partition=1to1
-fno-toplevel-reorder" } */
+
+extern
+#ifdef __cplusplus
+"C"
+#endif
+void abort (void);


Why the C++ stuff in there?  Do you intend to include the testcase
also in libgomp.c++?


No, that's just there because I started both target-37.c and target-38.c
by copying target-1.c.


If not, it is not needed.


Removed.


Otherwise, the tests LGTM.



Updated patch attached.



Ping.

[ Original submission here:
https://gcc.gnu.org/ml/gcc-patches/2016-02/msg00547.html
  Latest patch with updated testcases:
https://gcc.gnu.org/ml/gcc-patches/2016-02/msg01196.html
]

OK for trunk, stage1 (or stage4, if that's appropriate)?


Thanks,
- Tom


0001-Mark-offload-symbols-as-global-in-lto.patch


Mark offload symbols as global in lto

2016-02-17  Tom de Vries  

PR lto/69607
* lto-partition.c (promote_offload_tables): New function.
* lto-partition.h (promote_offload_tables): Declare.
* lto.c (do_whole_program_analysis): Call promote_offload_tables.

* testsuite/libgomp.c/target-36.c: New test.
* testsuite/libgomp.c/target-37.c: New test.
* testsuite/libgomp.c/target-38.c: New test.

---
  gcc/lto/lto-partition.c | 28 ++
  gcc/lto/lto-partition.h |  1 +
  gcc/lto/lto.c   |  2 +
  libgomp/testsuite/libgomp.c/target-36.c |  4 ++
  libgomp/testsuite/libgomp.c/target-37.c | 94 +
  libgomp/testsuite/libgomp.c/target-38.c | 91 +++
  6 files changed, 220 insertions(+)

diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
index 9eb63c2..56598d4 100644
--- a/gcc/lto/lto-partition.c
+++ b/gcc/lto/lto-partition.c
@@ -34,6 +34,7 @@ along with GCC; see the file COPYING3.  If not see
  #include "ipa-prop.h"
  #include "ipa-inline.h"
  #include "lto-partition.h"
+#include "omp-low.h"

  vec ltrans_partitions;

@@ -1003,6 +1004,33 @@ promote_symbol (symtab_node *node)
"Promoting as hidden: %s\n", node->name ());
  }

+/* Promote the symbols in the offload tables.  */
+
+void
+promote_offload_tables (void)
+{
+  if (vec_safe_is_empty (offload_funcs) && vec_safe_is_empty (offload_vars))
+return;
+
+  for (unsigned i = 0; i < vec_safe_length (offload_funcs); i++)
+{
+  tree fn_decl = (*offload_funcs)[i];
+  cgraph_node *node = cgraph_node::get (fn_decl);
+  if (node->externally_visible)
+   continue;
+  promote_symbol (node);
+}
+
+  for (unsigned i = 0; i < vec_safe_length (offload_vars); i++)
+{
+  tree var_decl = (*offload_vars)[i];
+  varpool_node *node = varpool_node::get (var_decl);
+  if (node->externally_visible)
+   continue;
+  promote_symbol (node);
+}
+}
+
  /* Return true if NODE needs named section even if it won't land in the 
partition
 symbol table.
 FIXME: we should really not use named sections for inline clones and master
diff --git a/gcc/lto/lto-partition.h b/gcc/lto/lto-partition.h
index 31e3764..1a38126 100644
--- a/gcc/lto/lto-partition.h
+++ b/gcc/lto/lto-partition.h
@@ -36,6 +36,7 @@ extern vec ltrans_partitions;
  void lto_1_to_1_map (void);
  void lto_max_map (void);
  void lto_balanced_map (int);
+extern void promote_offload_tables (void);
  void lto_promote_cross_file_statics (void);
  void free_ltrans_partitions (void);
  void lto_promote_statics_nonwpa (void);
diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
index 9dd513f..2736c5c 100644
--- a/gcc/lto/lto.c
+++ b/gcc/lto/lto.c
@@ -3138,6 +3138,8 @@ do_whole_program_analysis (void)
   to globals with hidden visibility because they are accessed from multiple
   partitions.  */
lto_promote_cross_file_statics ();
+  /* Promote all the offload symbols.  */
+  promote_offload_tables ();
timevar_pop (TV_WHOPR_PARTITIONING);

timevar_stop (TV_PHASE_OPT_GEN);
diff --git a/libgomp/testsuite/libgomp.c/target-36.c 
b/libgomp/testsuite/libgomp.c/target-36.c
new file mode 100644
index 000..bafb718
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/target-36.c
@@ -0,0

[gomp4][PR 69916] Fix ICE

2016-02-24 Thread Nathan Sidwell

bug 69916 is an instance of an openacc loop that is lowered during omp-low, but 
determined to be a nop and removed before oacc-device-lower.  (j is dead after 
the loop).  This confused the openacc loop transform code, which still detected 
it via its loop header, but encountered no IFN_GOACC_LOOP fns to transform.


Fixed by counting the IFN_GOACC_LOOP calls during loop detection, and not trying 
to transform those calls if they don't exist.


nathan
2016-02-24  Nathan Sidwell  

	gcc/
	PR other/69916
	* omp-low.c (struct oacc_loop): Add ifns.
	(new_oacc_loop_raw): Initialize it.
	(finish_oacc_loop): Clear mask & flags if no ifns.
	(oacc_loop_discover_walk): Count IFN_GOACC_LOOP calls.
	(oacc_loop_xform_loop): Add ifns arg & adjust.
	(oacc_loop_process): Adjust oacc_loop_xform_loop call.

	gcc/testsuite/
	PR other/69916
	* c-c-++-common/goacc/pr69916.c: New.

Index: omp-low.c
===
--- omp-low.c	(revision 233648)
+++ omp-low.c	(working copy)
@@ -254,6 +254,7 @@ struct oacc_loop
   unsigned mask;   /* Partitioning mask.  */
   unsigned inner;  /* Partitioning of inner loops.  */
   unsigned flags;  /* Partitioning flags.  */
+  unsigned ifns;   /* Contained loop abstraction functions.  */
   tree chunk_size; /* Chunk size.  */
   gcall *head_end; /* Final marker of head sequence.  */
 };
@@ -20709,6 +20710,7 @@ new_oacc_loop_raw (oacc_loop *parent, lo
   loop->routine = NULL_TREE;
 
   loop->mask = loop->flags = loop->inner = 0;
+  loop->ifns = 0;
   loop->chunk_size = 0;
   loop->head_end = NULL;
 
@@ -20770,6 +20772,9 @@ new_oacc_loop_routine (oacc_loop *parent
 static oacc_loop *
 finish_oacc_loop (oacc_loop *loop)
 {
+  /* If the loop has been collapsed, don't partition it.  */
+  if (!loop->ifns)
+loop->mask = loop->flags = 0;
   return loop->parent;
 }
 
@@ -20900,43 +20905,54 @@ oacc_loop_discover_walk (oacc_loop *loop
   if (!gimple_call_internal_p (call))
 	continue;
 
-  if (gimple_call_internal_fn (call) != IFN_UNIQUE)
-	continue;
+  switch (gimple_call_internal_fn (call))
+	{
+	default:
+	  break;
 
-  enum ifn_unique_kind kind
-	= (enum ifn_unique_kind) TREE_INT_CST_LOW (gimple_call_arg (call, 0));
-  if (kind == IFN_UNIQUE_OACC_HEAD_MARK
-	  || kind == IFN_UNIQUE_OACC_TAIL_MARK)
-	{
-	  if (gimple_call_num_args (call) == 2)
-	{
-	  gcc_assert (marker && !remaining);
-	  marker = 0;
-	  if (kind == IFN_UNIQUE_OACC_TAIL_MARK)
-		loop = finish_oacc_loop (loop);
-	  else
-		loop->head_end = call;
-	}
-	  else
-	{
-	  int count = TREE_INT_CST_LOW (gimple_call_arg (call, 2));
+	case IFN_GOACC_LOOP:
+	  /* Count the goacc loop abstraction fns, to determine if the
+	 loop was collapsed already.  */
+	  loop->ifns++;
+	  break;
 
-	  if (!marker)
+	case IFN_UNIQUE:
+	  enum ifn_unique_kind kind
+	= (enum ifn_unique_kind) (TREE_INT_CST_LOW
+  (gimple_call_arg (call, 0)));
+	  if (kind == IFN_UNIQUE_OACC_HEAD_MARK
+	  || kind == IFN_UNIQUE_OACC_TAIL_MARK)
+	{
+	  if (gimple_call_num_args (call) == 2)
 		{
-		  if (kind == IFN_UNIQUE_OACC_HEAD_MARK)
-		loop = new_oacc_loop (loop, call);
-		  remaining = count;
+		  gcc_assert (marker && !remaining);
+		  marker = 0;
+		  if (kind == IFN_UNIQUE_OACC_TAIL_MARK)
+		loop = finish_oacc_loop (loop);
+		  else
+		loop->head_end = call;
 		}
-	  gcc_assert (count == remaining);
-	  if (remaining)
+	  else
 		{
-		  remaining--;
-		  if (kind == IFN_UNIQUE_OACC_HEAD_MARK)
-		loop->heads[marker] = call;
-		  else
-		loop->tails[remaining] = call;
+		  int count = TREE_INT_CST_LOW (gimple_call_arg (call, 2));
+
+		  if (!marker)
+		{
+		  if (kind == IFN_UNIQUE_OACC_HEAD_MARK)
+			loop = new_oacc_loop (loop, call);
+		  remaining = count;
+		}
+		  gcc_assert (count == remaining);
+		  if (remaining)
+		{
+		  remaining--;
+		  if (kind == IFN_UNIQUE_OACC_HEAD_MARK)
+			loop->heads[marker] = call;
+		  else
+			loop->tails[remaining] = call;
+		}
+		  marker++;
 		}
-	  marker++;
 	}
 	}
 }
@@ -21042,10 +21058,12 @@ oacc_loop_xform_head_tail (gcall *from,
determined partitioning mask and chunking argument.  */
 
 static void
-oacc_loop_xform_loop (gcall *end_marker, tree mask_arg, tree chunk_arg)
+oacc_loop_xform_loop (gcall *end_marker, unsigned ifns,
+		  tree mask_arg, tree chunk_arg)
 {
   gimple_stmt_iterator gsi = gsi_for_stmt (end_marker);
   
+  gcc_checking_assert (ifns);
   for (;;)
 {
   for (; !gsi_end_p (gsi); gsi_next (&gsi))
@@ -21065,13 +21083,13 @@ oacc_loop_xform_loop (gcall *end_marker,
 
 	  *gimple_call_arg_ptr (call, 5) = mask_arg;
 	  *gimple_call_arg_ptr (call, 4) = chunk_arg;
-	  if (TREE_INT_CST_LOW (gimple_call_arg (call, 0))
-	  == IFN_GOACC_LOOP_BOUND)
+	  ifns--;
+	  if (!ifns)
 	return;
 	}
 
-  /* If we didn't see LOOP_BOUND, it should be in the single
-	 successor block.  */
+

Re: [PATCH][ARM][RFC] PR target/65578 Fix gcc.dg/torture/stackalign/builtin-apply-4.c for single-precision fpus

2016-02-24 Thread Kyrill Tkachov


Ping*2

Thanks,
Kyrill

On 17/02/16 10:12, Kyrill Tkachov wrote:

Ping.
https://gcc.gnu.org/ml/gcc-patches/2016-02/msg00634.html

As mentioned before, this is actually a fix for PR target/69538.
I got confused when writing the cover letter and ChangeLog...

Thanks,
Kyrill

On 09/02/16 17:24, Kyrill Tkachov wrote:


On 09/02/16 17:21, Kyrill Tkachov wrote:

Hi all,

In this wrong-code PR the builtin-apply-4.c test fails with -flto but only when 
targeting an fpu
with only single-precision capabilities.

bar is a function returing a double. For non-LTO compilation the caller of bar 
reads the return value
from it from the s0 and s1 VFP registers like expected, but for -flto the 
caller seems to expect the
return value from the r0 and r1 regs.  The RTL dumps show that too.

Debugging the calls to arm_function_value show that in the -flto compilation 
the function bar is deemed
to be a local function call and assigned the ARM_PCS_AAPCS_LOCAL PCS variant, 
whereas for the non-LTO (and non-breaking)
compilation it uses the ARM_PCS_AAPCS_VFP variant.

Further down in use_vfp_abi when deciding whether to use VFP registers for the 
result there is a bit of
logic that rejects VFP registers when handling the ARM_PCS_AAPCS_LOCAL variant 
with a double precision value
on an FPU that is not TARGET_VFP_DOUBLE.

This seems wrong for ARM_PCS_AAPCS_LOCAL to me. ARM_PCS_AAPCS_LOCAL means that 
the function doesn't escape
the translation unit and we can thus use whatever variant we want. From what I 
understand we want to use the
VFP regs when possible for FP values.

So this patch removes that restriction and for the testcase the caller of bar 
correctly reads the return
value of bar from the VFP registers and everything works.

This patch has been bootstrapped and tested on arm-none-linux-gnueabihf 
configured with --with-fpu=fpv4-sp-d16.
The bootstrapped was performed with LTO.
I didn't see any regressions.

It seems that this logic was put there in 2009 with r154034 as part of a large 
patch to enable support for half-precision
floating point.

I'm not very familiar with this part of the code, so is this a safe patch to do?
The patch should only ever change behaviour for single-precision-only fpus and 
only for static functions
that don't get called outside their translation units (or during LTO I suppose) 
so there shouldn't
be any ABI problems, I think.

Is this ok for trunk?

Thanks,
Kyrill



Huh, I just realised I wrote completely the wrong PR number on this.
The PR I'm talking about here is PR target/69538

Sorry for the confusion.

Kyrill



2016-02-09 Kyrylo Tkachov 

PR target/65578
* config/arm/arm.c (use_vfp_abi): Remove id_double argument.
Don't check for is_double and TARGET_VFP_DOUBLE.
(aapcs_vfp_is_call_or_return_candidate): Update callsite.
(aapcs_vfp_is_return_candidate): Likewise.
(aapcs_vfp_is_call_candidate): Likewise.
(aapcs_vfp_allocate_return_reg): Likewise.

Re: [PATCH] Fix PR c++/69736

2016-02-24 Thread Jason Merrill


On 02/23/2016 11:24 AM, Patrick Palka wrote:

1. making tsubst_copy_and_build retain the REF_PARENTHESIZED_P flag when
processing an INDIRECT_REF, or by


This should happen in any case.


2. moving the call to maybe_undo_parenthesized_ref in finish_call_expr before
the assignment of orig_fn so that orig_fn will be un-obfuscated as well, or by


This would also be OK; at this point we know the expression isn't the 
operand of decltype.


Jason

Re: [PATCH] 69912 - [6 regression] ICE in build_ctor_subob_ref initializing a flexible array member

2016-02-24 Thread Jason Merrill


OK.

Jason

[WWWDocs] Deprecate support for non-thumb ARM devices

2016-02-24 Thread Richard Earnshaw (lists)

After discussion with the ARM port maintainers we have decided that now
is probably the right time to deprecate support for versions of the ARM
Architecture prior to ARMv4t.  This will allow us to clean up some of
the code base going forwards by being able to assume:
- Presence of half-word data accesses
- Presence of Thumb and therefore of interworking instructions.

This patch records the status change in the GCC-6 release notes.

I propose to commit this patch later this week.

R.
Index: htdocs/gcc-6/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/changes.html,v
retrieving revision 1.61
diff -u -r1.61 changes.html
--- htdocs/gcc-6/changes.html   19 Feb 2016 05:00:54 -  1.61
+++ htdocs/gcc-6/changes.html   19 Feb 2016 14:47:31 -
@@ -340,7 +340,14 @@
 ARM

  
-   The arm port now supports target attributes and pragmas.  Please
+   Support for revisions of the ARM architecture prior to ARMv4t has
+   been deprecated and will be removed in a future GCC release.
+   This affects ARM6, ARM7 (but not ARM7TDMI), ARM8, StrongARM, and
+   Faraday fa526 and fa626 devices, which do not have support for
+   the Thumb execution state.
+ 
+ 
+   The ARM port now supports target attributes and pragmas.  Please
refer to the https://gcc.gnu.org/onlinedocs/gcc/ARM-Function-Attributes.html#ARM-Function-Attributes";>
documentation for details of available attributes and
pragmas as well as usage instructions.

[patch] libstdc++/69939 Qualify get and forward

2016-02-24 Thread Jonathan Wakely


A trivial fix to .

Tested x86_64-linux, committed to trunk and gcc-5-branch.
commit 124d28afe0c515a5d60ba2bbe788c2029329dbc5
Author: Jonathan Wakely 
Date:   Wed Feb 24 13:15:53 2016 +

libstdc++/69939 Qualify get and forward

	PR libstdc++/69939
	* include/experimental/tuple (__apply_impl): Qualify get and forward.

diff --git a/libstdc++-v3/include/experimental/tuple b/libstdc++-v3/include/experimental/tuple
index e3896e4..81e91bd 100644
--- a/libstdc++-v3/include/experimental/tuple
+++ b/libstdc++-v3/include/experimental/tuple
@@ -58,7 +58,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 {
   using _Wrap = _Maybe_wrap_member_pointer>;
   return _Wrap::__do_wrap(std::forward<_Fn>(f))(
-	  get<_Idx>(forward<_Tuple>(t))...);
+	  std::get<_Idx>(std::forward<_Tuple>(t))...);
 }
 
   template

Re: [PATCH][ARM] Tests for arm_restrict_it patterns in thumb2.md

2016-02-24 Thread Kyrill Tkachov


Ping.
https://gcc.gnu.org/ml/gcc-patches/2016-02/msg00396.html

Thanks,
Kyrill
On 05/02/16 10:00, Kyrill Tkachov wrote:

Hi all,

I've been auditing the patterns in the arm backend that were added/modified for 
-mrestrict-it
and I've come up with a few runtime tests that end up generating those patterns.
This patch contains 4 tests for 4 patterns that have paths specific to 
-mrestrict-it.

There were some patterns like thumb2_mov_negscc_strict_it and 
thumb2_mov_notscc_strict_it
that I could not generate at all, because the earlier RTL optimisers always 
generated some
equivalent but different (and at least as good) so these splitters never 
matched.
I think we could safely remove them, but not at this stage.

These tests should give us a bit more test coverage into the -mrestrict-it 
functionality.

Ok for trunk?

Thanks,
Kyrill

2016-02-05  Kyrylo Tkachov  

* gcc.target/arm/cond_sub_restrict_it.c: New test.
* gcc.target/arm/condarith_restrict_it.c: Likewise.
* gcc.target/arm/movcond_restrict_it.c: Likewise.
* gcc.target/arm/negscc_restrict_it.c: Likewise.

Re: [PATCH 10/9] ENABLE_CHECKING refactoring: remove remaining occurrences

2016-02-24 Thread Martin Liška

On 02/23/2016 04:21 PM, Richard Biener wrote:
> On Wed, Nov 4, 2015 at 4:03 PM, Mikhail Maltsev  wrote:
>> On 11/03/2015 02:35 AM, Jeff Law wrote:
>>> This is good fore the trunk too.  Please install.
>>>
>>> Thanks!
>>>
>>> jeff
>>
>> Committed as r229758.
> 
>> grep ENABLE_CHECKING *.[ch]
> dwarf2out.c:#if ENABLE_CHECKING
> dwarf2out.c:#if ENABLE_CHECKING
> dwarf2out.c:#if ENABLE_CHECKING
> dwarf2out.h:#if ENABLE_CHECKING

Hi Richi.

Removal in dwarf2out.c is not possible due to assignment (and read) of
a struct member that is conditional in dwarf2out.h:

struct GTY((chain_next ("%h.dw_loc_next"))) dw_loc_descr_node {
...
#if ENABLE_CHECKING
  /* When translating a function into a DWARF procedure, contains the frame
 offset *before* evaluating this operation.  It is -1 when not yet
 initialized.  */
  int dw_loc_frame_offset;
#endif
};

> hsa-gen.c:#ifdef ENABLE_CHECKING
> hsa-regalloc.c:#ifdef ENABLE_CHECKING
>> grep ENABLE_CHECKING ada/gcc-interface/*.[ch]
> ada/gcc-interface/utils.c:#ifdef ENABLE_CHECKING

I've just prepared patches for remaining files., btw. is it an acceptable 
stage4 material?

Thanks,
Martin

> 
> 
>> --
>> Regards,
>> Mikhail Maltsev

Re: [PATCH][ARM] Tests for arm_restrict_it patterns in thumb2.md

2016-02-24 Thread Ramana Radhakrishnan

On Fri, Feb 5, 2016 at 10:00 AM, Kyrill Tkachov
 wrote:
> Hi all,
>
> I've been auditing the patterns in the arm backend that were added/modified
> for -mrestrict-it
> and I've come up with a few runtime tests that end up generating those
> patterns.
> This patch contains 4 tests for 4 patterns that have paths specific to
> -mrestrict-it.
>
> There were some patterns like thumb2_mov_negscc_strict_it and
> thumb2_mov_notscc_strict_it
> that I could not generate at all, because the earlier RTL optimisers always
> generated some
> equivalent but different (and at least as good) so these splitters never
> matched.
> I think we could safely remove them, but not at this stage.
>
> These tests should give us a bit more test coverage into the -mrestrict-it
> functionality.
>
> Ok for trunk?

OK - thanks for the audit.

Ramana
>
> Thanks,
> Kyrill
>
> 2016-02-05  Kyrylo Tkachov  
>
> * gcc.target/arm/cond_sub_restrict_it.c: New test.
> * gcc.target/arm/condarith_restrict_it.c: Likewise.
> * gcc.target/arm/movcond_restrict_it.c: Likewise.
> * gcc.target/arm/negscc_restrict_it.c: Likewise.

Re: [PATCH 10/9] ENABLE_CHECKING refactoring: remove remaining occurrences

2016-02-24 Thread Michael Matz

Hi,

On Wed, 24 Feb 2016, Martin Liška wrote:

> >> grep ENABLE_CHECKING *.[ch]
> > dwarf2out.c:#if ENABLE_CHECKING
> > dwarf2out.c:#if ENABLE_CHECKING
> > dwarf2out.c:#if ENABLE_CHECKING
> > dwarf2out.h:#if ENABLE_CHECKING
> 
> Hi Richi.
> 
> Removal in dwarf2out.c is not possible due to assignment (and read) of
> a struct member that is conditional in dwarf2out.h:
> 
> struct GTY((chain_next ("%h.dw_loc_next"))) dw_loc_descr_node {
> ...
> #if ENABLE_CHECKING
>   /* When translating a function into a DWARF procedure, contains the frame
>  offset *before* evaluating this operation.  It is -1 when not yet
>  initialized.  */
>   int dw_loc_frame_offset;
> #endif
> };

But nothing can set ENABLE_CHECKING anymore (the macro is meanwhile called 
CHECKING_P), so all that code is dead anyway.  So either the new macro 
should be used or that code should be removed.


Ciao,
Michael.

Re: [PATCH] 69780 - [4.9/5/6 Regression] ICE on __builtin_alloca_with_align, with small alignment

2016-02-24 Thread Kyrill Tkachov



On 24/02/16 10:12, Dominique d'Humières wrote:

The test gcc.dg/builtins-68.c fails on x86_64-apple-darwin15:

FAIL: gcc.dg/builtins-68.c  (test for errors, line 79)
FAIL: gcc.dg/builtins-68.c (test for excess errors)
Excess errors:
/opt/gcc/work/gcc/testsuite/gcc.dg/builtins-68.c:107:40: warning: large integer 
implicitly truncated to unsigned type [-Woverflow]
/opt/gcc/work/gcc/testsuite/gcc.dg/builtins-68.c:11:19: warning: large integer 
implicitly truncated to unsigned type [-Woverflow]
/opt/gcc/work/gcc/testsuite/gcc.dg/builtins-68.c:109:40: warning: large integer 
implicitly truncated to unsigned type [-Woverflow]


Also on arm

Kyrill


TIA

Dominique

Re: [PATCH] 69780 - [4.9/5/6 Regression] ICE on __builtin_alloca_with_align, with small alignment

2016-02-24 Thread Martin Sebor


On 02/24/2016 03:12 AM, Dominique d'Humières wrote:

The test gcc.dg/builtins-68.c fails on x86_64-apple-darwin15:


Thanks for the heads up.  I see it also fails on i686-pc-linux-gnu
and likely other 32-bit targets for similar reasons.  Let me adjust
it today.

Martin



FAIL: gcc.dg/builtins-68.c  (test for errors, line 79)
FAIL: gcc.dg/builtins-68.c (test for excess errors)
Excess errors:
/opt/gcc/work/gcc/testsuite/gcc.dg/builtins-68.c:107:40: warning: large integer 
implicitly truncated to unsigned type [-Woverflow]
/opt/gcc/work/gcc/testsuite/gcc.dg/builtins-68.c:11:19: warning: large integer 
implicitly truncated to unsigned type [-Woverflow]
/opt/gcc/work/gcc/testsuite/gcc.dg/builtins-68.c:109:40: warning: large integer 
implicitly truncated to unsigned type [-Woverflow]

TIA

Dominique

Re: [PATCH] 69780 - [4.9/5/6 Regression] ICE on __builtin_alloca_with_align, with small alignment

2016-02-24 Thread Jakub Jelinek

On Wed, Feb 24, 2016 at 07:37:56AM -0700, Martin Sebor wrote:
> On 02/24/2016 03:12 AM, Dominique d'Humières wrote:
> >The test gcc.dg/builtins-68.c fails on x86_64-apple-darwin15:
> 
> Thanks for the heads up.  I see it also fails on i686-pc-linux-gnu
> and likely other 32-bit targets for similar reasons.  Let me adjust
> it today.

The last argument is size_t, which is unsigned int on some targets, unsigned
long on others, unsigned long long on others.
Thus perpaps you want to turn those
  p =  __builtin_alloca_with_align (n, LONG_MAX);/* { dg-error "must be a 
constant integer" } */
  p =  __builtin_alloca_with_align (n, ~0LU);/* { dg-error "must be a 
constant integer" } */
  p =  __builtin_alloca_with_align (n, 1LLU << 34);  /* { dg-error "must be a 
constant integer" } */
  p =  __builtin_alloca_with_align (n, LLONG_MAX);   /* { dg-error "must be a 
constant integer" } */
  p =  __builtin_alloca_with_align (n, ~0LLU);   /* { dg-error "must be a 
constant integer" } */
tests into
... (n, (size_t) (sizeof (size_t) >= sizeof (long) ? LONG_MAX : 0))
and similarly for the others?
I.e. if size_t is big enough, test it, otherwise still generate the error, but 
for some other reason?

Jakub

[testsuite] Adapt gcc.dg/debug/dwarf2/prod-options.c for Solaris assembler

2016-02-24 Thread Rainer Orth

gcc.dg/debug/dwarf2/prod-options.c currently FAILs on Solaris when the
native assembler is used:

FAIL: gcc.dg/debug/dwarf2/prod-options.c scan-assembler DW_AT_producer: "GNU C

This happens because the DW_AT_producer line scanned for is formatted
slightly differently: on Solaris/x86 with as, I see

.ascii "GNU C11 6.0.0 20160205 (experimental) [trunk revision 233172] 
-mtune=generic -march=pentium4 -gdwarf -O2\0" / DW_AT_producer

while with gas, the following is produced:

.long   .LASF0  / DW_AT_producer: "GNU C11 6.0.0 20160205 
(experimental) [trunk revision 233172] -mtune=generic -march=pentium4 -gdwarf 
-O2"

Solaris/SPARC as is slightly different again due to the use of a
different comment character:

.ascii "GNU C11 6.0.0 20160219 (experimental) [trunk revision 233556] 
-mcpu=v9 -gdwarf -O2\0"   ! DW_AT_producer

The following patch allows for those variations, tested with the
appropriate runtest invocations on i386-pc-solaris2.12,
sparc-sun-solaris2.12, and x86_64-pc-linux-gnu, installed on mainline.

Rainer


2016-02-24  Rainer Orth  

* gcc.dg/debug/dwarf2/prod-options.c: Use different DW_AT_producer
pattern on Solaris with as.

# HG changeset patch
# Parent  4426c5674eb0f7a0b8dcc53bdb4372dcd66a1ef4
Adapt gcc.dg/debug/dwarf2/prod-options.c for Solaris assembler

diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2/prod-options.c b/gcc/testsuite/gcc.dg/debug/dwarf2/prod-options.c
--- a/gcc/testsuite/gcc.dg/debug/dwarf2/prod-options.c
+++ b/gcc/testsuite/gcc.dg/debug/dwarf2/prod-options.c
@@ -4,7 +4,8 @@
as well.  */
 /* { dg-do compile } */
 /* { dg-options "-O2 -gdwarf -dA -fdebug-prefix-map=a=b" } */
-/* { dg-final { scan-assembler "DW_AT_producer: \"GNU C" } } */
+/* { dg-final { scan-assembler "DW_AT_producer: \"GNU C" { target { { ! *-*-solaris2* } || gas } } } } */
+/* { dg-final { scan-assembler "\"GNU C\[^\\n\\r\]+ DW_AT_producer" { target { *-*-solaris2* && { ! gas } } } } } */
 /* { dg-final { scan-assembler-not "debug-prefix-map" } } */
 
 void func (void)

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

[PATCH] Fix PR69760

2016-02-24 Thread Richard Biener


The following fixes bogus SCEV analysis for expressions that are only
executed conditionally [note: conditionally here doesn't include
after a taken exit].  Basically we have to make sure further analysis
does not attempt to use undefined overflow for expressions we don't
know whether they are computed in the original source (for all
loop iterations).  This would result in bogus CHRECs as can be seen
in this PR.

The solution is to re-write those expressions in a way so overflow
behavior is well-defined.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2016-02-24  Richard Biener  
Jakub Jelinek  

PR middle-end/69760
* tree-scalar-evolution.c (interpret_rhs_expr): Re-write
conditionally executed ops to well-defined overflow behavior.

* gcc.dg/torture/pr69760.c: New testcase.

Index: gcc/tree-scalar-evolution.c
===
--- gcc/tree-scalar-evolution.c (revision 233634)
+++ gcc/tree-scalar-evolution.c (working copy)
@@ -1703,7 +1703,7 @@ static tree
 interpret_rhs_expr (struct loop *loop, gimple *at_stmt,
tree type, tree rhs1, enum tree_code code, tree rhs2)
 {
-  tree res, chrec1, chrec2;
+  tree res, chrec1, chrec2, ctype;
   gimple *def;
 
   if (get_gimple_rhs_class (code) == GIMPLE_SINGLE_RHS)
@@ -1798,30 +1798,63 @@ interpret_rhs_expr (struct loop *loop, g
 case PLUS_EXPR:
   chrec1 = analyze_scalar_evolution (loop, rhs1);
   chrec2 = analyze_scalar_evolution (loop, rhs2);
-  chrec1 = chrec_convert (type, chrec1, at_stmt);
-  chrec2 = chrec_convert (type, chrec2, at_stmt);
+  ctype = type;
+  /* When the stmt is conditionally executed re-write the CHREC
+ into a form that has well-defined behavior on overflow.  */
+  if (at_stmt
+ && INTEGRAL_TYPE_P (type)
+ && ! TYPE_OVERFLOW_WRAPS (type)
+ && ! dominated_by_p (CDI_DOMINATORS, loop->latch,
+  gimple_bb (at_stmt)))
+   ctype = unsigned_type_for (type);
+  chrec1 = chrec_convert (ctype, chrec1, at_stmt);
+  chrec2 = chrec_convert (ctype, chrec2, at_stmt);
   chrec1 = instantiate_parameters (loop, chrec1);
   chrec2 = instantiate_parameters (loop, chrec2);
-  res = chrec_fold_plus (type, chrec1, chrec2);
+  res = chrec_fold_plus (ctype, chrec1, chrec2);
+  if (type != ctype)
+   res = chrec_convert (type, res, at_stmt);
   break;
 
 case MINUS_EXPR:
   chrec1 = analyze_scalar_evolution (loop, rhs1);
   chrec2 = analyze_scalar_evolution (loop, rhs2);
-  chrec1 = chrec_convert (type, chrec1, at_stmt);
-  chrec2 = chrec_convert (type, chrec2, at_stmt);
+  ctype = type;
+  /* When the stmt is conditionally executed re-write the CHREC
+ into a form that has well-defined behavior on overflow.  */
+  if (at_stmt
+ && INTEGRAL_TYPE_P (type)
+ && ! TYPE_OVERFLOW_WRAPS (type)
+ && ! dominated_by_p (CDI_DOMINATORS,
+  loop->latch, gimple_bb (at_stmt)))
+   ctype = unsigned_type_for (type);
+  chrec1 = chrec_convert (ctype, chrec1, at_stmt);
+  chrec2 = chrec_convert (ctype, chrec2, at_stmt);
   chrec1 = instantiate_parameters (loop, chrec1);
   chrec2 = instantiate_parameters (loop, chrec2);
-  res = chrec_fold_minus (type, chrec1, chrec2);
+  res = chrec_fold_minus (ctype, chrec1, chrec2);
+  if (type != ctype)
+   res = chrec_convert (type, res, at_stmt);
   break;
 
 case NEGATE_EXPR:
   chrec1 = analyze_scalar_evolution (loop, rhs1);
-  chrec1 = chrec_convert (type, chrec1, at_stmt);
+  ctype = type;
+  /* When the stmt is conditionally executed re-write the CHREC
+ into a form that has well-defined behavior on overflow.  */
+  if (at_stmt
+ && INTEGRAL_TYPE_P (type)
+ && ! TYPE_OVERFLOW_WRAPS (type)
+ && ! dominated_by_p (CDI_DOMINATORS,
+  loop->latch, gimple_bb (at_stmt)))
+   ctype = unsigned_type_for (type);
+  chrec1 = chrec_convert (ctype, chrec1, at_stmt);
   /* TYPE may be integer, real or complex, so use fold_convert.  */
   chrec1 = instantiate_parameters (loop, chrec1);
-  res = chrec_fold_multiply (type, chrec1,
-fold_convert (type, integer_minus_one_node));
+  res = chrec_fold_multiply (ctype, chrec1,
+fold_convert (ctype, integer_minus_one_node));
+  if (type != ctype)
+   res = chrec_convert (type, res, at_stmt);
   break;
 
 case BIT_NOT_EXPR:
@@ -1837,11 +1870,22 @@ interpret_rhs_expr (struct loop *loop, g
 case MULT_EXPR:
   chrec1 = analyze_scalar_evolution (loop, rhs1);
   chrec2 = analyze_scalar_evolution (loop, rhs2);
-  chrec1 = chrec_convert (type, chrec1, at_stmt);
-  chrec2 = chrec_convert (type, chrec2, at_stmt);
+

Re: [PATCH] 69780 - [4.9/5/6 Regression] ICE on __builtin_alloca_with_align, with small alignment

2016-02-24 Thread Martin Sebor


On 02/24/2016 07:46 AM, Jakub Jelinek wrote:

On Wed, Feb 24, 2016 at 07:37:56AM -0700, Martin Sebor wrote:

On 02/24/2016 03:12 AM, Dominique d'Humières wrote:

The test gcc.dg/builtins-68.c fails on x86_64-apple-darwin15:


Thanks for the heads up.  I see it also fails on i686-pc-linux-gnu
and likely other 32-bit targets for similar reasons.  Let me adjust
it today.


The last argument is size_t, which is unsigned int on some targets, unsigned
long on others, unsigned long long on others.
Thus perpaps you want to turn those
   p =  __builtin_alloca_with_align (n, LONG_MAX);/* { dg-error "must be a 
constant integer" } */
   p =  __builtin_alloca_with_align (n, ~0LU);/* { dg-error "must be a 
constant integer" } */
   p =  __builtin_alloca_with_align (n, 1LLU << 34);  /* { dg-error "must be a 
constant integer" } */
   p =  __builtin_alloca_with_align (n, LLONG_MAX);   /* { dg-error "must be a 
constant integer" } */
   p =  __builtin_alloca_with_align (n, ~0LLU);   /* { dg-error "must be a 
constant integer" } */
tests into
... (n, (size_t) (sizeof (size_t) >= sizeof (long) ? LONG_MAX : 0))
and similarly for the others?
I.e. if size_t is big enough, test it, otherwise still generate the error, but 
for some other reason?


Yes, something like that, thanks.  Some of the tests are redundant
given the builtin's signature and only there out of paranoia. They
can be removed.

Martin

Re: [PATCH 10/9] ENABLE_CHECKING refactoring: remove remaining occurrences

2016-02-24 Thread Martin Liška

On 02/24/2016 03:27 PM, Michael Matz wrote:
> But nothing can set ENABLE_CHECKING anymore (the macro is meanwhile called 
> CHECKING_P), so all that code is dead anyway.  So either the new macro 
> should be used or that code should be removed.
> 
> 
> Ciao,
> Michael.

Good point, well the change is quite recent (12-2015). I'm adding
the author of the code to make a decision about it.

Thanks,
Martin

libgo patch committed: Lock goroutine to thread during cgo call

2016-02-24 Thread Ian Lance Taylor

This patch from Xia Bin fixes libgo to lock a goroutine to a thread
during a cgo call.  That is how the master library works, and it
avoids confusion if the C code calls back into the Go code.
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 233515)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-1c3747d20789c73447ff71cbc739f7423c4bdf67
+156f5f0152797ac2afe5f23803aeb3c7b8f8418e
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/runtime/go-cgo.c
===
--- libgo/runtime/go-cgo.c  (revision 232239)
+++ libgo/runtime/go-cgo.c  (working copy)
@@ -41,6 +41,8 @@ syscall_cgocall ()
   if (runtime_needextram && runtime_cas (&runtime_needextram, 1, 0))
 runtime_newextram ();
 
+  runtime_lockOSThread();
+
   m = runtime_m ();
   ++m->ncgocall;
   g = runtime_g ();
@@ -70,6 +72,8 @@ syscall_cgocalldone ()
  _cgo_panic will already have exited syscall mode.  */
   if (g->status == Gsyscall)
 runtime_exitsyscall ();
+
+  runtime_unlockOSThread();
 }
 
 /* Call back from C/C++ code to Go code.  */

PATCH to add -flifetime-dse=1

2016-02-24 Thread Jason Merrill

Various projects have been having trouble with the new -flifetime-dse 
clobber at the beginning of the constructor, so this patch allows them 
to pass -flifetime-dse=1 to disable that clobber while still keeping the 
one at the end of the destructor.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 4795b095eba78ee05f46548fbc9aa9343ff34a7e
Author: Jason Merrill 
Date:   Mon Feb 22 15:18:13 2016 -0500

	Add -flifetime-dse=1.

gcc/
	* common.opt (flifetime-dse): Add -flifetime-dse=1.
gcc/cp/
	* decl.c (start_preparsed_function): Condition ctor clobber on
	flag_lifetime_dse > 1.

diff --git a/gcc/common.opt b/gcc/common.opt
index bc5b4c4..e91f225 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1946,10 +1946,13 @@ Common Ignore
 Does nothing. Preserved for backward compatibility.
 
 flifetime-dse
-Common Report Var(flag_lifetime_dse) Init(1) Optimization
+Common Report Var(flag_lifetime_dse,2) Init(2) Optimization
 Tell DSE that the storage for a C++ object is dead when the constructor
 starts and when the destructor finishes.
 
+flifetime-dse=
+Common Joined RejectNegative UInteger Var(flag_lifetime_dse) Optimization
+
 flive-range-shrinkage
 Common Report Var(flag_live_range_shrinkage) Init(0) Optimization
 Relief of register pressure through live range shrinkage.
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 30eef5c..2df3398 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -14104,7 +14104,8 @@ start_preparsed_function (tree decl1, tree attrs, int flags)
   store_parm_decls (current_function_parms);
 
   if (!processing_template_decl
-  && flag_lifetime_dse && DECL_CONSTRUCTOR_P (decl1)
+  && (flag_lifetime_dse > 1)
+  && DECL_CONSTRUCTOR_P (decl1)
   /* We can't clobber safely for an implicitly-defined default constructor
 	 because part of the initialization might happen before we enter the
 	 constructor, via AGGR_INIT_ZERO_FIRST (c++/68006).  */
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 9ca3793..b8b2e70 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -6809,7 +6809,10 @@ value, and any changes during the lifetime of the object are dead when
 the object is destroyed.  Normally dead store elimination will take
 advantage of this; if your code relies on the value of the object
 storage persisting beyond the lifetime of the object, you can use this
-flag to disable this optimization.
+flag to disable this optimization.  To preserve stores before the
+constructor starts (e.g. because your operator new clears the object
+storage) but still treat the object as dead after the destructor you,
+can use -flifetime-dse=1.
 
 @item -flive-range-shrinkage
 @opindex flive-range-shrinkage
diff --git a/gcc/testsuite/g++.dg/opt/flifetime-dse4.C b/gcc/testsuite/g++.dg/opt/flifetime-dse4.C
new file mode 100644
index 000..c72444a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/opt/flifetime-dse4.C
@@ -0,0 +1,27 @@
+// { dg-options "-O3 -flifetime-dse=1" }
+// { dg-do run }
+
+typedef __SIZE_TYPE__ size_t;
+inline void * operator new (size_t, void *p) { return p; }
+
+struct A
+{
+  int i;
+  A() {}
+  ~A() {}
+};
+
+int main()
+{
+  int ar[1] = { 42 };
+  A* ap = new(ar) A;
+
+  // With -flifetime-dse=1 we retain the old value.
+  if (ap->i != 42) __builtin_abort();
+
+  ap->i = 42;
+  ap->~A();
+
+  // When the destructor ends the object no longer exists.
+  if (ar[0] == 42) __builtin_abort();
+}

C++ PATCH to make constexpr respect -fno-inline

2016-02-24 Thread Jason Merrill

While talking about constexpr issues it occurred to me that we ought not 
fold away calls to constexpr functions in regular expressions when the 
user has specified -fno-inline, so that they can debug those calls normally.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit bf89c79569525294101eb024502fcc03011b16d2
Author: Jason Merrill 
Date:   Mon Feb 22 13:32:50 2016 -0500

	* cp-gimplify.c (cp_fold): Don't fold constexpr calls if -fno-inline.

diff --git a/gcc/cp/cp-gimplify.c b/gcc/cp/cp-gimplify.c
index 34bdc82..c59cd90 100644
--- a/gcc/cp/cp-gimplify.c
+++ b/gcc/cp/cp-gimplify.c
@@ -2154,7 +2154,8 @@ cp_fold (tree x)
 	   TODO:
 	   Do constexpr expansion of expressions where the call itself is not
 	   constant, but the call followed by an INDIRECT_REF is.  */
-	if (callee && DECL_DECLARED_CONSTEXPR_P (callee))
+	if (callee && DECL_DECLARED_CONSTEXPR_P (callee)
+	&& !flag_no_inline)
   r = maybe_constant_value (x);
 	optimize = sv;

Re: PATCH to add -flifetime-dse=1

2016-02-24 Thread Jakub Jelinek

On Wed, Feb 24, 2016 at 10:15:35AM -0500, Jason Merrill wrote:
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -6809,7 +6809,10 @@ value, and any changes during the lifetime of the 
> object are dead when
>  the object is destroyed.  Normally dead store elimination will take
>  advantage of this; if your code relies on the value of the object
>  storage persisting beyond the lifetime of the object, you can use this
> -flag to disable this optimization.
> +flag to disable this optimization.  To preserve stores before the
> +constructor starts (e.g. because your operator new clears the object
> +storage) but still treat the object as dead after the destructor you,
> +can use -flifetime-dse=1.

That should be @option{-flifetime-dse=1} I think.  Shouldn't -flifetime-dse=
be also in @opindex at the beginning of the paragraph, and documented what
the values mean (0 equivalent of -fno-lifetime-dse (or document it vice
versa) and 2 full lifetime dse enabled?

Jakub

Re: [PATCH 10/9] ENABLE_CHECKING refactoring: remove remaining occurrences

2016-02-24 Thread Pierre-Marie de Rodat


On 02/24/2016 03:53 PM, Martin Liška wrote:

On 02/24/2016 03:27 PM, Michael Matz wrote:

But nothing can set ENABLE_CHECKING anymore (the macro is meanwhile called
CHECKING_P), so all that code is dead anyway.  So either the new macro
should be used or that code should be removed.


Good point, well the change is quite recent (12-2015). I'm adding
the author of the code to make a decision about it.


Thanks for the heads up! That’s kind of funny: the check associated with 
this dw_loc_frame_offset field revealed a bug to us (at AdaCore) very 
recently, so I think we should keep it in one form or another.


This field takes one int slot in the dw_loc_descr_node structure, so I 
guess not having it in release mode is important (that’s what Jason said 
in ). Here’s 
what I think:


  * This field is used only in the resolve_args_picking graph traversal
for a consistency check in already visited nodes.

  * resolve_args_picking has a hash set to remember the already visited
nodes.

  * The consistency check itself has almost no runtime cost: we’re
doing the graph traversal in release mode anyway.

So what about removing the field (in struct dw_loc_descr_node) and 
replacing the visited hash set with a frame_offset hash map (in 
resolve_args_picking)? This hash map would remember the information we 
currently store in the field.


This is a little change, but I can take care of this if you want. I’m a 
little bit desynchronized with the development pace these days: would 
this be for stage 4 or GCC 7?


--
Pierre-Marie de Rodat

Re: [PATCH][RFC][Offloading] Fix PR68463

2016-02-24 Thread Thomas Schwinge

Hi!

On Tue, 23 Feb 2016 08:37:07 +0100, Tom de Vries  wrote:
> On 22/02/16 19:07, Ilya Verbin wrote:
> > 2016-02-22 18:13 GMT+03:00 Thomas Schwinge:
> >> >On Sat, 20 Feb 2016 13:54:20 +0300, Ilya Verbin  wrote:
> >>> >>On Fri, Feb 19, 2016 at 15:53:08 +0100, Jakub Jelinek wrote:
>  >> >On Wed, Feb 10, 2016 at 08:19:34PM +0300, Ilya Verbin wrote:
> > >> > >This patch adds crtoffload{begin,end}.o to all -fopenmp programs, 
> > >> > >if they exist.

> >>> >>Thomas, could you please test it using nvptx
> >> >
> >> >It mostly;-)  works.  With nvptx offloading enabled (which you don't
> >> >have, do you?), I'm seeing one test case regress:
> >> >
> >> > [-PASS:-]{+FAIL:+} 
> >> > libgomp.oacc-c/../libgomp.oacc-c-c++-common/parallel-dims-2.c 
> >> > -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0  (test for errors, line 9)
> >> > [-PASS:-]{+FAIL:+} 
> >> > libgomp.oacc-c/../libgomp.oacc-c-c++-common/parallel-dims-2.c 
> >> > -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0  (test for errors, line 13)
> >> > PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/parallel-dims-2.c 
> >> > -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 (test for excess errors)
> >> > [-PASS:-]{+FAIL:+} 
> >> > libgomp.oacc-c/../libgomp.oacc-c-c++-common/parallel-dims-2.c 
> >> > -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
> >> >
> >> >(Same for C++.)  That testcase, just recently added by Tom in r233237
> >> >"Handle -fdiagnostics-color in lto", specifies 'dg-additional-options
> >> >"-flto -fno-use-linker-plugin"'.  Is that now an unsupported
> >> >combination/configuration?  (I have not yet looked in detail, but it
> >> >appears as if the offloading compilers are no longer being run for
> >> >-fno-use-linker-plugin.)
> > Yes, it's really hard to fix the "lto + non-lto objects" issue for
> > no-use-linker-plugin LTO path. In this patch lto-plugin prepares a
> > list of objects files with offloading and passes it to lto-wrapper, so
> > I believe we should consider offloading without lto-plugin as
> > unsupported. I'll update wiki when the patch will be committed.

Aha, I see.  I guess there's no point in keeping offloading supported for
the -fno-lto (default) with -fno-use-linker-plugin configuration?

Ilya, then please remove
libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims-2.c as part of
your patch, unless Tom thinks it should be changed to a -flto test, but
without -fno-use-linker-plugin?

> Shouldn't we error (or at least warn) then if we compile a file 
> containing an offload construct with fopenacc/fopenmp and 
> -fno-use-linker-plugin?

Yes, that makes sense to me, too.  (Note that, as I understand it,
-fno-use-linker-plugin may also be the default for certain GCC
configurations...)  Aside from spec stuff in gcc/gcc.c relating to
LINK_PLUGIN_SPEC, I see there's some code in
gcc/gcc.c:driver::maybe_run_linker evaluating the three possible values
of HAVE_LTO_PLUGIN, but I have not yet thought about how and where to
conditionalize the diagnostic if attempting to do offloading in an
unsupported (-fno-use-linker-plugin) configuration.


Grüße
 Thomas

Re: [pr 69666] No SRA default_def replacements for unscalarizable

2016-02-24 Thread Martin Jambor

On Tue, Feb 23, 2016 at 06:45:08AM -0800, H.J. Lu wrote:
> On Fri, Feb 19, 2016 at 8:21 AM, Martin Jambor  wrote:
> > Hi,
> >
> > in PR 69666, SRA attempts to turn a load from an aggregate that is
> > uninitialized into a load from a default definition SSA name (which
> > something it does to generate an appropriate warning later) but
> > unfortunately it does so using an access structure which is
> > representable with __int128 when the load in question is smaller.  It
> > then attempts to fix it up only to create an invalid V_C_E.  In this
> > case, the correct thing to do is not to attempt the transformation,
> > when there are smaller accesses, which can be figured out by looking
> > at the unscalarizable_region flag of the access.
> >
> > Bootstrapped and tested on x86_64, OK for trunk and later for the 5
> > branch?
> >
> 
> This may have caused:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69920
> 

I have reverted the patch on the gcc-5 branch as Jakub asked me to.  I
do have a fix for the issue but I'd like to investigate one aspect of
this problem a bit more tomorrow (see below) before testing it and
formally proposing it here.

Sorry for the breakage,

Martin


--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -3504,7 +3504,8 @@ sra_modify_assign (gimple *stmt, gimple_stmt_iterator 
*gsi)
   else
{
  if (access_has_children_p (racc)
- && !racc->grp_unscalarized_data)
+ && !racc->grp_unscalarized_data
+ && TREE_CODE (lhs) != SSA_NAME)
{
  if (dump_file)
{

Re: [PATCH][RFC][Offloading] Fix PR68463

2016-02-24 Thread Ilya Verbin

On Wed, Feb 24, 2016 at 17:13:35 +0100, Thomas Schwinge wrote:
> On Tue, 23 Feb 2016 08:37:07 +0100, Tom de Vries  
> wrote:
> > On 22/02/16 19:07, Ilya Verbin wrote:
> > > 2016-02-22 18:13 GMT+03:00 Thomas Schwinge:
> > >> >On Sat, 20 Feb 2016 13:54:20 +0300, Ilya Verbin  
> > >> >wrote:
> > >>> >>On Fri, Feb 19, 2016 at 15:53:08 +0100, Jakub Jelinek wrote:
> >  >> >On Wed, Feb 10, 2016 at 08:19:34PM +0300, Ilya Verbin wrote:
> > > >> > >This patch adds crtoffload{begin,end}.o to all -fopenmp 
> > > >> > >programs, if they exist.
> 
> > >>> >>Thomas, could you please test it using nvptx
> > >> >
> > >> >It mostly;-)  works.  With nvptx offloading enabled (which you don't
> > >> >have, do you?), I'm seeing one test case regress:
> > >> >
> > >> > [-PASS:-]{+FAIL:+} 
> > >> > libgomp.oacc-c/../libgomp.oacc-c-c++-common/parallel-dims-2.c 
> > >> > -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0  (test for errors, line 
> > >> > 9)
> > >> > [-PASS:-]{+FAIL:+} 
> > >> > libgomp.oacc-c/../libgomp.oacc-c-c++-common/parallel-dims-2.c 
> > >> > -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0  (test for errors, line 
> > >> > 13)
> > >> > PASS: 
> > >> > libgomp.oacc-c/../libgomp.oacc-c-c++-common/parallel-dims-2.c 
> > >> > -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 (test for excess errors)
> > >> > [-PASS:-]{+FAIL:+} 
> > >> > libgomp.oacc-c/../libgomp.oacc-c-c++-common/parallel-dims-2.c 
> > >> > -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
> > >> >
> > >> >(Same for C++.)  That testcase, just recently added by Tom in r233237
> > >> >"Handle -fdiagnostics-color in lto", specifies 'dg-additional-options
> > >> >"-flto -fno-use-linker-plugin"'.  Is that now an unsupported
> > >> >combination/configuration?  (I have not yet looked in detail, but it
> > >> >appears as if the offloading compilers are no longer being run for
> > >> >-fno-use-linker-plugin.)
> > > Yes, it's really hard to fix the "lto + non-lto objects" issue for
> > > no-use-linker-plugin LTO path. In this patch lto-plugin prepares a
> > > list of objects files with offloading and passes it to lto-wrapper, so
> > > I believe we should consider offloading without lto-plugin as
> > > unsupported. I'll update wiki when the patch will be committed.
> 
> Aha, I see.  I guess there's no point in keeping offloading supported for
> the -fno-lto (default) with -fno-use-linker-plugin configuration?
> 
> Ilya, then please remove
> libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims-2.c as part of
> your patch, unless Tom thinks it should be changed to a -flto test, but
> without -fno-use-linker-plugin?

OK.

> > Shouldn't we error (or at least warn) then if we compile a file 
> > containing an offload construct with fopenacc/fopenmp and 
> > -fno-use-linker-plugin?
> 
> Yes, that makes sense to me, too.  (Note that, as I understand it,
> -fno-use-linker-plugin may also be the default for certain GCC
> configurations...)  Aside from spec stuff in gcc/gcc.c relating to
> LINK_PLUGIN_SPEC, I see there's some code in
> gcc/gcc.c:driver::maybe_run_linker evaluating the three possible values
> of HAVE_LTO_PLUGIN, but I have not yet thought about how and where to
> conditionalize the diagnostic if attempting to do offloading in an
> unsupported (-fno-use-linker-plugin) configuration.

To print this error someone has to detect that at least one object contains
offload sections, only linker plugin and lto-wrapper can do it.  But if linker
plugin is absent, the lto-wrapper have to open all objects, scan for all
sections, etc.  Looks like too much overhead for a single diagnostic.

  -- Ilya

[PATCH] Fix missing warning with bool (PR c/67854)

2016-02-24 Thread Marek Polacek

The following is another issue with macros from system headers.  In this case
bool is defined in a system header to expand to _Bool and the "is promoted to"
warning didn't trigger because of that.  The fix is to use the expanded 
location.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2016-02-24  Marek Polacek  

PR c/67854
* gimplify.c (gimplify_va_arg_expr): Use expanded location for the
"is promoted to" warning.

* gcc.dg/pr67854.c: New test.

diff --git gcc/gimplify.c gcc/gimplify.c
index 7be6bd7..e7ea974 100644
--- gcc/gimplify.c
+++ gcc/gimplify.c
@@ -11573,24 +11573,28 @@ gimplify_va_arg_expr (tree *expr_p, gimple_seq *pre_p,
 {
   static bool gave_help;
   bool warned;
+  /* Use the expansion point to handle cases such as passing bool (defined
+in a system header) through `...'.  */
+  source_location xloc
+   = expansion_point_location_if_in_system_header (loc);
 
   /* Unfortunately, this is merely undefined, rather than a constraint
 violation, so we cannot make this an error.  If this call is never
 executed, the program is still strictly conforming.  */
-  warned = warning_at (loc, 0,
-  "%qT is promoted to %qT when passed through %<...%>",
+  warned = warning_at (xloc, 0,
+  "%qT is promoted to %qT when passed through %<...%>",
   type, promoted_type);
   if (!gave_help && warned)
{
  gave_help = true;
- inform (loc, "(so you should pass %qT not %qT to %)",
+ inform (xloc, "(so you should pass %qT not %qT to %)",
  promoted_type, type);
}
 
   /* We can, however, treat "undefined" any way we please.
 Call abort to encourage the user to fix the program.  */
   if (warned)
-   inform (loc, "if this code is reached, the program will abort");
+   inform (xloc, "if this code is reached, the program will abort");
   /* Before the abort, allow the evaluation of the va_list
 expression to exit or longjmp.  */
   gimplify_and_add (valist, pre_p);
diff --git gcc/testsuite/gcc.dg/pr67854.c gcc/testsuite/gcc.dg/pr67854.c
index e69de29..af994c6 100644
--- gcc/testsuite/gcc.dg/pr67854.c
+++ gcc/testsuite/gcc.dg/pr67854.c
@@ -0,0 +1,11 @@
+/* PR c/67854 */
+/* { dg-do compile } */
+
+#include 
+#include 
+
+void
+foo (va_list ap)
+{
+  va_arg (ap, bool); /* { dg-warning "is promoted to .int. when passed 
through" } */
+}

Marek

C PATCH for c/69819 (error-recovery ICE)

2016-02-24 Thread Marek Polacek

This is a minor issue in the C FE, where it on an invalid code generated bogus
FUNCTION_DECL with ARRAY_TYPE as a type -- and gimplifier can't digest that.

I think the code in finish_decl, whereby we update the type of shadowed global
variables should not overwrite a type of totally unrelated decl.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2016-02-24  Marek Polacek  

PR c/69819
* c-decl.c (finish_decl): Don't update the copy of the type of a
different decl type.

* gcc.dg/pr69819.c: New test.

diff --git gcc/c/c-decl.c gcc/c/c-decl.c
index 8e332f8..298036a 100644
--- gcc/c/c-decl.c
+++ gcc/c/c-decl.c
@@ -4743,7 +4743,7 @@ finish_decl (tree decl, location_t init_loc, tree init,
  struct c_binding *b_ext = I_SYMBOL_BINDING (DECL_NAME (decl));
  while (b_ext && !B_IN_EXTERNAL_SCOPE (b_ext))
b_ext = b_ext->shadowed;
- if (b_ext)
+ if (b_ext && TREE_CODE (decl) == TREE_CODE (b_ext->decl))
{
  if (b_ext->u.type && comptypes (b_ext->u.type, type))
b_ext->u.type = composite_type (b_ext->u.type, type);
diff --git gcc/testsuite/gcc.dg/pr69819.c gcc/testsuite/gcc.dg/pr69819.c
index e69de29..a9594dd 100644
--- gcc/testsuite/gcc.dg/pr69819.c
+++ gcc/testsuite/gcc.dg/pr69819.c
@@ -0,0 +1,5 @@
+/* PR c/69819 */
+/* { dg-do compile } */
+
+void foo () { }
+int foo[] = { 0 }; /* { dg-error ".foo. redeclared as different kind of 
symbol" } */

Marek

Re: [PATCH] 69780 - [4.9/5/6 Regression] ICE on __builtin_alloca_with_align, with small alignment

2016-02-24 Thread Martin Sebor


On 02/24/2016 07:52 AM, Martin Sebor wrote:

On 02/24/2016 07:46 AM, Jakub Jelinek wrote:

On Wed, Feb 24, 2016 at 07:37:56AM -0700, Martin Sebor wrote:

On 02/24/2016 03:12 AM, Dominique d'Humières wrote:

The test gcc.dg/builtins-68.c fails on x86_64-apple-darwin15:


Thanks for the heads up.  I see it also fails on i686-pc-linux-gnu
and likely other 32-bit targets for similar reasons.  Let me adjust
it today.


The last argument is size_t, which is unsigned int on some targets,
unsigned
long on others, unsigned long long on others.
Thus perpaps you want to turn those
   p =  __builtin_alloca_with_align (n, LONG_MAX);/* { dg-error
"must be a constant integer" } */
   p =  __builtin_alloca_with_align (n, ~0LU);/* { dg-error
"must be a constant integer" } */
   p =  __builtin_alloca_with_align (n, 1LLU << 34);  /* { dg-error
"must be a constant integer" } */
   p =  __builtin_alloca_with_align (n, LLONG_MAX);   /* { dg-error
"must be a constant integer" } */
   p =  __builtin_alloca_with_align (n, ~0LLU);   /* { dg-error
"must be a constant integer" } */
tests into
... (n, (size_t) (sizeof (size_t) >= sizeof (long) ? LONG_MAX : 0))
and similarly for the others?
I.e. if size_t is big enough, test it, otherwise still generate the
error, but for some other reason?


Yes, something like that, thanks.  Some of the tests are redundant
given the builtin's signature and only there out of paranoia. They
can be removed.


I committed r233677 after testing it with -m32 on x86_64 but without
posting it for review first (it seemed simple enough).  I'll keep
an eye on the test results on other targets but if someone notices
any outstanding or new failures please let me know.

Martin

Re: [WWWDocs] Deprecate support for non-thumb ARM devices

2016-02-24 Thread Joseph Myers

On Wed, 24 Feb 2016, Richard Earnshaw (lists) wrote:

> After discussion with the ARM port maintainers we have decided that now
> is probably the right time to deprecate support for versions of the ARM
> Architecture prior to ARMv4t.  This will allow us to clean up some of

Should this include -march=armv5 and -march=armv5e (the theoretical 
no-Thumb versions of v5, which may never have had any corresponding 
processors)?

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: C PATCH for c/69819 (error-recovery ICE)

2016-02-24 Thread Joseph Myers

On Wed, 24 Feb 2016, Marek Polacek wrote:

> This is a minor issue in the C FE, where it on an invalid code generated bogus
> FUNCTION_DECL with ARRAY_TYPE as a type -- and gimplifier can't digest that.
> 
> I think the code in finish_decl, whereby we update the type of shadowed global
> variables should not overwrite a type of totally unrelated decl.
> 
> Bootstrapped/regtested on x86_64-linux, ok for trunk?

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com

[patch] libstdc++/69945 Add __gnu_cxx::__freeres hook

2016-02-24 Thread Jonathan Wakely


This adds a new function to libsupc++ which will free the memory still
in use by the pool used for allocating exceptions when malloc fails.

This is similar to glibc's __libc_freeres, which valgrind (and other
tools?) use to tell glibc to deallocate everything before exiting.

I initially called it __gnu_cxx::__free_eh_pool() but I figured we
might have other memory in use at some later date, and we wouldn't
want valgrind to have to start calling a second function, nor make a
function called __free_eh_pool() actually free other things.

I intentionally *didn't* lock the pool's mutex before freeing it,
because this should never be called in a context where multiple
threads are still active and accessing the pool.

Thoughts?

Is the new test a good idea, or will exposing it like that in the
testsuite just give users the idea that they can/should be doing the
same themselves?


[ Aside: should the actual pool be created with placement-new to
[ ensure it doesn't ever get destroyed?
[
[   alignas(pool) unsigned char poolbuf[sizeof(pool)];
[   pool& emergency_pool = *::new(poolbuf) pool;



commit 40fed1db72b0a4852d5890c2c464a0baabb02b74
Author: Jonathan Wakely 
Date:   Wed Feb 24 18:22:30 2016 +

libstdc++/69945 Add __gnu_cxx::__freeres hook

	* config/abi/pre/gnu.ver: Add new symbol.
	* libsupc++/eh_alloc.cc (__gnu_cxx::__freeres): Define.
	* testsuite/18_support/free_eh_pool.cc: New test.

diff --git a/libstdc++-v3/config/abi/pre/gnu.ver b/libstdc++-v3/config/abi/pre/gnu.ver
index 41069d1..5c6b0fe 100644
--- a/libstdc++-v3/config/abi/pre/gnu.ver
+++ b/libstdc++-v3/config/abi/pre/gnu.ver
@@ -2148,6 +2148,8 @@ CXXABI_1.3.10 {
 _ZGTtNKSt13bad_exceptionD1Ev;
 _ZGTtNKSt13bad_exception4whatEv;
 
+_ZN9__gnu_cxx9__freeresEv;
+
 } CXXABI_1.3.9;
 
 # Symbols in the support library (libsupc++) supporting transactional memory.
diff --git a/libstdc++-v3/libsupc++/eh_alloc.cc b/libstdc++-v3/libsupc++/eh_alloc.cc
index 6973af3..d362e40 100644
--- a/libstdc++-v3/libsupc++/eh_alloc.cc
+++ b/libstdc++-v3/libsupc++/eh_alloc.cc
@@ -73,6 +73,10 @@ using namespace __cxxabiv1;
 # define EMERGENCY_OBJ_COUNT	4
 #endif
 
+namespace __gnu_cxx
+{
+  void __freeres();
+}
 
 namespace
 {
@@ -106,6 +110,8 @@ namespace
   // to implement in_pool.
   char *arena;
   std::size_t arena_size;
+
+  friend void __gnu_cxx::__freeres();
 };
 
   pool::pool()
@@ -244,6 +250,19 @@ namespace
   pool emergency_pool;
 }
 
+namespace __gnu_cxx
+{
+  void
+  __freeres()
+  {
+if (emergency_pool.arena)
+  {
+	::free(emergency_pool.arena);
+	emergency_pool.arena = 0;
+  }
+  }
+}
+
 extern "C" void *
 __cxxabiv1::__cxa_allocate_exception(std::size_t thrown_size) _GLIBCXX_NOTHROW
 {
diff --git a/libstdc++-v3/testsuite/18_support/free_eh_pool.cc b/libstdc++-v3/testsuite/18_support/free_eh_pool.cc
new file mode 100644
index 000..9712d3d
--- /dev/null
+++ b/libstdc++-v3/testsuite/18_support/free_eh_pool.cc
@@ -0,0 +1,35 @@
+// Copyright (C) 2016 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-do run }
+
+namespace __gnu_cxx {
+  void __freeres();
+}
+
+struct X {
+  ~X() {
+__gnu_cxx::__freeres();
+  }
+};
+
+X x;
+
+int
+main()
+{
+}

[Patch, MIPS] Patch for PR 68273 (user aligned variable arguments)

2016-02-24 Thread Steve Ellcey

Here is a new patch for PR 68273.  The original problem with gsoap has
been fixed by changing GCC to not create overly-aligned variables in
the SRA pass but the MIPS specific problem of how user-aligned variables
are passed to functions remains.

Because MIPS GCC is internally inconsistent, it seems like changing
the passing convention for user aligned variables on MIPS is the best
option, even though this is an ABI change for MIPS.

For example:

typedef int alignedint __attribute__((aligned(8)));
int foo1 (int x, alignedint y) { return x+y; }
int foo2 (int x, int y) { return x+y; }
int foo3 (int x, alignedint y) { return x+y; }
int foo4 (int x, int y) { return x+y; }
int splat1(int a, alignedint b) { foo1(a,b); }
int splat2(int a, alignedint b) { foo2(a,b); }
int splat3(int a, int b) { foo3(a,b); }
int splat4(int a, int b) { foo4(a,b); }

In this case foo1 and foo3 would expect the second argument to be in
register $6, but foo2 and foo4 would exect it in register $5.

Likewise splat1 and splat2 would pass the second argument in $6, but
splat3 and splat4 would pass it in $5.

This means that the call from splat2 to foo2 and the call from splat3
to foo3 would be wrong in that the caller is putting the argument in
one register but the callee is getting out of a different register.

In none of these cases would GCC give a warning about the inconsistent
parameter passing 

Since this patch could cause a change in the users program, I have
added a warning that will be emitted whenever a user passes an
aligned type or when a parameter is declared as an aligned type
and that alignment could cause a change in how the value is passed.

I also added a way to turn that warning off in case the user doesn't
want to see it (-mno-warn-aligned-args).  I did not add an option
to make GCC pass arguments in the old manner as I consider that method
of passing arguments to be a bug and I don't think we want to have an
option to propogate that incorrect behavior.

Steve Ellcey
sell...@imgtec.com


2016-02-24  Steve Ellcey  

PR target/68273
* config/mips/mips.opt (mwarn-aligned-args): New flag.
* config/mips/mips.h (mips_args): Add new field.
* config/mips/mips.c (mips_internal_function_arg): New function.
(mips_function_incoming_arg): New function.
(mips_old_function_arg_boundary): New function.
(mips_function_arg): Rewrite to use mips_internal_function_arg.
(mips_function_arg_boundary): Fix argument alignment.
(TARGET_FUNCTION_INCOMING_ARG): New define.


diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index 697abc2..05465c1 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -1124,6 +1124,7 @@ static const struct mips_rtx_cost_data
 static rtx mips_find_pic_call_symbol (rtx_insn *, rtx, bool);
 static int mips_register_move_cost (machine_mode, reg_class_t,
reg_class_t);
+static unsigned int mips_old_function_arg_boundary (machine_mode, const_tree);
 static unsigned int mips_function_arg_boundary (machine_mode, const_tree);
 static machine_mode mips_get_reg_raw_mode (int regno);
 
@@ -5459,11 +5460,11 @@ mips_strict_argument_naming (cumulative_args_t ca 
ATTRIBUTE_UNUSED)
   return !TARGET_OLDABI;
 }
 
-/* Implement TARGET_FUNCTION_ARG.  */
+/* Used to implement TARGET_FUNCTION_ARG and TARGET_FUNCTION_INCOMING_ARG.  */
 
 static rtx
-mips_function_arg (cumulative_args_t cum_v, machine_mode mode,
-  const_tree type, bool named)
+mips_internal_function_arg (cumulative_args_t cum_v, machine_mode mode,
+   const_tree type, bool named)
 {
   CUMULATIVE_ARGS *cum = get_cumulative_args (cum_v);
   struct mips_arg_info info;
@@ -5586,6 +5587,50 @@ mips_function_arg (cumulative_args_t cum_v, machine_mode 
mode,
   return gen_rtx_REG (mode, mips_arg_regno (&info, TARGET_HARD_FLOAT));
 }
 
+/* Implement TARGET_FUNCTION_ARG.  */
+
+static rtx
+mips_function_arg (cumulative_args_t cum_v, machine_mode mode,
+  const_tree type, bool named)
+{
+  bool doubleword_aligned_p = (mips_function_arg_boundary (mode, type)
+ > BITS_PER_WORD);
+  bool old_doubleword_aligned_p = (mips_old_function_arg_boundary (mode, type)
+ > BITS_PER_WORD);
+  CUMULATIVE_ARGS *cum = get_cumulative_args (cum_v);
+
+  if (doubleword_aligned_p != old_doubleword_aligned_p
+  && mips_warn_aligned_args && !cum->aligned_arg_warning_given)
+{
+  warning (0, "argument %d in the call may be passed in a manner 
incompatible with previous GCC versions", cum->arg_number+1);
+  cum->aligned_arg_warning_given = 1;
+}
+
+  return mips_internal_function_arg (cum_v, mode, type, named);
+}
+
+/* Implement TARGET_FUNCTION_INCOMING_ARG.  */
+
+static rtx
+mips_function_incoming_arg (cumulative_args_t cum_v, machine_mode mode,
+   c

[PATCH] Further -Wnonnull-compare fixes (PR c++/69922)

2016-02-24 Thread Jakub Jelinek

Hi!

This patch adds TREE_NO_WARNING to two places that didn't have it before
and avoids folding those conditions at those spots (partly to match
C++ delayed folding intent, partly to avoid duplicating TREE_NO_WARNING
propagation).

The more controversial stuff is that we need to preserve TREE_NO_WARNING on
artificial comparisons with NULL, but some fold-const.c optimizations break
that, e.g. fold_binary_op_with_conditional_arg which changes
(cond ? a : b) != NULL to
(cond ? a != NULL : b != NULL)
but can go arbitrarily deep, we'd want to mark those NULL comparisons added
because of that as TREE_NO_WARNING, but this is called from
fold_binary_loc, which doesn't know it is for a TREE_NO_WARNING comparison.
Richi suggested on IRC just punt on folding in that case, which is what the
patch below implements (basically, it will try to fold the comparison,
if we get error_mark_node or if the comparison folds to constant, we are
fine, if we get a comparison, mark it TREE_NO_WARNING, otherwise don't
fold the comparison itself (hopefully GIMPLE optimizations will optimize
it), but still make sure that at least the operands are folded).
Perhaps we can pattern recognize some important cases, or could invoke
e.g. fold_binary_op_with_conditional_arg with some extra argument from
cp_fold directly, whatever.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-02-24  Jakub Jelinek  

PR c++/69922
* class.c (build_base_path): Set TREE_NO_WARNING on the null_test.
Avoid folding it.
* init.c (build_vec_delete_1, build_delete): Don't fold the non-NULL
tests.
* cp-gimplify.c (cp_fold): For TREE_NO_WARNING comparisons with NULL,
unless they are folded into INTEGER_CST, error_mark_node or some
comparison with NULL, avoid folding them and use either the original
comparison or non-folded comparison of folded arguments.
* cp-ubsan.c (cp_ubsan_instrument_vptr): Set TREE_NO_WARNING on the
comparison, don't fold the comparison right away.

* g++.dg/warn/Wnonnull-compare-6.C: New test.
* g++.dg/warn/Wnonnull-compare-7.C: New test.
* g++.dg/ubsan/pr69922.C: New test.

--- gcc/cp/class.c.jj   2016-02-04 12:01:38.0 +0100
+++ gcc/cp/class.c  2016-02-24 11:21:59.490438431 +0100
@@ -392,8 +392,11 @@ build_base_path (enum tree_code code,
   if (null_test)
 {
   tree zero = cp_convert (TREE_TYPE (expr), nullptr_node, complain);
-  null_test = fold_build2_loc (input_location, NE_EXPR, boolean_type_node,
-  expr, zero);
+  null_test = build2_loc (input_location, NE_EXPR, boolean_type_node,
+ expr, zero);
+  /* This is a compiler generated comparison, don't emit
+e.g. -Wnonnull-compare warning for it.  */
+  TREE_NO_WARNING (null_test) = 1;
 }
 
   /* If this is a simple base reference, express it as a COMPONENT_REF.  */
--- gcc/cp/init.c.jj2016-02-19 17:02:14.0 +0100
+++ gcc/cp/init.c   2016-02-24 12:12:49.889583371 +0100
@@ -3678,15 +3678,13 @@ build_vec_delete_1 (tree base, tree maxi
 body = integer_zero_node;
 
   /* Outermost wrapper: If pointer is null, punt.  */
-  tree cond
-= fold_build2_loc (input_location, NE_EXPR, boolean_type_node, base,
-  fold_convert (TREE_TYPE (base), nullptr_node));
+  tree cond = build2_loc (input_location, NE_EXPR, boolean_type_node, base,
+ fold_convert (TREE_TYPE (base), nullptr_node));
   /* This is a compiler generated comparison, don't emit
  e.g. -Wnonnull-compare warning for it.  */
-  if (TREE_CODE (cond) == NE_EXPR)
-TREE_NO_WARNING (cond) = 1;
-  body = fold_build3_loc (input_location, COND_EXPR, void_type_node,
- cond, body, integer_zero_node);
+  TREE_NO_WARNING (cond) = 1;
+  body = build3_loc (input_location, COND_EXPR, void_type_node,
+cond, body, integer_zero_node);
   body = build1 (NOP_EXPR, void_type_node, body);
 
   if (controller)
@@ -4523,9 +4521,8 @@ build_delete (tree otype, tree addr, spe
{
  /* Handle deleting a null pointer.  */
  warning_sentinel s (warn_address);
- ifexp = fold (cp_build_binary_op (input_location,
-   NE_EXPR, addr, nullptr_node,
-   complain));
+ ifexp = cp_build_binary_op (input_location, NE_EXPR, addr,
+ nullptr_node, complain);
  if (ifexp == error_mark_node)
return error_mark_node;
  /* This is a compiler generated comparison, don't emit
--- gcc/cp/cp-gimplify.c.jj 2016-02-24 12:06:15.0 +0100
+++ gcc/cp/cp-gimplify.c2016-02-24 14:12:24.063438023 +0100
@@ -2069,8 +2069,28 @@ cp_fold (tree x)
x = fold (x);
 
   if (TREE_NO_WARNING (org_x)
- && TREE_CODE (x) == TREE_CODE (org_x))
-   TR

[PATCH] Workaround LTO debug info bugs in dwarf2out (PR debug/69705)

2016-02-24 Thread Jakub Jelinek

Hi!

This is something that I hope early debug for LTO will eventually
fix, but we aren't there yet and current trunk emits bogus debug info
for inlines - DW_TAG_subprogram of the inline doesn't contain any
parameters/variables/lexical blocks etc. in it, but in
DW_TAG_inlined_subroutine we add all those directly, without abstract
origins on children (except for one on the DW_TAG_inlined_subroutine
itself).  For Fortran DW_TAG_common_block we were expecting this doesn't
ever happen, and thus assumed that decl is always non-NULL, but LTO
breaks that.

Is the following workaround ok for GCC 6?  Bootstrapped/regtested on
x86_64-linux and i686-linux?

2016-02-24  Jakub Jelinek  

PR debug/69705
* dwarf2out.c (gen_variable_die): Work around buggy LTO
- allow NULL decl for Fortran DW_TAG_common_block variables.

--- gcc/dwarf2out.c.jj  2016-01-25 12:10:59.0 +0100
+++ gcc/dwarf2out.c 2016-02-24 16:00:54.811874481 +0100
@@ -21055,7 +21055,7 @@ gen_variable_die (tree decl, tree origin
 DW_TAG_common_block and DW_TAG_variable.  */
  loc = loc_list_from_tree (com_decl, 2, NULL);
}
-  else if (DECL_EXTERNAL (decl))
+ else if (DECL_EXTERNAL (decl_or_origin))
add_AT_flag (com_die, DW_AT_declaration, 1);
  if (want_pubnames ())
add_pubname_string (cnam, com_die); /* ??? needed? */
@@ -21070,8 +21070,9 @@ gen_variable_die (tree decl, tree origin
  remove_AT (com_die, DW_AT_declaration);
}
   var_die = new_die (DW_TAG_variable, com_die, decl);
-  add_name_and_src_coords_attributes (var_die, decl);
-  add_type_attribute (var_die, TREE_TYPE (decl), decl_quals (decl), false,
+  add_name_and_src_coords_attributes (var_die, decl_or_origin);
+  add_type_attribute (var_die, TREE_TYPE (decl_or_origin),
+ decl_quals (decl_or_origin), false,
  context_die);
   add_AT_flag (var_die, DW_AT_external, 1);
   if (loc)
@@ -21093,9 +21094,10 @@ gen_variable_die (tree decl, tree origin
}
  add_AT_location_description (var_die, DW_AT_location, loc);
}
-  else if (DECL_EXTERNAL (decl))
+  else if (DECL_EXTERNAL (decl_or_origin))
add_AT_flag (var_die, DW_AT_declaration, 1);
-  equate_decl_number_to_die (decl, var_die);
+  if (decl)
+   equate_decl_number_to_die (decl, var_die);
   return;
 }
 

Jakub

Re: [PATCH][RFC][Offloading] Fix PR68463

2016-02-24 Thread Ilya Verbin

On Mon, Feb 22, 2016 at 16:13:07 +0100, Thomas Schwinge wrote:
> (..., and similar for others.)  The if-exists spec function only works
> for absolute paths (I have not researched, why?), so it won't locate the
> files for relative -Bbuild-gcc/[...] prefixes, and linking will fail:
> 
> /tmp/ccGajPD4.crtoffloadtable.o:(.rodata+0x0): undefined reference to 
> `__offload_func_table'
> /tmp/ccGajPD4.crtoffloadtable.o:(.rodata+0x8): undefined reference to 
> `__offload_funcs_end'
> /tmp/ccGajPD4.crtoffloadtable.o:(.rodata+0x10): undefined reference to 
> `__offload_var_table'
> /tmp/ccGajPD4.crtoffloadtable.o:(.rodata+0x18): undefined reference to 
> `__offload_vars_end'
> 
> If I use the absolute -B$PWD/build-gcc/[...], it works.  (But there is no
> requirement for -B prefixes to be absolute, as far as I know.)  Why not
> make it a hard error, though, if these files are missing?  Can we use
> something like (untested pseudo-patch):
> 
> +#ifdef ENABLE_OFFLOADING
> +# define CRTOFFLOADBEGIN "%{fopenacc|fopenmp:%:crtoffloadbegin%O%s}"
> +#else
> +# define CRTOFFLOADBEGIN ""
> +#endif
> 
> @@ -49,14 +49,16 @@ see the files COPYING3 and COPYING.RUNTIME 
> respectively.  If not, see
> %{" NO_PIE_SPEC ":crtbegin.o%s}} \
> %{fvtable-verify=none:%s; \
>   fvtable-verify=preinit:vtv_start_preinit.o%s; \
> - fvtable-verify=std:vtv_start.o%s}"
> + fvtable-verify=std:vtv_start.o%s} \
> +   " CRTOFFLOADBEGIN ")}"

Fixed.  Actually ENABLE_OFFLOADING is always defined (to 0 or to 1).

> To the casual reader, skipping the first offload_files looks like a
> off-by-one error, so I suggest you add a comment "Skip the dummy item at
> the start of the list.", or similar.

Done.

> Ilya, then please remove
> libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims-2.c as part of
> your patch, unless Tom thinks it should be changed to a -flto test, but
> without -fno-use-linker-plugin?

Done.
Here is a follow up patch.  OK for trunk?  Bootstrapped and regtested.
Unfortunately I'm unable to run bootstrap-lto:
libdecnumber/dpd/decimal32.c:53:0: error: type of ‘decDigitsFromDPD’ does not 
match original declaration [-Werror=lto-type-mismatch]
[...]


diff --git a/gcc/config/gnu-user.h b/gcc/config/gnu-user.h
index 2fdb63c..b0bf40a 100644
--- a/gcc/config/gnu-user.h
+++ b/gcc/config/gnu-user.h
@@ -35,6 +35,14 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If 
not, see
 #undef ASM_APP_OFF
 #define ASM_APP_OFF "#NO_APP\n"
 
+#if ENABLE_OFFLOADING == 1
+#define CRTOFFLOADBEGIN "%{fopenacc|fopenmp:crtoffloadbegin%O%s}"
+#define CRTOFFLOADEND "%{fopenacc|fopenmp:crtoffloadend%O%s}"
+#else
+#define CRTOFFLOADBEGIN ""
+#define CRTOFFLOADEND ""
+#endif
+
 /* Provide a STARTFILE_SPEC appropriate for GNU userspace.  Here we add
the GNU userspace magical crtbegin.o file (see crtstuff.c) which
provides part of the support for getting C++ file-scope static
@@ -50,7 +58,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If 
not, see
%{fvtable-verify=none:%s; \
  fvtable-verify=preinit:vtv_start_preinit.o%s; \
  fvtable-verify=std:vtv_start.o%s} \
-   %{fopenacc|fopenmp:%:if-exists(crtoffloadbegin%O%s)}"
+   " CRTOFFLOADBEGIN
 #else
 #define GNU_USER_TARGET_STARTFILE_SPEC \
   "%{!shared: %{pg|p|profile:gcrt1.o%s;:crt1.o%s}} \
@@ -58,7 +66,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If 
not, see
%{fvtable-verify=none:%s; \
  fvtable-verify=preinit:vtv_start_preinit.o%s; \
  fvtable-verify=std:vtv_start.o%s} \
-   %{fopenacc|fopenmp:%:if-exists(crtoffloadbegin%O%s)}"
+   " CRTOFFLOADBEGIN
 #endif
 #undef  STARTFILE_SPEC
 #define STARTFILE_SPEC GNU_USER_TARGET_STARTFILE_SPEC
@@ -76,14 +84,14 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
  fvtable-verify=std:vtv_end.o%s} \
%{shared:crtendS.o%s;: %{" PIE_SPEC ":crtendS.o%s} \
%{" NO_PIE_SPEC ":crtend.o%s}} crtn.o%s \
-   %{fopenacc|fopenmp:%:if-exists(crtoffloadend%O%s)}"
+   " CRTOFFLOADEND
 #else
 #define GNU_USER_TARGET_ENDFILE_SPEC \
   "%{fvtable-verify=none:%s; \
  fvtable-verify=preinit:vtv_end_preinit.o%s; \
  fvtable-verify=std:vtv_end.o%s} \
%{shared|pie:crtendS.o%s;:crtend.o%s} crtn.o%s \
-   %{fopenacc|fopenmp:%:if-exists(crtoffloadend%O%s)}"
+   " CRTOFFLOADEND
 #endif
 #undef  ENDFILE_SPEC
 #define ENDFILE_SPEC GNU_USER_TARGET_ENDFILE_SPEC
diff --git a/libgcc/offloadstuff.c b/libgcc/offloadstuff.c
index a4ea3ac..4ab6397 100644
--- a/libgcc/offloadstuff.c
+++ b/libgcc/offloadstuff.c
@@ -40,7 +40,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If 
not, see
 #include "tm.h"
 #include "libgcc_tm.h"
 
-#if defined(HAVE_GAS_HIDDEN) && defined(ENABLE_OFFLOADING)
+#if defined(HAVE_GAS_HIDDEN) && ENABLE_OFFLOADING == 1
 
 #define OFFLOAD_FUNC_TABLE_SECTION_NAME ".gnu.offload_funcs"
 #define OFFLOAD_VAR_TABLE_SECTION_NAME ".gnu.offload_vars"
diff --git a/lto-plugin/lto-pl

Re: [PATCH][RFC][Offloading] Fix PR68463

2016-02-24 Thread Jakub Jelinek

On Wed, Feb 24, 2016 at 10:29:47PM +0300, Ilya Verbin wrote:
> Done.
> Here is a follow up patch.  OK for trunk?  Bootstrapped and regtested.

Ok with appropriate ChangeLog entry.

Jakub

[PR fortran/60126] Internal compiler error with code using pointer reshaping (gfortran 4.8.2)

2016-02-24 Thread Harald Anlauf

Hello,

the above bug appears to have been fixed over 2.5 years ago.
It does not trigger with 4.9, 5 and 6 trunk, but does with 4.8.0 and
before.

I recommend to close the bug, while adding a testcase to the trunk's
testsuite.  See e.g. the attached example.

Harald

2016-02-24  Harald Anlauf  

* gfortran.dg/pr60126.f90: New test.


Index: gcc/testsuite/gfortran.dg/pr60126.f90
===
--- gcc/testsuite/gfortran.dg/pr60126.f90   (revision 0)
+++ gcc/testsuite/gfortran.dg/pr60126.f90   (revision 0)
@@ -0,0 +1,18 @@
+! { dg-do compile }
+! PR fortran/60126 - ICE on pointer rank remapping
+! Based on testcase by Michel Valin 
+
+subroutine simple_bug_demo
+  implicit none
+  interface
+ function offset_ptr_R4(nelements) result (dest)
+   implicit none
+   real*4, pointer, dimension(:) :: dest
+   integer, intent(IN) :: nelements
+ end function offset_ptr_R4
+  end interface
+
+  real, dimension(:,:), pointer :: R2D
+
+  R2D(-2:2,-3:3) => offset_ptr_R4(100)
+end

Re: [PATCH] Fix PR c++/69736

2016-02-24 Thread Patrick Palka


On Wed, 24 Feb 2016, Jason Merrill wrote:


On 02/23/2016 11:24 AM, Patrick Palka wrote:

1. making tsubst_copy_and_build retain the REF_PARENTHESIZED_P flag when
processing an INDIRECT_REF, or by


This should happen in any case.

2. moving the call to maybe_undo_parenthesized_ref in finish_call_expr 
before
the assignment of orig_fn so that orig_fn will be un-obfuscated as well, or 
by


This would also be OK; at this point we know the expression isn't the operand 
of decltype.


Here's an updated patch that does both these things.  Does it look OK to
commit after testing?

-- >8 --

gcc/cp/ChangeLog:

PR c++/69736
* cp-tree.h (REF_PARENTHESIZED_P): Adjust documentation.
(maybe_undo_parenthesized_ref): Declare.
* semantics.c (maybe_undo_parenthesized_ref): Split out from
check_return_expr.
(finish_call_expr): Use it.
* typeck.c (check_return_expr): Use it.
* pt.c (tsubst_copy_and_build) [INDIRECT_REF]: Retain the
REF_PARENTHESIZED_P flag.

gcc/testsuite/ChangeLog:

PR c++/69736
* g++.dg/cpp1y/paren2.C: New test.
---
 gcc/cp/cp-tree.h|  3 ++-
 gcc/cp/pt.c |  4 
 gcc/cp/semantics.c  | 28 
 gcc/cp/typeck.c | 12 +---
 gcc/testsuite/g++.dg/cpp1y/paren2.C | 31 +++
 5 files changed, 66 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1y/paren2.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 3c23a83a..88c6367 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -3393,7 +3393,7 @@ extern void decl_shadowed_for_var_insert (tree, tree);
   TREE_LANG_FLAG_0 (STRING_CST_CHECK (NODE))

 /* Indicates whether a COMPONENT_REF has been parenthesized, or an
-   INDIRECT_REF comes from parenthesizing a VAR_DECL.  Currently only set
+   INDIRECT_REF comes from parenthesizing a _DECL.  Currently only set
some of the time in C++14 mode.  */

 #define REF_PARENTHESIZED_P(NODE) \
@@ -6361,6 +6361,7 @@ extern tree finish_label_stmt (tree);
 extern void finish_label_decl  (tree);
 extern cp_expr finish_parenthesized_expr   (cp_expr);
 extern tree force_paren_expr   (tree);
+extern tree maybe_undo_parenthesized_ref   (tree);
 extern tree finish_non_static_data_member   (tree, tree, tree);
 extern tree begin_stmt_expr(void);
 extern tree finish_stmt_expr_expr  (tree, tree);
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 4785aa4..f74e9bb 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -15965,6 +15965,10 @@ tsubst_copy_and_build (tree t,
else
  r = build_x_indirect_ref (input_location, r, RO_UNARY_STAR,
complain|decltype_flag);
+
+   if (TREE_CODE (r) == INDIRECT_REF)
+ REF_PARENTHESIZED_P (r) = REF_PARENTHESIZED_P (t);
+
RETURN (r);
   }

diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 38c7516..109f82a 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -1673,6 +1673,30 @@ force_paren_expr (tree expr)
   return expr;
 }

+/* If T is an id-expression obfuscated by force_paren_expr, undo the
+   obfuscation and return the underlying id-expression.  Otherwise
+   return T.  */
+
+tree
+maybe_undo_parenthesized_ref (tree t)
+{
+  if (cxx_dialect >= cxx14
+  && INDIRECT_REF_P (t)
+  && REF_PARENTHESIZED_P (t))
+{
+  t = TREE_OPERAND (t, 0);
+  while (TREE_CODE (t) == NON_LVALUE_EXPR
+|| TREE_CODE (t) == NOP_EXPR)
+   t = TREE_OPERAND (t, 0);
+
+  gcc_assert (TREE_CODE (t) == ADDR_EXPR
+ || TREE_CODE (t) == STATIC_CAST_EXPR);
+  t = TREE_OPERAND (t, 0);
+}
+
+  return t;
+}
+
 /* Finish a parenthesized expression EXPR.  */

 cp_expr
@@ -2263,6 +2287,10 @@ finish_call_expr (tree fn, vec **args, bool 
disallow_virtual,

   gcc_assert (!TYPE_P (fn));

+  /* If FN is a FUNCTION_DECL obfuscated by force_paren_expr, undo
+ it so that we can tell this is a call to a known function.  */
+  fn = maybe_undo_parenthesized_ref (fn);
+
   orig_fn = fn;

   if (processing_template_decl)
diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
index d7ce327..3da6ea1 100644
--- a/gcc/cp/typeck.c
+++ b/gcc/cp/typeck.c
@@ -8929,17 +8929,7 @@ check_return_expr (tree retval, bool *no_warning)

   /* If we had an id-expression obfuscated by force_paren_expr, we need
 to undo it so we can try to treat it as an rvalue below.  */
-  if (cxx_dialect >= cxx14
- && INDIRECT_REF_P (retval)
- && REF_PARENTHESIZED_P (retval))
-   {
- retval = TREE_OPERAND (retval, 0);
- while (TREE_CODE (retval) == NON_LVALUE_EXPR
-|| TREE_CODE (retval) == NOP_EXPR)
-   retval = TREE_OPERAND (retval, 0);
- gcc_assert (TREE_CODE (retval) == ADDR_EXPR);
- retval = TREE_OPERAND (retv

C++ PATCH for c++/69323 (ICE on template friend)

2016-02-24 Thread Jason Merrill

The testcase in the BZ is invalid, but there's a valid variant that also 
ICEd.  I've fixed the compiler to do the right thing with three 
different variants.


Tested x86_64-pc-linux-gnu, applying to trunk.  Changes for the valid 
variant also applied to 5.
commit e99dbeae7942d066e234b3e31ec73e24d6aa2917
Author: Jason Merrill 
Date:   Wed Feb 24 13:26:12 2016 -0500

	PR c++/69323 - valid

	* pt.c (instantiate_class_template_1): Set
	processing_template_decl before substituting friend_type.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 65edfa7..e9cdf6e 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -10180,11 +10180,11 @@ instantiate_class_template_1 (tree type)
 		   template  friend class T::C;
 
 		 otherwise.  */
+		  /* Bump processing_template_decl in case this is something like
+		 template  friend struct A::B.  */
+		  ++processing_template_decl;
 		  friend_type = tsubst (friend_type, args,
 	tf_warning_or_error, NULL_TREE);
-		  /* Bump processing_template_decl for correct
-		 dependent_type_p calculation.  */
-		  ++processing_template_decl;
 		  if (dependent_type_p (friend_type))
 		adjust_processing_template_decl = true;
 		  --processing_template_decl;
diff --git a/gcc/testsuite/g++.dg/template/friend61.C b/gcc/testsuite/g++.dg/template/friend61.C
new file mode 100644
index 000..1604f5c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/friend61.C
@@ -0,0 +1,12 @@
+// PR c++/69323
+
+template
+struct Outer
+{
+  struct StupidValueTrick
+  {
+template friend struct Outer::StupidValueTrick;
+  };
+};
+typedef Outer<42>::StupidValueTrick GoodValue;
+GoodValue good;

commit 63aabcee8f5bc478488102628e432dd3cbc0bbdc
Author: Jason Merrill 
Date:   Wed Feb 24 14:05:13 2016 -0500

	PR c++/69323 - errors

	* friend.c (make_friend_class): Likewise.
	* decl.c (lookup_and_check_tag): Diagnose invalid dependent friend.

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 2df3398..5ec6589 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -12590,6 +12590,20 @@ lookup_and_check_tag (enum tag_types tag_code, tree name,
 	   decl,
 	   template_header_p
 	   | DECL_SELF_REFERENCE_P (decl));
+  if (template_header_p && t && CLASS_TYPE_P (t)
+	  && (!CLASSTYPE_TEMPLATE_INFO (t)
+	  || (!PRIMARY_TEMPLATE_P (CLASSTYPE_TI_TEMPLATE (t)
+	{
+	  error ("%qT is not a template", t);
+	  inform (location_of (t), "previous declaration here");
+	  if (TYPE_CLASS_SCOPE_P (t)
+	  && CLASSTYPE_TEMPLATE_INFO (TYPE_CONTEXT (t)))
+	inform (input_location,
+		"perhaps you want to explicitly add %<%T::%>",
+		TYPE_CONTEXT (t));
+	  t = error_mark_node;
+	}
+
   return t;
 }
   else if (decl && TREE_CODE (decl) == TREE_LIST)
diff --git a/gcc/cp/friend.c b/gcc/cp/friend.c
index 36b000f..5e4b2d1 100644
--- a/gcc/cp/friend.c
+++ b/gcc/cp/friend.c
@@ -255,6 +255,18 @@ make_friend_class (tree type, tree friend_type, bool complain)
 		 friend_type);
 	  return;
 	}
+  if (TYPE_TEMPLATE_INFO (friend_type)
+	  && !PRIMARY_TEMPLATE_P (TYPE_TI_TEMPLATE (friend_type)))
+	{
+	  error ("%qT is not a template", friend_type);
+	  inform (location_of (friend_type), "previous declaration here");
+	  if (TYPE_CLASS_SCOPE_P (friend_type)
+	  && CLASSTYPE_TEMPLATE_INFO (TYPE_CONTEXT (friend_type))
+	  && currently_open_class (TYPE_CONTEXT (friend_type)))
+	inform (input_location, "perhaps you need explicit template "
+		"arguments in your nested-name-specifier");
+	  return;
+	}
 }
   else if (same_type_p (type, friend_type))
 {
diff --git a/gcc/testsuite/g++.dg/template/crash34.C b/gcc/testsuite/g++.dg/template/crash34.C
index ef4d21b..83dcc78 100644
--- a/gcc/testsuite/g++.dg/template/crash34.C
+++ b/gcc/testsuite/g++.dg/template/crash34.C
@@ -7,6 +7,6 @@
 
 class Foo;
 
-template  class Foo { }; // { dg-error "not a template type" }
+template  class Foo { }; // { dg-error "not a template" }
 
 Foo x; // { dg-error "not a template|incomplete type" }
diff --git a/gcc/testsuite/g++.dg/template/friend61a.C b/gcc/testsuite/g++.dg/template/friend61a.C
new file mode 100644
index 000..d38e53a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/friend61a.C
@@ -0,0 +1,12 @@
+// PR c++/69323
+
+template
+struct Outer
+{
+  struct StupidValueTrick
+  {
+template friend struct StupidValueTrick; // { dg-error "not a template" }
+  };
+};
+typedef Outer<42>::StupidValueTrick GoodValue;
+GoodValue good;
diff --git a/gcc/testsuite/g++.dg/template/friend61b.C b/gcc/testsuite/g++.dg/template/friend61b.C
new file mode 100644
index 000..2da5d60
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/friend61b.C
@@ -0,0 +1,12 @@
+// PR c++/69323
+
+template
+struct Outer
+{
+  struct StupidValueTrick
+  {
+template friend struct Outer::StupidValueTrick; // { dg-error "not a template" }
+  };
+};
+typedef Outer<42>::StupidValueTrick GoodValue;
+GoodValue good;

Re: [PATCH] Further -Wnonnull-compare fixes (PR c++/69922)

2016-02-24 Thread Jason Merrill


On 02/24/2016 02:05 PM, Jakub Jelinek wrote:

+ && integer_zerop (tree_strip_nop_conversions (TREE_OPERAND (org_x,
+ 1


Why check this?  I think we want this handling for all TREE_NO_WARNING 
comparisons.


Jason

Re: [PATCH] Further -Wnonnull-compare fixes (PR c++/69922)

2016-02-24 Thread Jakub Jelinek

On Wed, Feb 24, 2016 at 02:57:49PM -0500, Jason Merrill wrote:
> On 02/24/2016 02:05 PM, Jakub Jelinek wrote:
> >+  && integer_zerop (tree_strip_nop_conversions (TREE_OPERAND (org_x,
> >+  1
> 
> Why check this?  I think we want this handling for all TREE_NO_WARNING
> comparisons.

There probably aren't any other anyway, so it is really not needed.
Is the patch ok with those two lines removed and ) moved?

Jakub

[pr/69916] ICE with empty loops

2016-02-24 Thread Nathan Sidwell


Jakub,
this patch fixes the ICE reported in pr69916 
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69916)  The loop is lowered at 
omp-lowering, but subsequently determined to be dead before we get  to 
oacc-target-lower.  The loop CF is removed along with the (pure) IFN_OACC_LOOP 
function calls inserted during lowering.  However the IFN_UNIQUE loop head & 
tail calls remain (because they are not pure).  Thus in  the oacc-target-lower 
pass we rediscover the loop structure.


Firstly we assign a specific axis for this loop -- as it's auto.  That's a 
pessimization, but not wrong.  However, we then scan the  loop to adjust the 
expected OACC_LOOP calls with the determined partitioning information.  As 
they're not there, we end up falling out of the function and die with a 
single_succ_edge assert.  (In general we  might end up finding OACC_LOOP  calls 
of an inner loop, or meeting a block with more than one successor.  Either would 
be bad.)


This patch changes the loop transformation to count OACC_LOOP calls it 
encounters when rediscovering the loops, and uses that count for the OACC_LOOP 
adjustment scan (rather than expect OACC_LOOP_BOUND to be the last one).  That 
fixes the ICE.


While there it  is trivial to mark the loop as not to be partitioned, if we 
discover no OACC_LOOP calls, which addresses the pessimization mentioned above.


As the loop is no longer partitioned, the fork and join  markers, end up being 
deleted.


ok for trunk?

nathan
2016-02-24  Nathan Sidwell  

	gcc/
	PR middle-end/69916
	* omp-low.c (struct oacc_loop): Add ifns.
	(new_oacc_loop_raw): Initialize it.
	(finish_oacc_loop): Clear mask & flags if no ifns.
	(oacc_loop_discover_walk): Count IFN_GOACC_LOOP calls.
	(oacc_loop_xform_loop): Add ifns arg & adjust.
	(oacc_loop_process): Adjust oacc_loop_xform_loop call.

	gcc/testsuite/
	PR middle-end/69916
	* c-c-++-common/goacc/pr69916.c: New.

Index: omp-low.c
===
--- omp-low.c	(revision 233663)
+++ omp-low.c	(working copy)
@@ -241,8 +241,9 @@ struct oacc_loop
   tree routine;  /* Pseudo-loop enclosing a routine.  */
 
   unsigned mask;   /* Partitioning mask.  */
-  unsigned flags;   /* Partitioning flags.  */
-  tree chunk_size;   /* Chunk size.  */
+  unsigned flags;  /* Partitioning flags.  */
+  unsigned ifns;   /* Contained loop abstraction functions.  */
+  tree chunk_size; /* Chunk size.  */
   gcall *head_end; /* Final marker of head sequence.  */
 };
 
@@ -20393,6 +20394,7 @@ new_oacc_loop_raw (oacc_loop *parent, lo
   loop->routine = NULL_TREE;
 
   loop->mask = loop->flags = 0;
+  loop->ifns = 0;
   loop->chunk_size = 0;
   loop->head_end = NULL;
 
@@ -20454,6 +20456,9 @@ new_oacc_loop_routine (oacc_loop *parent
 static oacc_loop *
 finish_oacc_loop (oacc_loop *loop)
 {
+  /* If the loop has been collapsed, don't partition it.  */
+  if (!loop->ifns)
+loop->mask = loop->flags = 0;
   return loop->parent;
 }
 
@@ -20584,43 +20589,54 @@ oacc_loop_discover_walk (oacc_loop *loop
   if (!gimple_call_internal_p (call))
 	continue;
 
-  if (gimple_call_internal_fn (call) != IFN_UNIQUE)
-	continue;
+  switch (gimple_call_internal_fn (call))
+	{
+	default:
+	  break;
 
-  enum ifn_unique_kind kind
-	= (enum ifn_unique_kind) TREE_INT_CST_LOW (gimple_call_arg (call, 0));
-  if (kind == IFN_UNIQUE_OACC_HEAD_MARK
-	  || kind == IFN_UNIQUE_OACC_TAIL_MARK)
-	{
-	  if (gimple_call_num_args (call) == 2)
-	{
-	  gcc_assert (marker && !remaining);
-	  marker = 0;
-	  if (kind == IFN_UNIQUE_OACC_TAIL_MARK)
-		loop = finish_oacc_loop (loop);
-	  else
-		loop->head_end = call;
-	}
-	  else
-	{
-	  int count = TREE_INT_CST_LOW (gimple_call_arg (call, 2));
+	case IFN_GOACC_LOOP:
+	  /* Count the goacc loop abstraction fns, to determine if the
+	 loop was collapsed already.  */
+	  loop->ifns++;
+	  break;
 
-	  if (!marker)
+	case IFN_UNIQUE:
+	  enum ifn_unique_kind kind
+	= (enum ifn_unique_kind) (TREE_INT_CST_LOW
+  (gimple_call_arg (call, 0)));
+	  if (kind == IFN_UNIQUE_OACC_HEAD_MARK
+	  || kind == IFN_UNIQUE_OACC_TAIL_MARK)
+	{
+	  if (gimple_call_num_args (call) == 2)
 		{
-		  if (kind == IFN_UNIQUE_OACC_HEAD_MARK)
-		loop = new_oacc_loop (loop, call);
-		  remaining = count;
+		  gcc_assert (marker && !remaining);
+		  marker = 0;
+		  if (kind == IFN_UNIQUE_OACC_TAIL_MARK)
+		loop = finish_oacc_loop (loop);
+		  else
+		loop->head_end = call;
 		}
-	  gcc_assert (count == remaining);
-	  if (remaining)
+	  else
 		{
-		  remaining--;
-		  if (kind == IFN_UNIQUE_OACC_HEAD_MARK)
-		loop->heads[marker] = call;
-		  else
-		loop->tails[remaining] = call;
+		  int count = TREE_INT_CST_LOW (gimple_call_arg (call, 2));
+
+		  if (!marker)
+		{
+		  if (kind == IFN_UNIQUE_OACC_HEAD_MARK)
+			loop = new_oacc_loop (loop, call);
+		  remaining = count;
+		}
+		  gcc_assert (count == remaini

Re: [PR fortran/60126] Internal compiler error with code using pointer reshaping (gfortran 4.8.2)

2016-02-24 Thread Harald Anlauf

On 02/24/16 20:42, Harald Anlauf wrote:
> Hello,
> 
> the above bug appears to have been fixed over 2.5 years ago.
> It does not trigger with 4.9, 5 and 6 trunk, but does with 4.8.0 and
> before.
> 
> I recommend to close the bug, while adding a testcase to the trunk's
> testsuite.  See e.g. the attached example.

I was missing the fact that the testsuite uses -pedantic-errors
which rejects the real*4 in the original testcase.  This non-standard
construct was not needed for the demonstration.  Fixed in the new
version.  Sorry for that.

Whoever wants to take it.

Harald

Index: gcc/testsuite/gfortran.dg/pr60126.f90
===
--- gcc/testsuite/gfortran.dg/pr60126.f90   (revision 0)
+++ gcc/testsuite/gfortran.dg/pr60126.f90   (revision 0)
@@ -0,0 +1,18 @@
+! { dg-do compile }
+! PR fortran/60126 - ICE on pointer rank remapping
+! Based on testcase by Michel Valin 
+
+subroutine simple_bug_demo
+  implicit none
+  interface
+ function offset_ptr_R4(nelements) result (dest)
+   implicit none
+   real, pointer, dimension(:) :: dest
+   integer, intent(IN) :: nelements
+ end function offset_ptr_R4
+  end interface
+
+  real, dimension(:,:), pointer :: R2D
+
+  R2D(-2:2,-3:3) => offset_ptr_R4(100)
+end


> 
> Harald
> 
> 2016-02-24  Harald Anlauf  
> 
>   * gfortran.dg/pr60126.f90: New test.
> 
> 
> Index: gcc/testsuite/gfortran.dg/pr60126.f90
> ===
> --- gcc/testsuite/gfortran.dg/pr60126.f90 (revision 0)
> +++ gcc/testsuite/gfortran.dg/pr60126.f90 (revision 0)
> @@ -0,0 +1,18 @@
> +! { dg-do compile }
> +! PR fortran/60126 - ICE on pointer rank remapping
> +! Based on testcase by Michel Valin 
> +
> +subroutine simple_bug_demo
> +  implicit none
> +  interface
> + function offset_ptr_R4(nelements) result (dest)
> +   implicit none
> +   real*4, pointer, dimension(:) :: dest
> +   integer, intent(IN) :: nelements
> + end function offset_ptr_R4
> +  end interface
> +
> +  real, dimension(:,:), pointer :: R2D
> +
> +  R2D(-2:2,-3:3) => offset_ptr_R4(100)
> +end
> 


-- 
Harald Anlauf
Dieburger Str. 17
60386 Frankfurt
Tel.: (069) 4014 8318

Re: [PATCH] Workaround LTO debug info bugs in dwarf2out (PR debug/69705)

2016-02-24 Thread Jason Merrill


OK.

Jason

Re: [PATCH] Further -Wnonnull-compare fixes (PR c++/69922)

2016-02-24 Thread Jason Merrill


On 02/24/2016 03:05 PM, Jakub Jelinek wrote:

On Wed, Feb 24, 2016 at 02:57:49PM -0500, Jason Merrill wrote:

On 02/24/2016 02:05 PM, Jakub Jelinek wrote:

+ && integer_zerop (tree_strip_nop_conversions (TREE_OPERAND (org_x,
+ 1


Why check this?  I think we want this handling for all TREE_NO_WARNING
comparisons.


There probably aren't any other anyway, so it is really not needed.
Is the patch ok with those two lines removed and ) moved?


Yes.

Jason

Re: PATCH to add -flifetime-dse=1

2016-02-24 Thread Jason Merrill


On 02/24/2016 10:25 AM, Jakub Jelinek wrote:

That should be @option{-flifetime-dse=1} I think.  Shouldn't -flifetime-dse=
be also in @opindex at the beginning of the paragraph, and documented what
the values mean (0 equivalent of -fno-lifetime-dse (or document it vice
versa) and 2 full lifetime dse enabled?




commit ab0d46cc0eccaa31b91488f2a691e36443cd2992
Author: jason 
Date:   Wed Feb 24 19:55:57 2016 +

	* doc/invoke.texi: Adjust -flifetime-dse documentation.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@233680 138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index b8b2e70..18b2b8f 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -6812,7 +6812,9 @@ storage persisting beyond the lifetime of the object, you can use this
 flag to disable this optimization.  To preserve stores before the
 constructor starts (e.g. because your operator new clears the object
 storage) but still treat the object as dead after the destructor you,
-can use -flifetime-dse=1.
+can use @option{-flifetime-dse=1}.  The default behavior can be
+explicitly selected with @option{-flifetime-dse=2}.
+@option{-flifetime-dse=0} is equivalent to @option{-fno-lifetime-dse}.
 
 @item -flive-range-shrinkage
 @opindex flive-range-shrinkage

Re: [PATCH] Fix PR c++/69736

2016-02-24 Thread Jason Merrill


OK.

Jason

RE: [Patch, MIPS] Patch for PR 68273 (user aligned variable arguments)

2016-02-24 Thread Matthew Fortune

Steve Ellcey   writes:
> Here is a new patch for PR 68273.  The original problem with gsoap has
> been fixed by changing GCC to not create overly-aligned variables in
> the SRA pass but the MIPS specific problem of how user-aligned variables
> are passed to functions remains.
> 
> Because MIPS GCC is internally inconsistent, it seems like changing
> the passing convention for user aligned variables on MIPS is the best
> option, even though this is an ABI change for MIPS.
> 
> For example:
> 
>   typedef int alignedint __attribute__((aligned(8)));
>   int foo1 (int x, alignedint y) { return x+y; }
>   int foo2 (int x, int y) { return x+y; }
>   int foo3 (int x, alignedint y) { return x+y; }
>   int foo4 (int x, int y) { return x+y; }
>   int splat1(int a, alignedint b) { foo1(a,b); }
>   int splat2(int a, alignedint b) { foo2(a,b); }
>   int splat3(int a, int b) { foo3(a,b); }
>   int splat4(int a, int b) { foo4(a,b); }
> 
> In this case foo1 and foo3 would expect the second argument to be in
> register $6, but foo2 and foo4 would exect it in register $5.
> 
> Likewise splat1 and splat2 would pass the second argument in $6, but
> splat3 and splat4 would pass it in $5.
> 
> This means that the call from splat2 to foo2 and the call from splat3
> to foo3 would be wrong in that the caller is putting the argument in
> one register but the callee is getting out of a different register.

Thanks for enumerating all the cases. I'd not looked at all of them. I
do agree that we need a fix given the existing inconsistencies.

One question I have is where does an over aligned argument get pushed
to the stack with the patch in place. I.e. when taking its address, is
the alignment achieved up to the limit of stack alignment or do they now
only get register-size alignment? If the former then the idea that
argument passing is defined as a structure in memory with the first
portion in registers will no longer hold true. Not sure if that is a
problem.

> In none of these cases would GCC give a warning about the inconsistent
> parameter passing
> 
> Since this patch could cause a change in the users program, I have
> added a warning that will be emitted whenever a user passes an
> aligned type or when a parameter is declared as an aligned type
> and that alignment could cause a change in how the value is passed.
> 
> I also added a way to turn that warning off in case the user doesn't
> want to see it (-mno-warn-aligned-args).  I did not add an option
> to make GCC pass arguments in the old manner as I consider that method
> of passing arguments to be a bug and I don't think we want to have an
> option to propogate that incorrect behavior.

This sounds good to me.

I'd like to wait and see if anyone else has comments here given the
amount of discussion there has been on this PR.

I've pointed out a couple of style issues inline below.

Matthew

> 
> Steve Ellcey
> sell...@imgtec.com
> 
> 
> 2016-02-24  Steve Ellcey  
> 
>   PR target/68273
>   * config/mips/mips.opt (mwarn-aligned-args): New flag.

Needs doc/invoke.texi update.

>   * config/mips/mips.h (mips_args): Add new field.
>   * config/mips/mips.c (mips_internal_function_arg): New function.
>   (mips_function_incoming_arg): New function.
>   (mips_old_function_arg_boundary): New function.
>   (mips_function_arg): Rewrite to use mips_internal_function_arg.
>   (mips_function_arg_boundary): Fix argument alignment.
>   (TARGET_FUNCTION_INCOMING_ARG): New define.
> 
> 
> diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
> index 697abc2..05465c1 100644
> --- a/gcc/config/mips/mips.c
> +++ b/gcc/config/mips/mips.c
> @@ -1124,6 +1124,7 @@ static const struct mips_rtx_cost_data
>  static rtx mips_find_pic_call_symbol (rtx_insn *, rtx, bool);
>  static int mips_register_move_cost (machine_mode, reg_class_t,
>   reg_class_t);
> +static unsigned int mips_old_function_arg_boundary (machine_mode,
> const_tree);
>  static unsigned int mips_function_arg_boundary (machine_mode,
> const_tree);
>  static machine_mode mips_get_reg_raw_mode (int regno);
>  
> @@ -5459,11 +5460,11 @@ mips_strict_argument_naming (cumulative_args_t
> ca ATTRIBUTE_UNUSED)
>return !TARGET_OLDABI;
>  }
> 
> -/* Implement TARGET_FUNCTION_ARG.  */
> +/* Used to implement TARGET_FUNCTION_ARG and
> TARGET_FUNCTION_INCOMING_ARG.  */
> 
>  static rtx
> -mips_function_arg (cumulative_args_t cum_v, machine_mode mode,
> -const_tree type, bool named)
> +mips_internal_function_arg (cumulative_args_t cum_v, machine_mode mode,
> + const_tree type, bool named)
>  {
>CUMULATIVE_ARGS *cum = get_cumulative_args (cum_v);
>struct mips_arg_info info;
> @@ -5586,6 +5587,50 @@ mips_function_arg (cumulative_args_t cum_v,
> machine_mode mode,
>return gen_rtx_REG (mode, mips_arg_regno (&info, TARGET_HARD_FLOAT));
>  }
> 
> +/* Implement TARGET_FUNCTION_ARG.  */
> +
>

[PATCH v2] gcov: Runtime configurable destination output

2016-02-24 Thread Aaron Conole

The previous gcov behavior was to always output errors on the stderr channel.
This is fine for most uses, but some programs will require stderr to be
untouched by libgcov for certain tests. This change allows configuring
the gcov output via an environment variable which will be used to open
the appropriate file.
---
v2:
* Retitled subject
* Cleaned up whitespace in libgcov-driver-system.c diff
* Lazy error file opening
* non-static error file
* No warnings during compilation

 libgcc/libgcov-driver-system.c | 35 ++-
 libgcc/libgcov-driver.c|  6 ++
 2 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/libgcc/libgcov-driver-system.c b/libgcc/libgcov-driver-system.c
index 4e3b244..0eb9755 100644
--- a/libgcc/libgcov-driver-system.c
+++ b/libgcc/libgcov-driver-system.c
@@ -23,6 +23,24 @@ a copy of the GCC Runtime Library Exception along with this 
program;
 see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 .  */
 
+FILE *__gcov_error_file = NULL;
+
+static FILE *
+get_gcov_error_file(void)
+{
+  char *gcov_error_filename = getenv("GCOV_ERROR_FILE");
+  FILE *gcov_error_file = NULL;
+  if (gcov_error_filename)
+{
+  FILE *openfile = fopen(gcov_error_filename, "a");
+  if (openfile)
+gcov_error_file = openfile;
+}
+  if (!gcov_error_file)
+gcov_error_file = stderr;
+  return gcov_error_file;
+}
+
 /* A utility function for outputing errors.  */
 
 static int __attribute__((format(printf, 1, 2)))
@@ -30,12 +48,27 @@ gcov_error (const char *fmt, ...)
 {
   int ret;
   va_list argp;
+
+  if (!__gcov_error_file)
+__gcov_error_file = get_gcov_error_file();
+
   va_start (argp, fmt);
-  ret = vfprintf (stderr, fmt, argp);
+  ret = vfprintf (__gcov_error_file, fmt, argp);
   va_end (argp);
   return ret;
 }
 
+#if !IN_GCOV_TOOL
+static void
+gcov_error_exit(void)
+{
+  if (__gcov_error_file && __gcov_error_file != stderr)
+{
+  fclose(__gcov_error_file);
+}
+}
+#endif
+
 /* Make sure path component of the given FILENAME exists, create
missing directories. FILENAME must be writable.
Returns zero on success, or -1 if an error occurred.  */
diff --git a/libgcc/libgcov-driver.c b/libgcc/libgcov-driver.c
index 9c4eeca..83d84c5c 100644
--- a/libgcc/libgcov-driver.c
+++ b/libgcc/libgcov-driver.c
@@ -46,6 +46,10 @@ void __gcov_init (struct gcov_info *p __attribute__ 
((unused))) {}
 /* A utility function for outputing errors.  */
 static int gcov_error (const char *, ...);
 
+#if !IN_GCOV_TOOL
+static void gcov_error_exit(void);
+#endif
+
 #include "gcov-io.c"
 
 struct gcov_fn_buffer
@@ -878,6 +882,8 @@ gcov_exit (void)
 __gcov_root.prev->next = __gcov_root.next;
   else
 __gcov_master.root = __gcov_root.next;
+
+  gcov_error_exit();
 }
 
 /* Add a new object file onto the bb chain.  Invoked automatically
-- 
2.5.0

Re: [Patch, MIPS] Patch for PR 68273 (user aligned variable arguments)

2016-02-24 Thread Joseph Myers

On Wed, 24 Feb 2016, Steve Ellcey  wrote:

> I also added a way to turn that warning off in case the user doesn't
> want to see it (-mno-warn-aligned-args).  I did not add an option

Any command-line options need documenting in invoke.texi.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: Fix/work around PR57676, LRA terminates prematurely

2016-02-24 Thread Jeff Law


On 02/23/2016 04:56 AM, Richard Biener wrote:

On Mon, Feb 22, 2016 at 4:34 PM, Jeff Law  wrote:

On 02/22/2016 07:34 AM, Richard Biener wrote:


Hum, but then you get to "inifinite" compiles again when LRA is buggy
or the user presents it with an impossible to handle asm.


Neither should be happening in practice, even an impossible asm should cause
LRA to halt in some way or another.

In practice looping has occurred due to bugs in machine descriptions are are
typically seen during development/porting.  Hence the desire to put it under
-fchecking for gcc-6 and possibly implement something smarter for gcc-7
(where we'd track more precisely whether or not we're making forward
progress).



I don't think that's a good idea - maybe bumping the limit is the way to
go instead?


No, because one just needs to build a longer chain of insns needing
reloading.



30 constraint passes sounds excessive and a sign of a bug to me anyway.


Not really.  If you look at the testcase and the chain of reloads, it's
legitimate.  Essentially each pass exposes a case where spill a register in
an insn that previously had a register allocated.


But requiring another full reload pass to handle such chains is pointing at
a wrong algorithm IMHO.  Isn't this also quadratic in the length of the chain?
Note that reload behaves similarly.  Not for this exact case, but it's 
essentially a "keep iterating until nothing changes" algorithm.  It's 
just so poor in its ability to assign things to registers that these 
deep chains are rare.


As Vlad noted, the test is definitely a pathological case.  I think 
Bernd's patch is a very reasonable approach to address the current 
problem.  Namely that LRA can be making progress on a pathological 
testcase, but it gets terminated by the anti-looping clamp.  The clamp 
itself was put in place to catch bugs in LRA or ports that are in the 
process of converting to LRA.


Jeff

Re: [PATCH] hurd: align -p and -pg behavior on Linux

2016-02-24 Thread Thomas Schwinge

Hi!

Sorry for the late answer...

On Sat, 19 Sep 2015 14:00:23 +0200, Samuel Thibault  
wrote:
> On Linux, -p and -pg do not make gcc link against libc_p.a, only
> -profile does (as documented in r11246), and thus people expect -p

(Yo, 20 years ago...)

> and -pg to work without libc_p.a installed (it is actually even not
> available any more in Debian).  We should thus rather make the Hurd port
> do the same to avoid build failures.

Conceptually, ACK.


>   * gcc/config/gnu.h (LIB_SPEC) [-p|-pg]: Link with -lc instead of -lc_p.

> --- gcc/config/gnu.h.orig 2015-09-16 00:43:09.785570853 +0200
> +++ gcc/config/gnu.h  2015-09-16 00:43:12.513550418 +0200
> @@ -25,7 +25,7 @@
>  
>  /* Default C library spec.  */
>  #undef LIB_SPEC
> -#define LIB_SPEC "%{pthread:-lpthread} %{pg|p|profile:-lc_p;:-lc}"
> +#define LIB_SPEC "%{pthread:-lpthread} %{profile:-lc_p;:-lc}"

I guess, we can just drop that custom LIB_SPEC altogether, and will then
use the default (which is also used for x86 GNU/Linux):

gcc/config/gnu-user.h:#define GNU_USER_TARGET_NO_PTHREADS_LIB_SPEC \
gcc/config/gnu-user.h-  "%{shared:-lc} \
gcc/config/gnu-user.h-   %{!shared:%{mieee-fp:-lieee} 
%{profile:-lc_p}%{!profile:-lc}}"
gcc/config/gnu-user.h-
gcc/config/gnu-user.h:#define GNU_USER_TARGET_LIB_SPEC \
gcc/config/gnu-user.h-  "%{pthread:-lpthread} " \
gcc/config/gnu-user.h:  GNU_USER_TARGET_NO_PTHREADS_LIB_SPEC
gcc/config/gnu-user.h-
gcc/config/gnu-user.h:#undef  LIB_SPEC
gcc/config/gnu-user.h:#define LIB_SPEC GNU_USER_TARGET_LIB_SPEC

I have not tested the -mieee-fp thingy, but I don't expect any issues
there; looke like this on both x86 GNU/Linux and GNU/Hurd:

$ nm /usr/lib/i386-*gnu/libieee.a
 D _LIB_VERSION


That said, I think we can also drop our custom CPP_SPEC:

gcc/config/gnu.h:#define CPP_SPEC "%{posix:-D_POSIX_SOURCE}"

..., and instead go with the default:

gcc/config/i386/gnu-user-common.h:#define CPP_SPEC 
"%{posix:-D_POSIX_SOURCE} %{pthread:-D_REENTRANT}"

(It doesn't matter for GNU/Hurd which is x86-only, but I don't know why
CPP_SPEC is defined in gcc/config/i386/gnu-user-common.h -- and repeated
in a number of other GNU-user and */linux.h files -- instead of putting
it into the generic gcc/config/gnu-user.h?)

I guess getting -D_REENTRANT for -pthread won't do us any harm?


> * gcc/config/i386/gnu.h (STARTFILE_SPEC) [-p|-pg]: Use gcrt1.o
> instead of gcrt0.o.

> --- gcc/config/i386/gnu.h.orig2015-09-17 21:41:13.0 +
> +++ gcc/config/i386/gnu.h 2015-09-17 23:03:57.0 +
> @@ -27,11 +27,11 @@
>  #undef   STARTFILE_SPEC
>  #if defined HAVE_LD_PIE
>  #define STARTFILE_SPEC \
> -  "%{!shared: 
> %{pg|p|profile:gcrt0.o%s;pie:Scrt1.o%s;static:crt0.o%s;:crt1.o%s}} \
> +  "%{!shared: 
> %{pg|p:gcrt1.o%s;profile:gcrt0.o%s;pie:Scrt1.o%s;static:crt0.o%s;:crt1.o%s}} \
> crti.o%s %{static:crtbeginT.o%s;shared|pie:crtbeginS.o%s;:crtbegin.o%s}"
>  #else
>  #define STARTFILE_SPEC \
> -  "%{!shared: %{pg|p|profile:gcrt0.o%s;static:crt0.o%s;:crt1.o%s}} \
> +  "%{!shared: %{pg|p:gcrt1.o%s;profile:gcrt0.o%s;static:crt0.o%s;:crt1.o%s}} 
> \
> crti.o%s %{static:crtbeginT.o%s;shared|pie:crtbeginS.o%s;:crtbegin.o%s}"
>  #endif

I think I understand what you're trying to do (avoid gcrt0.o being used
for -pg or -p, and instead use gcrt1.o), but I'm not sure this is
completely correct.  Which of the several configurations, that is, flags
and their combinations, did you actually test?

In my understanding, the Hurd needs crt0.o for static linking, and crt1.o
for dynamic linking.  Likewise, for -pg or -p, I would assume that we
still need gcrt0.o for static linking, and gcrt1.o for dynamic linking.
Instead you're now suggesting to always use gcrt1.o for -pg or -p, and
gcrt0.o for -profile.


I'm now testing the following patch:

--- gcc/config/gnu.h
+++ gcc/config/gnu.h
@@ -19,14 +19,6 @@ You should have received a copy of the GNU General Public 
License
 along with GCC.  If not, see .
 */
 
-/* Provide GCC options for standard feature-test macros.  */
-#undef CPP_SPEC
-#define CPP_SPEC "%{posix:-D_POSIX_SOURCE}"
-
-/* Default C library spec.  */
-#undef LIB_SPEC
-#define LIB_SPEC "%{pthread:-lpthread} %{pg|p|profile:-lc_p;:-lc}"
-
 #undef GNU_USER_TARGET_OS_CPP_BUILTINS
 #define GNU_USER_TARGET_OS_CPP_BUILTINS()  \
 do {   \
--- gcc/config/i386/gnu.h
+++ gcc/config/i386/gnu.h
@@ -27,11 +27,11 @@ along with GCC.  If not, see .
 #undef STARTFILE_SPEC
 #if defined HAVE_LD_PIE
 #define STARTFILE_SPEC \
-  "%{!shared: 
%{pg|p|profile:gcrt0.o%s;pie:Scrt1.o%s;static:crt0.o%s;:crt1.o%s}} \
+  "%{!shared: 
%{pg|p|profile:%{static:gcrt0.o%s;:gcrt1.o%s};pie:Scrt1.o%s;static:crt0.o%s;:crt1.o%s}}
 \
crti.o%s %{static:crtbeginT.o%s;shared|pie:crtbeginS.o%s;:crtbegin.o%s}"
 #else
 #define STARTFILE_SPEC \
-  "%{!shared: %{p

[PATCH] powerpc: Handle DImode rotatert implemented with rlwinm (PR69946)

2016-02-24 Thread Segher Boessenkool

Some DImode rotate-right-and-mask can be implemented best with a rlwinm
instruction: those that could be a lshiftrt instead of a rotatert, while
the mask is not right-aligned.  Why the rotate in the testcase is not
optimised to a plain shift is another question, but we need to handle
it here anyway.  We compute the shift amount for a 64-bit rotate.  This
is 32 too high in this case; if we print using %h that is masked out (and
this doesn't silently let through invalid instructions, everything is
checked by rs6000_is_valid_shift_mask which is much more thorough).

Built and tested on powerpc64-linux, -m32,-m64 and -mlra,-mno-lra.  Also
tested the new test on powerpc64le-linux (where the test is skipped).
Is this okay for trunk?


Segher


2016-02-24  Segher Boessenkool  

PR target/69946
* config/rs6000/rs6000.c (rs6000_insn_for_shift_mask): Print rlwinm
shift amount using %h.

gcc/testsuite/
* pr69946.c: New file.

---
 gcc/config/rs6000/rs6000.c |  4 ++--
 gcc/testsuite/gcc.target/powerpc/pr69946.c | 38 ++
 2 files changed, 40 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr69946.c

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 92cc9ee..d5abb98 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -17439,8 +17439,8 @@ rs6000_insn_for_shift_mask (machine_mode mode, rtx 
*operands, bool dot)
   operands[3] = GEN_INT (31 - nb);
   operands[4] = GEN_INT (31 - ne);
   if (dot)
-   return "rlw%I2nm. %0,%1,%2,%3,%4";
-  return "rlw%I2nm %0,%1,%2,%3,%4";
+   return "rlw%I2nm. %0,%1,%h2,%3,%4";
+  return "rlw%I2nm %0,%1,%h2,%3,%4";
 }
 
   gcc_unreachable ();
diff --git a/gcc/testsuite/gcc.target/powerpc/pr69946.c 
b/gcc/testsuite/gcc.target/powerpc/pr69946.c
new file mode 100644
index 000..d5f8bf2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr69946.c
@@ -0,0 +1,38 @@
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-skip-if "" { powerpc_elfv2 } } */
+/* { dg-options "-O2" } */
+
+/* This generates a rotate:DI by 44, with mask 0xf00, which is implemented
+   using a rlwinm instruction.  We used to write 44 for the shift count
+   there; it should be 12.  */
+
+struct A
+{
+  int a : 4;
+  int : 2;
+  int b : 2;
+  int : 2;
+  int c : 2;
+  int d : 1;
+  int e;
+};
+struct B
+{
+  int a : 4;
+} *a;
+void bar (struct A);
+
+void
+foo (void)
+{
+  struct B b = a[0];
+  struct A c;
+  c.a = b.a;
+  c.b = 1;
+  c.c = 1;
+  c.d = 0;
+  bar (c);
+}
+
+/* { dg-final { scan-assembler-not {(?n)rlwinm.*,44,20,23} } } */
+/* { dg-final { scan-assembler-times {(?n)rlwinm.*,12,20,23} 1 } } */
-- 
1.9.3

[ARM] Add support for overflow add, sub, and neg operations

2016-02-24 Thread Michael Collison

This patch adds support for builtin overflow of add, subtract and 
negate. This patch is targeted for gcc 7 stage 1. It was tested with no 
regressions in arm and thumb modes on the following targets:


arm-non-linux-gnueabi
arm-non-linux-gnuabihf
armeb-none-linux-gnuabihf
arm-non-eabi

2016-02-24  Michael Collison  

* config/arm/arm-modes.def: Add new condition code mode CC_V
to represent the overflow bit.
* config/arm/arm.c (maybe_get_arm_condition_code):
Add support for CC_Vmode.
* config/arm/arm.md (addv4, add3_compareV,
addsi3_compareV_upper): New patterns to support signed
builtin overflow add operations.
(uaddv4, add3_compareC, addsi3_compareV_upper):
New patterns to support unsigned builtin add overflow operations.
(subv4, sub3_compare1): New patterns to support signed
builtin overflow subtract operations,
(usubv4): New patterns to support unsigned builtin subtract
overflow operations.
(negvsi3, negvdi3, negdi2_compre, negsi2_carryin_compare): New patterns
to support builtin overflow negate operations.


--
Michael Collison
Linaro Toolchain Working Group
michael.colli...@linaro.org

diff --git a/gcc/config/arm/arm-modes.def b/gcc/config/arm/arm-modes.def
index 1819553..69231f2 100644
--- a/gcc/config/arm/arm-modes.def
+++ b/gcc/config/arm/arm-modes.def
@@ -59,6 +59,7 @@ CC_MODE (CC_DGEU);
 CC_MODE (CC_DGTU);
 CC_MODE (CC_C);
 CC_MODE (CC_N);
+CC_MODE (CC_V);
 
 /* Vector modes.  */
 VECTOR_MODES (INT, 4);/*V4QI V2HI */
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index d8a2745..e0fbb6f 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -22854,6 +22854,8 @@ maybe_get_arm_condition_code (rtx comparison)
 	{
 	case LTU: return ARM_CS;
 	case GEU: return ARM_CC;
+	case NE: return ARM_CS;
+	case EQ: return ARM_CC;
 	default: return ARM_NV;
 	}
 
@@ -22879,6 +22881,15 @@ maybe_get_arm_condition_code (rtx comparison)
 	default: return ARM_NV;
 	}
 
+case CC_Vmode:
+  switch (comp_code)
+	{
+	case NE: return ARM_VS;
+	case EQ: return ARM_VC;
+	default: return ARM_NV;
+
+	}
+
 case CCmode:
   switch (comp_code)
 	{
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 64873a2..705fe0b 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -539,6 +539,42 @@
(set_attr "type" "multiple")]
 )
 
+(define_expand "addv4"
+  [(match_operand:SIDI 0 "register_operand")
+   (match_operand:SIDI 1 "register_operand")
+   (match_operand:SIDI 2 "register_operand")
+   (match_operand 3 "")]
+  "TARGET_ARM"
+{
+  emit_insn (gen_add3_compareV (operands[0], operands[1], operands[2]));
+
+  rtx x;
+  x = gen_rtx_NE (VOIDmode, gen_rtx_REG (CC_Vmode, CC_REGNUM), const0_rtx);
+  x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
+			gen_rtx_LABEL_REF (VOIDmode, operands[3]),
+			pc_rtx);
+  emit_jump_insn (gen_rtx_SET (pc_rtx, x));
+  DONE;
+})
+
+(define_expand "uaddv4"
+  [(match_operand:SIDI 0 "register_operand")
+   (match_operand:SIDI 1 "register_operand")
+   (match_operand:SIDI 2 "register_operand")
+   (match_operand 3 "")]
+  "TARGET_ARM"
+{
+  emit_insn (gen_add3_compareC (operands[0], operands[1], operands[2]));
+
+  rtx x;
+  x = gen_rtx_NE (VOIDmode, gen_rtx_REG (CC_Cmode, CC_REGNUM), const0_rtx);
+  x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
+			gen_rtx_LABEL_REF (VOIDmode, operands[3]),
+			pc_rtx);
+  emit_jump_insn (gen_rtx_SET (pc_rtx, x));
+  DONE;
+})
+
 (define_expand "addsi3"
   [(set (match_operand:SI  0 "s_register_operand" "")
 	(plus:SI (match_operand:SI 1 "s_register_operand" "")
@@ -616,6 +652,163 @@
  ]
 )
 
+(define_insn_and_split "adddi3_compareV"
+  [(set (reg:CC_V CC_REGNUM)
+	(ne:CC_V
+	  (plus:TI
+	(sign_extend:TI (match_operand:DI 1 "register_operand" "r"))
+	(sign_extend:TI (match_operand:DI 2 "register_operand" "r")))
+	  (sign_extend:TI (plus:DI (match_dup 1) (match_dup 2)
+   (set (match_operand:DI 0 "register_operand" "=r")
+	(plus:DI (match_dup 1) (match_dup 2)))]
+  "TARGET_ARM"
+  "#"
+  "TARGET_ARM && reload_completed"
+  [(parallel [(set (reg:CC_C CC_REGNUM)
+		   (compare:CC_C (plus:SI (match_dup 1) (match_dup 2))
+ (match_dup 1)))
+	  (set (match_dup 0) (plus:SI (match_dup 1) (match_dup 2)))])
+   (parallel [(set (reg:CC_V CC_REGNUM)
+		   (ne:CC_V
+		(plus:DI (plus:DI
+			  (sign_extend:DI (match_dup 4))
+			  (sign_extend:DI (match_dup 5)))
+			 (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)))
+		(plus:DI (sign_extend:DI
+			  (plus:SI (match_dup 4) (match_dup 5)))
+			 (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)
+	 (set (match_dup 3) (plus:SI (plus:SI
+	  (match_dup 4) (match_dup 5))
+	 (ltu:SI (reg:CC_C CC_REGNUM)
+		 (const_int 0])]
+  "
+  {
+operands[3] = gen_highpart (SImode, operands[0]);
+operands[0] = gen_lowpart (SImode, operands[0]);
+operands[4] = gen_highpart (SImode, operands[1]);
+operands[1] = gen_lowpart (SImode, operands[1]);
+

RE: [Patch, MIPS] Patch for PR 68273 (user aligned variable arguments)

2016-02-24 Thread Steve Ellcey

On Wed, 2016-02-24 at 13:46 -0800, Matthew Fortune wrote:

> Thanks for enumerating all the cases. I'd not looked at all of them. I
> do agree that we need a fix given the existing inconsistencies.
> 
> One question I have is where does an over aligned argument get pushed
> to the stack with the patch in place. I.e. when taking its address, is
> the alignment achieved up to the limit of stack alignment or do they now
> only get register-size alignment? If the former then the idea that
> argument passing is defined as a structure in memory with the first
> portion in registers will no longer hold true. Not sure if that is a
> problem.

Passing arguments on the stack is not affected by this change, except
that we may start using memory at a different argument because of
changes in the number of registers we use to pass arguments.  The extra
alignment was already being ignored for arguments passed on the stack
and was only affecting register arguments.  So we are now more
consistent in terms of argument passing looking like a structure.  Those
structure 'fields' now do not include the user defined alignment for
register args or memory args.  An example:

typedef int alignedint __attribute__((aligned(8)));
int foo1 (int a, int b, int c, int d, int e, int x, alignedint y, int z)
{ return a+b+c+d+e+x+y+z; }

e, x, y, and z were being passed at offsets 16, 20, 24, and 28 before
this change and after this change.

> > diff --git a/gcc/testsuite/gcc.dg/pr48335-2.c
> > b/gcc/testsuite/gcc.dg/pr48335-2.c
> > index a37c079..f56b6fd 100644
> > --- a/gcc/testsuite/gcc.dg/pr48335-2.c
> > +++ b/gcc/testsuite/gcc.dg/pr48335-2.c
> > @@ -1,6 +1,7 @@
> >  /* PR middle-end/48335 */
> >  /* { dg-do compile } */
> >  /* { dg-options "-O2 -fno-tree-sra" } */
> > +/* { dg-options "-O2 -fno-tree-sra -mno-warn-aligned-args" { target
> > mips*-*-* } } */
> 
> Presumably this means that both under alignment of 64-bit types and over
> alignment of < 64-bit types are affected? The explicit test cases do not
> cover an under aligned long long case which would be good to cover.

Yes, the under alignment of long long's is affected.  I have added a new
testcase (pr68273-3.c) to check that.

Here is a new version of the patch with the invoke.texi text, the new
testcase, and the style changes you pointed out.

Steve Ellcey
sell...@imgtec.com


2016-02-24  Steve Ellcey  

PR target/68273
* config/mips/mips.opt (mwarn-aligned-args): New flag.
* config/mips/mips.h (mips_args): Add new field.
* config/mips/mips.c (mips_internal_function_arg): New function.
(mips_function_incoming_arg): New function.
(mips_old_function_arg_boundary): New function.
(mips_function_arg): Rewrite to use mips_internal_function_arg.
(mips_function_arg_boundary): Fix argument alignment.
(TARGET_FUNCTION_INCOMING_ARG): New define.
* doc/invoke.texi (MIPS Options): Document new flag.


diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index 5af3d1e..ea4733b 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -1124,6 +1124,7 @@ static const struct mips_rtx_cost_data
 static rtx mips_find_pic_call_symbol (rtx_insn *, rtx, bool);
 static int mips_register_move_cost (machine_mode, reg_class_t,
reg_class_t);
+static unsigned int mips_old_function_arg_boundary (machine_mode, const_tree);
 static unsigned int mips_function_arg_boundary (machine_mode, const_tree);
 static machine_mode mips_get_reg_raw_mode (int regno);
 
@@ -5459,11 +5460,11 @@ mips_strict_argument_naming (cumulative_args_t ca 
ATTRIBUTE_UNUSED)
   return !TARGET_OLDABI;
 }
 
-/* Implement TARGET_FUNCTION_ARG.  */
+/* Used to implement TARGET_FUNCTION_ARG and TARGET_FUNCTION_INCOMING_ARG.  */
 
 static rtx
-mips_function_arg (cumulative_args_t cum_v, machine_mode mode,
-  const_tree type, bool named)
+mips_internal_function_arg (cumulative_args_t cum_v, machine_mode mode,
+   const_tree type, bool named)
 {
   CUMULATIVE_ARGS *cum = get_cumulative_args (cum_v);
   struct mips_arg_info info;
@@ -5586,6 +5587,54 @@ mips_function_arg (cumulative_args_t cum_v, machine_mode 
mode,
   return gen_rtx_REG (mode, mips_arg_regno (&info, TARGET_HARD_FLOAT));
 }
 
+/* Implement TARGET_FUNCTION_ARG.  */
+
+static rtx
+mips_function_arg (cumulative_args_t cum_v, machine_mode mode,
+  const_tree type, bool named)
+{
+  bool doubleword_aligned_p = (mips_function_arg_boundary (mode, type)
+ > BITS_PER_WORD);
+  bool old_doubleword_aligned_p = (mips_old_function_arg_boundary (mode, type)
+ > BITS_PER_WORD);
+  CUMULATIVE_ARGS *cum = get_cumulative_args (cum_v);
+
+  if (doubleword_aligned_p != old_doubleword_aligned_p
+  && mips_warn_aligned_args && !cum->aligned_arg_warning_given)
+{
+  warning (0, "argument %d in the call may be passed in a manner"
+  " incom

Re: [PATCH] hurd: align -p and -pg behavior on Linux

2016-02-24 Thread Samuel Thibault

Thomas Schwinge, on Wed 24 Feb 2016 23:46:36 +0100, wrote:
> I guess getting -D_REENTRANT for -pthread won't do us any harm?

It won't.

> > --- gcc/config/i386/gnu.h.orig  2015-09-17 21:41:13.0 +
> > +++ gcc/config/i386/gnu.h   2015-09-17 23:03:57.0 +
> > @@ -27,11 +27,11 @@
> >  #undef STARTFILE_SPEC
> >  #if defined HAVE_LD_PIE
> >  #define STARTFILE_SPEC \
> > -  "%{!shared: 
> > %{pg|p|profile:gcrt0.o%s;pie:Scrt1.o%s;static:crt0.o%s;:crt1.o%s}} \
> > +  "%{!shared: 
> > %{pg|p:gcrt1.o%s;profile:gcrt0.o%s;pie:Scrt1.o%s;static:crt0.o%s;:crt1.o%s}}
> >  \
> > crti.o%s %{static:crtbeginT.o%s;shared|pie:crtbeginS.o%s;:crtbegin.o%s}"
> >  #else
> >  #define STARTFILE_SPEC \
> > -  "%{!shared: %{pg|p|profile:gcrt0.o%s;static:crt0.o%s;:crt1.o%s}} \
> > +  "%{!shared: 
> > %{pg|p:gcrt1.o%s;profile:gcrt0.o%s;static:crt0.o%s;:crt1.o%s}} \
> > crti.o%s %{static:crtbeginT.o%s;shared|pie:crtbeginS.o%s;:crtbegin.o%s}"
> >  #endif
> 
> I think I understand what you're trying to do (avoid gcrt0.o being used
> for -pg or -p, and instead use gcrt1.o),

Yes.

> Likewise, for -pg or -p, I would assume that we
> still need gcrt0.o for static linking, and gcrt1.o for dynamic linking.

Mmm, probably indeed.

> -  "%{!shared: 
> %{pg|p|profile:gcrt0.o%s;pie:Scrt1.o%s;static:crt0.o%s;:crt1.o%s}} \
> +  "%{!shared: 
> %{pg|p|profile:%{static:gcrt0.o%s;:gcrt1.o%s};pie:Scrt1.o%s;static:crt0.o%s;:crt1.o%s}}
>  \

Yes, that looks reasonable.

Samuel

[patch, fortran] PR69110 ICE with NEWUNIT

2016-02-24 Thread Jerry DeLisle

This patch from Steve on c.l.f

Fixes the segfault from attempting a string compare where there is no string 
yet.

Regression tested on x86-64.  New test case.

OK for trunk.

Regards,

Jerry

2016-02-24  Jerry DeLisle  
Steven G. Kargl  

PR fortran/69110
* io.c (gfc_match_open): Check that open status is an expression
constant before comparing string to 'scratch' with NEWUNIT.
diff --git a/gcc/fortran/io.c b/gcc/fortran/io.c
index fddd36b..da0e1c5 100644
--- a/gcc/fortran/io.c
+++ b/gcc/fortran/io.c
@@ -1890,13 +1890,16 @@ gfc_match_open (void)
 	  goto cleanup;
 	}
 
-  if (!(open->file || (open->status
-  && gfc_wide_strncasecmp (open->status->value.character.string,
-   "scratch", 7) == 0)))
-	{
-	  gfc_error ("NEWUNIT specifier must have FILE= "
-		 "or STATUS='scratch' at %C");
-	  goto cleanup;
+  if (!open->file && open->status)
+{
+	  if (open->status->expr_type == EXPR_CONSTANT
+	 && gfc_wide_strncasecmp (open->status->value.character.string,
+   "scratch", 7) != 0)
+	   {
+	 gfc_error ("NEWUNIT specifier must have FILE= "
+			"or STATUS='scratch' at %C");
+	 goto cleanup;
+	   }
 	}
 }
   else if (!open->unit)
diff --git a/gcc/testsuite/gfortran.dg/newunit_4.f90 b/gcc/testsuite/gfortran.dg/newunit_4.f90
new file mode 100644
index 000..4d7d738
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/newunit_4.f90
@@ -0,0 +1,7 @@
+! { dg-do compile }
+! PR69110 ICE with NEWUNIT 
+subroutine open_file_safe(fname, fstatus, faction, fposition, funit)
+  character(*), intent(in)  :: fname, fstatus, faction, fposition
+  integer, intent(out)  :: funit
+  open(newunit=funit, status=fstatus)
+end subroutine open_file_safe

[PATCH] [RFA] [PR tree-optmization/69740] Schedule loop fixups when needed

2016-02-24 Thread Jeff Law



PR69740 shows two instances where one or more transformations ultimately 
lead to the removal of a basic block.


In both cases, removal of the basic block removes a path into an 
irreducible region and turns the irreducible region into a natural loop.


When that occurs we need to be requesting loops to be fixed up.

My first patch was to handle this is was in tree-ssa-dce.c and that 
fixed the initial problem report.  As I was cobbling the patch together, 
I pondered putting the changes into delete_basic_block because that 
would capture other instances of this problem.


When I looked at the second instance, it came via a completely different 
path (tail merging).  Again it was a case where we called 
delete_basic_block which in turn changed an irreducible region into a 
natural loop.  So I tossed my original patch and put the test into 
delete_basic_block as you see here.


Bootstrapped and regression tested on x86_64-linux-gnu.  OK for the 
trunk and the gcc-5 branch after a suitable soak time?




Jeff
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 913abc8..42e5b4f 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,10 @@
+2016-02-24  Jeff Law  
+
+   PR tree-optimization/69740
+   * cfghooks.c (delete_basic_block): Request loop fixups if we delete
+   a block with an outgoing edge to a block marked as being in anx
+   irreducible region.
+
 2016-02-24  Jason Merrill  
 
* common.opt (flifetime-dse): Add -flifetime-dse=1.
diff --git a/gcc/cfghooks.c b/gcc/cfghooks.c
index bbb1017..4d31aa9 100644
--- a/gcc/cfghooks.c
+++ b/gcc/cfghooks.c
@@ -574,6 +574,14 @@ delete_basic_block (basic_block bb)
   if (!cfg_hooks->delete_basic_block)
 internal_error ("%s does not support delete_basic_block", cfg_hooks->name);
 
+  /* Look at BB's successors, if any are marked as BB_IRREDUCIBLE_LOOP, then
+ removing BB (and its outgoing edges) may make the loop a natural
+ loop.  In which case we need to schedule loop fixups.  */
+  if (current_loops)
+for (edge_iterator ei = ei_start (bb->succs); !ei_end_p (ei); ei_next 
(&ei))
+  if (ei_edge (ei)->dest->flags & BB_IRREDUCIBLE_LOOP)
+   loops_state_set (LOOPS_NEED_FIXUP);
+
   cfg_hooks->delete_basic_block (bb);
 
   if (current_loops != NULL)
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 311232f..b0df819 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,9 @@
+2016-02-04  Jeff Law  
+
+   PR tree-optimization/69740
+   * gcc.c-torture/compile/pr69740-1.c: New test.
+   * gcc.c-torture/compile/pr69740-2.c: New test.
+
 2016-02-24  Martin Sebor  
 
PR c++/69912
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr69740-1.c 
b/gcc/testsuite/gcc.c-torture/compile/pr69740-1.c
new file mode 100644
index 000..ac867d8
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr69740-1.c
@@ -0,0 +1,12 @@
+char a;
+short b;
+void fn1() {
+  if (b)
+;
+  else {
+int c[1] = {0};
+  l1:;
+  }
+  if (a)
+goto l1;
+}
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr69740-2.c 
b/gcc/testsuite/gcc.c-torture/compile/pr69740-2.c
new file mode 100644
index 000..a89c9a0
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr69740-2.c
@@ -0,0 +1,19 @@
+inline int foo(int *p1, int p2) {
+  int z = *p1;
+  while (z > p2)
+p2 = 2;
+  return z;
+}
+int main() {
+  int i;
+  for (;;) {
+int j, k;
+i = foo(&k, 7);
+if (k)
+  j = i;
+else
+  k = j;
+if (2 != j)
+  __builtin_abort();
+  }
+}

Re: [PATCH] Fix regcprop noop move handling (PR rtl-optimization/69896)

2016-02-24 Thread Jeff Law


On 02/22/2016 02:26 PM, Jakub Jelinek wrote:

Hi!

The following testcase is miscompiled, because prepare_shrink_wrap
attempts to copyprop_hardreg_forward_1 the first bb.  We see
DImode rbx being copied to DImode r11, and then we have (dead since
postreload) an assignment of SImode r11d to SImode ebx, and later on
some uses of DImode r11.  copyprop_hardreg_forward_1 notes the oldest
reg is rbx, and replaces in the SImode move r11d with ebx (that is fine),
and later on the DImode uses of r11 with rbx.  The problem is that
we now have a DImode rbx setter, then SImode noop move of ebx to ebx
(which by definition means the upper bits are undefined), and then DImode
rbx use.  If the noop move is DCEd soon enough, that wouldn't be a problem,
but before that happens, regrename is performed and we get the last
use of DImode rbx replaced with rcx and the ebx, ebx noop move changed int
ecx = ebx SImode move.  That of course doesn't work.

The problem is that copyprop_hardreg_forward_1 ignores noop_p moves (most
likely in the hope that they will be soon optimized away).  That is fine
if they use as wide mode as we have recorded for the register, but if it is
narrower, we can choose to either remove the noop move, or adjust the
regcprop data structure.  The patch below chooses to do both of these,
the first one if DCE would remove the noop move, the latter otherwise.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-02-22  Jakub Jelinek  

PR rtl-optimization/69896
* regcprop.c: Include cfgrtl.h.
(copyprop_hardreg_forward_1): If noop_p insn uses narrower
than remembered mode, either delete it (if noop_move_p), or
treat like copy_p but not noop_p instruction.

* gcc.dg/pr69896.c: New test.
One could argue we should have DCE'd the nop move before we get here, 
but even if we did, making regcprop safe WRT narrower nop-moves is the 
right thing to do IMHO.


OK.

Jeff

Re: [PATCH PR69052]Check if loop inv can be propagated into mem ref with additional addr expr canonicalization

2016-02-24 Thread Jeff Law


On 02/22/2016 02:22 AM, Bin.Cheng wrote:


My only question is why didn't you use FOR_EACH_SUBRTX_VRA from rtl-iter.h
to walk the RTX expressions in collect_address_parts and
canonicalize_address_mult?

Hi Jeff,
Nothing special, just I haven't used this before, also
canonicalize_address_mult is alphabetically copied from fwprop.c.  One
question is when rewriting SHIFT to MULT, we need to modify rtl
expression in place, does FOR_EACH_SUBRTX iterator support this?  If
yes, what is the behavior for modified sub-expression?
Hmm.  The question of semantics when we change the underlying 
sub-expressions is an interesting one.


While I think we're OK in practice using the iterators, I think that's 
more of an accident than by design -- if MULT/ASHIFT had a different 
underlying RTL structure then I'm much less sure using the iterators 
would be safe.


Let's go with your original patch that didn't use the iterators.  Sorry 
for making you do the additional work/testing to build the iterator 
version.  But after pondering the issue you raised, I think your 
original patch is safer.



jeff

Re: [patch, fortran] PR69110 ICE with NEWUNIT

2016-02-24 Thread Thomas Koenig


Hi Jerry and Steve,


OK for trunk.


OK.

Regards

Thomas

79 matches

Mail list logo