date:20170731

Re: [PATCH] Fix when -lssp is added by driver (PR middle-end/81400).

2017-07-31 Thread Martin Liška

On 07/26/2017 07:45 PM, Jeff Law wrote:
> On 07/12/2017 07:38 AM, Martin Liška wrote:
>> Hi.
>>
>> Following patch adds -lspp when one uses -mstack-protector-guard=global.
>>
>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>>
>> Ready to be installed?
>>
>> Martin
>>
>> gcc/ChangeLog:
>>
>> 2017-07-12  Martin Liska  
>>
>> PR middle-end/81400
>> * gcc.c: Add -lssp when one uses -mstack-protector-guard=global.
> Isn't the -m option target specific?  And doesn't that imply this change
> should somehow be done in the target's specs?

Hi.

I think it's the right place as it's needed for all targets that support that 
(x86 family and ppc family).
Both have -mstack-protector-guard option.

Martin

> 
> jeff
>

[PATCH] Fix a pasto in gfc_check_num_images

2017-07-31 Thread Jakub Jelinek

Hi!

bootstrap-ubsan reported taking address of a &distance->where when
distance is NULL.  The function has one block guarded with if (distance)
and another guarded with if (failed), so I think this is just a pasto
from the earlier if (distance) block.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-07-31  Jakub Jelinek  

* check.c (gfc_check_num_images): Fix a pasto.

--- gcc/fortran/check.c.jj  2017-03-05 22:39:52.0 +0100
+++ gcc/fortran/check.c 2017-07-28 14:30:40.503511275 +0200
@@ -5149,7 +5149,7 @@ gfc_check_num_images (gfc_expr *distance
return false;
 
   if (!gfc_notify_std (GFC_STD_F2008_TS, "FAILED= argument to "
-  "NUM_IMAGES at %L", &distance->where))
+  "NUM_IMAGES at %L", &failed->where))
return false;
 }
 

Jakub

[PATCH] Fix UB in ipa-polymorphic-call.c (PR tree-optimization/81603)

2017-07-31 Thread Jakub Jelinek

Hi!

ipa_polymorphic_call_context offset field is in bits (for whatever reason),
and the following computations can overflow and invoke UB.
This patch computes it in offset_int instead and punts if there is overflow.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-07-31  Jakub Jelinek  

PR tree-optimization/81603
* ipa-polymorphic-call.c
(ipa_polymorphic_call_context::ipa_polymorphic_call_context): Perform
offset arithmetic in offset_int, bail out if the resulting bit offset
doesn't fit into shwi.

--- gcc/ipa-polymorphic-call.c.jj   2017-06-12 12:41:55.0 +0200
+++ gcc/ipa-polymorphic-call.c  2017-07-28 15:02:43.747354910 +0200
@@ -921,9 +921,13 @@ ipa_polymorphic_call_context::ipa_polymo
 and MEM_REF is meaningless, but we can look futher.  */
  if (TREE_CODE (base) == MEM_REF)
{
+ offset_int o = mem_ref_offset (base) * BITS_PER_UNIT;
+ o += offset;
+ o += offset2;
+ if (!wi::fits_shwi_p (o))
+   break;
  base_pointer = TREE_OPERAND (base, 0);
- offset
-   += offset2 + mem_ref_offset (base).to_short_addr () * 
BITS_PER_UNIT;
+ offset = o.to_shwi ();
  outer_type = NULL;
}
  /* We found base object.  In this case the outer_type
@@ -961,10 +965,15 @@ ipa_polymorphic_call_context::ipa_polymo
break;
}
   else if (TREE_CODE (base_pointer) == POINTER_PLUS_EXPR
-  && tree_fits_uhwi_p (TREE_OPERAND (base_pointer, 1)))
+  && TREE_CODE (TREE_OPERAND (base_pointer, 1)) == INTEGER_CST)
{
- offset += tree_to_shwi (TREE_OPERAND (base_pointer, 1))
-   * BITS_PER_UNIT;
+ offset_int o = offset_int::from (TREE_OPERAND (base_pointer, 1),
+  SIGNED);
+ o *= BITS_PER_UNIT;
+ o += offset;
+ if (!wi::fits_shwi_p (o))
+   break;
+ offset = o.to_shwi ();
  base_pointer = TREE_OPERAND (base_pointer, 0);
}
   else

Jakub

Re: [RFC] Remaining references of Java

2017-07-31 Thread Martin Liška

On 07/25/2017 01:55 PM, Richard Biener wrote:
> On Tue, Jul 11, 2017 at 4:35 PM, Martin Liška  wrote:
>> Hi.
>>
>> There's a small follow-up with remaining occurrences:
>>
>> 1) dwarf2out.c:
>>
>>  20213  origin_die = lookup_type_die (origin);
>>  20214else if (TREE_CODE (origin) == BLOCK)
>>  20215  origin_die = BLOCK_DIE (origin);
>>  20216
>>  20217/* XXX: Functions that are never lowered don't always have correct
>> block
>>  20218   trees (in the case of java, they simply have no block tree, in
>> some other
>>  20219   languages).  For these functions, there is nothing we can
>> really do to
>>  20220   output correct debug info for inlined functions in all cases.
>> Rather
>>  20221   than die, we'll just produce deficient debug info now, in that
>> we will
>>  20222   have variables without a proper abstract origin.  In the
>> future, when all
>>  20223   functions are lowered, we should re-add a gcc_assert
>> (origin_die)
>>
>> Probably Jakub can help with that?
> 
> Can you check whether all-language bootstrap passes with the suggested
> gcc_assert?

No, following simple test-case can break it:

$ cat ice.i
__attribute__ ((__always_inline__)) int a ()
{
  int b = 0;
}
void
c ()
{
  a ();
}

Thus maybe a small adjustment of the comment can be done.

> 
>> 2) fold-const.c:
>>
>>   1882/* The following code implements the floating point to integer
>>   1883   conversion rules required by the Java Language Specification,
>>   1884   that IEEE NaNs are mapped to zero and values that overflow
>>   1885   the target precision saturate, i.e. values greater than
>>   1886   INT_MAX are mapped to INT_MAX, and values less than INT_MIN
>>   1887   are mapped to INT_MIN.  These semantics are allowed by the
>>   1888   C and C++ standards that simply state that the behavior of
>>   1889   FP-to-integer conversion is unspecified upon overflow.  */
>>   1890
>>   1891wide_int val;
>>   1892REAL_VALUE_TYPE r;
>>   1893REAL_VALUE_TYPE x = TREE_REAL_CST (arg1);
>>
>> Can we somehow remove that Richi?
> 
> Remove what?  The comment?  It still matches the code and we have to
> do sth.  So I don't see why we should change this.

Yep, I just mean comment which says that we do such transformation just
because of Java. Which is probably not true I guess.

> 
> Maybe Joseph knows whether future standards / TRs recommend sth different
> or exactly this.
> 
>>
>> 3) gimplify.c:
>>
>>   2771   Java requires that we elaborated nodes in source order.  That
>>   2772   means we must gimplify the inner expression followed by each of
>>   2773   the indices, in order.  But we can't gimplify the inner
>>   2774   expression until we deal with any variable bounds, sizes, or
>>   2775   positions in order to deal with PLACEHOLDER_EXPRs.
>>   2776
>>   2777   So we do this in three steps.  First we deal with the
>> annotations
>>   2778   for any variables in the components, then we gimplify the base,
>>   2779   then we gimplify any indices, from left to right.  */
>>   2780for (i = expr_stack.length () - 1; i >= 0; i--)
>>
>> Richi?
> 
> Just change the comment to "We elaborate nodes in source order. [...]"
> 
>> 4) tree.c:
>>
>>  13535if (RECORD_OR_UNION_TYPE_P (t) && TYPE_BINFO (t) && TYPE_BINFO
>> (tv)
>>  13536&& TYPE_BINFO (t) != TYPE_BINFO (tv)
>>  13537/* FIXME: Java sometimes keep dump TYPE_BINFOs on variant
>> types.
>>  13538   Since there is no cheap way to tell C++/Java type w/o LTO,
>> do checking
>>  13539   at LTO time only.  */
>>  13540&& (in_lto_p && odr_type_p (t)))
>>  13541  {
>>  13542error ("type variant has different TYPE_BINFO");
>>  13543debug_tree (tv);
>>  13544error ("type variant's TYPE_BINFO");
>>  13545debug_tree (TYPE_BINFO (tv));
>>  13546error ("type's TYPE_BINFO");
>>  13547debug_tree (TYPE_BINFO (t));
>>  13548return false;
>>
>> Can we Honza remove that?
> 
> Try it? (remove in_lto_p &&)
> 
>> Thanks,
>> Martin

For the rest, I've been testing patch.

Martin

Re: [PATCH 4/N] Recover GOTO predictor.

2017-07-31 Thread Martin Liška

Richi?

Thanks

On 06/30/2017 03:48 PM, Martin Liška wrote:
> On 06/22/2017 12:27 PM, Richard Biener wrote:
>> On Wed, Jun 21, 2017 at 3:06 PM, Martin Liška  wrote:
>>> Hello.
>>>
>>> There's one additional predictor enhancement that is GOTO predict that
>>> used to working. Following patch adds expect statement for C and C++ family
>>> languages.
>>>
>>> There's one fallout which is vrp24.c test-case, where only 'Simplified 
>>> relational'
>>> appears just once. Adding Richi and Patrick who can probably help how to 
>>> fix the
>>> test-case.
>>
>> Happens to be optimized better now, there's only one predicate to simplify
>> left in the IL input to VRP1.  I suggest to change it to match 1 times and 
>> add
>> -fdump-tree-optimized to dg-options and
>>
>> /* { dg-final { scan-tree-dump-times "if " 3 "optimized" } } */
>>
>> to verify we have 3 conditions left.
> 
> One small note, I see 4 conditions in optimized dump:
> 
> sss (struct rtx_def * insn, int code1, int code2, int code3)
> {
>   int D1544;
>   struct rtx_def * body;
>   _Bool D1562;
> 
>[100.00%] [count: INV]:
>   body_5 = insn_4(D)->u.fld[5].rt_rtx;
>   D1544_6 = body_5->code;
>   if (D1544_6 == 55)
> goto  (L7); [34.00%] [count: INV]
>   else
> goto ; [66.00%] [count: INV]
> 
>[66.00%] [count: INV]:
>   if (code3_7(D) == 99)
> goto ; [34.00%] [count: INV]
>   else
> goto  (L16); [66.00%] [count: INV]
> 
>[22.44%] [count: INV]:
>   D1562_9 = code1_8(D) == 10;
>   if (D1562_9 == 1)
> goto  (L7); [64.00%] [count: INV]
>   else
> goto  (L16); [36.00%] [count: INV]
> 
>[9.82%] [count: INV]:
>   arf ();
> 
>[46.68%] [count: INV]:
>   frob (); [tail call]
>   goto  (L16); [100.00%] [count: INV]
> 
> L7 [48.36%] [count: INV]:
>   if (code2_12(D) == 42)
> goto ; [20.24%] [count: INV]
>   else
> goto ; [79.76%] [count: INV]
> 
> L16 [100.00%] [count: INV]:
>   return;
> 
> }
> 
> Is it a problem or adjusting to 4 is fine?
> 
> Martin
> 
>>
>>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>>>
>>> Ready to be installed?
>>> Martin
>

Re: [RFC] [PATCH] Introduce configure flag --with-stage1-cflags.

2017-07-31 Thread Martin Liška

I would like to ping this. Input from other people will be appreciated ;)

Thanks,
Martin

On 06/19/2017 02:30 PM, Richard Biener wrote:
> On Mon, Jun 19, 2017 at 12:51 PM, Martin Liška  wrote:
>> PING^1
>>
>> Richi are you fine with the suggested change? I basically followed your 
>> advises :)
> 
> Well, I am but as Eric disagrees I think we need input from other
> people on this.
> I'm comfortably setting STAGE1_CFLAGS here.
> 
> Richard.
> 
>> Martin
>>
>> On 05/26/2017 03:00 PM, Martin Liška wrote:
>>> On 05/26/2017 01:55 PM, Richard Biener wrote:
 On Fri, May 26, 2017 at 1:51 PM, Jakub Jelinek  wrote:
> On Fri, May 26, 2017 at 01:46:47PM +0200, Richard Biener wrote:
>> On Thu, May 25, 2017 at 11:23 AM, Martin Liška  wrote:
>>> Hello.
>>>
>>> After a discussion with Richi, using adding "-O2" to STAGE1 cflags with 
>>> a recent
>>> enough compiler can significantly speed up bootstrap. Thus I'm 
>>> suggesting to
>>> introduce --with-stage1-cflags where one can provide such options.
>>
>> I don't think this is necessary -- you can always override with 
>> STAGE1_CFLAGS.
>>
>>> Apart from that, maybe it would be handy to automatically enable "-O2" 
>>> when
>>> one has a recent compiler? Do we have an example where we detect host 
>>> compiler
>>> and it's version?
>>
>> Don't know about version but configury already detects that we use GCC, 
>> so that
>> knowledge should be readily available.
>
> Well, it certainly shouldn't be -O2 by default for any system GCC, more
> something like if it is major of the configured configure minus 1 or newer
> (or minus 2?), then use -O2, otherwise default to -O0 as before.

 I'd still default to -O0 on release branches regardless of version and then
 for development we can probably simply use "any GCC" when people have
 the chance to override.
>>>
>>> Ok, sending new patch that does that on experimental branches for ${CC} 
>>> --version
>>> being a GCC newer than 4.9.
>>>
>>> Martin
>>>

 At least for me host GCC 4.8 works quite well with -O2.

 Richard.

> Jakub
>>>
>>

[PATCH] Fix ubsan_type_descriptor

2017-07-31 Thread Jakub Jelinek

Hi!

Last week I've finally decided to find time to look into sometimes very
weird signed-integer-overflow runtime messages, which report that say
signed integer overflow: 1234567890123456 \\* 1234567890123456 cannot be 
represented in type 'long int[10]'
Obviously the integer overflow is not happening in an array type.

The problem is that ubsan_type_descriptor needs for a couple of spots the
element type and changes the type parameter to the element type, but then
it adds the result into the hash table using that type, instead of the
TYPE_MAIN_VARIANT of the type with which the function has been originally
called, so if we first record say bounds violation on long int[10],
we actually create "long int[10]" object and record it for long int type,
and then when we look up long int, we get this (and lookup for long int[10]
next time isn't successful and creates another object).

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk and release branches?

2017-07-31  Jakub Jelinek  

PR sanitizer/81604
* ubsan.c (ubsan_type_descriptor): For UBSAN_PRINT_ARRAY don't
change type to the element type, instead add eltype variable and
use it where we are interested in the element type.

* c-c++-common/ubsan/pr81604.c: New test.

--- gcc/ubsan.c.jj  2017-07-28 12:35:27.0 +0200
+++ gcc/ubsan.c 2017-07-28 18:11:21.686219421 +0200
@@ -402,6 +402,7 @@ ubsan_type_descriptor (tree type, enum u
 /* We weren't able to determine the type name.  */
 tname = "";
 
+  tree eltype = type;
   if (pstyle == UBSAN_PRINT_POINTER)
 {
   pp_printf (&pretty_name, "'%s%s%s%s%s%s%s",
@@ -452,12 +453,12 @@ ubsan_type_descriptor (tree type, enum u
   pp_quote (&pretty_name);
 
   /* Save the tree with stripped types.  */
-  type = t;
+  eltype = t;
 }
   else
 pp_printf (&pretty_name, "'%s'", tname);
 
-  switch (TREE_CODE (type))
+  switch (TREE_CODE (eltype))
 {
 case BOOLEAN_TYPE:
 case ENUMERAL_TYPE:
@@ -467,9 +468,9 @@ ubsan_type_descriptor (tree type, enum u
 case REAL_TYPE:
   /* FIXME: libubsan right now only supports float, double and
 long double type formats.  */
-  if (TYPE_MODE (type) == TYPE_MODE (float_type_node)
- || TYPE_MODE (type) == TYPE_MODE (double_type_node)
- || TYPE_MODE (type) == TYPE_MODE (long_double_type_node))
+  if (TYPE_MODE (eltype) == TYPE_MODE (float_type_node)
+ || TYPE_MODE (eltype) == TYPE_MODE (double_type_node)
+ || TYPE_MODE (eltype) == TYPE_MODE (long_double_type_node))
tkind = 0x0001;
   else
tkind = 0x;
@@ -478,7 +479,7 @@ ubsan_type_descriptor (tree type, enum u
   tkind = 0x;
   break;
 }
-  tinfo = get_ubsan_type_info_for_type (type);
+  tinfo = get_ubsan_type_info_for_type (eltype);
 
   /* Create a new VAR_DECL of type descriptor.  */
   const char *tmp = pp_formatted_text (&pretty_name);
--- gcc/testsuite/c-c++-common/ubsan/pr81604.c.jj   2017-07-28 
18:20:52.471352034 +0200
+++ gcc/testsuite/c-c++-common/ubsan/pr81604.c  2017-07-28 18:25:54.398733100 
+0200
@@ -0,0 +1,31 @@
+/* PR sanitizer/81604 */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=bounds,signed-integer-overflow" } */
+
+long a[10];
+
+__attribute__((noinline, noclone)) long *
+foo (int i)
+{
+  return &a[i];
+}
+
+__attribute__((noinline, noclone)) long
+bar (long x, long y)
+{
+  return x * y;
+}
+
+int
+main ()
+{
+  volatile int i = -1;
+  volatile long l = __LONG_MAX__;
+  long *volatile p;
+  p = foo (i);
+  l = bar (l, l);
+  return 0;
+}
+
+/* { dg-output "index -1 out of bounds for type 'long int 
\\\[10\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*signed integer overflow: \[0-9]+ \\* \[0-9]+ cannot 
be represented in type 'long int'" } */

Jakub

[PATCH] Introduce TARGET_SUPPORTS_ALIASES

2017-07-31 Thread Martin Liška

Hi.

Doing the transformation suggested by Honza.

Patch can bootstrap on ppc64le-redhat-linux and x86_64-linux-gnu and survives 
regression tests.
And I also verified that works on hppa2.0w-hp-hpux11.11 (target w/o aliasing 
support).

Ready to be installed?
Martin
>From f8584bcb678dba92241c20b3e81895a52fc865f3 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Tue, 25 Jul 2017 13:11:28 +0200
Subject: [PATCH] Introduce TARGET_SUPPORTS_ALIASES

gcc/c-family/ChangeLog:

2017-07-25  Martin Liska  

	* c-opts.c (c_common_post_options): Replace ASM_OUTPUT_DEF with
	TARGET_SUPPORTS_ALIASES.

gcc/ChangeLog:

2017-07-25  Martin Liska  

	* asan.c (asan_protect_global): Replace ASM_OUTPUT_DEF with
	TARGET_SUPPORTS_ALIASES.
	* cgraph.c (cgraph_node::create_same_body_alias): Likewise.
	* ipa-visibility.c (can_replace_by_local_alias): Likewise.
	(optimize_weakref): Likewise.
	* symtab.c (symtab_node::noninterposable_alias): Likewise.
	* varpool.c (varpool_node::create_extra_name_alias): Likewise.
	* defaults.h: Introduce TARGET_SUPPORTS_ALIASES.

gcc/cp/ChangeLog:

2017-07-25  Martin Liska  

	* decl2.c (get_tls_init_fn): Replace ASM_OUTPUT_DEF with
	TARGET_SUPPORTS_ALIASES.
	(handle_tls_init): Likewise.
	(note_mangling_alias): Likewise.
	* optimize.c (can_alias_cdtor): Likewise.
---
 gcc/asan.c|  4 +---
 gcc/c-family/c-opts.c | 22 --
 gcc/cgraph.c  |  7 ---
 gcc/cp/decl2.c| 23 ++-
 gcc/cp/optimize.c |  6 +++---
 gcc/defaults.h|  9 +
 gcc/ipa-visibility.c  | 15 +--
 gcc/symtab.c  |  6 +++---
 gcc/varpool.c |  6 +++---
 9 files changed, 50 insertions(+), 48 deletions(-)

diff --git a/gcc/asan.c b/gcc/asan.c
index 5f9275f6425..d8cb2b52c8b 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -1663,10 +1663,8 @@ asan_protect_global (tree decl)
   if (lookup_attribute ("weakref", DECL_ATTRIBUTES (decl)))
 return false;
 
-#ifndef ASM_OUTPUT_DEF
-  if (asan_needs_local_alias (decl))
+  if (!TARGET_SUPPORTS_ALIASES && asan_needs_local_alias (decl))
 return false;
-#endif
 
   return true;
 }
diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index 1657e7a4390..0b13a188c1b 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -957,16 +957,18 @@ c_common_post_options (const char **pfilename)
 
   if (flag_extern_tls_init)
 {
-#if !defined (ASM_OUTPUT_DEF) || !SUPPORTS_WEAK
-  /* Lazy TLS initialization for a variable in another TU requires
-	 alias and weak reference support. */
-  if (flag_extern_tls_init > 0)
-	sorry ("external TLS initialization functions not supported "
-	   "on this target");
-  flag_extern_tls_init = 0;
-#else
-  flag_extern_tls_init = 1;
-#endif
+  if (!TARGET_SUPPORTS_ALIASES || !SUPPORTS_WEAK)
+	{
+	  /* Lazy TLS initialization for a variable in another TU requires
+	 alias and weak reference support.  */
+	  if (flag_extern_tls_init > 0)
+	sorry ("external TLS initialization functions not supported "
+		   "on this target");
+
+	  flag_extern_tls_init = 0;
+	}
+  else
+	flag_extern_tls_init = 1;
 }
 
   if (num_in_fnames > 1)
diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index 2f820f1bb67..356bcd311e1 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -582,10 +582,11 @@ cgraph_node *
 cgraph_node::create_same_body_alias (tree alias, tree decl)
 {
   cgraph_node *n;
-#ifndef ASM_OUTPUT_DEF
+
   /* If aliases aren't supported by the assembler, fail.  */
-  return NULL;
-#endif
+  if (!TARGET_SUPPORTS_ALIASES)
+return NULL;
+
   /* Langhooks can create same body aliases of symbols not defined.
  Those are useless. Drop them on the floor.  */
   if (symtab->global_info_ready)
diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index 2a52f8ca3e2..5119d4e62cc 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -3156,11 +3156,9 @@ get_tls_init_fn (tree var)
   if (!flag_extern_tls_init && DECL_EXTERNAL (var))
 return NULL_TREE;
 
-#ifdef ASM_OUTPUT_DEF
   /* If the variable is internal, or if we can't generate aliases,
  call the local init function directly.  */
-  if (!TREE_PUBLIC (var))
-#endif
+  if (!TREE_PUBLIC (var) || !TARGET_SUPPORTS_ALIASES)
 return get_local_tls_init_fn ();
 
   tree sname = mangle_tls_init_fn (var);
@@ -4241,9 +4239,8 @@ handle_tls_init (void)
   tree init = TREE_PURPOSE (vars);
   one_static_initialization_or_destruction (var, init, true);
 
-#ifdef ASM_OUTPUT_DEF
   /* Output init aliases even with -fno-extern-tls-init.  */
-  if (TREE_PUBLIC (var))
+  if (TARGET_SUPPORTS_ALIASES && TREE_PUBLIC (var))
 	{
   tree single_init_fn = get_tls_init_fn (var);
 	  if (single_init_fn == NULL_TREE)
@@ -4253,7 +4250,6 @@ handle_tls_init (void)
 		(single_init_fn, fn);
 	  gcc_assert (alias != NULL);
 	}
-#endif
 }
 
   finish_then_clause (if_stmt);
@@ -4300,15 +4296,16 @@ generate_mangling_alias (tree decl, tree id2)
 void
 note_mangling_alias (tree decl ATTRIBUTE_UNUSED, tree

Re: [PATCH] Fix UB in ipa-polymorphic-call.c (PR tree-optimization/81603)

2017-07-31 Thread Richard Biener

On Mon, 31 Jul 2017, Jakub Jelinek wrote:

> Hi!
> 
> ipa_polymorphic_call_context offset field is in bits (for whatever reason),
> and the following computations can overflow and invoke UB.
> This patch computes it in offset_int instead and punts if there is overflow.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.  As said in the PR the machinery should probably use byte offsets
throughout (do we support any target where HWI < size of ptr_mode?  That 
is, 128bit pointers?)

Richard.

> 2017-07-31  Jakub Jelinek  
> 
>   PR tree-optimization/81603
>   * ipa-polymorphic-call.c
>   (ipa_polymorphic_call_context::ipa_polymorphic_call_context): Perform
>   offset arithmetic in offset_int, bail out if the resulting bit offset
>   doesn't fit into shwi.
> 
> --- gcc/ipa-polymorphic-call.c.jj 2017-06-12 12:41:55.0 +0200
> +++ gcc/ipa-polymorphic-call.c2017-07-28 15:02:43.747354910 +0200
> @@ -921,9 +921,13 @@ ipa_polymorphic_call_context::ipa_polymo
>and MEM_REF is meaningless, but we can look futher.  */
> if (TREE_CODE (base) == MEM_REF)
>   {
> +   offset_int o = mem_ref_offset (base) * BITS_PER_UNIT;
> +   o += offset;
> +   o += offset2;
> +   if (!wi::fits_shwi_p (o))
> + break;
> base_pointer = TREE_OPERAND (base, 0);
> -   offset
> - += offset2 + mem_ref_offset (base).to_short_addr () * 
> BITS_PER_UNIT;
> +   offset = o.to_shwi ();
> outer_type = NULL;
>   }
> /* We found base object.  In this case the outer_type
> @@ -961,10 +965,15 @@ ipa_polymorphic_call_context::ipa_polymo
>   break;
>   }
>else if (TREE_CODE (base_pointer) == POINTER_PLUS_EXPR
> -&& tree_fits_uhwi_p (TREE_OPERAND (base_pointer, 1)))
> +&& TREE_CODE (TREE_OPERAND (base_pointer, 1)) == INTEGER_CST)
>   {
> -   offset += tree_to_shwi (TREE_OPERAND (base_pointer, 1))
> - * BITS_PER_UNIT;
> +   offset_int o = offset_int::from (TREE_OPERAND (base_pointer, 1),
> +SIGNED);
> +   o *= BITS_PER_UNIT;
> +   o += offset;
> +   if (!wi::fits_shwi_p (o))
> + break;
> +   offset = o.to_shwi ();
> base_pointer = TREE_OPERAND (base_pointer, 0);
>   }
>else
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: [PATCH] Fix ubsan_type_descriptor

2017-07-31 Thread Richard Biener

On Mon, 31 Jul 2017, Jakub Jelinek wrote:

> Hi!
> 
> Last week I've finally decided to find time to look into sometimes very
> weird signed-integer-overflow runtime messages, which report that say
> signed integer overflow: 1234567890123456 \\* 1234567890123456 cannot be 
> represented in type 'long int[10]'
> Obviously the integer overflow is not happening in an array type.
> 
> The problem is that ubsan_type_descriptor needs for a couple of spots the
> element type and changes the type parameter to the element type, but then
> it adds the result into the hash table using that type, instead of the
> TYPE_MAIN_VARIANT of the type with which the function has been originally
> called, so if we first record say bounds violation on long int[10],
> we actually create "long int[10]" object and record it for long int type,
> and then when we look up long int, we get this (and lookup for long int[10]
> next time isn't successful and creates another object).
> 
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk and release branches?

Ok.

Thanks,
Richard.

> 2017-07-31  Jakub Jelinek  
> 
>   PR sanitizer/81604
>   * ubsan.c (ubsan_type_descriptor): For UBSAN_PRINT_ARRAY don't
>   change type to the element type, instead add eltype variable and
>   use it where we are interested in the element type.
> 
>   * c-c++-common/ubsan/pr81604.c: New test.
> 
> --- gcc/ubsan.c.jj2017-07-28 12:35:27.0 +0200
> +++ gcc/ubsan.c   2017-07-28 18:11:21.686219421 +0200
> @@ -402,6 +402,7 @@ ubsan_type_descriptor (tree type, enum u
>  /* We weren't able to determine the type name.  */
>  tname = "";
>  
> +  tree eltype = type;
>if (pstyle == UBSAN_PRINT_POINTER)
>  {
>pp_printf (&pretty_name, "'%s%s%s%s%s%s%s",
> @@ -452,12 +453,12 @@ ubsan_type_descriptor (tree type, enum u
>pp_quote (&pretty_name);
>  
>/* Save the tree with stripped types.  */
> -  type = t;
> +  eltype = t;
>  }
>else
>  pp_printf (&pretty_name, "'%s'", tname);
>  
> -  switch (TREE_CODE (type))
> +  switch (TREE_CODE (eltype))
>  {
>  case BOOLEAN_TYPE:
>  case ENUMERAL_TYPE:
> @@ -467,9 +468,9 @@ ubsan_type_descriptor (tree type, enum u
>  case REAL_TYPE:
>/* FIXME: libubsan right now only supports float, double and
>long double type formats.  */
> -  if (TYPE_MODE (type) == TYPE_MODE (float_type_node)
> -   || TYPE_MODE (type) == TYPE_MODE (double_type_node)
> -   || TYPE_MODE (type) == TYPE_MODE (long_double_type_node))
> +  if (TYPE_MODE (eltype) == TYPE_MODE (float_type_node)
> +   || TYPE_MODE (eltype) == TYPE_MODE (double_type_node)
> +   || TYPE_MODE (eltype) == TYPE_MODE (long_double_type_node))
>   tkind = 0x0001;
>else
>   tkind = 0x;
> @@ -478,7 +479,7 @@ ubsan_type_descriptor (tree type, enum u
>tkind = 0x;
>break;
>  }
> -  tinfo = get_ubsan_type_info_for_type (type);
> +  tinfo = get_ubsan_type_info_for_type (eltype);
>  
>/* Create a new VAR_DECL of type descriptor.  */
>const char *tmp = pp_formatted_text (&pretty_name);
> --- gcc/testsuite/c-c++-common/ubsan/pr81604.c.jj 2017-07-28 
> 18:20:52.471352034 +0200
> +++ gcc/testsuite/c-c++-common/ubsan/pr81604.c2017-07-28 
> 18:25:54.398733100 +0200
> @@ -0,0 +1,31 @@
> +/* PR sanitizer/81604 */
> +/* { dg-do run } */
> +/* { dg-options "-fsanitize=bounds,signed-integer-overflow" } */
> +
> +long a[10];
> +
> +__attribute__((noinline, noclone)) long *
> +foo (int i)
> +{
> +  return &a[i];
> +}
> +
> +__attribute__((noinline, noclone)) long
> +bar (long x, long y)
> +{
> +  return x * y;
> +}
> +
> +int
> +main ()
> +{
> +  volatile int i = -1;
> +  volatile long l = __LONG_MAX__;
> +  long *volatile p;
> +  p = foo (i);
> +  l = bar (l, l);
> +  return 0;
> +}
> +
> +/* { dg-output "index -1 out of bounds for type 'long int 
> \\\[10\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
> +/* { dg-output "\[^\n\r]*signed integer overflow: \[0-9]+ \\* \[0-9]+ cannot 
> be represented in type 'long int'" } */
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: [PATCH] Do UBSAN sanitization just when current_function_decl != NULL_TREE (PR sanitize/81530).

2017-07-31 Thread Marek Polacek

On Fri, Jul 28, 2017 at 01:20:57PM +0200, Martin Liška wrote:
> Hi.
> 
> In r249158 I added new sanitize_flags_p function. However I removed various 
> calls of
> do_ubsan_in_current_function:
> 
> /* True if we want to play UBSan games in the current function.  */
> 
> bool
> do_ubsan_in_current_function ()
> {
>   return (current_function_decl != NULL_TREE
>  && !lookup_attribute ("no_sanitize_undefined",
>DECL_ATTRIBUTES (current_function_decl)));
> }
> 
> Where we also checked for current_function_decl. This putting that condition 
> back to conditions
> that used to call do_ubsan_in_current_function is necessary.
> 
> May I ask you Marek (or Jakub) to go through and verify that the check is 
> really necessary?
> 
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
> 
> Ready to be installed?
> Martin
> 
> gcc/cp/ChangeLog:
> 
> 2017-07-28  Martin Liska  
> 
>   PR sanitize/81530
>   * cp-gimplify.c (cp_genericize): Guard condition with flag_sanitize_p
>   also with current_function_decl non-null equality.
>   * cp-ubsan.c (cp_ubsan_instrument_vptr_p): Likewise.
>   * decl.c (compute_array_index_type): Likewise.
>   * init.c (finish_length_check): Likewise.
>   * typeck.c (cp_build_binary_op): Likewise.
> 
> gcc/c/ChangeLog:
> 
> 2017-07-28  Martin Liska  
> 
>   PR sanitize/81530
>   * c-convert.c (convert): Guard condition with flag_sanitize_p
>   also with current_function_decl non-null equality.
>   * c-decl.c (grokdeclarator): Likewise.
>   * c-typeck.c (build_binary_op): Likewise.
> 
> gcc/ChangeLog:
> 
> 2017-07-28  Martin Liska  
> 
>   PR sanitize/81530
>   * convert.c (convert_to_integer_1): Guard condition with flag_sanitize_p
>   also with current_function_decl non-null equality.
> 
> gcc/c-family/ChangeLog:
> 
> 2017-07-28  Martin Liska  
> 
>   PR sanitize/81530
>   * c-ubsan.c (ubsan_maybe_instrument_array_ref):
>   Guard condition with flag_sanitize_p also with current_function_decl
>   non-null equality.
>   (ubsan_maybe_instrument_reference_or_call): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
> 2017-07-28  Martin Liska  
> 
>   PR sanitize/81530
>   * g++.dg/ubsan/pr81530.C: New test.
> ---
>  gcc/c-family/c-ubsan.c   | 6 --
>  gcc/c/c-convert.c| 1 +
>  gcc/c/c-decl.c   | 1 +
>  gcc/c/c-typeck.c | 1 +
>  gcc/convert.c| 3 ++-
>  gcc/cp/cp-gimplify.c | 3 ++-
>  gcc/cp/cp-ubsan.c| 3 +++
>  gcc/cp/decl.c| 3 ++-
>  gcc/cp/init.c| 3 ++-
>  gcc/cp/typeck.c  | 1 +
>  gcc/testsuite/g++.dg/ubsan/pr81530.C | 6 ++
>  11 files changed, 25 insertions(+), 6 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/ubsan/pr81530.C
> 
> 

> diff --git a/gcc/c-family/c-ubsan.c b/gcc/c-family/c-ubsan.c
> index a072d19eda6..541b53009c2 100644
> --- a/gcc/c-family/c-ubsan.c
> +++ b/gcc/c-family/c-ubsan.c
> @@ -373,7 +373,8 @@ void
>  ubsan_maybe_instrument_array_ref (tree *expr_p, bool ignore_off_by_one)
>  {
>if (!ubsan_array_ref_instrumented_p (*expr_p)
> -  && sanitize_flags_p (SANITIZE_BOUNDS | SANITIZE_BOUNDS_STRICT))
> +  && sanitize_flags_p (SANITIZE_BOUNDS | SANITIZE_BOUNDS_STRICT)
> +  && current_function_decl != NULL_TREE)
>  {
>tree op0 = TREE_OPERAND (*expr_p, 0);
>tree op1 = TREE_OPERAND (*expr_p, 1);
> @@ -393,7 +394,8 @@ static tree
>  ubsan_maybe_instrument_reference_or_call (location_t loc, tree op, tree 
> ptype,
> enum ubsan_null_ckind ckind)
>  {
> -  if (!sanitize_flags_p (SANITIZE_ALIGNMENT | SANITIZE_NULL))
> +  if (!sanitize_flags_p (SANITIZE_ALIGNMENT | SANITIZE_NULL)
> +  || current_function_decl == NULL_TREE)
>  return NULL_TREE;
>  
>tree type = TREE_TYPE (ptype);
> diff --git a/gcc/c/c-convert.c b/gcc/c/c-convert.c
> index 33c9143e354..07862543334 100644
> --- a/gcc/c/c-convert.c
> +++ b/gcc/c/c-convert.c
> @@ -108,6 +108,7 @@ convert (tree type, tree expr)
>  case INTEGER_TYPE:
>  case ENUMERAL_TYPE:
>if (sanitize_flags_p (SANITIZE_FLOAT_CAST)
> +   && current_function_decl != NULL

Should be NULL_TREE.

It's non-obvious to prove that some checks might not be necessary, but I
verifed that gcc7 had do_ubsan_in_current_function where you're adding
the current_function_decl checks, so I think this should be good to go.

Marek

Re: [Patch] Testsuite fixes for failures caused by patch for PR 80925 - loop peeling and alignment

2017-07-31 Thread Richard Biener

On Fri, Jul 28, 2017 at 8:22 PM, Steve Ellcey  wrote:
> On Fri, 2017-07-28 at 09:47 +0200, Richard Biener wrote:
>> On Fri, Jul 28, 2017 at 12:16 AM, Steve Ellcey  wrote:
>> >
>> > Any comments from the power and/or vectorizer folks?
>> On one side I'm inclined to simplify the testsuite by adding
>> --param vect-max-peeling-for-alignment=0 in addition to
>> -fno-vect-cost-model we already pass and override that in the
>> tests that specifically exercise peeling for alignment (do we have
>> any?).
>> OTOH that would remove quite some testing coverage of prologue
>> peeling.
>>
>> So ideally testresults would be clean with both no such --param
>> and that --param added...
>>
>> I think most of the testcases you needed to adjust have nothing
>> to do with peeling for alignment thus adding this --param just for
>> those (and simplifying their dump scanning accordingly) is another
>> pragmatic option.
>>
>> Adding yet another target (vect_peel_align) is IMHO not good,
>> especially as this one depends on cost tuning and not HW
>> features, so it's impossible(?) to dynamically compute it
>> with a test compile for example (we _do_ want a clean
>> vect.exp with any vector HW / tuning switch you add).
>
> How about something like the following.  I only fixed two of the tests,
> I can follow up with more if this approach seems reasonable.  I tested
> this on aarch64 and x86_64.

Looks good to me.

Richard.

> Steve Ellcey
> sell...@cavium.com
>
>
> 2017-07-28  Steve Ellcey  
>
> PR tree-optimization/80925
> * gcc.dg/vect/no-section-anchors-vect-69.c: Add
> --param vect-max-peeling-for-alignment=0 option.
> Remove unaligned access and peeling checks.
> * gcc.dg/vect/section-anchors-vect-69.c: Ditto.

Re: [PATCH] Handle BIT_INSERT_EXPR in hashable_expr_equal_p

2017-07-31 Thread Richard Biener

On Sat, Jul 29, 2017 at 8:48 PM, Andrew Pinski  wrote:
> Hi,
>   When I was playing around where lowering of bit-field accesses go in
> the pass order, I found that DOM had the same issue as PRE had when it
> came to comparing BIT_INSERT_EXPR for equality.  The same exact
> testcase was showing the wrong code; gcc.dg/tree-ssa/20040324-1.c.
>
> This fixes DOM the same way as I had fixed PRE, by special casing
> BIT_INSERT_EXPR due to the implicit operand.
>
> OK?  Bootstrapped and tested on aarch64-linux-gnu with no regressions.

Watch excess vertical space:

+return false;
+
+
   if (operand_equal_p (expr0->ops.ternary.opnd0,


Ok with that fixed.

Thanks,
Richard.

> Thanks,
> Andrew Pinski
>
> ChangeLog:
> * tree-ssa-scopedtables.c (hashable_expr_equal_p): Check
> BIT_INSERT_EXPR's operand 1
> to see if the types precision matches.

[PING*3][PATCH] [PR79542][Ada] Fix ICE in dwarf2out.c with nested func. inlining

2017-07-31 Thread Pierre-Marie de Rodat


Hello,

Ping for the updated patch, originally submitted at
. Thank you in 
advance!


--
Pierre-Marie de Rodat

Re: [PATCH v2] Dump BB number when dumping a BB with label.

2017-07-31 Thread Richard Biener

On Mon, Jul 31, 2017 at 8:43 AM, Martin Liška  wrote:
> On 07/28/2017 01:21 PM, Richard Biener wrote:
>> On Fri, Jul 28, 2017 at 12:52 PM, Martin Liška  wrote:
>>> On 07/28/2017 09:58 AM, Richard Biener wrote:
 On Fri, Jul 28, 2017 at 9:50 AM, Martin Liška  wrote:
> On 07/28/2017 09:21 AM, Richard Biener wrote:
>> On Thu, Jul 27, 2017 at 4:24 PM, Martin Liška  wrote:
>>> Hi.
>>>
>>> Following simple patch adds support for dumping of BBs when it's a BB
>>> that contains a label. That makes it easier for debugging as one can
>>> find destination for an edge in dump file.
>>>
>>> Sample, before:
>>>
>>> foo (int a)
>>> {
>>>   int D.1821;
>>>   int _1;
>>>   int _4;
>>>   int _5;
>>>
>>>[0.00%] [count: INV]:
>>>   switch (a_2(D))  [INV] [count: INV], case 0:  [INV] 
>>> [count: INV], case 1:  [INV] [count: INV]>
>>>
>>>  [0.00%] [count: INV]:
>>>   a_3 = a_2(D) + 2;
>>>
>>>  [0.00%] [count: INV]:
>>>   _4 = 2;
>>>   goto  (); [INV] [count: INV]
>>>
>>>  [0.00%] [count: INV]:
>>>   _5 = 123;
>>>
>>>   # _1 = PHI <_4(4), _5(5)>
>>>  [0.00%] [count: INV]:
>>>   return _1;
>>>
>>> }
>>>
>>> After:
>>>
>>> foo (int a)
>>> {
>>>   int D.1821;
>>>   int _1;
>>>   int _4;
>>>   int _5;
>>>
>>>[0.00%] [count: INV]:
>>>   switch (a_2(D))  [INV] [count: INV], case 0:  [INV] 
>>> [count: INV], case 1:  [INV] [count: INV]>
>>>
>>>  () [0.00%] [count: INV]:
>>>   a_3 = a_2(D) + 2;
>>>
>>>  () [0.00%] [count: INV]:
>>>   _4 = 2;
>>>   goto  (); [INV] [count: INV]
>>>
>>>  () [0.00%] [count: INV]:
>>>   _5 = 123;
>>>
>>>   # _1 = PHI <_4(4), _5(5)>
>>>  () [0.00%] [count: INV]:
>>>   return _1;
>>>
>>> }
>>>
>>> Patch can bootstrap on ppc64le-redhat-linux and survives regression 
>>> tests.
>>>
>>> Thoughts?
>>
>> I think I prefer to always see
>>
>>:
>>
>> and if there's a label just dump that as well, thus
>>
>>:
>>   L0:
>>
>> I think that's how we dump the case with multiple labels.  And always 
>> use the
>> implicit bb N when dumping destinations (in gotos, switches, etc).
>>
>> That is, what we have now is IMHO premature prettifying losing BB
>> indices in the dumps
>> unnecessarily.
>>
>> Richard.
>
> Hi.
>
> I like your ideas, there's difference in between 7.1 and modified trunk:
>
> foo (int a)
> {
>   int D.1824;
>   int _1;
>   int _4;
>   int _6;
>
>[0.00%] [count: INV]:
>   switch (a_2(D))  [INV] [count: INV], case 0:  [INV] 
> [count: INV], case 1:  [INV] [count: INV]>
>
>  [0.00%] [count: INV]:
>   a_3 = a_2(D) + 2;
>
>  [0.00%] [count: INV]:
>   _4 = 2;
>   goto  (); [INV] [count: INV]
>
>  [0.00%] [count: INV]:
>
>[0.00%] [count: INV]:
>   a_5 = a_2(D) + 2;
>
> label_XXX [0.00%] [count: INV]:
> label_YYY [0.00%] [count: INV]:
>   _6 = 101;
>
>   # _1 = PHI <_4(4), _6(7)>
>  [0.00%] [count: INV]:
>   return _1;
>
> }
>
> after:
>
> foo (int a)
> {
>   int D.1824;
>   int _1;
>   int _4;
>   int _6;
>
>[0.00%] [count: INV]:
>   switch (a_2(D))  [INV] [count: INV], case 0:  
> [INV] [count: INV], case 1:  [INV] [count: INV]>
>
>[0.00%] [count: INV]:
> :
>   a_3 = a_2(D) + 2;
>
>[0.00%] [count: INV]:
> :
>   _4 = 2;
>   goto ; [INV] [count: INV]
>
>[0.00%] [count: INV]:
> :
>
>[0.00%] [count: INV]:
>   a_5 = a_2(D) + 2;
>
>[0.00%] [count: INV]:
> label_XXX:
> label_YYY:
>   _6 = 101;
>
>[0.00%] [count: INV]:
>   # _1 = PHI <_4(4), _6(7)>
> :
>   return _1;
>
> }
>
> Do you like it? What about indentation of labels, should I increase it or 
> leave it?

 Leave it.

> I guess there will be some tests that will need to be adjusted.

 I guess so.

 I think : and friends are all DECL_ARTIFICIAL -- maybe we can avoid 
 dumping
 them?  Hmm, I guess doing it like above, while it preserves BB numbering, 
 does
 reflect the actual IL a bit less so I guess I'd leave the s in
 switches (those
 have labels) and gotos if there's still the label args (not in case of
 we are just
 dumping CFG edges).
>>>
>>> Good, thus said there's how it will look like:
>>>
>>> $ cat /tmp/switch.c
>>> int c;
>>>
>>> int foo(int a)
>>> {
>>>   switch (a)
>>>   {
>>>   case 0:
>>> a += 2;
>>>   case 1:
>>> if (c)
>>>   goto label_XXX;
>>> return 2;
>>>   default:
>>> break;
>>>   }
>>>
>>>   a += 2;
>>>
>>> label_XXX:
>>> label_YYY:
>>>

Re: [PATCH] Fix a pasto in gfc_check_num_images

2017-07-31 Thread Richard Biener

On Mon, Jul 31, 2017 at 9:39 AM, Jakub Jelinek  wrote:
> Hi!
>
> bootstrap-ubsan reported taking address of a &distance->where when
> distance is NULL.  The function has one block guarded with if (distance)
> and another guarded with if (failed), so I think this is just a pasto
> from the earlier if (distance) block.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Richard.

> 2017-07-31  Jakub Jelinek  
>
> * check.c (gfc_check_num_images): Fix a pasto.
>
> --- gcc/fortran/check.c.jj  2017-03-05 22:39:52.0 +0100
> +++ gcc/fortran/check.c 2017-07-28 14:30:40.503511275 +0200
> @@ -5149,7 +5149,7 @@ gfc_check_num_images (gfc_expr *distance
> return false;
>
>if (!gfc_notify_std (GFC_STD_F2008_TS, "FAILED= argument to "
> -  "NUM_IMAGES at %L", &distance->where))
> +  "NUM_IMAGES at %L", &failed->where))
> return false;
>  }
>
>
> Jakub

Re: [PATCH v2] Dump BB number when dumping a BB with label.

2017-07-31 Thread Martin Liška

On 07/31/2017 10:50 AM, Richard Biener wrote:
> On Mon, Jul 31, 2017 at 8:43 AM, Martin Liška  wrote:
>> On 07/28/2017 01:21 PM, Richard Biener wrote:
>>> On Fri, Jul 28, 2017 at 12:52 PM, Martin Liška  wrote:
 On 07/28/2017 09:58 AM, Richard Biener wrote:
> On Fri, Jul 28, 2017 at 9:50 AM, Martin Liška  wrote:
>> On 07/28/2017 09:21 AM, Richard Biener wrote:
>>> On Thu, Jul 27, 2017 at 4:24 PM, Martin Liška  wrote:
 Hi.

 Following simple patch adds support for dumping of BBs when it's a BB
 that contains a label. That makes it easier for debugging as one can
 find destination for an edge in dump file.

 Sample, before:

 foo (int a)
 {
   int D.1821;
   int _1;
   int _4;
   int _5;

[0.00%] [count: INV]:
   switch (a_2(D))  [INV] [count: INV], case 0:  
 [INV] [count: INV], case 1:  [INV] [count: INV]>

  [0.00%] [count: INV]:
   a_3 = a_2(D) + 2;

  [0.00%] [count: INV]:
   _4 = 2;
   goto  (); [INV] [count: INV]

  [0.00%] [count: INV]:
   _5 = 123;

   # _1 = PHI <_4(4), _5(5)>
  [0.00%] [count: INV]:
   return _1;

 }

 After:

 foo (int a)
 {
   int D.1821;
   int _1;
   int _4;
   int _5;

[0.00%] [count: INV]:
   switch (a_2(D))  [INV] [count: INV], case 0:  
 [INV] [count: INV], case 1:  [INV] [count: INV]>

  () [0.00%] [count: INV]:
   a_3 = a_2(D) + 2;

  () [0.00%] [count: INV]:
   _4 = 2;
   goto  (); [INV] [count: INV]

  () [0.00%] [count: INV]:
   _5 = 123;

   # _1 = PHI <_4(4), _5(5)>
  () [0.00%] [count: INV]:
   return _1;

 }

 Patch can bootstrap on ppc64le-redhat-linux and survives regression 
 tests.

 Thoughts?
>>>
>>> I think I prefer to always see
>>>
>>>:
>>>
>>> and if there's a label just dump that as well, thus
>>>
>>>:
>>>   L0:
>>>
>>> I think that's how we dump the case with multiple labels.  And always 
>>> use the
>>> implicit bb N when dumping destinations (in gotos, switches, etc).
>>>
>>> That is, what we have now is IMHO premature prettifying losing BB
>>> indices in the dumps
>>> unnecessarily.
>>>
>>> Richard.
>>
>> Hi.
>>
>> I like your ideas, there's difference in between 7.1 and modified trunk:
>>
>> foo (int a)
>> {
>>   int D.1824;
>>   int _1;
>>   int _4;
>>   int _6;
>>
>>[0.00%] [count: INV]:
>>   switch (a_2(D))  [INV] [count: INV], case 0:  [INV] 
>> [count: INV], case 1:  [INV] [count: INV]>
>>
>>  [0.00%] [count: INV]:
>>   a_3 = a_2(D) + 2;
>>
>>  [0.00%] [count: INV]:
>>   _4 = 2;
>>   goto  (); [INV] [count: INV]
>>
>>  [0.00%] [count: INV]:
>>
>>[0.00%] [count: INV]:
>>   a_5 = a_2(D) + 2;
>>
>> label_XXX [0.00%] [count: INV]:
>> label_YYY [0.00%] [count: INV]:
>>   _6 = 101;
>>
>>   # _1 = PHI <_4(4), _6(7)>
>>  [0.00%] [count: INV]:
>>   return _1;
>>
>> }
>>
>> after:
>>
>> foo (int a)
>> {
>>   int D.1824;
>>   int _1;
>>   int _4;
>>   int _6;
>>
>>[0.00%] [count: INV]:
>>   switch (a_2(D))  [INV] [count: INV], case 0:  
>> [INV] [count: INV], case 1:  [INV] [count: INV]>
>>
>>[0.00%] [count: INV]:
>> :
>>   a_3 = a_2(D) + 2;
>>
>>[0.00%] [count: INV]:
>> :
>>   _4 = 2;
>>   goto ; [INV] [count: INV]
>>
>>[0.00%] [count: INV]:
>> :
>>
>>[0.00%] [count: INV]:
>>   a_5 = a_2(D) + 2;
>>
>>[0.00%] [count: INV]:
>> label_XXX:
>> label_YYY:
>>   _6 = 101;
>>
>>[0.00%] [count: INV]:
>>   # _1 = PHI <_4(4), _6(7)>
>> :
>>   return _1;
>>
>> }
>>
>> Do you like it? What about indentation of labels, should I increase it 
>> or leave it?
>
> Leave it.
>
>> I guess there will be some tests that will need to be adjusted.
>
> I guess so.
>
> I think : and friends are all DECL_ARTIFICIAL -- maybe we can avoid 
> dumping
> them?  Hmm, I guess doing it like above, while it preserves BB numbering, 
> does
> reflect the actual IL a bit less so I guess I'd leave the s in
> switches (those
> have labels) and gotos if there's still the label args (not in case of
> we are just
> dumping CFG edges).

 Good, thus said there's how it will look like:

 $ cat /tmp/switch.c
 int c;

>>>

Re: [PATCH] Do UBSAN sanitization just when current_function_decl != NULL_TREE (PR sanitize/81530).

2017-07-31 Thread Martin Liška

On 07/31/2017 10:27 AM, Marek Polacek wrote:
> On Fri, Jul 28, 2017 at 01:20:57PM +0200, Martin Liška wrote:
>> Hi.
>>
>> In r249158 I added new sanitize_flags_p function. However I removed various 
>> calls of
>> do_ubsan_in_current_function:
>>
>> /* True if we want to play UBSan games in the current function.  */
>>
>> bool
>> do_ubsan_in_current_function ()
>> {
>>   return (current_function_decl != NULL_TREE
>>  && !lookup_attribute ("no_sanitize_undefined",
>>DECL_ATTRIBUTES (current_function_decl)));
>> }
>>
>> Where we also checked for current_function_decl. This putting that condition 
>> back to conditions
>> that used to call do_ubsan_in_current_function is necessary.
>>
>> May I ask you Marek (or Jakub) to go through and verify that the check is 
>> really necessary?
>>
>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>>
>> Ready to be installed?
>> Martin
>>
>> gcc/cp/ChangeLog:
>>
>> 2017-07-28  Martin Liska  
>>
>>  PR sanitize/81530
>>  * cp-gimplify.c (cp_genericize): Guard condition with flag_sanitize_p
>>  also with current_function_decl non-null equality.
>>  * cp-ubsan.c (cp_ubsan_instrument_vptr_p): Likewise.
>>  * decl.c (compute_array_index_type): Likewise.
>>  * init.c (finish_length_check): Likewise.
>>  * typeck.c (cp_build_binary_op): Likewise.
>>
>> gcc/c/ChangeLog:
>>
>> 2017-07-28  Martin Liska  
>>
>>  PR sanitize/81530
>>  * c-convert.c (convert): Guard condition with flag_sanitize_p
>>  also with current_function_decl non-null equality.
>>  * c-decl.c (grokdeclarator): Likewise.
>>  * c-typeck.c (build_binary_op): Likewise.
>>
>> gcc/ChangeLog:
>>
>> 2017-07-28  Martin Liska  
>>
>>  PR sanitize/81530
>>  * convert.c (convert_to_integer_1): Guard condition with flag_sanitize_p
>>  also with current_function_decl non-null equality.
>>
>> gcc/c-family/ChangeLog:
>>
>> 2017-07-28  Martin Liska  
>>
>>  PR sanitize/81530
>>  * c-ubsan.c (ubsan_maybe_instrument_array_ref):
>>  Guard condition with flag_sanitize_p also with current_function_decl
>>  non-null equality.
>>  (ubsan_maybe_instrument_reference_or_call): Likewise.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2017-07-28  Martin Liska  
>>
>>  PR sanitize/81530
>>  * g++.dg/ubsan/pr81530.C: New test.
>> ---
>>  gcc/c-family/c-ubsan.c   | 6 --
>>  gcc/c/c-convert.c| 1 +
>>  gcc/c/c-decl.c   | 1 +
>>  gcc/c/c-typeck.c | 1 +
>>  gcc/convert.c| 3 ++-
>>  gcc/cp/cp-gimplify.c | 3 ++-
>>  gcc/cp/cp-ubsan.c| 3 +++
>>  gcc/cp/decl.c| 3 ++-
>>  gcc/cp/init.c| 3 ++-
>>  gcc/cp/typeck.c  | 1 +
>>  gcc/testsuite/g++.dg/ubsan/pr81530.C | 6 ++
>>  11 files changed, 25 insertions(+), 6 deletions(-)
>>  create mode 100644 gcc/testsuite/g++.dg/ubsan/pr81530.C
>>
>>
> 
>> diff --git a/gcc/c-family/c-ubsan.c b/gcc/c-family/c-ubsan.c
>> index a072d19eda6..541b53009c2 100644
>> --- a/gcc/c-family/c-ubsan.c
>> +++ b/gcc/c-family/c-ubsan.c
>> @@ -373,7 +373,8 @@ void
>>  ubsan_maybe_instrument_array_ref (tree *expr_p, bool ignore_off_by_one)
>>  {
>>if (!ubsan_array_ref_instrumented_p (*expr_p)
>> -  && sanitize_flags_p (SANITIZE_BOUNDS | SANITIZE_BOUNDS_STRICT))
>> +  && sanitize_flags_p (SANITIZE_BOUNDS | SANITIZE_BOUNDS_STRICT)
>> +  && current_function_decl != NULL_TREE)
>>  {
>>tree op0 = TREE_OPERAND (*expr_p, 0);
>>tree op1 = TREE_OPERAND (*expr_p, 1);
>> @@ -393,7 +394,8 @@ static tree
>>  ubsan_maybe_instrument_reference_or_call (location_t loc, tree op, tree 
>> ptype,
>>enum ubsan_null_ckind ckind)
>>  {
>> -  if (!sanitize_flags_p (SANITIZE_ALIGNMENT | SANITIZE_NULL))
>> +  if (!sanitize_flags_p (SANITIZE_ALIGNMENT | SANITIZE_NULL)
>> +  || current_function_decl == NULL_TREE)
>>  return NULL_TREE;
>>  
>>tree type = TREE_TYPE (ptype);
>> diff --git a/gcc/c/c-convert.c b/gcc/c/c-convert.c
>> index 33c9143e354..07862543334 100644
>> --- a/gcc/c/c-convert.c
>> +++ b/gcc/c/c-convert.c
>> @@ -108,6 +108,7 @@ convert (tree type, tree expr)
>>  case INTEGER_TYPE:
>>  case ENUMERAL_TYPE:
>>if (sanitize_flags_p (SANITIZE_FLOAT_CAST)
>> +  && current_function_decl != NULL
> 
> Should be NULL_TREE.

Yes, fixed.

> 
> It's non-obvious to prove that some checks might not be necessary, but I
> verifed that gcc7 had do_ubsan_in_current_function where you're adding
> the current_function_decl checks, so I think this should be good to go.
> 
>   Marek
> 

Good, I installed the patch as r250730.

Martin

Re: [PATCH 4/N] Recover GOTO predictor.

2017-07-31 Thread Richard Biener

On Mon, Jul 31, 2017 at 9:46 AM, Martin Liška  wrote:
> Richi?

4 is fine.

> Thanks
>
> On 06/30/2017 03:48 PM, Martin Liška wrote:
>> On 06/22/2017 12:27 PM, Richard Biener wrote:
>>> On Wed, Jun 21, 2017 at 3:06 PM, Martin Liška  wrote:
 Hello.

 There's one additional predictor enhancement that is GOTO predict that
 used to working. Following patch adds expect statement for C and C++ family
 languages.

 There's one fallout which is vrp24.c test-case, where only 'Simplified 
 relational'
 appears just once. Adding Richi and Patrick who can probably help how to 
 fix the
 test-case.
>>>
>>> Happens to be optimized better now, there's only one predicate to simplify
>>> left in the IL input to VRP1.  I suggest to change it to match 1 times and 
>>> add
>>> -fdump-tree-optimized to dg-options and
>>>
>>> /* { dg-final { scan-tree-dump-times "if " 3 "optimized" } } */
>>>
>>> to verify we have 3 conditions left.
>>
>> One small note, I see 4 conditions in optimized dump:
>>
>> sss (struct rtx_def * insn, int code1, int code2, int code3)
>> {
>>   int D1544;
>>   struct rtx_def * body;
>>   _Bool D1562;
>>
>>[100.00%] [count: INV]:
>>   body_5 = insn_4(D)->u.fld[5].rt_rtx;
>>   D1544_6 = body_5->code;
>>   if (D1544_6 == 55)
>> goto  (L7); [34.00%] [count: INV]
>>   else
>> goto ; [66.00%] [count: INV]
>>
>>[66.00%] [count: INV]:
>>   if (code3_7(D) == 99)
>> goto ; [34.00%] [count: INV]
>>   else
>> goto  (L16); [66.00%] [count: INV]
>>
>>[22.44%] [count: INV]:
>>   D1562_9 = code1_8(D) == 10;
>>   if (D1562_9 == 1)
>> goto  (L7); [64.00%] [count: INV]
>>   else
>> goto  (L16); [36.00%] [count: INV]
>>
>>[9.82%] [count: INV]:
>>   arf ();
>>
>>[46.68%] [count: INV]:
>>   frob (); [tail call]
>>   goto  (L16); [100.00%] [count: INV]
>>
>> L7 [48.36%] [count: INV]:
>>   if (code2_12(D) == 42)
>> goto ; [20.24%] [count: INV]
>>   else
>> goto ; [79.76%] [count: INV]
>>
>> L16 [100.00%] [count: INV]:
>>   return;
>>
>> }
>>
>> Is it a problem or adjusting to 4 is fine?
>>
>> Martin
>>
>>>
 Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

 Ready to be installed?
 Martin
>>
>

Re: [PATCH] Fix PR81354 (rewrite gimple_split_edge)

2017-07-31 Thread Richard Biener

On Sun, Jul 30, 2017 at 8:04 PM, Bill Schmidt
 wrote:
> Hi,
>
> PR81354 identifies a latent bug that can happen in SLSR since the
> conditional candidate support was first added.  SLSR relies on the
> address of a GIMPLE PHI remaining constant during the course of the
> optimization pass, but it needs to split edges.  The use of
> make_single_succ_edge and reinstall_phi_args in gimple_split_edge
> causes GIMPLE PHI statements to be temporarily expanded to add a
> predecessor, and then rebuilt to have the original number of
> predecessors.  The expansion usually, if not always, causes the PHI
> statement to change address.  Thus gimple_split_edge needs to be
> rewritten to perform in-situ replacement of PHI arguments.
>
> The required pieces of make_single_succ_edge have been extracted into
> two places:  make_replacement_pred_edge, and some fixup code at the
> end of gimple_split_edge.  The division is necessary because the
> destination of the original edge must remember its original
> predecessors for the switch processing in
> gimple_redirect_edge_and_branch_1 to work properly.
>
> The function gimple_redirect_edge_and_branch was factored into two
> pieces so that most of it can be used by gimple_split_edge without
> calling ssa_redirect_edge, which also interferes with PHIs.  The
> useful bits of ssa_redirect_edge are factored out into the next three
> lines of gimple_split_edge.
>
> Similarly, redirect_eh_edge had already been similarly factored into
> redirect_eh_edge_1 and ssa_redirect_edge.  I took advantage of that
> and exposed redirect_eh_edge_1 for use in gimple_redirect_edge_and_branch_1.
>
> I've added the test from PR81354 as a torture test, but as we've seen,
> small changes elsewhere in the optimizer can easily hide the problem.
>
> Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
> Is this ok for trunk?  Eventually this needs to be backported to GCC 5,
> 6, and 7 if that's acceptable, since PR81354 was observed on
> gcc-5-branch.  I haven't yet prepared the backports.

I don't like make_replacement_pred_edge too much.  Wouldn't it work
to make sure we first shrink and then re-grow like if we simply do the
redirect_edge_and_branch before the make_single_succ_edge call?
At least quick testing shows it fixes the testcase on the GCC 6 branch for me.

Index: gcc/tree-cfg.c
===
--- gcc/tree-cfg.c  (revision 250732)
+++ gcc/tree-cfg.c  (working copy)
@@ -2753,12 +2753,16 @@ gimple_split_edge (edge edge_in)
   new_bb = create_empty_bb (after_bb);
   new_bb->frequency = EDGE_FREQUENCY (edge_in);
   new_bb->count = edge_in->count;
+
+  /* First redirect the existing edge to avoid reallocating
+ PHI nodes in dest.  */
+  e = redirect_edge_and_branch (edge_in, new_bb);
+  gcc_assert (e == edge_in);
+
   new_edge = make_edge (new_bb, dest, EDGE_FALLTHRU);
   new_edge->probability = REG_BR_PROB_BASE;
   new_edge->count = edge_in->count;

-  e = redirect_edge_and_branch (edge_in, new_bb);
-  gcc_assert (e == edge_in);
   reinstall_phi_args (new_edge, e);

   return new_bb;

Sorry for misleading you to a complex solution.

Thanks,
Richard.

> Thanks,
> Bill
>
>
> [gcc]
>
> 2017-07-30  Bill Schmidt  
>
> PR tree-optimization/81354
> * tree-cfg.c (gimple_redirect_edge_and_branch_1): New decl.
> (reinstall_phi_args): Delete function.
> (make_replacement_pred_edge): New function.
> (gimple_split_edge): Rewrite.
> (gimple_redirect_edge_and_branch_1): New function, factored
> from...
> (gimple_redirect_edge_and_branch): ...here.
> (split_critical_edges): Don't re-split already split edges.
> * tree-eh.c (redirect_eh_edge_1): Make visible.
> * tree-eh.h (redirect_eh_edge_1): Likewise.
>
> [gcc/testsuite]
>
> 2017-07-30  Bill Schmidt  
>
> PR tree-optimization/81354
> * g++.dg/torture/pr81354.C: New file.
>
>
> Index: gcc/testsuite/g++.dg/torture/pr81354.C
> ===
> --- gcc/testsuite/g++.dg/torture/pr81354.C  (nonexistent)
> +++ gcc/testsuite/g++.dg/torture/pr81354.C  (working copy)
> @@ -0,0 +1,24 @@
> +// PR81354 reported this test as crashing in a limited range of revisions.
> +// { dg-do compile }
> +
> +struct T { double a; double b; };
> +
> +void foo(T Ad[], int As[2])
> +{
> +  int j;
> +  int i;
> +  int Bs[2] = {0,0};
> +  T Bd[16];
> +
> +  for (j = 0; j < 4; j++) {
> +for (i = 0; i + 1 <= j + 1; i++) {
> +  Ad[i + As[0] * j] = Bd[i + Bs[0] * j];
> +}
> +
> +i = j + 1;  // <- comment out this line and it does not crash
> +for (; i + 1 < 5; i++) {
> +  Ad[i + As[0] * j].a = 0.0;
> +  Ad[i + As[0] * j].b = 0.0;
> +}
> +  }
> +}
> Index: gcc/tree-cfg.c
> ===
> --- gcc/tree-cfg.c  (revision 250721)
> +++ gcc/tree-cfg.c  (working copy)
>

Re: [ARM, VXworks] Fix build

2017-07-31 Thread Richard Earnshaw (lists)

On 25/07/17 11:31, Olivier Hainque wrote:
> Hi Richard,
> 
> (back from a few days away)
> 
>> On Jul 17, 2017, at 12:01 , Eric Botcazou  wrote:
>>
>>> That's good news.  Does that mean we'll be able to drop the old stuff
>>> though?  I'd really like to make progress towards removing the old ABI
>>> support from GCC.
>>
>> Yes, I'd think so, but Olivier knows better.
>>
>>> Note that considering a port for deprecation is only the first step.  It
>>> does not mean that it has been deprecated.  Sometimes the only way to
>>> find out if a port really is wanted is to make such a threat...
>>
>> No disagreement. ;-)
> 
> In this instance, no doubt we want to keep the port in. As I mentioned
> off-list, the port is still very active from our (AdaCore) perspective,
> I have just recently been enlisted as maintainer and plan to contribute
> significant updates.
> 
> Part of that is support for VxWorks 7, which has switched to the new
> ABI.
> 
> Regarding removal of old ABI support, which release were you
> targeting ?
> 
> On the VxWorks front, where we adapt to what the system toolchains
> do, it will mean dropping support for VxWorks versions prior to 7,
> which is not so old - couple of years I think. Not the end of the
> world, but an extra release cycle can make a difference.
> 
> Is VxWorks alone in this category ?

Pretty close, I think.  The only other ARM port using the old ABI that
I'm aware of is NetBSD.  That doesn't use GCC as it's default compiler
these days and even there an EABI port is in use (I suspect Clang
requires it).

So until I became aware of the VXworks port using the old ABI, I thought
there was only one remaining port - VXworks makes that 2 but both seem
to have a transition plan of sorts.

I think deprecating in gcc-8 with removal in GCC-9 is probably viable on
that basis, so that's my opening bid.  Given that gcc-8 will have a
2-year support window that means support for the old ABI through to
~2020, at which point v2 of the EABI will itself be 15 years old.

R.

> 
> Olivier
> 
> 
> 
> 
>

Re: [PATCH] Introduce TARGET_SUPPORTS_ALIASES

2017-07-31 Thread Yuri Gribov

On Mon, Jul 31, 2017 at 9:04 AM, Martin Liška  wrote:
> Hi.
>
> Doing the transformation suggested by Honza.
>
> Patch can bootstrap on ppc64le-redhat-linux and x86_64-linux-gnu and survives 
> regression tests.
> And I also verified that works on hppa2.0w-hp-hpux11.11 (target w/o aliasing 
> support).
>
> Ready to be installed?

A nit - you can probly get rid of ATTRIBUTE_UNUSED in note_mangling_alias now.

-Y

Re: [PING] [PATCH v4 0/12] [i386] Improve 64-bit Microsoft to System V ABI pro/epilogues

2017-07-31 Thread Daniel Santos


On 07/28/2017 09:41 AM, H.J. Lu wrote:

On Fri, Jul 28, 2017 at 6:57 AM, Daniel Santos  wrote:

On 07/26/2017 02:03 PM, H.J. Lu wrote:

This patch caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81563

Hello.  I've rebased my patch set and I'm now retesting.  I'm afraid that
your changes are wrong because my my sp_valid_at and fp_valid_at functions
are wrong -- these are supposed to be for the base offset and not the CFA
offset, sorry about that.  This means that the check in choose_basereg (and
thus choose_baseaddr) have been wrong as well.  I'm retesting now.

Please check your change with gcc.target/i386/pr81563.c.

Thanks.


I'm still getting used to x86 stack math and and briefly I thought that 
my understanding of the CFA was wrong and that I had messed up 
sp_valid_at and fp_valid_at, but I was mistaken, so sorry for the false 
alarm.  My rebased patches pass all tests, so it's OK.

C++ PATCH to fix ICE with bit-fields and ?: (PR c++/81607)

2017-07-31 Thread Marek Polacek

We crash in the gimplifier because it sees a shift expression with different
types of op1/op2.  The problem arises in cp_fold: it gets "1 ? d.c : 0" of type
int which is folded to "d.c", but it then sees that d.c is a bit-field with
declared type long int, so if produces "(long int) d.c" -- but that differs
from the original COND_EXPR type!  Folding should not change the type of the
CODE_EXPR, thus this fix.

Bootstrapped/regtested on x86_64-linux, ok for trunk/7/6?

2017-07-31  Marek Polacek  

PR c++/81607
* cp-gimplify.c (cp_fold): If folding exposed a branch of
a COND_EXPR, convert it to the original type of the COND_EXPR, if
they differ.   

* g++.dg/other/bitfield6.C: New test.

diff --git gcc/cp/cp-gimplify.c gcc/cp/cp-gimplify.c
index a9563b1a8cd..9fc4ab18207 100644
--- gcc/cp/cp-gimplify.c
+++ gcc/cp/cp-gimplify.c
@@ -2314,9 +2314,9 @@ cp_fold (tree x)
 
   /* A COND_EXPR might have incompatible types in branches if one or both
 arms are bitfields.  If folding exposed such a branch, fix it up.  */
-  if (TREE_CODE (x) != code)
-   if (tree type = is_bitfield_expr_with_lowered_type (x))
- x = fold_convert (type, x);
+  if (TREE_CODE (x) != code
+ && !useless_type_conversion_p (TREE_TYPE (org_x), TREE_TYPE (x)))
+   x = fold_convert (TREE_TYPE (org_x), x);
 
   break;
 
diff --git gcc/testsuite/g++.dg/other/bitfield6.C 
gcc/testsuite/g++.dg/other/bitfield6.C
index e69de29bb2d..c1e8a17989b 100644
--- gcc/testsuite/g++.dg/other/bitfield6.C
+++ gcc/testsuite/g++.dg/other/bitfield6.C
@@ -0,0 +1,9 @@
+// PR c++/81607
+
+int a;
+
+struct b {
+  long c : 32;
+} d;
+
+char f = (903092 ? int(d.c) : 0) << a;

Marek

[PATCH 0/6] [i386] PR80969 Fix ICE with -mabi=ms -mavx512f

2017-07-31 Thread Daniel Santos

When working on the Wine64 project to use aligned SSE MOVs after SP 
realignment and adding -mcall-ms2sysv-xlogues, I overlooked the fact 
that the function body may require a stack alignment greater than 
16-bytes.  This can result in an ICE with -mabi=ms -mavx512f and some 
other cases.  This patch set reworks the strategy for calculating the 
frame layout following normal (inline) integral register saves (at 
frame.reg_save_offset) to the start of the frame for the local function 
(frame.frame_pointer_offset).


I've completed a bootstrap and full regression test with no additional 
failures, but I don't have access to a machine with avx512 extensions.  
I have manually run the tests that need it using the Intel SDE, but I 
haven't been able to validate that my 
check_effective_target_avx512f_runtime code in 
gcc/testsuite/lib/target-supports.exp is correctly enabling the tests 
for pr80969-4*.c.


As an aside note, I still have some rework of the ms-sysv.exp tests that 
I haven't yet to submitted and in which I'm adding more tests for cases 
with uncommon stacks, as in PR 81563.


Thanks,
Daniel
2017-07-23  Daniel Santos  

* config/i386/i386.h (ix86_frame::outlined_save_offset): Remove field.
(ix86_frame::stack_realign_allocate_offset): Likewise.
(ix86_frame::stack_realign_allocate): New field.
(struct machine_frame_state): Modify comments.
(machine_frame_state::sp_realigned_fp_end): New field.
(machine_function::call_ms2sysv_pad_out): Remove field.
* config/i386/i386.c (xlogue_layout::get_stack_space_used): Modify.
(ix86_compute_frame_layout): Likewise.
(sp_valid_at): Likewise.
(fp_valid_at): Likewise.
(choose_baseaddr): Modify comments.
(ix86_emit_outlined_ms2sysv_save): Modify.
(ix86_expand_prologue): Likewise.
(ix86_expand_epilogue): Modify comments.
2017-07-23  Daniel Santos  
* gcc.target/i386/pr80969-1.c: New testcase.
* gcc.target/i386/pr80969-2a.c: Likewise.
* gcc.target/i386/pr80969-2.c: Likewise.
* gcc.target/i386/pr80969-3.c: Likewise.
* gcc.target/i386/pr80969-4a.c: Likewise.
* gcc.target/i386/pr80969-4b.c: Likewise.
* gcc.target/i386/pr80969-4.c: Likewise.

[PATCH 1/6] [i386] Correct comments, add assertions to sp_valid_at and fp_valid_at

2017-07-31 Thread Daniel Santos

When we realign the stack frame (without DRAP), there may be a range of
CFA offsets that should never be touched because they are alignment
padding and any reference to them is almost certainly an error.
Previously, only the offset of where the realigned stack frame starts
was recorded and checked in sp_valid_at and fp_valid_at.

This change adds sp_realigned_fp_last to struct machine_frame_state to
record the last valid offset from which the frame pointer can be used
when the stack pointer is realigned and modifies sp_valid_at and
fp_valid_at to fail an assertion when passed an offset in the "no-man's
land" between these two values.

Comments for struct machine_frame_state incorrectly stated that a
realigned stack pointer could be used to access offsets equal to or
greater than sp_realigned_offset, but it is only valid for offsets that
are greater.  This was the (incorrect) behaviour of sp_valid_at and
fp_valid_at prior to r250587 and this change now corrects the
documentation and adds clarification of the CFA-relative calculation.

Signed-off-by: Daniel Santos 
---
 gcc/config/i386/i386.c | 45 ++---
 gcc/config/i386/i386.h | 18 +-
 2 files changed, 43 insertions(+), 20 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index f1486ff3750..690631dfe43 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -13102,26 +13102,36 @@ choose_baseaddr_len (unsigned int regno, 
HOST_WIDE_INT offset)
   return len;
 }
 
-/* Determine if the stack pointer is valid for accessing the cfa_offset.
-   The register is saved at CFA - CFA_OFFSET.  */
+/* Determine if the stack pointer is valid for accessing the CFA_OFFSET in
+   the frame save area.  The register is saved at CFA - CFA_OFFSET.  */
 
-static inline bool
+static bool
 sp_valid_at (HOST_WIDE_INT cfa_offset)
 {
   const struct machine_frame_state &fs = cfun->machine->fs;
-  return fs.sp_valid && !(fs.sp_realigned
- && cfa_offset <= fs.sp_realigned_offset);
+  if (fs.sp_realigned && cfa_offset <= fs.sp_realigned_offset)
+{
+  /* Validate that the cfa_offset isn't in a "no-man's land".  */
+  gcc_assert (cfa_offset <= fs.sp_realigned_fp_last);
+  return false;
+}
+  return fs.sp_valid;
 }
 
-/* Determine if the frame pointer is valid for accessing the cfa_offset.
-   The register is saved at CFA - CFA_OFFSET.  */
+/* Determine if the frame pointer is valid for accessing the CFA_OFFSET in
+   the frame save area.  The register is saved at CFA - CFA_OFFSET.  */
 
 static inline bool
 fp_valid_at (HOST_WIDE_INT cfa_offset)
 {
   const struct machine_frame_state &fs = cfun->machine->fs;
-  return fs.fp_valid && !(fs.sp_valid && fs.sp_realigned
- && cfa_offset > fs.sp_realigned_offset);
+  if (fs.sp_realigned && cfa_offset > fs.sp_realigned_fp_last)
+{
+  /* Validate that the cfa_offset isn't in a "no-man's land".  */
+  gcc_assert (cfa_offset >= fs.sp_realigned_offset);
+  return false;
+}
+  return fs.fp_valid;
 }
 
 /* Choose a base register based upon alignment requested, speed and/or
@@ -14560,6 +14570,9 @@ ix86_expand_prologue (void)
   int align_bytes = crtl->stack_alignment_needed / BITS_PER_UNIT;
   gcc_assert (align_bytes > MIN_STACK_BOUNDARY / BITS_PER_UNIT);
 
+  /* Record last valid frame pointer offset.  */
+  m->fs.sp_realigned_fp_last = m->fs.sp_offset;
+
   /* The computation of the size of the re-aligned stack frame means
 that we must allocate the size of the register save area before
 performing the actual alignment.  Otherwise we cannot guarantee
@@ -14573,13 +14586,15 @@ ix86_expand_prologue (void)
   insn = emit_insn (ix86_gen_andsp (stack_pointer_rtx,
stack_pointer_rtx,
GEN_INT (-align_bytes)));
-  /* For the purposes of register save area addressing, the stack
-pointer can no longer be used to access anything in the frame
-below m->fs.sp_realigned_offset and the frame pointer cannot be
-used for anything at or above.  */
   m->fs.sp_offset = ROUND_UP (m->fs.sp_offset, align_bytes);
   m->fs.sp_realigned = true;
   m->fs.sp_realigned_offset = m->fs.sp_offset - frame.nsseregs * 16;
+  /* The stack pointer may no longer be equal to CFA - m->fs.sp_offset.
+Beyond this point, stack access should be done via choose_baseaddr or
+by using sp_valid_at and fp_valid_at to determine the correct base
+register.  Henceforth, any CFA offset should be thought of as logical
+and not physical.  */
+  gcc_assert (m->fs.sp_realigned_offset >= m->fs.sp_realigned_fp_last);
   gcc_assert (m->fs.sp_realigned_offset == frame.stack_realign_offset);
   /* SEH unwind emit doesn't currently support REG_CFA_EXPRESSION, which
 is needed to describe where a register is saved using a rea

[PATCH 2/6] [i386] Remove ix86_frame::outlined_save_offset

2017-07-31 Thread Daniel Santos

This value was used in an earlier incarnation of the
-mcall-ms2sysv-xlogues patch set but is now set and never read.  The
value of ix86_frame::sse_reg_save_offset serves the same purpose.

Signed-off-by: Daniel Santos 
---
 gcc/config/i386/i386.c | 1 -
 gcc/config/i386/i386.h | 4 +---
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 690631dfe43..47c5608c3cd 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -12966,7 +12966,6 @@ ix86_compute_frame_layout (void)
 
   offset += xlogue.get_stack_space_used ();
   gcc_assert (!(offset & 0xf));
-  frame->outlined_save_offset = offset;
 }
 
   /* Align and set SSE register save area.  */
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index ce5bb7f6677..1648bdf1556 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -2477,8 +2477,7 @@ enum avx_u128_state
<- end of stub-saved/restored regs
  [padding1]
]
-   <- outlined_save_offset
-   <- sse_regs_save_offset
+   <- sse_reg_save_offset
[padding2]
   |<- FRAME_POINTER
[va_arg registers]  |
@@ -2504,7 +2503,6 @@ struct GTY(()) ix86_frame
   HOST_WIDE_INT reg_save_offset;
   HOST_WIDE_INT stack_realign_allocate_offset;
   HOST_WIDE_INT stack_realign_offset;
-  HOST_WIDE_INT outlined_save_offset;
   HOST_WIDE_INT sse_reg_save_offset;
 
   /* When save_regs_using_mov is set, emit prologue using
-- 
2.13.3

[PATCH 3/6] [i386] Remove machine_function::call_ms2sysv_pad_out

2017-07-31 Thread Daniel Santos

The -mcall-ms2sysv-xlogues project added the boolean fields
call_ms2sysv_pad_in and call_ms2sysv_pad_out to struct machine_function
to track rather or not an additional 8 bytes of padding was needed for
stack alignment prior to and after the stub save area.  This design was
based upon the faulty assumption the function body would not require a
stack alignment greater than 16 bytes.  This continues to work well for
managing padding prior to the stub save area, but will not work for the
outgoing alignment.

Rather than changing machine_function::call_ms2sysv_pad_out to a larger
type, this patch removes it, thus transferring responsibility for stack
alignment following the stub save area from class xlogue_layout to the
body of ix86_compute_frame_layout.  Since the 64-bit va_arg register
save area is always a multiple of 16-bytes in size (176 for System V ABI
and 96 for Microsoft ABI), the ROUND_UP calculation for the stack offset
at the start of the function body (frame.frame_pointer_offset) will
assure there is enough room for any padding needed to keep the save area
for SSE va_args 16-byte aligned, so no modification is needed for that
calculation.

Signed-off-by: Daniel Santos 
---
 gcc/config/i386/i386.c | 18 --
 gcc/config/i386/i386.h |  8 ++--
 2 files changed, 6 insertions(+), 20 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 47c5608c3cd..e2e9546a27c 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2491,9 +2491,7 @@ public:
 unsigned last_reg = m->call_ms2sysv_extra_regs + MIN_REGS - 1;
 
 gcc_assert (m->call_ms2sysv_extra_regs <= MAX_EXTRA_REGS);
-return m_regs[last_reg].offset
-  + (m->call_ms2sysv_pad_out ? 8 : 0)
-  + STUB_INDEX_OFFSET;
+return m_regs[last_reg].offset + STUB_INDEX_OFFSET;
   }
 
   /* Returns the offset for the base pointer used by the stub.  */
@@ -12849,13 +12847,12 @@ ix86_compute_frame_layout (void)
{
  unsigned count = xlogue_layout::count_stub_managed_regs ();
  m->call_ms2sysv_extra_regs = count - xlogue_layout::MIN_REGS;
+ m->call_ms2sysv_pad_in = 0;
}
 }
 
   frame->nregs = ix86_nsaved_regs ();
   frame->nsseregs = ix86_nsaved_sseregs ();
-  m->call_ms2sysv_pad_in = 0;
-  m->call_ms2sysv_pad_out = 0;
 
   /* 64-bit MS ABI seem to require stack alignment to be always 16,
  except for function prologues, leaf functions and when the defult
@@ -12957,15 +12954,7 @@ ix86_compute_frame_layout (void)
   gcc_assert (!frame->nsseregs);
 
   m->call_ms2sysv_pad_in = !!(offset & UNITS_PER_WORD);
-
-  /* Select an appropriate layout for incoming stack offset.  */
-  const struct xlogue_layout &xlogue = xlogue_layout::get_instance ();
-
-  if ((offset + xlogue.get_stack_space_used ()) & UNITS_PER_WORD)
-   m->call_ms2sysv_pad_out = 1;
-
-  offset += xlogue.get_stack_space_used ();
-  gcc_assert (!(offset & 0xf));
+  offset += xlogue_layout::get_instance ().get_stack_space_used ();
 }
 
   /* Align and set SSE register save area.  */
@@ -12993,6 +12982,7 @@ ix86_compute_frame_layout (void)
 
   /* Align start of frame for local function.  */
   if (stack_realign_fp
+  || m->call_ms2sysv
   || offset != frame->sse_reg_save_offset
   || size != 0
   || !crtl->is_leaf
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 1648bdf1556..b08e45f68d4 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -2646,17 +2646,13 @@ struct GTY(()) machine_function {
   BOOL_BITFIELD arg_reg_available : 1;
 
   /* If true, we're out-of-lining reg save/restore for regs clobbered
- by ms_abi functions calling a sysv function.  */
+ by 64-bit ms_abi functions calling a sysv_abi function.  */
   BOOL_BITFIELD call_ms2sysv : 1;
 
   /* If true, the incoming 16-byte aligned stack has an offset (of 8) and
- needs padding.  */
+ needs padding prior to out-of-line stub save/restore area.  */
   BOOL_BITFIELD call_ms2sysv_pad_in : 1;
 
-  /* If true, the size of the stub save area plus inline int reg saves will
- result in an 8 byte offset, so needs padding.  */
-  BOOL_BITFIELD call_ms2sysv_pad_out : 1;
-
   /* This is the number of extra registers saved by stub (valid range is
  0-6). Each additional register is only saved/restored by the stubs
  if all successive ones are. (Will always be zero when using a hard
-- 
2.13.3

[PATCH 4/6] [i386] Modify ix86_compute_frame_layout

2017-07-31 Thread Daniel Santos

These changes affect how the stack frame is calculated from the region
starting at frame.reg_save_offset until frame.frame_pointer_offset,
which includes either the stub save area or the (inline) SSE register
save area and the va_args register save area.

The calculation used when not realigning the stack pointer is the same,
but when when realigning we calculate the 16-byte aligned space needed
in reverse so that the stack realignment boundary at
frame.stack_realign_offset may not necessarily be a multiple of
stack_alignment_needed, but the value of frame.frame_pointer_offset
will. This results in a properly aligned stack for the function body and
avoids wasting stack space.

Signed-off-by: Daniel Santos 
---
 gcc/config/i386/i386.c | 116 +
 gcc/config/i386/i386.h |   2 +-
 2 files changed, 80 insertions(+), 38 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index e2e9546a27c..e92f322de0c 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -12874,6 +12874,14 @@ ix86_compute_frame_layout (void)
   gcc_assert (preferred_alignment >= STACK_BOUNDARY / BITS_PER_UNIT);
   gcc_assert (preferred_alignment <= stack_alignment_needed);
 
+  /* The only ABI saving SSE regs should be 64-bit ms_abi.  */
+  gcc_assert (TARGET_64BIT || !frame->nsseregs);
+  if (TARGET_64BIT && m->call_ms2sysv)
+{
+  gcc_assert (stack_alignment_needed >= 16);
+  gcc_assert (!frame->nsseregs);
+}
+
   /* For SEH we have to limit the amount of code movement into the prologue.
  At present we do this via a BLOCKAGE, at which point there's very little
  scheduling that can be done, which means that there's very little point
@@ -12936,54 +12944,88 @@ ix86_compute_frame_layout (void)
   if (TARGET_SEH)
 frame->hard_frame_pointer_offset = offset;
 
-  /* When re-aligning the stack frame, but not saving SSE registers, this
- is the offset we want adjust the stack pointer to.  */
-  frame->stack_realign_allocate_offset = offset;
+  /* Calculate the size of the va-arg area (not including padding, if any).  */
+  frame->va_arg_size = ix86_varargs_gpr_size + ix86_varargs_fpr_size;
 
-  /* The re-aligned stack starts here.  Values before this point are not
- directly comparable with values below this point.  Use sp_valid_at
- to determine if the stack pointer is valid for a given offset and
- fp_valid_at for the frame pointer.  */
   if (stack_realign_fp)
-offset = ROUND_UP (offset, stack_alignment_needed);
-  frame->stack_realign_offset = offset;
-
-  if (TARGET_64BIT && m->call_ms2sysv)
 {
-  gcc_assert (stack_alignment_needed >= 16);
-  gcc_assert (!frame->nsseregs);
+  /* We may need a 16-byte aligned stack for the remainder of the
+register save area, but the stack frame for the local function
+may require a greater alignment if using AVX/2/512.  In order
+to avoid wasting space, we first calculate the space needed for
+the rest of the register saves, add that to the stack pointer,
+and then realign the stack to the boundary of the start of the
+frame for the local function.  */
+  HOST_WIDE_INT space_needed = 0;
+  HOST_WIDE_INT sse_reg_space_needed = 0;
 
-  m->call_ms2sysv_pad_in = !!(offset & UNITS_PER_WORD);
-  offset += xlogue_layout::get_instance ().get_stack_space_used ();
-}
+  if (TARGET_64BIT)
+   {
+ if (m->call_ms2sysv)
+   {
+ m->call_ms2sysv_pad_in = 0;
+ space_needed = xlogue_layout::get_instance 
().get_stack_space_used ();
+   }
 
-  /* Align and set SSE register save area.  */
-  else if (frame->nsseregs)
-{
-  /* The only ABI that has saved SSE registers (Win64) also has a
-16-byte aligned default stack.  However, many programs violate
-the ABI, and Wine64 forces stack realignment to compensate.
+ else if (frame->nsseregs)
+   /* The only ABI that has saved SSE registers (Win64) also has a
+  16-byte aligned default stack.  However, many programs violate
+  the ABI, and Wine64 forces stack realignment to compensate.  */
+   space_needed = frame->nsseregs * 16;
+
+ sse_reg_space_needed = space_needed = ROUND_UP (space_needed, 16);
+
+ /* 64-bit frame->va_arg_size should always be a multiple of 16, but
+rounding to be pedantic.  */
+ space_needed = ROUND_UP (space_needed + frame->va_arg_size, 16);
+   }
+  else
+   space_needed = frame->va_arg_size;
+
+  /* Record the allocation size required prior to the realignment AND.  */
+  frame->stack_realign_allocate = space_needed;
+
+  /* The re-aligned stack starts at frame->stack_realign_offset.  Values
+before this point are not directly comparable with values below
+this point.  Use sp_valid_at to determine if the stack pointer is
+valid for a given offse

[PATCH 5/6] [i386] Modify SP realignment in ix86_expand_prologue, et. al.

2017-07-31 Thread Daniel Santos

The SP allocation calculation is now done in ix86_compute_frame_layout
and the result stored in ix86_frame::stack_realign_allocate.  This
change also updates comments for choose_baseaddr to clarify that the
alignment returned doesn't necessarily reflect the alignment of the
cfa_offset passed (e.g., you can pass cfa_offset 48 and it can return an
alignment of 64 bytes).

Since the alignment required may be more than 16-bytes, we cannot defer
SP allocation to ix86_emit_outlined_ms2sysv_save (when it's enabled), so
that function needs to be updated as well.

Signed-off-by: Daniel Santos 
---
 gcc/config/i386/i386.c | 54 +++---
 1 file changed, 29 insertions(+), 25 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index e92f322de0c..7e1fc4dfbf5 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -13273,10 +13273,13 @@ choose_basereg (HOST_WIDE_INT cfa_offset, rtx 
&base_reg,
 }
 
 /* Return an RTX that points to CFA_OFFSET within the stack frame and
-   the alignment of address.  If align is non-null, it should point to
+   the alignment of address.  If ALIGN is non-null, it should point to
an alignment value (in bits) that is preferred or zero and will
-   recieve the alignment of the base register that was selected.  The
-   valid base registers are taken from CFUN->MACHINE->FS.  */
+   recieve the alignment of the base register that was selected,
+   irrespective of rather or not CFA_OFFSET is a multiple of that
+   alignment value.
+
+   The valid base registers are taken from CFUN->MACHINE->FS.  */
 
 static rtx
 choose_baseaddr (HOST_WIDE_INT cfa_offset, unsigned int *align)
@@ -14322,35 +14325,35 @@ ix86_emit_outlined_ms2sysv_save (const struct 
ix86_frame &frame)
   rtx sym, addr;
   rtx rax = gen_rtx_REG (word_mode, AX_REG);
   const struct xlogue_layout &xlogue = xlogue_layout::get_instance ();
-  HOST_WIDE_INT rax_offset = xlogue.get_stub_ptr_offset () + m->fs.sp_offset;
-  HOST_WIDE_INT stack_alloc_size = frame.stack_pointer_offset - 
m->fs.sp_offset;
-  HOST_WIDE_INT stack_align_off_in = xlogue.get_stack_align_off_in ();
+  HOST_WIDE_INT allocate = frame.stack_pointer_offset - m->fs.sp_offset;
+
+  /* AL should only be live with sysv_abi.  */
+  gcc_assert (!ix86_eax_live_at_start_p ());
+
+  /* Setup RAX as the stub's base pointer.  We use stack_realign_offset rather
+ we've actually realigned the stack or not.  */
+  align = GET_MODE_ALIGNMENT (V4SFmode);
+  addr = choose_baseaddr (frame.stack_realign_offset
+ + xlogue.get_stub_ptr_offset (), &align);
+  gcc_assert (align >= GET_MODE_ALIGNMENT (V4SFmode));
+  emit_insn (gen_rtx_SET (rax, addr));
 
-  /* Verify that the incoming stack 16-byte alignment offset matches the
- layout we're using.  */
-  gcc_assert (stack_align_off_in == (m->fs.sp_offset & UNITS_PER_WORD));
+  /* Allocate stack if not already done.  */
+  if (allocate > 0)
+  pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,
+   GEN_INT (-allocate), -1, false);
 
   /* Get the stub symbol.  */
   sym = xlogue.get_stub_rtx (frame_pointer_needed ? XLOGUE_STUB_SAVE_HFP
  : XLOGUE_STUB_SAVE);
   RTVEC_ELT (v, vi++) = gen_rtx_USE (VOIDmode, sym);
 
-  /* Setup RAX as the stub's base pointer.  */
-  align = GET_MODE_ALIGNMENT (V4SFmode);
-  addr = choose_baseaddr (rax_offset, &align);
-  gcc_assert (align >= GET_MODE_ALIGNMENT (V4SFmode));
-  insn = emit_insn (gen_rtx_SET (rax, addr));
-
-  gcc_assert (stack_alloc_size >= xlogue.get_stack_space_used ());
-  pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,
-GEN_INT (-stack_alloc_size), -1,
-m->fs.cfa_reg == stack_pointer_rtx);
   for (i = 0; i < ncregs; ++i)
 {
   const xlogue_layout::reginfo &r = xlogue.get_reginfo (i);
   rtx reg = gen_rtx_REG ((SSE_REGNO_P (r.regno) ? V4SFmode : word_mode),
 r.regno);
-  RTVEC_ELT (v, vi++) = gen_frame_store (reg, rax, -r.offset);;
+  RTVEC_ELT (v, vi++) = gen_frame_store (reg, rax, -r.offset);
 }
 
   gcc_assert (vi == (unsigned)GET_NUM_ELEM (v));
@@ -14608,8 +14611,8 @@ ix86_expand_prologue (void)
 that we must allocate the size of the register save area before
 performing the actual alignment.  Otherwise we cannot guarantee
 that there's enough storage above the realignment point.  */
-  allocate = frame.stack_realign_allocate_offset - m->fs.sp_offset;
-  if (allocate && !m->call_ms2sysv)
+  allocate = frame.stack_realign_allocate;
+  if (allocate)
 pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx,
   GEN_INT (-allocate), -1, false);
 
@@ -14618,8 +14621,7 @@ ix86_expand_prologue (void)
stack_pointer_rtx,

[PATCH 6/6] [i386, testsuite] Add tests, fix bug in check_avx2_hw_available

2017-07-31 Thread Daniel Santos

The testcase in the PR is used as a base and relevant variants are added
to test other factors affected by the patch set.

pr80969-1.c   Base test case.
pr80969-2.c   With ms to sysv call.
pr80969-2a.c  With ms to sysv call using stubs.
pr80969-3.c   With alloca (for DRAP test).
pr80969-4.c   With va_args passed via va_list
pr80969-4a.c  With va_args passed via va_list and ms to sysv call.
pr80969-4b.c  With va_args passed via va_list and ms to sysv call using
  stubs.

Signed-off-by: Daniel Santos 
---
 gcc/testsuite/gcc.target/i386/pr80969-1.c  |  16 
 gcc/testsuite/gcc.target/i386/pr80969-2.c  |  26 ++
 gcc/testsuite/gcc.target/i386/pr80969-2a.c |  26 ++
 gcc/testsuite/gcc.target/i386/pr80969-3.c  |  31 
 gcc/testsuite/gcc.target/i386/pr80969-4.c  | 123 
 gcc/testsuite/gcc.target/i386/pr80969-4a.c | 124 +
 gcc/testsuite/gcc.target/i386/pr80969-4b.c | 124 +
 gcc/testsuite/lib/target-supports.exp  |  66 +++
 8 files changed, 536 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-2a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-4a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-4b.c

diff --git a/gcc/testsuite/gcc.target/i386/pr80969-1.c 
b/gcc/testsuite/gcc.target/i386/pr80969-1.c
new file mode 100644
index 000..eb8d767a778
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr80969-1.c
@@ -0,0 +1,16 @@
+/* { dg-do run } */
+/* { dg-options "-Ofast -mabi=ms -mavx512f" } */
+/* { dg-require-effective-target avx512f } */
+
+int a[56];
+int b;
+int main (int argc, char *argv[]) {
+  int c;
+  for (; b; b++) {
+c = b;
+if (b & 1)
+  c = 2;
+a[b] = c;
+  }
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr80969-2.c 
b/gcc/testsuite/gcc.target/i386/pr80969-2.c
new file mode 100644
index 000..e868d6c7e5c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr80969-2.c
@@ -0,0 +1,26 @@
+/* { dg-do run } */
+/* { dg-options "-Ofast -mabi=ms -mavx512f" } */
+/* { dg-require-effective-target avx512f } */
+
+/* Test when calling a sysv func.  */
+
+int a[56];
+int b;
+
+static void __attribute__((sysv_abi)) sysv ()
+{
+}
+
+void __attribute__((sysv_abi)) (*volatile const sysv_noinfo)() = sysv;
+
+int main (int argc, char *argv[]) {
+  int c;
+  sysv_noinfo ();
+  for (; b; b++) {
+c = b;
+if (b & 1)
+  c = 2;
+a[b] = c;
+  }
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr80969-2a.c 
b/gcc/testsuite/gcc.target/i386/pr80969-2a.c
new file mode 100644
index 000..071a90534a4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr80969-2a.c
@@ -0,0 +1,26 @@
+/* { dg-do run } */
+/* { dg-options "-Ofast -mabi=ms -mavx512f -mcall-ms2sysv-xlogues" } */
+/* { dg-require-effective-target avx512f } */
+
+/* Test when calling a sysv func using save/restore stubs.  */
+
+int a[56];
+int b;
+
+static void __attribute__((sysv_abi)) sysv ()
+{
+}
+
+void __attribute__((sysv_abi)) (*volatile const sysv_noinfo)() = sysv;
+
+int main (int argc, char *argv[]) {
+  int c;
+  sysv_noinfo ();
+  for (; b; b++) {
+c = b;
+if (b & 1)
+  c = 2;
+a[b] = c;
+  }
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr80969-3.c 
b/gcc/testsuite/gcc.target/i386/pr80969-3.c
new file mode 100644
index 000..5982981b55c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr80969-3.c
@@ -0,0 +1,31 @@
+/* { dg-do run } */
+/* { dg-options "-Ofast -mabi=ms -mavx512f" } */
+/* { dg-require-effective-target avx512f } */
+
+/* Test with alloca (and DRAP).  */
+
+#include 
+
+int a[56];
+volatile int b = -12345;
+volatile const int d = 42;
+
+void foo (int *x, int y, int z)
+{
+}
+
+void (*volatile const foo_noinfo)(int *, int, int) = foo;
+
+int main (int argc, char *argv[]) {
+  int c;
+  int *e = alloca (d);
+  foo_noinfo (e, d, 0);
+  for (; b; b++) {
+c = b;
+if (b & 1)
+  c = 2;
+foo_noinfo (e, d, c);
+a[-(b % 56)] = c;
+  }
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr80969-4.c 
b/gcc/testsuite/gcc.target/i386/pr80969-4.c
new file mode 100644
index 000..1ec54d081cd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr80969-4.c
@@ -0,0 +1,123 @@
+/* { dg-do run { target avx512f_runtime } } */
+/* { dg-options "-Ofast -mabi=ms -mavx512f" } */
+/* { dg-require-effective-target avx512f } */
+
+/* Test with avx512 and va_args.  */
+
+#include 
+#include 
+
+#include "avx-check.h"
+
+int a[56];
+int b;
+
+__m128 n1 = { -283.3, -23.3, 213.4, 1119.03 };
+__m512d n2 = { -93.83, 893.318, 3994.3, -39484.0, 830.32, -328.32, 3.14159, 
2.99792 };
+__m128i n3 = { 893, -3180 } ;
+int n4 = 324;
+double n5 = 1

[PATCH][v2] Introduce TARGET_SUPPORTS_ALIASES

2017-07-31 Thread Martin Liška

On 07/31/2017 11:57 AM, Yuri Gribov wrote:
> On Mon, Jul 31, 2017 at 9:04 AM, Martin Liška  wrote:
>> Hi.
>>
>> Doing the transformation suggested by Honza.
>>
>> Patch can bootstrap on ppc64le-redhat-linux and x86_64-linux-gnu and 
>> survives regression tests.
>> And I also verified that works on hppa2.0w-hp-hpux11.11 (target w/o aliasing 
>> support).
>>
>> Ready to be installed?
> 
> A nit - you can probly get rid of ATTRIBUTE_UNUSED in note_mangling_alias now.
> 
> -Y
> 

Sure.

Done in v2.

Martin
>From 78ee08b25d22125cb1fa248bac98ef1e84504761 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Tue, 25 Jul 2017 13:11:28 +0200
Subject: [PATCH] Introduce TARGET_SUPPORTS_ALIASES

gcc/c-family/ChangeLog:

2017-07-25  Martin Liska  

	* c-opts.c (c_common_post_options): Replace ASM_OUTPUT_DEF with
	TARGET_SUPPORTS_ALIASES.

gcc/ChangeLog:

2017-07-25  Martin Liska  

	* asan.c (asan_protect_global): Replace ASM_OUTPUT_DEF with
	TARGET_SUPPORTS_ALIASES.
	* cgraph.c (cgraph_node::create_same_body_alias): Likewise.
	* ipa-visibility.c (can_replace_by_local_alias): Likewise.
	(optimize_weakref): Likewise.
	* symtab.c (symtab_node::noninterposable_alias): Likewise.
	* varpool.c (varpool_node::create_extra_name_alias): Likewise.
	* defaults.h: Introduce TARGET_SUPPORTS_ALIASES.

gcc/cp/ChangeLog:

2017-07-25  Martin Liska  

	* decl2.c (get_tls_init_fn): Replace ASM_OUTPUT_DEF with
	TARGET_SUPPORTS_ALIASES.
	(handle_tls_init): Likewise.
	(note_mangling_alias): Likewise.  Remove ATTRIBUTE_UNUSED for
	both arguments.
	* optimize.c (can_alias_cdtor): Likewise.
---
 gcc/asan.c|  4 +---
 gcc/c-family/c-opts.c | 22 --
 gcc/cgraph.c  |  7 ---
 gcc/cp/decl2.c| 25 +++--
 gcc/cp/optimize.c |  6 +++---
 gcc/defaults.h|  9 +
 gcc/ipa-visibility.c  | 15 +--
 gcc/symtab.c  |  6 +++---
 gcc/varpool.c |  6 +++---
 9 files changed, 51 insertions(+), 49 deletions(-)

diff --git a/gcc/asan.c b/gcc/asan.c
index 5f9275f6425..d8cb2b52c8b 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -1663,10 +1663,8 @@ asan_protect_global (tree decl)
   if (lookup_attribute ("weakref", DECL_ATTRIBUTES (decl)))
 return false;
 
-#ifndef ASM_OUTPUT_DEF
-  if (asan_needs_local_alias (decl))
+  if (!TARGET_SUPPORTS_ALIASES && asan_needs_local_alias (decl))
 return false;
-#endif
 
   return true;
 }
diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index 1657e7a4390..0b13a188c1b 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -957,16 +957,18 @@ c_common_post_options (const char **pfilename)
 
   if (flag_extern_tls_init)
 {
-#if !defined (ASM_OUTPUT_DEF) || !SUPPORTS_WEAK
-  /* Lazy TLS initialization for a variable in another TU requires
-	 alias and weak reference support. */
-  if (flag_extern_tls_init > 0)
-	sorry ("external TLS initialization functions not supported "
-	   "on this target");
-  flag_extern_tls_init = 0;
-#else
-  flag_extern_tls_init = 1;
-#endif
+  if (!TARGET_SUPPORTS_ALIASES || !SUPPORTS_WEAK)
+	{
+	  /* Lazy TLS initialization for a variable in another TU requires
+	 alias and weak reference support.  */
+	  if (flag_extern_tls_init > 0)
+	sorry ("external TLS initialization functions not supported "
+		   "on this target");
+
+	  flag_extern_tls_init = 0;
+	}
+  else
+	flag_extern_tls_init = 1;
 }
 
   if (num_in_fnames > 1)
diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index d7c9ba61795..849989443b1 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -582,10 +582,11 @@ cgraph_node *
 cgraph_node::create_same_body_alias (tree alias, tree decl)
 {
   cgraph_node *n;
-#ifndef ASM_OUTPUT_DEF
+
   /* If aliases aren't supported by the assembler, fail.  */
-  return NULL;
-#endif
+  if (!TARGET_SUPPORTS_ALIASES)
+return NULL;
+
   /* Langhooks can create same body aliases of symbols not defined.
  Those are useless. Drop them on the floor.  */
   if (symtab->global_info_ready)
diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index 2a52f8ca3e2..29a2e3cf02d 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -3156,11 +3156,9 @@ get_tls_init_fn (tree var)
   if (!flag_extern_tls_init && DECL_EXTERNAL (var))
 return NULL_TREE;
 
-#ifdef ASM_OUTPUT_DEF
   /* If the variable is internal, or if we can't generate aliases,
  call the local init function directly.  */
-  if (!TREE_PUBLIC (var))
-#endif
+  if (!TREE_PUBLIC (var) || !TARGET_SUPPORTS_ALIASES)
 return get_local_tls_init_fn ();
 
   tree sname = mangle_tls_init_fn (var);
@@ -4241,9 +4239,8 @@ handle_tls_init (void)
   tree init = TREE_PURPOSE (vars);
   one_static_initialization_or_destruction (var, init, true);
 
-#ifdef ASM_OUTPUT_DEF
   /* Output init aliases even with -fno-extern-tls-init.  */
-  if (TREE_PUBLIC (var))
+  if (TARGET_SUPPORTS_ALIASES && TREE_PUBLIC (var))
 	{
   tree single_init_fn = get_tls_init_fn (var);
 	  if (single_init_fn == NULL_TR

Re: C PATCH to display types when printing a conversion warning (PR c/81233)

2017-07-31 Thread Marek Polacek

Ping.

On Thu, Jul 20, 2017 at 12:53:10PM +0200, Marek Polacek wrote:
> On Wed, Jul 19, 2017 at 10:51:33AM -0400, David Malcolm wrote:
> > The changes to diagnostic-core.h and diagnostic.c are OK.
> 
> Thanks.
>  
> > > Also,
> > > PEDWARN_FOR_ASSIGNMENT didn't work with the addition of printing TYPE
> > > and
> > > RHSTYPE so I just decided to unroll the macro instead of making it
> > > even more
> > > ugly.
> > > This patch is long but it's mainly because of the testsuite fallout.
> > 
> > The comment by PEDWARN_FOR_ASSIGNMENT says:
> > 
> > 
> >   /* This macro is used to emit diagnostics to ensure that all format
> >  strings are complete sentences, visible to gettext and checked
> > at
> >  compile time.  */
> > 
> > I wonder if it's possible to convert it to an inline function to get
> > the same test coverage, without unrolling the macro?
> 
> Yeah, I tried, but the resulting inline function would have to have 12
> parameters if I count well and that didn't seem like a win.  Perhaps
> splitting convert_for_assignment would make sense, but likely not as
> part of this patch.

Marek

[Committed] S/390: Support z14 as CPU name.

2017-07-31 Thread Andreas Krebbel

With IBM z14 officially announced we can add support for z14 as
preferred CPU name.  We still pass arch12 to Binutils in order to keep
older Binutils versions supported.

Bootstrapped and regression-tested on s390x.

Committed to mainline and GCC 7 branch.

Bye,

-Andreas-


gcc/ChangeLog:

2017-07-31  Andreas Krebbel  

* config.gcc: Add z14.
* config/s390/driver-native.c (s390_host_detect_local_cpu): Add
CPU model numbers for z13s and z14.
* config/s390/s390-c.c (s390_resolve_overloaded_builtin): Replace
arch12 with z14.
* config/s390/s390-opts.h (enum processor_type): Rename
PROCESSOR_ARCH12 to PROCESSOR_3906_Z14.
* config/s390/s390.c (processor_table): Add field for CPU name to
be passed to Binutils.
(s390_asm_output_machine_for_arch): Use the new field in
processor_table for Binutils.
(s390_expand_builtin): Replace arch12 with z14.
(s390_issue_rate): Rename PROCESSOR_ARCH12 to PROCESSOR_3906_Z14.
(s390_get_sched_attrmask): Likewise.
(s390_get_unit_mask): Likewise.
* config/s390/s390.opt: Add z14 to processor_type enum.
---
 gcc/config.gcc  |  2 +-
 gcc/config/s390/driver-native.c |  6 +-
 gcc/config/s390/s390-c.c|  4 ++--
 gcc/config/s390/s390-opts.h |  2 +-
 gcc/config/s390/s390.c  | 38 +-
 gcc/config/s390/s390.opt|  5 -
 6 files changed, 34 insertions(+), 23 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 9196843..a9196cd 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -4336,7 +4336,7 @@ case "${target}" in
for which in arch tune; do
eval "val=\$with_$which"
case ${val} in
-   "" | native | g5 | g6 | z900 | z990 | z9-109 | z9-ec | 
z10 | z196 | zEC12 | z13 | arch3 | arch5 | arch6 | arch7 | arch8 | arch9 | 
arch10 | arch11 | arch12)
+   "" | native | g5 | g6 | z900 | z990 | z9-109 | z9-ec | 
z10 | z196 | zEC12 | z13 | z14 | arch3 | arch5 | arch6 | arch7 | arch8 | arch9 
| arch10 | arch11 | arch12)
# OK
;;
*)
diff --git a/gcc/config/s390/driver-native.c b/gcc/config/s390/driver-native.c
index 4bcddb4..acb9836 100644
--- a/gcc/config/s390/driver-native.c
+++ b/gcc/config/s390/driver-native.c
@@ -112,10 +112,14 @@ s390_host_detect_local_cpu (int argc, const char **argv)
  cpu = "zEC12";
  break;
case 0x2964:
+   case 0x2965:
  cpu = "z13";
  break;
+   case 0x3906:
+ cpu = "z14";
+ break;
default:
- cpu = "arch12";
+ cpu = "z14";
  break;
}
}
diff --git a/gcc/config/s390/s390-c.c b/gcc/config/s390/s390-c.c
index 35c3545..07224ad 100644
--- a/gcc/config/s390/s390-c.c
+++ b/gcc/config/s390/s390-c.c
@@ -886,7 +886,7 @@ s390_resolve_overloaded_builtin (location_t loc,
 
   if (!TARGET_VXE && (ob_flags & B_VXE))
 {
-  error_at (loc, "%qF requires -march=arch12 or higher", ob_fndecl);
+  error_at (loc, "%qF requires z14 or higher", ob_fndecl);
   return error_mark_node;
 }
 
@@ -963,7 +963,7 @@ s390_resolve_overloaded_builtin (location_t loc,
   if (!TARGET_VXE
   && bflags_overloaded_builtin_var[last_match_index] & B_VXE)
 {
-  error_at (loc, "%qs matching variant requires -march=arch12 or higher",
+  error_at (loc, "%qs matching variant requires z14 or higher",
IDENTIFIER_POINTER (DECL_NAME (ob_fndecl)));
   return error_mark_node;
 }
diff --git a/gcc/config/s390/s390-opts.h b/gcc/config/s390/s390-opts.h
index 65ac4f8..6d506e2 100644
--- a/gcc/config/s390/s390-opts.h
+++ b/gcc/config/s390/s390-opts.h
@@ -38,7 +38,7 @@ enum processor_type
   PROCESSOR_2817_Z196,
   PROCESSOR_2827_ZEC12,
   PROCESSOR_2964_Z13,
-  PROCESSOR_ARCH12,
+  PROCESSOR_3906_Z14,
   PROCESSOR_NATIVE,
   PROCESSOR_max
 };
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 7be22d9..c408d59 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -318,24 +318,27 @@ struct processor_costs zEC12_cost =
 
 static struct
 {
+  /* The preferred name to be used in user visible output.  */
   const char *const name;
+  /* CPU name as it should be passed to Binutils via .machine  */
+  const char *const binutils_name;
   const enum processor_type processor;
   const struct processor_costs *cost;
 }
 const processor_table[] =
 {
-  { "g5", PROCESSOR_9672_G5, &z900_cost },
-  { "g6", PROCESSOR_9672_G6, &z900_cost },
-  { "z900",   PROCESSOR_2064_Z900,   &z900_cost },
-  { "z990",   PROCESSOR_2084_Z990,   &z990_cost },
-  { "z9-109", PROCESSOR_2094_Z9_109, &z9_109_cost },
-  { "z9-ec",  PROCESSOR_2094_Z9_EC,  &z9_109_cost },
-  { "z10",PROCESSOR_2097_Z10,

Re: [PATCH] Make inlining consistent in LTO and non-LTO mode (PR target/71991).

2017-07-31 Thread Martin Liška

Honza?

Thanks,
Martin

On 06/30/2017 03:50 PM, Martin Liška wrote:
> On 06/28/2017 05:18 PM, Jan Hubicka wrote:
>>> On 06/28/2017 04:24 PM, Jan Hubicka wrote:
> -  /* If callee has no option attributes, then it is ok to inline.  */
> -  if (!callee_tree)
> +  /* If callee has no option attributes (or default),
> + then it is ok to inline.  */
> +  if (!callee_tree || callee_tree == target_option_default_node)

 I am not sure this actually makes sense, because 
 target_option_default_node is not very
 meaningful for LTO (it contains whatever was passed to LTO driver). 
>>>
>>> I see!
>>>
>>>  Perhaps one can check
 for explicit optimization/machine attribute and whether caller and callee 
 come from
> 
> I'm not sure what you mean by 'for explicit optimization/machine attribute' ?
> 
> I'm attaching a new patch, is it closer?
> 
> Martin
> 
 same compilation unit, though this is quite hackish and will do unexpected 
 things with COMDATs.
>>>
>>> That's quite cumbersome. Any other idea than marking the PR as won't fix?
>>
>> Yep, it is not prettiest. The problem is that the concept that callee can 
>> change semantics
>> when no explicit attribute is present is sloppy.  I am not sure how many 
>> programs rely on it
>> (it is kind of surprising to see functions not being inlined into your 
>> target attribute annotated
>> function I guess).
>> Note that we check for original file in inliner already - this can be done 
>> by comparing lto_file_data
>> of corresponding cgraph nodes.
>>
>> Honza
>>
>

Re: [PATCH v12] add -fpatchable-function-entry=N,M option

2017-07-31 Thread Maxim Kuvyrkov

On Jul 26, 2017, at 5:33 PM, Andreas Schwab  wrote:
> 
> On Jul 26 2017, Torsten Duwe  wrote:
> 
>> On Wed, Jul 26, 2017 at 04:16:25PM +0200, Andreas Schwab wrote:
>>> On Jul 07 2017, Torsten Duwe  wrote:
>>> 
 diff --git a/gcc/testsuite/c-c++-common/patchable_function_entry-decl.c 
 b/gcc/testsuite/c-c++-common/patchable_function_entry-decl.c
 new file mode 100644
 index 000..8514b10e820
 --- /dev/null
 +++ b/gcc/testsuite/c-c++-common/patchable_function_entry-decl.c
 @@ -0,0 +1,16 @@
 +/* { dg-do compile } */
 +/* { dg-options "-O2 -fpatchable-function-entry=3,1" } */
 +/* { dg-final { scan-assembler-times "nop" 2 } } */
>>> 
>>> This fails on ia64.
>> 
>> The solution is fairly obvious: on architectures where the nop is not called
>> "nop" provide a custom, cpu-specific test, or document the failure.
> 
> But on ia64, a nop _is_ called nop.

The problem here is that ia64 backend emits "nop" instructions to pad IA64 
bundles.  The 2 nops at the beginning are [as expected] from the patchable 
attribute, but [unexpected] nops after ld8.mov and before "add r8" are 
generated by ia64 bundle packing.

nop 0
nop 0
.prologue
.body
.mmi
addl r14 = @ltoffx(a#), r1
;;
ld8.mov r14 = [r14], a#
nop 0
;;
.mmi
ld4 r14 = [r14]
;;
shladd r8 = r14, 2, r0
nop 0
;;
.mib
nop 0
add r8 = r8, r14
br.ret.sptk.many b0

I don't see an easy way to correctly differentiate between "attribute" nops and 
"bundle" nops, so XFAILing these tests on ia64 seems like a valid approach.

I speculate that other tests fail on ia64 for the same reason, but I didn't 
check.

--
Maxim Kuvyrkov
www.linaro.org

[PATCH, committed] Add myself to MAINTAINERS

2017-07-31 Thread Robin Dapp

ChangeLog:

2017-07-31  Robin Dapp  

* MAINTAINERS (write after approval): Add myself.
Index: MAINTAINERS
===
--- MAINTAINERS	(revision 250740)
+++ MAINTAINERS	(working copy)
@@ -356,6 +356,7 @@
 Lawrence Crowl	
 Ian Dall	
 David Daney	
+Robin Dapp	
 Simon Dardis	
 Bud Davis	
 Chris Demetriou

Re: [PATCH, i386]: Implement attribute ((naked))

2017-07-31 Thread Uros Bizjak

On Sun, Jul 30, 2017 at 10:14 PM, Uros Bizjak  wrote:
> Hello!
>
> attribute ((naked)) generates function body without function frame,
> and as shown in PR 25967 [1], users are looking for this feature also
> for x86 targets. Recently, Daniel introduced a testcase that would
> benefit from this attribute.

Following additional patch enables passing arguments to a naked
function. Testcases gcc.target/i386/naked-3.c and
gcc.target/i386/naked-4.c show necessary decorations (i.e. mregparm
for -m32 and volatile "ret" for function result) to reliably pass
function arguments to and function result from naked functions.

2017-07-31  Uros Bizjak  

PR target/25967
* config/i386/i386.c (ix86_allocate_stack_slots_for_args):
New function.
(TARGET_ALLOCATE_STACK_SLOTS_FOR_ARGS): Define.

testsuite/ChangeLog:

2017-07-31  Uros Bizjak  

PR target/25967
* gcc.target/i386/naked-3.c (dg-options): Use -O0.
(naked): Add attribute regparm(1) for x86_32 targets.
Add integer argument.  Remove global "data" variable.
(main): Pass integer argument to naked function.
* gcc.target/i386/naked-4.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {-m32}.

Committed to mainline SVN.

Uros.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 250736)
+++ config/i386/i386.c  (working copy)
@@ -31676,6 +31676,13 @@ ix86_trampoline_init (rtx m_tramp, tree fndecl, rt
 }
 
 static bool
+ix86_allocate_stack_slots_for_args (void)
+{
+  /* Naked functions should not allocate stack slots for arguments.  */
+  return !ix86_function_naked (current_function_decl);
+}
+
+static bool
 ix86_warn_func_return (tree decl)
 {
   /* Naked functions are implemented entirely in assembly, including the
@@ -52727,6 +52734,8 @@ ix86_run_selftests (void)
 #define TARGET_SETUP_INCOMING_VARARGS ix86_setup_incoming_varargs
 #undef TARGET_MUST_PASS_IN_STACK
 #define TARGET_MUST_PASS_IN_STACK ix86_must_pass_in_stack
+#undef TARGET_ALLOCATE_STACK_SLOTS_FOR_ARGS
+#define TARGET_ALLOCATE_STACK_SLOTS_FOR_ARGS ix86_allocate_stack_slots_for_args
 #undef TARGET_FUNCTION_ARG_ADVANCE
 #define TARGET_FUNCTION_ARG_ADVANCE ix86_function_arg_advance
 #undef TARGET_FUNCTION_ARG
Index: testsuite/gcc.target/i386/naked-3.c
===
--- testsuite/gcc.target/i386/naked-3.c (revision 250736)
+++ testsuite/gcc.target/i386/naked-3.c (working copy)
@@ -1,17 +1,18 @@
 /* { dg-do run { target *-*-linux* *-*-gnu* } } */
-/* { dg-options "-O2" } */
+/* { dg-options "-O0" } */
 
 #include 
 #include 
 #include 
 
-int data;
-
 /* Verify that naked function traps at the end.  */
 
 void
 __attribute__((naked, noinline, noclone))
-naked (void)
+#ifdef __i386__
+__attribute__((regparm(1)))
+#endif
+naked (int data)
 {
   if (data == 0x12345678)
 return;
@@ -32,8 +33,7 @@ int main ()
   s.sa_flags = 0;
   sigaction (SIGILL, &s, NULL);
 
-  data = 0x12345678;
-  naked ();
+  naked (0x12345678);
 
   abort ();
 }
Index: testsuite/gcc.target/i386/naked-4.c
===
--- testsuite/gcc.target/i386/naked-4.c (nonexistent)
+++ testsuite/gcc.target/i386/naked-4.c (working copy)
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O0" } */
+/* { dg-additional-options "-mregparm=3" { target ia32 } } */
+
+/* Verify that __attribute__((naked)) produces a naked function 
+   that does not allocate stack slots for args.  */
+extern void bar (int);
+
+int
+__attribute__((naked))
+foo (int a, int b, int c)
+{
+  bar (c);
+  asm volatile ("ret" :: "a" (b));
+}
+
+/* { dg-final { scan-assembler-not "%\[re\]bp" } } */

Re: [PATCH] Fix PR81354 (rewrite gimple_split_edge)

2017-07-31 Thread Bill Schmidt

That would certainly be much simpler!  I'll regstrap it and test it on the other
occurrence I've found to be certain.

-- Bill

> On Jul 31, 2017, at 4:15 AM, Richard Biener  
> wrote:
> 
> On Sun, Jul 30, 2017 at 8:04 PM, Bill Schmidt
>  wrote:
>> Hi,
>> 
>> PR81354 identifies a latent bug that can happen in SLSR since the
>> conditional candidate support was first added.  SLSR relies on the
>> address of a GIMPLE PHI remaining constant during the course of the
>> optimization pass, but it needs to split edges.  The use of
>> make_single_succ_edge and reinstall_phi_args in gimple_split_edge
>> causes GIMPLE PHI statements to be temporarily expanded to add a
>> predecessor, and then rebuilt to have the original number of
>> predecessors.  The expansion usually, if not always, causes the PHI
>> statement to change address.  Thus gimple_split_edge needs to be
>> rewritten to perform in-situ replacement of PHI arguments.
>> 
>> The required pieces of make_single_succ_edge have been extracted into
>> two places:  make_replacement_pred_edge, and some fixup code at the
>> end of gimple_split_edge.  The division is necessary because the
>> destination of the original edge must remember its original
>> predecessors for the switch processing in
>> gimple_redirect_edge_and_branch_1 to work properly.
>> 
>> The function gimple_redirect_edge_and_branch was factored into two
>> pieces so that most of it can be used by gimple_split_edge without
>> calling ssa_redirect_edge, which also interferes with PHIs.  The
>> useful bits of ssa_redirect_edge are factored out into the next three
>> lines of gimple_split_edge.
>> 
>> Similarly, redirect_eh_edge had already been similarly factored into
>> redirect_eh_edge_1 and ssa_redirect_edge.  I took advantage of that
>> and exposed redirect_eh_edge_1 for use in gimple_redirect_edge_and_branch_1.
>> 
>> I've added the test from PR81354 as a torture test, but as we've seen,
>> small changes elsewhere in the optimizer can easily hide the problem.
>> 
>> Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
>> Is this ok for trunk?  Eventually this needs to be backported to GCC 5,
>> 6, and 7 if that's acceptable, since PR81354 was observed on
>> gcc-5-branch.  I haven't yet prepared the backports.
> 
> I don't like make_replacement_pred_edge too much.  Wouldn't it work
> to make sure we first shrink and then re-grow like if we simply do the
> redirect_edge_and_branch before the make_single_succ_edge call?
> At least quick testing shows it fixes the testcase on the GCC 6 branch for me.
> 
> Index: gcc/tree-cfg.c
> ===
> --- gcc/tree-cfg.c  (revision 250732)
> +++ gcc/tree-cfg.c  (working copy)
> @@ -2753,12 +2753,16 @@ gimple_split_edge (edge edge_in)
>   new_bb = create_empty_bb (after_bb);
>   new_bb->frequency = EDGE_FREQUENCY (edge_in);
>   new_bb->count = edge_in->count;
> +
> +  /* First redirect the existing edge to avoid reallocating
> + PHI nodes in dest.  */
> +  e = redirect_edge_and_branch (edge_in, new_bb);
> +  gcc_assert (e == edge_in);
> +
>   new_edge = make_edge (new_bb, dest, EDGE_FALLTHRU);
>   new_edge->probability = REG_BR_PROB_BASE;
>   new_edge->count = edge_in->count;
> 
> -  e = redirect_edge_and_branch (edge_in, new_bb);
> -  gcc_assert (e == edge_in);
>   reinstall_phi_args (new_edge, e);
> 
>   return new_bb;
> 
> Sorry for misleading you to a complex solution.
> 
> Thanks,
> Richard.
> 
>> Thanks,
>> Bill
>> 
>> 
>> [gcc]
>> 
>> 2017-07-30  Bill Schmidt  
>> 
>>PR tree-optimization/81354
>>* tree-cfg.c (gimple_redirect_edge_and_branch_1): New decl.
>>(reinstall_phi_args): Delete function.
>>(make_replacement_pred_edge): New function.
>>(gimple_split_edge): Rewrite.
>>(gimple_redirect_edge_and_branch_1): New function, factored
>>from...
>>(gimple_redirect_edge_and_branch): ...here.
>>(split_critical_edges): Don't re-split already split edges.
>>* tree-eh.c (redirect_eh_edge_1): Make visible.
>>* tree-eh.h (redirect_eh_edge_1): Likewise.
>> 
>> [gcc/testsuite]
>> 
>> 2017-07-30  Bill Schmidt  
>> 
>>PR tree-optimization/81354
>>* g++.dg/torture/pr81354.C: New file.
>> 
>> 
>> Index: gcc/testsuite/g++.dg/torture/pr81354.C
>> ===
>> --- gcc/testsuite/g++.dg/torture/pr81354.C  (nonexistent)
>> +++ gcc/testsuite/g++.dg/torture/pr81354.C  (working copy)
>> @@ -0,0 +1,24 @@
>> +// PR81354 reported this test as crashing in a limited range of revisions.
>> +// { dg-do compile }
>> +
>> +struct T { double a; double b; };
>> +
>> +void foo(T Ad[], int As[2])
>> +{
>> +  int j;
>> +  int i;
>> +  int Bs[2] = {0,0};
>> +  T Bd[16];
>> +
>> +  for (j = 0; j < 4; j++) {
>> +for (i = 0; i + 1 <= j + 1; i++) {
>> +  Ad[i + As[0] * j] = Bd[i + Bs[0] * j];
>> +}
>> +
>> +i = j + 1;  // <- commen

[PATCH] Fix typo in std::stack (PR libstdc++/81599)

2017-07-31 Thread Marek Polacek

The documentation of std::stack says that the underlying container must support
pop_front, but that is wrong, it meant to say pop_back, so this patch fixes
that.

Ok for trunk?

2017-07-31  Marek Polacek  

PR libstdc++/81599
* include/bits/stl_stack.h: Fix typo.

diff --git gcc/include/bits/stl_stack.h gcc/include/bits/stl_stack.h
index ac59ec715cf..5f2b4ab4486 100644
--- gcc/include/bits/stl_stack.h
+++ gcc/include/bits/stl_stack.h
@@ -86,7 +86,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*
*  The second template parameter defines the type of the underlying
*  sequence/container.  It defaults to std::deque, but it can be
-   *  any type that supports @c back, @c push_back, and @c pop_front,
+   *  any type that supports @c back, @c push_back, and @c pop_back,
*  such as std::list, std::vector, or an appropriate user-defined
*  type.
*

Marek

[WWW PATCH]: Mention that x86 now supports "naked" function attribute.

2017-07-31 Thread Uros Bizjak

One liner that mentions new addition to x86 port.

OK for wwwdocs?

Uros.
Index: htdocs/gcc-8/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-8/changes.html,v
retrieving revision 1.7
diff -u -r1.7 changes.html
--- htdocs/gcc-8/changes.html   3 Jul 2017 16:37:04 -   1.7
+++ htdocs/gcc-8/changes.html   31 Jul 2017 13:26:13 -
@@ -119,7 +119,8 @@
 
 IA-32/x86-64
 
-  
+  
+The x86 port now supports the naked function attribute.

Re: [PATCH] Fix typo in std::stack (PR libstdc++/81599)

2017-07-31 Thread Ville Voutilainen

On 31 July 2017 at 16:25, Marek Polacek  wrote:
> The documentation of std::stack says that the underlying container must 
> support
> pop_front, but that is wrong, it meant to say pop_back, so this patch fixes
> that.

Indeed, the documentation has a copy-pasto originating from bits/stl_queue.h.

> Ok for trunk?

I can't approve the patch, but I suggest committing it as obvious.

Re: [PATCH] Fix typo in std::stack (PR libstdc++/81599)

2017-07-31 Thread Marek Polacek

On Mon, Jul 31, 2017 at 04:37:19PM +0300, Ville Voutilainen wrote:
> On 31 July 2017 at 16:25, Marek Polacek  wrote:
> > The documentation of std::stack says that the underlying container must 
> > support
> > pop_front, but that is wrong, it meant to say pop_back, so this patch fixes
> > that.
> 
> Indeed, the documentation has a copy-pasto originating from bits/stl_queue.h.
> 
> > Ok for trunk?
> 
> 
> I can't approve the patch, but I suggest committing it as obvious.

Yea, will do.  Thanks,

Marek

[PATCH] Compile pr79793-[12].c with -mtune=generic

2017-07-31 Thread H.J. Lu

pr79793-1.c and pr79793-2.c are failed when GCC is configured with
--with-cpu=slm since lea is used to adjust stack, instead of sub/add.
This patch uses -mtune=generic to always generate sub and add.

OK for trunk?

H.J.

* gcc.target/i386/pr79793-1.c: Compile with -mtune=generic.
* gcc.target/i386/pr79793-2.c: Likewise.
---
 gcc/testsuite/gcc.target/i386/pr79793-1.c | 2 +-
 gcc/testsuite/gcc.target/i386/pr79793-2.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/i386/pr79793-1.c 
b/gcc/testsuite/gcc.target/i386/pr79793-1.c
index a382fe9c5e2..1cc67a83ba3 100644
--- a/gcc/testsuite/gcc.target/i386/pr79793-1.c
+++ b/gcc/testsuite/gcc.target/i386/pr79793-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile { target { *-*-linux* && { ! ia32 } } } } */
-/* { dg-options "-O2 -mgeneral-regs-only" } */
+/* { dg-options "-O2 -mgeneral-regs-only -mtune=generic" } */
 
 void
  __attribute__ ((interrupt))
diff --git a/gcc/testsuite/gcc.target/i386/pr79793-2.c 
b/gcc/testsuite/gcc.target/i386/pr79793-2.c
index f6ae5aed33a..e1e6463e120 100644
--- a/gcc/testsuite/gcc.target/i386/pr79793-2.c
+++ b/gcc/testsuite/gcc.target/i386/pr79793-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile { target { *-*-linux* && { ! ia32 } } } } */
-/* { dg-options "-O2 -mgeneral-regs-only" } */
+/* { dg-options "-O2 -mgeneral-regs-only -mtune=generic" } */
 
 typedef unsigned int uword_t __attribute__ ((mode (__word__)));
 
-- 
2.13.3

Re: [PATCH] Compile pr79793-[12].c with -mtune=generic

2017-07-31 Thread Uros Bizjak

On Mon, Jul 31, 2017 at 3:47 PM, H.J. Lu  wrote:
> pr79793-1.c and pr79793-2.c are failed when GCC is configured with
> --with-cpu=slm since lea is used to adjust stack, instead of sub/add.
> This patch uses -mtune=generic to always generate sub and add.
>
> OK for trunk?

OK.

Thanks,
Uros.

> H.J.
> 
> * gcc.target/i386/pr79793-1.c: Compile with -mtune=generic.
> * gcc.target/i386/pr79793-2.c: Likewise.
> ---
>  gcc/testsuite/gcc.target/i386/pr79793-1.c | 2 +-
>  gcc/testsuite/gcc.target/i386/pr79793-2.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.target/i386/pr79793-1.c 
> b/gcc/testsuite/gcc.target/i386/pr79793-1.c
> index a382fe9c5e2..1cc67a83ba3 100644
> --- a/gcc/testsuite/gcc.target/i386/pr79793-1.c
> +++ b/gcc/testsuite/gcc.target/i386/pr79793-1.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile { target { *-*-linux* && { ! ia32 } } } } */
> -/* { dg-options "-O2 -mgeneral-regs-only" } */
> +/* { dg-options "-O2 -mgeneral-regs-only -mtune=generic" } */
>
>  void
>   __attribute__ ((interrupt))
> diff --git a/gcc/testsuite/gcc.target/i386/pr79793-2.c 
> b/gcc/testsuite/gcc.target/i386/pr79793-2.c
> index f6ae5aed33a..e1e6463e120 100644
> --- a/gcc/testsuite/gcc.target/i386/pr79793-2.c
> +++ b/gcc/testsuite/gcc.target/i386/pr79793-2.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile { target { *-*-linux* && { ! ia32 } } } } */
> -/* { dg-options "-O2 -mgeneral-regs-only" } */
> +/* { dg-options "-O2 -mgeneral-regs-only -mtune=generic" } */
>
>  typedef unsigned int uword_t __attribute__ ((mode (__word__)));
>
> --
> 2.13.3
>

Re: [PATCH 2/6] [i386] Remove ix86_frame::outlined_save_offset

2017-07-31 Thread Uros Bizjak

On Mon, Jul 31, 2017 at 1:24 PM, Daniel Santos  wrote:
> This value was used in an earlier incarnation of the
> -mcall-ms2sysv-xlogues patch set but is now set and never read.  The
> value of ix86_frame::sse_reg_save_offset serves the same purpose.

OK as obvious patch.

Thanks,
Uros.

> Signed-off-by: Daniel Santos 
> ---
>  gcc/config/i386/i386.c | 1 -
>  gcc/config/i386/i386.h | 4 +---
>  2 files changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 690631dfe43..47c5608c3cd 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -12966,7 +12966,6 @@ ix86_compute_frame_layout (void)
>
>offset += xlogue.get_stack_space_used ();
>gcc_assert (!(offset & 0xf));
> -  frame->outlined_save_offset = offset;
>  }
>
>/* Align and set SSE register save area.  */
> diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> index ce5bb7f6677..1648bdf1556 100644
> --- a/gcc/config/i386/i386.h
> +++ b/gcc/config/i386/i386.h
> @@ -2477,8 +2477,7 @@ enum avx_u128_state
> <- end of stub-saved/restored regs
>   [padding1]
> ]
> -   <- outlined_save_offset
> -   <- sse_regs_save_offset
> +   <- sse_reg_save_offset
> [padding2]
>|<- FRAME_POINTER
> [va_arg registers]  |
> @@ -2504,7 +2503,6 @@ struct GTY(()) ix86_frame
>HOST_WIDE_INT reg_save_offset;
>HOST_WIDE_INT stack_realign_allocate_offset;
>HOST_WIDE_INT stack_realign_offset;
> -  HOST_WIDE_INT outlined_save_offset;
>HOST_WIDE_INT sse_reg_save_offset;
>
>/* When save_regs_using_mov is set, emit prologue using
> --
> 2.13.3
>

Re: [PATCH 3/6] [i386] Remove machine_function::call_ms2sysv_pad_out

2017-07-31 Thread Uros Bizjak

On Mon, Jul 31, 2017 at 1:24 PM, Daniel Santos  wrote:
> The -mcall-ms2sysv-xlogues project added the boolean fields
> call_ms2sysv_pad_in and call_ms2sysv_pad_out to struct machine_function
> to track rather or not an additional 8 bytes of padding was needed for
> stack alignment prior to and after the stub save area.  This design was
> based upon the faulty assumption the function body would not require a
> stack alignment greater than 16 bytes.  This continues to work well for
> managing padding prior to the stub save area, but will not work for the
> outgoing alignment.
>
> Rather than changing machine_function::call_ms2sysv_pad_out to a larger
> type, this patch removes it, thus transferring responsibility for stack
> alignment following the stub save area from class xlogue_layout to the
> body of ix86_compute_frame_layout.  Since the 64-bit va_arg register
> save area is always a multiple of 16-bytes in size (176 for System V ABI
> and 96 for Microsoft ABI), the ROUND_UP calculation for the stack offset
> at the start of the function body (frame.frame_pointer_offset) will
> assure there is enough room for any padding needed to keep the save area
> for SSE va_args 16-byte aligned, so no modification is needed for that
> calculation.
>
> Signed-off-by: Daniel Santos 

LGTM.

OK for mainline.

Thanks,
Uros.

> ---
>  gcc/config/i386/i386.c | 18 --
>  gcc/config/i386/i386.h |  8 ++--
>  2 files changed, 6 insertions(+), 20 deletions(-)
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 47c5608c3cd..e2e9546a27c 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -2491,9 +2491,7 @@ public:
>  unsigned last_reg = m->call_ms2sysv_extra_regs + MIN_REGS - 1;
>
>  gcc_assert (m->call_ms2sysv_extra_regs <= MAX_EXTRA_REGS);
> -return m_regs[last_reg].offset
> -  + (m->call_ms2sysv_pad_out ? 8 : 0)
> -  + STUB_INDEX_OFFSET;
> +return m_regs[last_reg].offset + STUB_INDEX_OFFSET;
>}
>
>/* Returns the offset for the base pointer used by the stub.  */
> @@ -12849,13 +12847,12 @@ ix86_compute_frame_layout (void)
> {
>   unsigned count = xlogue_layout::count_stub_managed_regs ();
>   m->call_ms2sysv_extra_regs = count - xlogue_layout::MIN_REGS;
> + m->call_ms2sysv_pad_in = 0;
> }
>  }
>
>frame->nregs = ix86_nsaved_regs ();
>frame->nsseregs = ix86_nsaved_sseregs ();
> -  m->call_ms2sysv_pad_in = 0;
> -  m->call_ms2sysv_pad_out = 0;
>
>/* 64-bit MS ABI seem to require stack alignment to be always 16,
>   except for function prologues, leaf functions and when the defult
> @@ -12957,15 +12954,7 @@ ix86_compute_frame_layout (void)
>gcc_assert (!frame->nsseregs);
>
>m->call_ms2sysv_pad_in = !!(offset & UNITS_PER_WORD);
> -
> -  /* Select an appropriate layout for incoming stack offset.  */
> -  const struct xlogue_layout &xlogue = xlogue_layout::get_instance ();
> -
> -  if ((offset + xlogue.get_stack_space_used ()) & UNITS_PER_WORD)
> -   m->call_ms2sysv_pad_out = 1;
> -
> -  offset += xlogue.get_stack_space_used ();
> -  gcc_assert (!(offset & 0xf));
> +  offset += xlogue_layout::get_instance ().get_stack_space_used ();
>  }
>
>/* Align and set SSE register save area.  */
> @@ -12993,6 +12982,7 @@ ix86_compute_frame_layout (void)
>
>/* Align start of frame for local function.  */
>if (stack_realign_fp
> +  || m->call_ms2sysv
>|| offset != frame->sse_reg_save_offset
>|| size != 0
>|| !crtl->is_leaf
> diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> index 1648bdf1556..b08e45f68d4 100644
> --- a/gcc/config/i386/i386.h
> +++ b/gcc/config/i386/i386.h
> @@ -2646,17 +2646,13 @@ struct GTY(()) machine_function {
>BOOL_BITFIELD arg_reg_available : 1;
>
>/* If true, we're out-of-lining reg save/restore for regs clobbered
> - by ms_abi functions calling a sysv function.  */
> + by 64-bit ms_abi functions calling a sysv_abi function.  */
>BOOL_BITFIELD call_ms2sysv : 1;
>
>/* If true, the incoming 16-byte aligned stack has an offset (of 8) and
> - needs padding.  */
> + needs padding prior to out-of-line stub save/restore area.  */
>BOOL_BITFIELD call_ms2sysv_pad_in : 1;
>
> -  /* If true, the size of the stub save area plus inline int reg saves will
> - result in an 8 byte offset, so needs padding.  */
> -  BOOL_BITFIELD call_ms2sysv_pad_out : 1;
> -
>/* This is the number of extra registers saved by stub (valid range is
>   0-6). Each additional register is only saved/restored by the stubs
>   if all successive ones are. (Will always be zero when using a hard
> --
> 2.13.3
>

Re: [PATCH] Fix PR81354 (rewrite gimple_split_edge)

2017-07-31 Thread Bill Schmidt


> On Jul 31, 2017, at 8:19 AM, Bill Schmidt  wrote:
> 
> That would certainly be much simpler!  I'll regstrap it and test it on the 
> other
> occurrence I've found to be certain.

Unfortunately, this fails bootstrap:  

/home/wschmidt/gcc/gcc-mainline-test/gcc/calls.c: In function 'rtx_def* 
emit_library_call_value_1(int, rtx, rtx, libcall_type, machine_mode, int, 
va_list)':
/home/wschmidt/gcc/gcc-mainline-test/gcc/calls.c:4362:1: error: definition in 
block 214 does not dominate use in block 14
 emit_library_call_value_1 (int retval, rtx orgfun, rtx value,
 ^
for SSA_NAME: slsr_772 in statement:
slsr_749 = PHI <_17(26), slsr_772(14), slsr_334(214)>
PHI argument
slsr_772
for PHI node
slsr_749 = PHI <_17(26), slsr_772(14), slsr_334(214)>
during GIMPLE pass: slsr
/home/wschmidt/gcc/gcc-mainline-test/gcc/calls.c:4362:1: internal compiler 
error: verify_ssa failed
0x11567cf3 verify_ssa(bool, bool)
/home/wschmidt/gcc/gcc-mainline-test/gcc/tree-ssa.c:1186
0x10fc3fff execute_function_todo
/home/wschmidt/gcc/gcc-mainline-test/gcc/passes.c:1997
0x10fc277f do_per_function
/home/wschmidt/gcc/gcc-mainline-test/gcc/passes.c:1655
0x10fc42a3 execute_todo
/home/wschmidt/gcc/gcc-mainline-test/gcc/passes.c:2044
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.

Not terribly surprising given how sensitive this stuff is.  I can look into why
this fails, but looks like it can't be quite this simple, sadly.

Bill

> 
> -- Bill
> 
>> On Jul 31, 2017, at 4:15 AM, Richard Biener  
>> wrote:
>> 
>> On Sun, Jul 30, 2017 at 8:04 PM, Bill Schmidt
>>  wrote:
>>> Hi,
>>> 
>>> PR81354 identifies a latent bug that can happen in SLSR since the
>>> conditional candidate support was first added.  SLSR relies on the
>>> address of a GIMPLE PHI remaining constant during the course of the
>>> optimization pass, but it needs to split edges.  The use of
>>> make_single_succ_edge and reinstall_phi_args in gimple_split_edge
>>> causes GIMPLE PHI statements to be temporarily expanded to add a
>>> predecessor, and then rebuilt to have the original number of
>>> predecessors.  The expansion usually, if not always, causes the PHI
>>> statement to change address.  Thus gimple_split_edge needs to be
>>> rewritten to perform in-situ replacement of PHI arguments.
>>> 
>>> The required pieces of make_single_succ_edge have been extracted into
>>> two places:  make_replacement_pred_edge, and some fixup code at the
>>> end of gimple_split_edge.  The division is necessary because the
>>> destination of the original edge must remember its original
>>> predecessors for the switch processing in
>>> gimple_redirect_edge_and_branch_1 to work properly.
>>> 
>>> The function gimple_redirect_edge_and_branch was factored into two
>>> pieces so that most of it can be used by gimple_split_edge without
>>> calling ssa_redirect_edge, which also interferes with PHIs.  The
>>> useful bits of ssa_redirect_edge are factored out into the next three
>>> lines of gimple_split_edge.
>>> 
>>> Similarly, redirect_eh_edge had already been similarly factored into
>>> redirect_eh_edge_1 and ssa_redirect_edge.  I took advantage of that
>>> and exposed redirect_eh_edge_1 for use in gimple_redirect_edge_and_branch_1.
>>> 
>>> I've added the test from PR81354 as a torture test, but as we've seen,
>>> small changes elsewhere in the optimizer can easily hide the problem.
>>> 
>>> Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
>>> Is this ok for trunk?  Eventually this needs to be backported to GCC 5,
>>> 6, and 7 if that's acceptable, since PR81354 was observed on
>>> gcc-5-branch.  I haven't yet prepared the backports.
>> 
>> I don't like make_replacement_pred_edge too much.  Wouldn't it work
>> to make sure we first shrink and then re-grow like if we simply do the
>> redirect_edge_and_branch before the make_single_succ_edge call?
>> At least quick testing shows it fixes the testcase on the GCC 6 branch for 
>> me.
>> 
>> Index: gcc/tree-cfg.c
>> ===
>> --- gcc/tree-cfg.c  (revision 250732)
>> +++ gcc/tree-cfg.c  (working copy)
>> @@ -2753,12 +2753,16 @@ gimple_split_edge (edge edge_in)
>>  new_bb = create_empty_bb (after_bb);
>>  new_bb->frequency = EDGE_FREQUENCY (edge_in);
>>  new_bb->count = edge_in->count;
>> +
>> +  /* First redirect the existing edge to avoid reallocating
>> + PHI nodes in dest.  */
>> +  e = redirect_edge_and_branch (edge_in, new_bb);
>> +  gcc_assert (e == edge_in);
>> +
>>  new_edge = make_edge (new_bb, dest, EDGE_FALLTHRU);
>>  new_edge->probability = REG_BR_PROB_BASE;
>>  new_edge->count = edge_in->count;
>> 
>> -  e = redirect_edge_and_branch (edge_in, new_bb);
>> -  gcc_assert (e == edge_in);
>>  reinstall_phi_args (new_edge, e);
>> 
>>  return new_bb;
>> 
>> Sorry for mislea

[PATCH PR81620]Don't set has_max_use_after flag for store-store chain

2017-07-31 Thread Bin Cheng

Hi,
This simple patch fixes the ICE by not setting has_max_use_after flag for
store-store chain because there is no use at all.
Bootstrap and test on x86_64 and AArch64 ongoing.  Is it OK if no failure?

Thanks,
bin
2017-07-31  Bin Cheng  

PR tree-optimization/81620
* tree-predcom.c (add_ref_to_chain): Don't set has_max_use_after
for store-store chain.

gcc/testsuite/ChangeLog
2017-07-31  Bin Cheng  

PR tree-optimization/81620
* gcc.dg/tree-ssa/pr81620-1.c: New.
* gcc.dg/tree-ssa/pr81620-2.c: New.From 4e8f67bb1cc09ef475f9cfbb8e847f9f422c3e44 Mon Sep 17 00:00:00 2001
From: Bin Cheng 
Date: Mon, 31 Jul 2017 10:24:07 +0100
Subject: [PATCH 1/2] pr81620-20170731.txt

---
 gcc/testsuite/gcc.dg/tree-ssa/pr81620-1.c | 20 
 gcc/testsuite/gcc.dg/tree-ssa/pr81620-2.c | 25 +
 gcc/tree-predcom.c|  4 +++-
 3 files changed, 48 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr81620-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr81620-2.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr81620-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr81620-1.c
new file mode 100644
index 000..f8f2dd8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr81620-1.c
@@ -0,0 +1,20 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fno-tree-loop-vectorize -fdump-tree-pcom-details" } */
+
+int a[7];
+char b;
+void abort (void);
+
+int main() {
+  b = 4;
+  for (; b; b--) {
+a[b] = b;
+a[b + 2] = 1;
+  }
+  if (a[0] != 0 || a[1] != 1 || a[2] != 2
+  || a[3] != 1 || a[4] != 1 || a[5] != 1 || a[6] != 1)
+abort ();
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "Store-stores chain" 1 "pcom" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr81620-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr81620-2.c
new file mode 100644
index 000..85a8e35
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr81620-2.c
@@ -0,0 +1,25 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fno-tree-loop-vectorize -fdump-tree-pcom-details" } */
+
+int a[200];
+char b;
+void abort (void);
+
+int main() {
+  int i;
+  b = 100;
+  for (; b; b--) {
+a[b] = 2;
+a[b + 2] = 1;
+  }
+
+  if (a[0] != 0 || a[1] != 2 || a[2] != 2)
+abort ();
+  for (i = 3; i < 103; i++)
+if (a[i] != 1)
+abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "Store-stores chain" 1 "pcom" } } */
diff --git a/gcc/tree-predcom.c b/gcc/tree-predcom.c
index a4011bf..f7a57a4 100644
--- a/gcc/tree-predcom.c
+++ b/gcc/tree-predcom.c
@@ -1069,7 +1069,9 @@ add_ref_to_chain (chain_p chain, dref ref)
   chain->has_max_use_after = false;
 }
 
-  if (ref->distance == chain->length
+  /* Don't set the flag for store-store chain since there is no use.  */
+  if (chain->type != CT_STORE_STORE
+  && ref->distance == chain->length
   && ref->pos > root->pos)
 chain->has_max_use_after = true;
 
-- 
1.9.1

[PATCH PR81267]Rewrite into loop closed ssa form in case of any store-store chain

2017-07-31 Thread Bin Cheng

Hi,
This simple patch fixes the ICE by rewriting into loop closed ssa form in case
of any store-store chain.  We maybe able to avoid that for some cases that
eliminated stores only store loop invariant values, but only with more checks
when inserting final store instructions.
Bootstrap and test on x86_64 and AArch64 ongoing.  Is it OK?

Thanks,
bin
2017-07-31  Bin Cheng  

PR tree-optimization/81627
* tree-predcom.c (prepare_finalizers): Always rewrite into loop
closed ssa form for store-store chain.

gcc/testsuite/ChangeLog
2017-07-31  Bin Cheng  

PR tree-optimization/81627
* gcc.dg/tree-ssa/pr81627.c: New.From d366015187de926a8fe3248325b229bed99b27b5 Mon Sep 17 00:00:00 2001
From: Bin Cheng 
Date: Mon, 31 Jul 2017 11:16:44 +0100
Subject: [PATCH 2/2] pr81627-20170731.txt

---
 gcc/testsuite/gcc.dg/tree-ssa/pr81627.c | 28 
 gcc/tree-predcom.c  | 10 +-
 2 files changed, 33 insertions(+), 5 deletions(-)
 create mode 100755 gcc/testsuite/gcc.dg/tree-ssa/pr81627.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr81627.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr81627.c
new file mode 100755
index 000..7421c49
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr81627.c
@@ -0,0 +1,28 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fno-tree-loop-vectorize -fdump-tree-pcom-details" } */
+
+int a, b, c, d[6], e = 3, f;
+
+void abort (void);
+void fn1 ()
+{
+  for (b = 1; b < 5; b++)
+{
+  for (c = 0; c < 5; c++)
+d[b] = e;
+  if (a)
+f++;
+  d[b + 1] = 1;
+}
+}
+
+int main ()
+{
+  fn1 ();
+  if (d[0] != 0 || d[1] != 3 || d[2] != 3
+  || d[3] != 3 || d[4] != 3 || d[5] != 1)
+abort ();
+
+  return 0;
+}
+/* { dg-final { scan-tree-dump-times "Store-stores chain" 1 "pcom" } } */
diff --git a/gcc/tree-predcom.c b/gcc/tree-predcom.c
index f7a57a4..4538773 100644
--- a/gcc/tree-predcom.c
+++ b/gcc/tree-predcom.c
@@ -2983,11 +2983,11 @@ prepare_finalizers (struct loop *loop, vec 
chains)
   if (prepare_finalizers_chain (loop, chain))
{
  i++;
- /* We don't corrupt loop closed ssa form for store elimination
-chain if eliminated stores only store loop invariant values
-into memory.  */
- if (!chain->inv_store_elimination)
-   loop_closed_ssa |= (!chain->inv_store_elimination);
+ /* Be conservative, assume loop closed ssa form is corrupted
+by store-store chain.  Though it's not always the case if
+eliminated stores only store loop invariant values into
+memory.  */
+ loop_closed_ssa = true;
}
   else
{
-- 
1.9.1

C PATCH to further improve diagnostic for -Wsign-compare (PR c/81417)

2017-07-31 Thread Marek Polacek

This patch improves the diagnostic of -Wsign-compare for ?: by also printing
the types, similarly to my recent patch.  But we can do even better here if we
actually point to the operand in question, so I passed the locations of the
operands from the parser.  So instead of 

x.c:8:16: warning: signed and unsigned type in conditional expression 
[-Wsign-compare]
   return x ? y : -1;
^
you'll now see:

x.c:8:18: warning: operand of conditional expression changes signedness: 'int' 
to 'unsigned int' [-Wsign-compare]
   return x ? y : -1;
  ^

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2017-07-31  Marek Polacek  

PR c/81417
* c-array-notation.c (fix_builtin_array_notation_fn): Update calls to
build_conditional_expr. 
* c-parser.c (c_parser_conditional_expression): Pass the locations of
OP1 and OP2 down to build_conditional_expr.
* c-tree.h (build_conditional_expr): Update declaration.
* c-typeck.c (build_conditional_expr): Add location_t parameters.
For -Wsign-compare, also print the types.

* objc-next-runtime-abi-02.c (build_v2_build_objc_method_call): Update
a call to build_conditional_expr.

* Wsign-compare-1.c: New test.
* gcc.dg/compare1.c: Adjust dg-bogus.
* gcc.dg/compare2.c: Likewise.
* gcc.dg/compare3.c: Likewise.
* gcc.dg/compare7.c: Likewise.
* gcc.dg/compare8.c: Likewise.
* gcc.dg/compare9.c: Likewise.
* gcc.dg/pr11492.c: Likewise.

diff --git gcc/c/c-array-notation.c gcc/c/c-array-notation.c
index e430f5c681b..40f1cfdabb8 100644
--- gcc/c/c-array-notation.c
+++ gcc/c/c-array-notation.c
@@ -355,8 +355,9 @@ fix_builtin_array_notation_fn (tree an_builtin_fn, tree 
*new_var)
   new_cond_expr = build2 (NE_EXPR, TREE_TYPE (func_parm), func_parm,
  build_zero_cst (TREE_TYPE (func_parm)));
   new_expr = build_conditional_expr
-   (location, new_cond_expr, false, new_yes_expr,
-TREE_TYPE (new_yes_expr), new_no_expr, TREE_TYPE (new_no_expr));
+   (location, new_cond_expr, false,
+new_yes_expr, TREE_TYPE (new_yes_expr), location,
+new_no_expr, TREE_TYPE (new_no_expr), location);
   break;
 case BUILT_IN_CILKPLUS_SEC_REDUCE_ALL_NONZERO:
   new_var_init = build_modify_expr
@@ -375,8 +376,9 @@ fix_builtin_array_notation_fn (tree an_builtin_fn, tree 
*new_var)
   new_cond_expr = build2 (EQ_EXPR, TREE_TYPE (func_parm), func_parm,
  build_zero_cst (TREE_TYPE (func_parm)));
   new_expr = build_conditional_expr
-   (location, new_cond_expr, false, new_yes_expr,
-TREE_TYPE (new_yes_expr), new_no_expr, TREE_TYPE (new_no_expr));
+   (location, new_cond_expr, false,
+new_yes_expr, TREE_TYPE (new_yes_expr), location,
+new_no_expr, TREE_TYPE (new_no_expr), location);
   break;
 case BUILT_IN_CILKPLUS_SEC_REDUCE_ANY_ZERO:
   new_var_init = build_modify_expr
@@ -394,8 +396,9 @@ fix_builtin_array_notation_fn (tree an_builtin_fn, tree 
*new_var)
   new_cond_expr = build2 (EQ_EXPR, TREE_TYPE (func_parm), func_parm,
  build_zero_cst (TREE_TYPE (func_parm)));
   new_expr = build_conditional_expr
-   (location, new_cond_expr, false, new_yes_expr,
-TREE_TYPE (new_yes_expr), new_no_expr, TREE_TYPE (new_no_expr));   
+   (location, new_cond_expr, false,
+new_yes_expr, TREE_TYPE (new_yes_expr), location,
+new_no_expr, TREE_TYPE (new_no_expr), location);
   break;
 case BUILT_IN_CILKPLUS_SEC_REDUCE_ANY_NONZERO:
   new_var_init = build_modify_expr
@@ -413,8 +416,9 @@ fix_builtin_array_notation_fn (tree an_builtin_fn, tree 
*new_var)
   new_cond_expr = build2 (NE_EXPR, TREE_TYPE (func_parm), func_parm,
  build_zero_cst (TREE_TYPE (func_parm)));
   new_expr = build_conditional_expr
-   (location, new_cond_expr, false, new_yes_expr,
-TREE_TYPE (new_yes_expr), new_no_expr, TREE_TYPE (new_no_expr));   
+   (location, new_cond_expr, false,
+new_yes_expr, TREE_TYPE (new_yes_expr), location,
+new_no_expr, TREE_TYPE (new_no_expr), location);
   break;
 case BUILT_IN_CILKPLUS_SEC_REDUCE_MAX:
   if (TYPE_MIN_VALUE (new_var_type))
@@ -434,7 +438,8 @@ fix_builtin_array_notation_fn (tree an_builtin_fn, tree 
*new_var)
   new_expr = build_conditional_expr
(location,
 build2 (LT_EXPR, TREE_TYPE (*new_var), *new_var, func_parm), false,
-new_yes_expr, TREE_TYPE (*new_var), new_no_expr, TREE_TYPE (*new_var));
+new_yes_expr, TREE_TYPE (*new_var), location,
+new_no_expr, TREE_TYPE (*new_var), location);
   break;
 case BUILT_IN_CILKPLUS_SEC_REDUCE_MIN:
   if (TYPE_MAX_VALUE (new_var_type))
@@ -454,7 +459,8 @@ fix_builtin_array_notation_fn (tree an_builtin_fn, tree 
*new_var)
   new_expr = build_co

Re: [PATCH] libstdc++: Support std::is_aggregate on clang++ (was [cfe-dev] clang++: std::is_aggregate unusable with clang-5.0/libstdc++-7)

2017-07-31 Thread Jonathan Wakely


On 27/07/17 16:27 +0900, Katsuhiko Nishimra wrote:

From 56c4a18d0d8c8ce7aa1239880138775e4db06645 Mon Sep 17 00:00:00 2001
From: Katsuhiko Nishimra 
Date: Thu, 27 Jul 2017 16:03:54 +0900
Subject: [PATCH] libstdc++: Support std::is_aggregate on clang++

Currently, libstdc++ tries to detect __is_aggregate built-in macro using
__has_builtin, but this fails on clang++ because __has_builtin on
clang++ detects only built-in functions, not built-in macros. This patch
adds a test using __is_identifier. Tested on clang++
5.0.0-svn308422-1~exp1 and g++ 7.1.0-10 from Debian unstable.
---
libstdc++-v3/include/std/type_traits | 5 +
1 file changed, 5 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 390b6f40a..e7ec402fb 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -2894,6 +2894,11 @@ template 

#if __GNUC__ >= 7
# define _GLIBCXX_HAVE_BUILTIN_IS_AGGREGATE 1
+#elif defined(__is_identifier)
+// For clang
+# if ! __is_identifier(__is_aggregate)
+#  define _GLIBCXX_HAVE_BUILTIN_IS_AGGREGATE 1
+# endif
#elif defined __has_builtin
// For non-GNU compilers:
# if __has_builtin(__is_aggregate)


This __has_bultin check only exists for Clang, so should be replaced
by the correct __is_identifier check, not left there in addition to
it.

PING: [PATCH] PR driver/81523: Make -static override -pie

2017-07-31 Thread H.J. Lu

On Mon, Jul 24, 2017 at 10:24 AM, H.J. Lu  wrote:
> On Sun, Jul 23, 2017 at 8:14 AM, H.J. Lu  wrote:
>> -static and -pie together behave differently depending on whether GCC is
>> configured with --enable-default-pie.  On x86, "-static -pie" fails to
>> create executable when --enable-default-pie isn't used, but creates a
>> static executable when --enable-default-pie is used.  This patch makes
>> -static completely override -pie to create a static executable, regardless
>> if --enable-default-pie is used to configure GCC.
>>
>> OK for master?
>>
>> H.J.
>> --
>> 2017-07-23  Alan Modra  
>> H.J. Lu  
>>
>> gcc/
>>
>> PR driver/81523
>> * gcc.c (NO_PIE_SPEC): Delete.
>> (PIE_SPEC): Define as !no-pie/pie.  Move static|shared|r
>> exclusion..
>> (LINK_PIE_SPEC): ..to here.
>> * config/gnu-user.h (GNU_USER_TARGET_STARTFILE_SPEC): Correct
>> chain of crtbegin*.o selection, update for PIE_SPEC changes and
>> format.
>> (GNU_USER_TARGET_ENDFILE_SPEC): Similarly.
>> * config/sol2.h (STARTFILE_CRTBEGIN_SPEC): Similarly.
>> (ENDFILE_CRTEND_SPEC): Similarly.
>>
>
> We need to add %{no-pie:} to LINK_COMMAND_SPEC to prevent an error
> message when PIE isn't enabled by default.   Here is the updated patch with
> a testcase.
>

PING.  I am enclosing the patch here.

Thanks.


-- 
H.J.
From ea702c99286ab92a4b94f676d2340ce55fd173c3 Mon Sep 17 00:00:00 2001
From: Alan Modra 
Date: Thu, 20 Jul 2017 09:57:36 -0700
Subject: [PATCH] PR driver/81523: Make -static override -pie

-static and -pie together behave differently depending on whether GCC is
configured with --enable-default-pie.  On x86, "-static -pie" fails to
create executable when --enable-default-pie isn't used, but creates a
static executable when --enable-default-pie is used.  This patch makes
-static completely override -pie to create a static executable, regardless
if --enable-default-pie is used to configure GCC.

2017-07-24  Alan Modra  
	H.J. Lu  

gcc/

	PR driver/81523
	* gcc.c (NO_PIE_SPEC): Delete.
	(PIE_SPEC): Define as !no-pie/pie.  Move static|shared|r
	exclusion..
	(LINK_PIE_SPEC): ..to here.
	(LINK_COMMAND_SPEC): Support -no-pie.
	* config/gnu-user.h (GNU_USER_TARGET_STARTFILE_SPEC): Correct
	chain of crtbegin*.o selection, update for PIE_SPEC changes and
	format.
	(GNU_USER_TARGET_ENDFILE_SPEC): Similarly.
	* config/sol2.h (STARTFILE_CRTBEGIN_SPEC): Similarly.
	(ENDFILE_CRTEND_SPEC): Similarly.

gcc/testsuite/

	PR driver/81523
	* gcc.dg/pie-7.c: New test.
	* gcc.dg/pie-static-1.c: Likewise.
	* gcc.dg/pie-static-2.c: Likewise.
---
 gcc/config/gnu-user.h   | 34 --
 gcc/config/sol2.h   | 12 ++--
 gcc/gcc.c   | 10 +-
 gcc/testsuite/gcc.dg/pie-7.c|  7 +++
 gcc/testsuite/gcc.dg/pie-static-1.c |  7 +++
 gcc/testsuite/gcc.dg/pie-static-2.c |  7 +++
 6 files changed, 56 insertions(+), 21 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pie-7.c
 create mode 100644 gcc/testsuite/gcc.dg/pie-static-1.c
 create mode 100644 gcc/testsuite/gcc.dg/pie-static-2.c

diff --git a/gcc/config/gnu-user.h b/gcc/config/gnu-user.h
index 2787a3d16be..de605b0c466 100644
--- a/gcc/config/gnu-user.h
+++ b/gcc/config/gnu-user.h
@@ -50,19 +50,28 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 
 #if defined HAVE_LD_PIE
 #define GNU_USER_TARGET_STARTFILE_SPEC \
-  "%{!shared: %{pg|p|profile:gcrt1.o%s;: \
-%{" PIE_SPEC ":Scrt1.o%s} %{" NO_PIE_SPEC ":crt1.o%s}}} \
-   crti.o%s %{static:crtbeginT.o%s;: %{shared:crtbeginS.o%s} \
-	  %{" PIE_SPEC ":crtbeginS.o%s} \
-	  %{" NO_PIE_SPEC ":crtbegin.o%s}} \
+  "%{shared:; \
+ pg|p|profile:gcrt1.o%s; \
+ static:crt1.o%s; \
+ " PIE_SPEC ":Scrt1.o%s; \
+ :crt1.o%s} \
+   crti.o%s \
+   %{static:crtbeginT.o%s; \
+ shared|" PIE_SPEC ":crtbeginS.o%s; \
+ :crtbegin.o%s} \
%{fvtable-verify=none:%s; \
  fvtable-verify=preinit:vtv_start_preinit.o%s; \
  fvtable-verify=std:vtv_start.o%s} \
" CRTOFFLOADBEGIN
 #else
 #define GNU_USER_TARGET_STARTFILE_SPEC \
-  "%{!shared: %{pg|p|profile:gcrt1.o%s;:crt1.o%s}} \
-   crti.o%s %{static:crtbeginT.o%s;shared|pie:crtbeginS.o%s;:crtbegin.o%s} \
+  "%{shared:; \
+ pg|p|profile:gcrt1.o%s; \
+ :crt1.o%s} \
+   crti.o%s \
+   %{static:crtbeginT.o%s; \
+ shared|pie:crtbeginS.o%s; \
+ :crtbegin.o%s} \
%{fvtable-verify=none:%s; \
  fvtable-verify=preinit:vtv_start_preinit.o%s; \
  fvtable-verify=std:vtv_start.o%s} \
@@ -82,15 +91,20 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
   "%{fvtable-verify=none:%s; \
  fvtable-verify=preinit:vtv_end_preinit.o%s; \
  fvtable-verify=std:vtv_end.o%s} \
-   %{shared:crtendS.o%s;: %{" PIE_SPEC ":crtendS.o%s} \
-   %{" NO_PIE_SPEC ":crtend.o%s}} crtn.o%s \
+   %{static:crtend.o%s; \
+ shared|" PIE_SPEC ":crtendS.o%s;

Re: [PATCH] libstdc++: Support std::is_aggregate on clang++ (was [cfe-dev] clang++: std::is_aggregate unusable with clang-5.0/libstdc++-7)

2017-07-31 Thread Tim Song

On Mon, Jul 31, 2017 at 10:53 AM, Jonathan Wakely  wrote:
> On 27/07/17 16:27 +0900, Katsuhiko Nishimra wrote:
>>
>> From 56c4a18d0d8c8ce7aa1239880138775e4db06645 Mon Sep 17 00:00:00 2001
>> From: Katsuhiko Nishimra 
>> Date: Thu, 27 Jul 2017 16:03:54 +0900
>> Subject: [PATCH] libstdc++: Support std::is_aggregate on clang++
>>
>> Currently, libstdc++ tries to detect __is_aggregate built-in macro using
>> __has_builtin, but this fails on clang++ because __has_builtin on
>> clang++ detects only built-in functions, not built-in macros. This patch
>> adds a test using __is_identifier. Tested on clang++
>> 5.0.0-svn308422-1~exp1 and g++ 7.1.0-10 from Debian unstable.
>> ---
>> libstdc++-v3/include/std/type_traits | 5 +
>> 1 file changed, 5 insertions(+)
>>
>> diff --git a/libstdc++-v3/include/std/type_traits
>> b/libstdc++-v3/include/std/type_traits
>> index 390b6f40a..e7ec402fb 100644
>> --- a/libstdc++-v3/include/std/type_traits
>> +++ b/libstdc++-v3/include/std/type_traits
>> @@ -2894,6 +2894,11 @@ template 
>>
>> #if __GNUC__ >= 7
>> # define _GLIBCXX_HAVE_BUILTIN_IS_AGGREGATE 1
>> +#elif defined(__is_identifier)
>> +// For clang
>> +# if ! __is_identifier(__is_aggregate)
>> +#  define _GLIBCXX_HAVE_BUILTIN_IS_AGGREGATE 1
>> +# endif
>> #elif defined __has_builtin
>> // For non-GNU compilers:
>> # if __has_builtin(__is_aggregate)
>
>
> This __has_bultin check only exists for Clang, so should be replaced
> by the correct __is_identifier check, not left there in addition to
> it.
>
>

https://clang.llvm.org/docs/LanguageExtensions.html#checks-for-type-trait-primitives
seems to suggest using __has_extension instead.

Re: C PATCH to further improve diagnostic for -Wsign-compare (PR c/81417)

2017-07-31 Thread David Malcolm

On Mon, 2017-07-31 at 16:14 +0200, Marek Polacek wrote:
> This patch improves the diagnostic of -Wsign-compare for ?: by also
> printing
> the types, similarly to my recent patch.  But we can do even better
> here if we
> actually point to the operand in question, so I passed the locations
> of the
> operands from the parser.

Thanks for updating the patch.

>   So instead of 
> 
> x.c:8:16: warning: signed and unsigned type in conditional expression
> [-Wsign-compare]
>return x ? y : -1;
> ^
> you'll now see:
> 
> x.c:8:18: warning: operand of conditional expression changes
> signedness: 'int' to 'unsigned int' [-Wsign-compare]
>return x ? y : -1;
>   ^

That's an improvement, but I would have expected it to underline the
whole of the pertinent subexpression e.g.:

   return x ? y : -1;
  ^~

rather than just:

   return x ? y : -1;
  ^

>From my reading of the patch, it's capturing just the location of the
first token within the subexpression (hence e.g. just the minus token
in the example above, which happens to make some sense for this case,
but wouldn't in general).

Hopefully you can get at the location_t for the whole of the
subexpression using c_expr, rather than peeking at the first token.

> Bootstrapped/regtested on x86_64-linux, ok for trunk?

The patch doesn't have a testcase for the location information; please
add one, using -fdiagnostics-show-caret and dg-begin-multiline-output/
dg-end-multiline-output.  Please ensure that the pertinent expressions
are more than one character wide, so that the test properly verifies
the underlining.

Thanks
Dave


> 2017-07-31  Marek Polacek  
> 
>   PR c/81417
>   * c-array-notation.c (fix_builtin_array_notation_fn): Update
> calls to
>   build_conditional_expr. 
>   * c-parser.c (c_parser_conditional_expression): Pass the
> locations of
>   OP1 and OP2 down to build_conditional_expr.
>   * c-tree.h (build_conditional_expr): Update declaration.
>   * c-typeck.c (build_conditional_expr): Add location_t
> parameters.
>   For -Wsign-compare, also print the types.
> 
>   * objc-next-runtime-abi-02.c (build_v2_build_objc_method_call):
> Update
>   a call to build_conditional_expr.
> 
>   * Wsign-compare-1.c: New test.
>   * gcc.dg/compare1.c: Adjust dg-bogus.
>   * gcc.dg/compare2.c: Likewise.
>   * gcc.dg/compare3.c: Likewise.
>   * gcc.dg/compare7.c: Likewise.
>   * gcc.dg/compare8.c: Likewise.
>   * gcc.dg/compare9.c: Likewise.
>   * gcc.dg/pr11492.c: Likewise.
> 
> diff --git gcc/c/c-array-notation.c gcc/c/c-array-notation.c
> index e430f5c681b..40f1cfdabb8 100644
> --- gcc/c/c-array-notation.c
> +++ gcc/c/c-array-notation.c
> @@ -355,8 +355,9 @@ fix_builtin_array_notation_fn (tree
> an_builtin_fn, tree *new_var)
>new_cond_expr = build2 (NE_EXPR, TREE_TYPE (func_parm),
> func_parm,
> build_zero_cst (TREE_TYPE
> (func_parm)));
>new_expr = build_conditional_expr
> - (location, new_cond_expr, false, new_yes_expr,
> -  TREE_TYPE (new_yes_expr), new_no_expr, TREE_TYPE
> (new_no_expr));
> + (location, new_cond_expr, false,
> +  new_yes_expr, TREE_TYPE (new_yes_expr), location,
> +  new_no_expr, TREE_TYPE (new_no_expr), location);
>break;
>  case BUILT_IN_CILKPLUS_SEC_REDUCE_ALL_NONZERO:
>new_var_init = build_modify_expr
> @@ -375,8 +376,9 @@ fix_builtin_array_notation_fn (tree
> an_builtin_fn, tree *new_var)
>new_cond_expr = build2 (EQ_EXPR, TREE_TYPE (func_parm),
> func_parm,
> build_zero_cst (TREE_TYPE
> (func_parm)));
>new_expr = build_conditional_expr
> - (location, new_cond_expr, false, new_yes_expr,
> -  TREE_TYPE (new_yes_expr), new_no_expr, TREE_TYPE
> (new_no_expr));
> + (location, new_cond_expr, false,
> +  new_yes_expr, TREE_TYPE (new_yes_expr), location,
> +  new_no_expr, TREE_TYPE (new_no_expr), location);
>break;
>  case BUILT_IN_CILKPLUS_SEC_REDUCE_ANY_ZERO:
>new_var_init = build_modify_expr
> @@ -394,8 +396,9 @@ fix_builtin_array_notation_fn (tree
> an_builtin_fn, tree *new_var)
>new_cond_expr = build2 (EQ_EXPR, TREE_TYPE (func_parm),
> func_parm,
> build_zero_cst (TREE_TYPE
> (func_parm)));
>new_expr = build_conditional_expr
> - (location, new_cond_expr, false, new_yes_expr,
> -  TREE_TYPE (new_yes_expr), new_no_expr, TREE_TYPE
> (new_no_expr));   
> + (location, new_cond_expr, false,
> +  new_yes_expr, TREE_TYPE (new_yes_expr), location,
> +  new_no_expr, TREE_TYPE (new_no_expr), location);
>break;
>  case BUILT_IN_CILKPLUS_SEC_REDUCE_ANY_NONZERO:
>new_var_init = build_modify_expr
> @@ -413,8 +416,9 @@ fix_builtin_array_notation_fn (tree
> an_builtin_fn, tree *new_var)
>new_cond_expr = build2 (NE_EXPR, TREE_TYPE (func_parm),
> func_parm,
>

Re: [PATCH 1/2] x86,s390: add compiler memory barriers when expanding atomic_thread_fence (PR 80640)

2017-07-31 Thread Jeff Law

On 07/26/2017 02:02 PM, Alexander Monakov wrote:
> On Wed, 26 Jul 2017, Alexander Monakov wrote:
> 
>> On Wed, 26 Jul 2017, Jeff Law wrote:
>>> I'm not sure what you mean by extraneous compiler barriers -- isn't the
>>> worst case scenario here that the target emits them as well?  So there
>>> would be an extraneous one in that case, but that ought to be a "don't
>>> care".
>>
>> Yes, exactly this.
> 
> I've just realized that we can detect if the backend produced empty RTL
> sequence by looking at get_last_insn () before/after gen_mem_thread_fence.
> This way we can emit a compiler barrier iff the backend didn't emit a
> machine barrier.
We could.  But I suspect just emitting the barriers in the generic code
is the way to go.  If we have a redundant barrier, the compile time cost
is minimal and I don't think the additional code to avoid creating the
exxtra barrier is worth it.

jeff

Re: [PATCH 1/2] x86,s390: add compiler memory barriers when expanding atomic_thread_fence (PR 80640)

2017-07-31 Thread Jeff Law

On 07/26/2017 12:13 PM, Alexander Monakov wrote:
> On Wed, 26 Jul 2017, Jeff Law wrote:
>> I'm not sure what you mean by extraneous compiler barriers -- isn't the
>> worst case scenario here that the target emits them as well?  So there
>> would be an extraneous one in that case, but that ought to be a "don't
>> care".
> 
> Yes, exactly this.
> 
>> In the middle end patch, do we need a barrier before the fence as well?
>> The post-fence barrier prevents reordering the fence with anything which
>> follows the fence.  But do we have to also prevent reordering the fence
>> with prior instructions with any of the memory models?  This isn't my
>> area of expertise, so if it's dumb question, don't hesitate to let me
>> know :-)
> 
> That depends on how pessimistic we want to be with respect to backend
> getting it wrong.  My expectation here is that if a backend emits non-empty
> RTL, the produced sequence for the fence itself acts as a compiler memory
> barrier.
Perhaps. But do we really want to rely on that?  EMitting a scheduling
barrier prior to these atomics is virtually free.

Jeff

Re: [PATCH][RFA/RFC] Stack clash mitigation patch 05/08

2017-07-31 Thread Jeff Law

On 07/21/2017 02:25 PM, Segher Boessenkool wrote:
> On Thu, Jul 20, 2017 at 08:20:52AM -0600, Jeff Law wrote:
>>> Can only combine-stack-adjustments do this?  It seems like something
>>> many passes could do, and then your new note doesn't help.
>> SO far it's only been observed with c-s-a, but further auditing is
>> certainly desirable here, particularly with the upcoming changes to the
>> generic dynamic alloca handling.
>>
>> In the V2 patch only backends would emit unrolled inline alloca/probe
>> sequences like what you see above and only for prologues.  Thus there
>> were a very limited number of passes to be concerned about.
>>
>> In the V3 patch we have unrolled inline probing for the dynamic space as
>> well, so this kind of sequence is exposed to everything after
>> gimple->rtl expansion.
>>
>> Unfortunately, the most thorough checker we have is x86 and on that
>> target, because of stack alignment issues, we'll never see a constant
>> size in the dynamic space and thus no unrolled inlined alloca/probe
>> sequences.
>>
>> In reality I suspect that with teh hard register references, most passes
>> are going to leave those insns alone, but some auditing is necessary.
> 
> This is similar to what rs6000 uses stack_tie for.  You want the
> prevent a store to the stack (the probe) from being moved after a
> later stack pointer update.  By pretending (in the insn pattern)
> there is a store to stack with that stack pointer update, nothing
> can move stores after it.
FWIW, we found a case where the scheduler would muck things up.
Essentially it has code rewrites address computations and memory
references in the hopes of breaking dependency chains.  In many ways
it's a lot like c-s-a.

Anyway, it broke the dependency chain and as a result we ended up with
unsafe sequences on aarch64.   THe V3 patch posted last night addresses
that in two ways.  First the scheduler knows about the magic note in the
same manner as c-s-a.  And we emit scheduling barriers in the probing
code to prevent undesirable movement.

Jeff

Re: C PATCH to further improve diagnostic for -Wsign-compare (PR c/81417)

2017-07-31 Thread Martin Sebor


On 07/31/2017 08:14 AM, Marek Polacek wrote:

This patch improves the diagnostic of -Wsign-compare for ?: by also printing
the types, similarly to my recent patch.  But we can do even better here if we
actually point to the operand in question, so I passed the locations of the
operands from the parser.  So instead of

x.c:8:16: warning: signed and unsigned type in conditional expression 
[-Wsign-compare]
   return x ? y : -1;
^
you'll now see:

x.c:8:18: warning: operand of conditional expression changes signedness: 'int' 
to 'unsigned int' [-Wsign-compare]
   return x ? y : -1;
  ^


I like that this is more informative than the last warning you
committed for this bug: it says what type the operand is converted
to.  The last one only shows what the types of the operands are but
leaves users guessing as to what that might mean (integer promotion
rules are often poorly understood).  Where the last warning prints

  comparison of integer expressions of different signedness: ‘int’ and 
‘unsigned int’


it would be nice to go back and add this detail to it as well, and
have it print something like this instead:

  comparison of integer expressions of different signedness changes 
type of the second operand from ‘int’ to ‘unsigned int’


Where constant expressions are involved it would also be helpful
to show the result of the conversion.  For instance:

  comparison between ‘int’ and ‘unsigned int’ changes the value of the 
second operand from ‘-1’ to ‘4294967296’


Martin



Bootstrapped/regtested on x86_64-linux, ok for trunk?

2017-07-31  Marek Polacek  

PR c/81417
* c-array-notation.c (fix_builtin_array_notation_fn): Update calls to
build_conditional_expr. 
* c-parser.c (c_parser_conditional_expression): Pass the locations of
OP1 and OP2 down to build_conditional_expr.
* c-tree.h (build_conditional_expr): Update declaration.
* c-typeck.c (build_conditional_expr): Add location_t parameters.
For -Wsign-compare, also print the types.

* objc-next-runtime-abi-02.c (build_v2_build_objc_method_call): Update
a call to build_conditional_expr.

* Wsign-compare-1.c: New test.
* gcc.dg/compare1.c: Adjust dg-bogus.
* gcc.dg/compare2.c: Likewise.
* gcc.dg/compare3.c: Likewise.
* gcc.dg/compare7.c: Likewise.
* gcc.dg/compare8.c: Likewise.
* gcc.dg/compare9.c: Likewise.
* gcc.dg/pr11492.c: Likewise.

diff --git gcc/c/c-array-notation.c gcc/c/c-array-notation.c
index e430f5c681b..40f1cfdabb8 100644
--- gcc/c/c-array-notation.c
+++ gcc/c/c-array-notation.c
@@ -355,8 +355,9 @@ fix_builtin_array_notation_fn (tree an_builtin_fn, tree 
*new_var)
   new_cond_expr = build2 (NE_EXPR, TREE_TYPE (func_parm), func_parm,
  build_zero_cst (TREE_TYPE (func_parm)));
   new_expr = build_conditional_expr
-   (location, new_cond_expr, false, new_yes_expr,
-TREE_TYPE (new_yes_expr), new_no_expr, TREE_TYPE (new_no_expr));
+   (location, new_cond_expr, false,
+new_yes_expr, TREE_TYPE (new_yes_expr), location,
+new_no_expr, TREE_TYPE (new_no_expr), location);
   break;
 case BUILT_IN_CILKPLUS_SEC_REDUCE_ALL_NONZERO:
   new_var_init = build_modify_expr
@@ -375,8 +376,9 @@ fix_builtin_array_notation_fn (tree an_builtin_fn, tree 
*new_var)
   new_cond_expr = build2 (EQ_EXPR, TREE_TYPE (func_parm), func_parm,
  build_zero_cst (TREE_TYPE (func_parm)));
   new_expr = build_conditional_expr
-   (location, new_cond_expr, false, new_yes_expr,
-TREE_TYPE (new_yes_expr), new_no_expr, TREE_TYPE (new_no_expr));
+   (location, new_cond_expr, false,
+new_yes_expr, TREE_TYPE (new_yes_expr), location,
+new_no_expr, TREE_TYPE (new_no_expr), location);
   break;
 case BUILT_IN_CILKPLUS_SEC_REDUCE_ANY_ZERO:
   new_var_init = build_modify_expr
@@ -394,8 +396,9 @@ fix_builtin_array_notation_fn (tree an_builtin_fn, tree 
*new_var)
   new_cond_expr = build2 (EQ_EXPR, TREE_TYPE (func_parm), func_parm,
  build_zero_cst (TREE_TYPE (func_parm)));
   new_expr = build_conditional_expr
-   (location, new_cond_expr, false, new_yes_expr,
-TREE_TYPE (new_yes_expr), new_no_expr, TREE_TYPE (new_no_expr));
+   (location, new_cond_expr, false,
+new_yes_expr, TREE_TYPE (new_yes_expr), location,
+new_no_expr, TREE_TYPE (new_no_expr), location);
   break;
 case BUILT_IN_CILKPLUS_SEC_REDUCE_ANY_NONZERO:
   new_var_init = build_modify_expr
@@ -413,8 +416,9 @@ fix_builtin_array_notation_fn (tree an_builtin_fn, tree 
*new_var)
   new_cond_expr = build2 (NE_EXPR, TREE_TYPE (func_parm), func_parm,
  build_zero_cst (TREE_TYPE (func_parm)));
   new_expr = build_conditional_expr
-   (location, new_cond_expr, false, new_yes_expr,
-TREE_

[PATCH][AArch64] PR71951: Fix unwinding with -fomit-frame-pointer

2017-07-31 Thread Wilco Dijkstra

As described in PR71951, if libgcc is built with -fomit-frame-pointer,
unwinding crashes, for example while doing a backtrace.  The underlying
reason is the Dwarf unwinder does not setup the frame pointer register
in the initialization code.  When later unwinding a function that uses
the frame pointer, it tries to read FP using _Unwind_GetGR, and this
crashes if has never restored FP.  To unwind correctly the first frame
must save and restore FP (it is unwound in a special way so that it
uses SP instead of FP).  This is done by adding -fno-omit-frame-pointer.

OK for commit and backport to GCC6/7?

ChangeLog:
2017-07-31  Wilco Dijkstra  

PR target/71951
* config/aarch64/aarch64.h (LIBGCC2_UNWIND_ATTRIBUTE): Define.

--
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 
7f91edb5713d7e8eda2f0a024a0f97b4e111c4b0..03fd93046bdbdb03bd7d0c4573928f504640f7e1
 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -971,4 +971,12 @@ extern const char *host_detect_local_cpu (int argc, const 
char **argv);
 extern tree aarch64_fp16_type_node;
 extern tree aarch64_fp16_ptr_type_node;
 
+/* The generic unwind code in libgcc does not initialize the frame pointer.
+   So in order to unwind a function using a frame pointer, the very first
+   function that is unwound must save the frame pointer.  That way the frame
+   pointer is restored and its value is now valid - otherwise _Unwind_GetGR
+   crashes.  Libgcc can now be safely built with -fomit-frame-pointer.  */
+#define LIBGCC2_UNWIND_ATTRIBUTE \
+  __attribute__((optimize ("no-omit-frame-pointer")))
+
 #endif /* GCC_AARCH64_H */

Re: C PATCH to further improve diagnostic for -Wsign-compare (PR c/81417)

2017-07-31 Thread Marek Polacek

On Mon, Jul 31, 2017 at 09:54:03AM -0600, Martin Sebor wrote:
> On 07/31/2017 08:14 AM, Marek Polacek wrote:
> > This patch improves the diagnostic of -Wsign-compare for ?: by also printing
> > the types, similarly to my recent patch.  But we can do even better here if 
> > we
> > actually point to the operand in question, so I passed the locations of the
> > operands from the parser.  So instead of
> > 
> > x.c:8:16: warning: signed and unsigned type in conditional expression 
> > [-Wsign-compare]
> >return x ? y : -1;
> > ^
> > you'll now see:
> > 
> > x.c:8:18: warning: operand of conditional expression changes signedness: 
> > 'int' to 'unsigned int' [-Wsign-compare]
> >return x ? y : -1;
> >   ^
> 
> I like that this is more informative than the last warning you
> committed for this bug: it says what type the operand is converted
> to.  The last one only shows what the types of the operands are but
> leaves users guessing as to what that might mean (integer promotion
> rules are often poorly understood).  Where the last warning prints
> 
>   comparison of integer expressions of different signedness: ‘int’ and
> ‘unsigned int’
> 
> it would be nice to go back and add this detail to it as well, and
> have it print something like this instead:
> 
>   comparison of integer expressions of different signedness changes type of
> the second operand from ‘int’ to ‘unsigned int’
> 
> Where constant expressions are involved it would also be helpful
> to show the result of the conversion.  For instance:
> 
>   comparison between ‘int’ and ‘unsigned int’ changes the value of the
> second operand from ‘-1’ to ‘4294967296’

Hmm, interesting.  I could do that.  How do other people feel about this?

Marek

Re: [WWW PATCH]: Mention that x86 now supports "naked" function attribute.

2017-07-31 Thread Gerald Pfeifer

On Mon, 31 Jul 2017, Uros Bizjak wrote:
> One liner that mentions new addition to x86 port.
> 
> OK for wwwdocs?

Sure.  (Even without asking. ;-)

Thanks,
Gerald

[PATCH] Fix reassoc var_bound range test optimization (PR tree-optimization/81588)

2017-07-31 Thread Jakub Jelinek

Hi!

This patch fixes a couple of issues in optimize_range_tests_var_bound.
One is that if we have ranges[i].in_p, we should be inverting the comparison
rather than just swapping or not swapping operands.  Then because
we want to handle only LT/LE, we want to swap operands and the comparison
code if we have GT/GE after the optional inversion.  And finally, using
ranges[i].in_p to determine if we should swap at the end is wrong,
in the pr81588 testcase we have a negation in between and thus
ranges[i].in_p doesn't match the opcode - so, we should use opcode and if
opcode is ERROR_MARK (i.e. the inter-bb range test optimization), we should
check oe->rank) and finally shouldn't swap comparison operands, but invert
the comparison code if we need to invert.

In the earlier version of the patch, I made a mistake and miscompiled
trip_count_to_ahead_ratio_too_small_p in tree-ssa-loop-prefetch.c,
so that simplified is also included in another testcase.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/7.2?

2017-07-31  Jakub Jelinek  

PR tree-optimization/81588
* tree-ssa-reassoc.c (optimize_range_tests_var_bound): If
ranges[i].in_p, invert comparison code ccode.  For >/>=,
swap rhs1 and rhs2 and comparison code unconditionally,
for rank is BIT_IOR_EXPR.

* gcc.dg/tree-ssa/pr81588.c: New test.
* gcc.dg/pr81588.c: New test.
* gcc.c-torture/execute/pr81588.c: New test.

--- gcc/tree-ssa-reassoc.c.jj   2017-07-31 10:19:22.777332269 +0200
+++ gcc/tree-ssa-reassoc.c  2017-07-31 13:14:33.488618172 +0200
@@ -2958,17 +2958,26 @@ optimize_range_tests_var_bound (enum tre
{
case GT_EXPR:
case GE_EXPR:
- if (!ranges[i].in_p)
-   std::swap (rhs1, rhs2);
+   case LT_EXPR:
+   case LE_EXPR:
+ break;
+   default:
+ continue;
+   }
+  if (ranges[i].in_p)
+   ccode = invert_tree_comparison (ccode, false);
+  switch (ccode)
+   {
+   case GT_EXPR:
+   case GE_EXPR:
+ std::swap (rhs1, rhs2);
  ccode = swap_tree_comparison (ccode);
  break;
case LT_EXPR:
case LE_EXPR:
- if (ranges[i].in_p)
-   std::swap (rhs1, rhs2);
  break;
default:
- continue;
+ gcc_unreachable ();
}
 
   int *idx = map->get (rhs1);
@@ -3015,8 +3024,14 @@ optimize_range_tests_var_bound (enum tre
  fprintf (dump_file, "\n");
}
 
-  if (ranges[i].in_p)
-   std::swap (rhs1, rhs2);
+  operand_entry *oe = (*ops)[ranges[i].idx];
+  ranges[i].in_p = 0;
+  if (opcode == BIT_IOR_EXPR
+ || (opcode == ERROR_MARK && oe->rank == BIT_IOR_EXPR))
+   {
+ ranges[i].in_p = 1;
+ ccode = invert_tree_comparison (ccode, false);
+   }
 
   unsigned int uid = gimple_uid (stmt);
   gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
@@ -3043,7 +3058,6 @@ optimize_range_tests_var_bound (enum tre
}
   else
{
- operand_entry *oe = (*ops)[ranges[i].idx];
  tree ctype = oe->op ? TREE_TYPE (oe->op) : boolean_type_node;
  if (!INTEGRAL_TYPE_P (ctype)
  || (TREE_CODE (ctype) != BOOLEAN_TYPE
@@ -3065,7 +3079,7 @@ optimize_range_tests_var_bound (enum tre
  ranges[i].high = ranges[i].low;
}
   ranges[i].strict_overflow_p = false;
-  operand_entry *oe = (*ops)[ranges[*idx].idx];
+  oe = (*ops)[ranges[*idx].idx];
   /* Now change all the other range test immediate uses, so that
 those tests will be optimized away.  */
   if (opcode == ERROR_MARK)
--- gcc/testsuite/gcc.dg/tree-ssa/pr81588.c.jj  2017-07-31 12:30:16.917723926 
+0200
+++ gcc/testsuite/gcc.dg/tree-ssa/pr81588.c 2017-07-31 12:30:16.917723926 
+0200
@@ -0,0 +1,15 @@
+/* PR tree-optimization/81588 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-reassoc1-details" } */
+   
+extern long long int a, c;
+extern unsigned short b;
+
+/* { dg-final { scan-tree-dump-times "Optimizing range test \[^\n\r]* and 
comparison" 1 "reassoc1" } } */
+
+__attribute__((noinline, noclone)) void
+foo (void)
+{
+  if ((b > a) != (1 + (a < 0)))
+c = 0;
+}
--- gcc/testsuite/gcc.dg/pr81588.c.jj   2017-07-31 12:30:16.917723926 +0200
+++ gcc/testsuite/gcc.dg/pr81588.c  2017-07-31 12:30:16.917723926 +0200
@@ -0,0 +1,26 @@
+/* PR tree-optimization/81588 */
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+long long int a = 5011877430933453486LL, c = 1;
+unsigned short b = 24847;
+
+#include "tree-ssa/pr81588.c"
+
+int
+main ()
+{
+  foo ();
+  if (c != 0)
+__builtin_abort ();
+  a = 24846;
+  c = 1;
+  foo ();
+  if (c != 1)
+__builtin_abort ();
+  a = -5;
+  foo ();
+  if (c != 0)
+__builtin_abort ();
+  return 0;
+}
--- gcc/testsuite/gcc.c-torture/execute/pr81588.c.jj2017-07-31 
13:15:11.832181205 +0200
+++ gcc/testsuite/gcc.c-torture/execute/pr81588.c   2017-07-31 
13:15:3

[PATCH, rs6000] Fix PR81622

2017-07-31 Thread Bill Schmidt

Hi,

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81622 reports an ICE-on-invalid
for the vec_ld and vec_st built-in functions.  This fires when the last
argument of the built-in is not a pointer or array type, as is required.
We break on this during early expansion of the built-ins into tree code
during parsing.  The solution, as with other ill-formed uses of these
built-ins, is to avoid the early expansion when the argument has an invalid
type, so that normal error handling can kick in later.

(The long-term solution is to move the vec_ld and vec_st built-ins to the
gimple folding work that Will Schmidt has been doing, but that hasn't
happened yet.)

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
Is this ok for trunk and GCC 7?  I'd like to get it into 7.2 since it
is a 7 regression.

Thanks,
Bill


[gcc]

2017-07-31  Bill Schmidt  

PR target/81622
* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin):
Don't expand vec_ld or vec_st early if the last argument isn't a
pointer or array type.

[gcc/testsuite]

2017-07-31  Bill Schmidt  

PR target/81622
* gcc.target/powerpc/pr81622.c: New file.


Index: gcc/config/rs6000/rs6000-c.c
===
--- gcc/config/rs6000/rs6000-c.c(revision 250721)
+++ gcc/config/rs6000/rs6000-c.c(working copy)
@@ -6454,10 +6454,13 @@ altivec_resolve_overloaded_builtin (location_t loc
  consider this for all memory access built-ins.
 
  When -maltivec=be is specified, or the wrong number of arguments
- is provided, simply punt to existing built-in processing.  */
+ is provided, or the second argument isn't a pointer, simply punt
+ to existing built-in processing.  */
   if (fcode == ALTIVEC_BUILTIN_VEC_LD
   && (BYTES_BIG_ENDIAN || !VECTOR_ELT_ORDER_BIG)
-  && nargs == 2)
+  && nargs == 2
+  && (POINTER_TYPE_P ((*arglist)[1])
+ || TREE_CODE (TREE_TYPE ((*arglist)[1])) == ARRAY_TYPE))
 {
   tree arg0 = (*arglist)[0];
   tree arg1 = (*arglist)[1];
@@ -6528,7 +6531,9 @@ altivec_resolve_overloaded_builtin (location_t loc
   /* Similarly for stvx.  */
   if (fcode == ALTIVEC_BUILTIN_VEC_ST
   && (BYTES_BIG_ENDIAN || !VECTOR_ELT_ORDER_BIG)
-  && nargs == 3)
+  && nargs == 3
+  && (POINTER_TYPE_P ((*arglist)[2])
+ || TREE_CODE (TREE_TYPE ((*arglist)[2])) == ARRAY_TYPE))
 {
   tree arg0 = (*arglist)[0];
   tree arg1 = (*arglist)[1];
Index: gcc/testsuite/gcc.target/powerpc/pr81622.c
===
--- gcc/testsuite/gcc.target/powerpc/pr81622.c  (nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/pr81622.c  (working copy)
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+
+void a()
+{
+  __builtin_vec_ld (1, 2); /* { dg-error "invalid parameter combination for 
AltiVec intrinsic __builtin_vec_ld" } */
+}
+
+void b()
+{
+  __builtin_vec_st (0, 1, 2); /* { dg-error "invalid parameter combination for 
AltiVec intrinsic __builtin_vec_st" } */
+}

Re: [PATCH, rs6000] Fix PR81622

2017-07-31 Thread Jakub Jelinek

On Mon, Jul 31, 2017 at 11:19:26AM -0500, Bill Schmidt wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81622 reports an ICE-on-invalid
> for the vec_ld and vec_st built-in functions.  This fires when the last
> argument of the built-in is not a pointer or array type, as is required.
> We break on this during early expansion of the built-ins into tree code
> during parsing.  The solution, as with other ill-formed uses of these
> built-ins, is to avoid the early expansion when the argument has an invalid
> type, so that normal error handling can kick in later.
> 
> (The long-term solution is to move the vec_ld and vec_st built-ins to the
> gimple folding work that Will Schmidt has been doing, but that hasn't
> happened yet.)
> 
> Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
> Is this ok for trunk and GCC 7?  I'd like to get it into 7.2 since it
> is a 7 regression.

See the patch I've attached in the PR, this isn't sufficient
(and for the ARRAY_TYPE I wonder if you can ever see there ARRAY_TYPE),
the function has various other issues, including e.g. ICE on
vec_cmpne (1, 2) with -mpower9.

Jakub

Re: [PATCH 1/2] x86,s390: add compiler memory barriers when expanding atomic_thread_fence (PR 80640)

2017-07-31 Thread Alexander Monakov

On Mon, 31 Jul 2017, Jeff Law wrote:
> >> In the middle end patch, do we need a barrier before the fence as well?
> >> The post-fence barrier prevents reordering the fence with anything which
> >> follows the fence.  But do we have to also prevent reordering the fence
> >> with prior instructions with any of the memory models?  This isn't my
> >> area of expertise, so if it's dumb question, don't hesitate to let me
> >> know :-)
> > 
> > That depends on how pessimistic we want to be with respect to backend
> > getting it wrong.  My expectation here is that if a backend emits non-empty
> > RTL, the produced sequence for the fence itself acts as a compiler memory
> > barrier.
> Perhaps. But do we really want to rely on that?  EMitting a scheduling
> barrier prior to these atomics is virtually free.

Please consider that expand_mem_thread_fence is used to place fences around
seq-cst atomic loads&stores when the backend doesn't provide a direct pattern.
With compiler barriers on both sides of the machine barrier, the generated
sequence for a seq-cst atomic load will be 7 insns:

  asm volatile ("":::"memory");
  machine_seq_cst_fence ();
  asm volatile ("":::"memory");
  dst = mem[src];
  asm volatile ("":::"memory");
  machine_seq_cst_fence ();
  asm volatile ("":::"memory");

I can easily imagine people looking at RTL dumps with this overkill fencing
being unhappy about this.

I'd be more happy with detecting empty expansion via get_last_insn ().

Thanks.
Alexander

Re: C PATCH to further improve diagnostic for -Wsign-compare (PR c/81417)

2017-07-31 Thread Martin Sebor


On 07/31/2017 10:05 AM, Marek Polacek wrote:

On Mon, Jul 31, 2017 at 09:54:03AM -0600, Martin Sebor wrote:

On 07/31/2017 08:14 AM, Marek Polacek wrote:

This patch improves the diagnostic of -Wsign-compare for ?: by also printing
the types, similarly to my recent patch.  But we can do even better here if we
actually point to the operand in question, so I passed the locations of the
operands from the parser.  So instead of

x.c:8:16: warning: signed and unsigned type in conditional expression 
[-Wsign-compare]
   return x ? y : -1;
^
you'll now see:

x.c:8:18: warning: operand of conditional expression changes signedness: 'int' 
to 'unsigned int' [-Wsign-compare]
   return x ? y : -1;
  ^


I like that this is more informative than the last warning you
committed for this bug: it says what type the operand is converted
to.  The last one only shows what the types of the operands are but
leaves users guessing as to what that might mean (integer promotion
rules are often poorly understood).  Where the last warning prints

  comparison of integer expressions of different signedness: ‘int’ and
‘unsigned int’

it would be nice to go back and add this detail to it as well, and
have it print something like this instead:

  comparison of integer expressions of different signedness changes type of
the second operand from ‘int’ to ‘unsigned int’

Where constant expressions are involved it would also be helpful
to show the result of the conversion.  For instance:

  comparison between ‘int’ and ‘unsigned int’ changes the value of the
second operand from ‘-1’ to ‘4294967296’


Hmm, interesting.  I could do that.  How do other people feel about this?


Since you ask for input from others, let me mention that I also
would find it helpful if we could come up with some sort of high
level direction on what information to include in diagnostics and
how to present it (e.g., a section on the GCC Diagnostic Guidelines
page on the Wiki).  Not only would it help guide us in implementing
new diagnostics but it would also result in a more consistent and
uniform user experience.

For what it's worth, my own preference is to err on the side of
providing more detail rather than less so the changes I made in
r248431 to -Wconversion reflect that.  So for example there, GCC
might now print:

  unsigned conversion from ‘int‘ to ‘unsigned char‘ changes value from 
‘-128‘ to ‘128‘"


My suggestions above were based on the style I chose here.

Martin

[PATCH, Fortran] Correctly set -fd-lines-as-comments with -fdec

2017-07-31 Thread Fritz Reese

All,

This is a simple patch. The original intent was for -fdec to set
-fd-lines-as-comments by default if flag_d_lines was unspecified by
the user. However, currently flag_d_lines is interrogated in
set_dec_flags(), usually before its final value is determined. The
attached patch fixes this behavior to work as intended.

Any objections for trunk? If not I think it is safe to commit.

---
Fritz Reese

>From e2761d73e818a5095bcc130ddbafe27022e25ba6 Mon Sep 17 00:00:00 2001
From: Fritz Reese 
Date: Mon, 31 Jul 2017 12:46:10 -0400
Subject: [PATCH] Correctly set -fd-lines-as-comments with -fdec

gcc/fortran/
* options.c (set_dec_flags, gfc_post_options): Set flag_d_lines
with -fdec when not set by user.

gcc/testsuite/gfortran.dg/
* dec_d_lines_1.f, dec_d_lines_2.f: New.
From e2761d73e818a5095bcc130ddbafe27022e25ba6 Mon Sep 17 00:00:00 2001
From: Fritz Reese 
Date: Mon, 31 Jul 2017 12:46:10 -0400
Subject: [PATCH] Don't override -fd-lines-as-code with -fdec.

gcc/fortran/
	* options.c (set_dec_flags, gfc_post_options): Only set flag_d_lines
	with -fdec when not set by user.

gcc/testsuite/gfortran.dg/
	* dec_d_lines_1.f, dec_d_lines_2.f: New.
---
 gcc/fortran/options.c | 14 +-
 gcc/testsuite/gfortran.dg/dec_d_lines_1.f |  9 +
 gcc/testsuite/gfortran.dg/dec_d_lines_2.f |  8 
 3 files changed, 26 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/dec_d_lines_1.f
 create mode 100644 gcc/testsuite/gfortran.dg/dec_d_lines_2.f

diff --git a/gcc/fortran/options.c b/gcc/fortran/options.c
index 1af76aa81ec..8cb861bf513 100644
--- a/gcc/fortran/options.c
+++ b/gcc/fortran/options.c
@@ -57,9 +57,6 @@ set_dec_flags (int value)
 | GFC_STD_GNU | GFC_STD_LEGACY;
   gfc_option.warn_std &= ~(GFC_STD_LEGACY | GFC_STD_F95_DEL);
 
-  /* Set -fd-lines-as-comments by default.  */
-  if (value && gfc_current_form != FORM_FREE && gfc_option.flag_d_lines == -1)
-gfc_option.flag_d_lines = 0;
 
   /* Set other DEC compatibility extensions.  */
   flag_dollar_ok |= value;
@@ -337,8 +334,15 @@ gfc_post_options (const char **pfilename)
 	diagnostic_classify_diagnostic (global_dc, OPT_Wline_truncation,
 	DK_ERROR, UNKNOWN_LOCATION);
 }
-  else if (warn_line_truncation == -1)
-warn_line_truncation = 0;
+  else
+{
+  /* With -fdec, set -fd-lines-as-comments by default in fixed form.  */
+  if (flag_dec && gfc_option.flag_d_lines == -1)
+	gfc_option.flag_d_lines = 0;
+
+  if (warn_line_truncation == -1)
+	warn_line_truncation = 0;
+}
 
   /* If -pedantic, warn about the use of GNU extensions.  */
   if (pedantic && (gfc_option.allow_std & GFC_STD_GNU) != 0)
diff --git a/gcc/testsuite/gfortran.dg/dec_d_lines_1.f b/gcc/testsuite/gfortran.dg/dec_d_lines_1.f
new file mode 100644
index 000..2cc7a01daff
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/dec_d_lines_1.f
@@ -0,0 +1,9 @@
+! { dg-do compile }
+! { dg-options "-ffixed-form -fd-lines-as-code -fdec" }
+!
+! Ensure -fd-lines-as-code is not overridden by -fdec.
+!
+  i = 0
+d end
+  subroutine s
+D end
diff --git a/gcc/testsuite/gfortran.dg/dec_d_lines_2.f b/gcc/testsuite/gfortran.dg/dec_d_lines_2.f
new file mode 100644
index 000..31eaf5f2328
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/dec_d_lines_2.f
@@ -0,0 +1,8 @@
+! { dg-do compile }
+! { dg-options "-ffixed-form -fdec" }
+!
+! Ensure -fd-lines-as-comments is enabled by default with -fdec.
+!
+d This is a comment.
+D This line, too.
+  end
-- 
2.12.2

[PATCH, Fortran] Slight cleanup in set_dec_flags

2017-07-31 Thread Fritz Reese

All,

Attached is a trivial patch which exists for posterity: currently
set_dec_flags(int value) enables support for legacy standards and
disables warnings for them etc... Unfortunately it does this
irrespective of the int value parameter. Since set_dec_flags(0) is
called on initialization, the legacy standards are enabled even when
-fdec is not specified on the command-line. However, since
set_dec_flags is called *just before* set_default_std_flags() in
gfc_init_options() the standards are correctly reset to defaults. I
believe a guard around setting the legacy standards from set_dec_flags
to ensure they are only set with value != 0 is appropriate for clarity
and correctness.

I plan to commit to trunk soon if there no objections since the patch
is trivial and does not effectively change compiler behavior at
present.

---
Fritz Reese


>From 805b4909deb450216c1dc522d834173455baca69 Mon Sep 17 00:00:00 2001
From: Fritz Reese 
Date: Mon, 31 Jul 2017 12:58:30 -0400
Subject: [PATCH] Only set legacy std in set_dec_flags when value!=0 for
 posterity

gcc/fortran/
* options.c (set_dec_flags): Only set legacy standards when value
is not zero.
From 805b4909deb450216c1dc522d834173455baca69 Mon Sep 17 00:00:00 2001
From: Fritz Reese 
Date: Mon, 31 Jul 2017 12:58:30 -0400
Subject: [PATCH] Only set legacy std in set_dec_flags when value!=0 for
 posterity

gcc/fortran/
	* options.c (set_dec_flags): Only set legacy standards when value
	is not zero.
---
 gcc/fortran/options.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/gcc/fortran/options.c b/gcc/fortran/options.c
index 8cb861bf513..c2f8b8a9398 100644
--- a/gcc/fortran/options.c
+++ b/gcc/fortran/options.c
@@ -52,11 +52,13 @@ set_default_std_flags (void)
 static void
 set_dec_flags (int value)
 {
-  /* Allow legacy code without warnings.  */
-  gfc_option.allow_std |= GFC_STD_F95_OBS | GFC_STD_F95_DEL
-| GFC_STD_GNU | GFC_STD_LEGACY;
-  gfc_option.warn_std &= ~(GFC_STD_LEGACY | GFC_STD_F95_DEL);
-
+  if (value)
+{
+  /* Allow legacy code without warnings.  */
+  gfc_option.allow_std |= GFC_STD_F95_OBS | GFC_STD_F95_DEL
+| GFC_STD_GNU | GFC_STD_LEGACY;
+  gfc_option.warn_std &= ~(GFC_STD_LEGACY | GFC_STD_F95_DEL);
+}
 
   /* Set other DEC compatibility extensions.  */
   flag_dollar_ok |= value;
-- 
2.12.2

Re: [PATCH 0/6] [i386] PR80969 Fix ICE with -mabi=ms -mavx512f

2017-07-31 Thread Daniel Santos

Well I just learned how to test 32-bit earlier and I've uncovered a 
problem when running 32-bit tests.  Do you want me to commit the the two 
patches (squashed together) in the mean time?


Thanks,
Daniel

Re: [PATCH] libstdc++: Support std::is_aggregate on clang++ (was [cfe-dev] clang++: std::is_aggregate unusable with clang-5.0/libstdc++-7)

2017-07-31 Thread Tim Song

On Mon, Jul 31, 2017 at 11:13 AM, Tim Song  wrote:
>
> https://clang.llvm.org/docs/LanguageExtensions.html#checks-for-type-trait-primitives
> seems to suggest using __has_extension instead.

Hmm, that doesn't work. Oh well.

Re: [PATCH] enhance overflow and truncation detection in strncpy and strncat (PR 81117)

2017-07-31 Thread Jeff Law

On 07/08/2017 02:45 PM, Martin Sebor wrote:
> PR 81117 asks for improved detection of common misuses(*) of
> strncpy and strncat.  The attached patch is my solution.  It
> consists of three related sets of changes:
> 
> 1) Adds a new option, -Wstringop-truncation, that diagnoses calls
> to strncpy, and stpncpy (and also strncat) that truncate the copy.
> This helps highlight the common but incorrect assumption that
> the first two functions NUL-terminate the copy (see, for example,
> CWE-170)  For strncat, it helps detect cases of inadvertent
> truncation of the source string by passing in a bound that's
> less than or equal to its length.
> 
> 2) Enhances -Wstringon-overflow to diagnose calls of the form
> strncpy(D, S, N) where the bound N depends on a call to strlen(S).
> This misuse is common in legacy code when, often in response to
> the adoption of a secure coding initiative, while replacing uses
> of strcpy with strncpy, the engineer either makes a mistake, or
> doesn't have a good enough understanding of how the function works,
> or does only the bare minimum to satisfy the requirement to avoid
> using strcpy without actually improving anything.
> 
> 3) Enhances -Wsizeof-pointer-memaccess to also warn about uses of
> the functions to copy an array to a destination of an unknown size
> that specify the size of the array as the bound.  Given the
> pervasive [mis]use of strncpy to bound the copy to the size of
> the destination, instances like this suggest a bug: a possible
> buffer overflow due to an excessive bound (see, for example,
> CWE-806).  In cases when the call is safe, it's equivalent to
> the corresponding call to memcpy which is clearer and can be
> more efficient.
> 
> Martin
> 
> PS By coincidence rather than by intent, the strncat warnings
> are a superset of Clang's -Wstrncat-size.  AFAICS, Clang only
> warns when the destination is an array of known size and
> doesn't have a corresponding warning for strncpy.
> 
> [*] Here's some background into these misuses.
> 
> The purpose of the historical strncpy function introduced in V7
> UNIX was to completely fill an array of chars with data, either
> by copying an initial portion of a source string, or by clearing
> it.  I.e., its purpose wasn't to create NUL-terminated strings.
> An example of its use was to fill the directory entry d_name
> array (dirent::d_name) with the name of a file.
> 
> The original purpose of the strncat function, on the other hand,
> was to append a not necessarily NUL-terminated array of chars
> to a string to form a NUL-terminated concatenation of the two.
> An example use case is appending a directory entry (struct
> dirent::d_name) that need not be NUL-terminated, to form
> a pathname which does.
> 
> Largely due to a misunderstanding of the functions' purpose they
> have become commonly used (and misused) to make "safe," bounded
> string copies by safeguarding against accidentally overflowing
> the destination.  This has led to great many bugs and security
> vulnerabilities.
> 
> 
> gcc-81117.diff
> 
> 
> PR c/81117 - Improve buffer overflow checking in strncpy
> 
> gcc/ChangeLog:
> 
>   PR c/81117
>   * builtins.c (compute_objsize): Handle arrays that
>   compute_builtin_object_size likes to fail for.
>   (check_strncpy_sizes): New function.
>   (expand_builtin_strncpy): Call check_strncpy_sizes.
>   * doc/invoke.texi (-Wsizeof-pointer-memaccess): Document enhancement.
>   (-Wstringop-truncation): Document new option.
>   * gimple-fold.c (gimple_fold_builtin_strncpy): Implement
>   -Wstringop-truncation.
>   (gimple_fold_builtin_strncat): Same.
>   * gimple.c (gimple_build_call_from_tree): Set call location.
>   * tree-ssa-strlen.c (strlen_to_stridx): New global variable.
>   (is_strlen_related_p): New function.
>   (check_builtin_strxncpy, check_builtin_strncat): Same.
>   handle_builtin_strlen): Use strlen_to_stridx.
>   (strlen_optimize_stmt): Handle flavors of strncat, strncpy, and
>   stpncpy.
>   Use strlen_to_stridx.
>   (pass_strlen::execute): Release strlen_to_stridx.
> 
> gcc/ada/ChangeLog:
> 
>   PR c/81117
>   * adadecode.c (__gnat_decode): Replace pointless strncpy with
>   memcpy.
>   * argv.c (__gnat_fill_arg): Same.
> 
> gcc/c-family/ChangeLog:
> 
>   PR c/81117
>   * c-common.c (resort_sorted_fields): Replace pointless strncpy
>   with memcpy.
>   * c-warn.c (sizeof_pointer_memaccess_warning): Handle arrays.
>   * c.opt (-Wstriingop-truncation): New option.
> 
> gcc/fortran/ChangeLog:
> 
>   PR c/81117
>   * decl.c (build_sym): Replace pointless strncpy with strcpy.
> 
> gcc/objc/ChangeLog:
> 
>   PR c/81117
>   * objc-encoding.c (encode_type): Replace pointless strncpy with
>   memcpy.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR c/81117
>   * c-c++-common/Wsizeof-pointer-memaccess3.c: New test.
>   * c-c++-common/Wstringop-overflow.c: New test.
>

Re: [PATCH 1/2] x86,s390: add compiler memory barriers when expanding atomic_thread_fence (PR 80640)

2017-07-31 Thread Jeff Law

On 07/31/2017 11:02 AM, Alexander Monakov wrote:
> On Mon, 31 Jul 2017, Jeff Law wrote:
 In the middle end patch, do we need a barrier before the fence as well?
 The post-fence barrier prevents reordering the fence with anything which
 follows the fence.  But do we have to also prevent reordering the fence
 with prior instructions with any of the memory models?  This isn't my
 area of expertise, so if it's dumb question, don't hesitate to let me
 know :-)
>>>
>>> That depends on how pessimistic we want to be with respect to backend
>>> getting it wrong.  My expectation here is that if a backend emits non-empty
>>> RTL, the produced sequence for the fence itself acts as a compiler memory
>>> barrier.
>> Perhaps. But do we really want to rely on that?  EMitting a scheduling
>> barrier prior to these atomics is virtually free.
> 
> Please consider that expand_mem_thread_fence is used to place fences around
> seq-cst atomic loads&stores when the backend doesn't provide a direct pattern.
> With compiler barriers on both sides of the machine barrier, the generated
> sequence for a seq-cst atomic load will be 7 insns:
> 
>   asm volatile ("":::"memory");
>   machine_seq_cst_fence ();
>   asm volatile ("":::"memory");
>   dst = mem[src];
>   asm volatile ("":::"memory");
>   machine_seq_cst_fence ();
>   asm volatile ("":::"memory");
> 
> I can easily imagine people looking at RTL dumps with this overkill fencing
> being unhappy about this.
But the extra fences aren't actually going to impact anything except
perhaps an unmeasurable compile-time hit.  ie, it may look bad, but I'd
have a hard time believing it matters in practice.

> 
> I'd be more happy with detecting empty expansion via get_last_insn ().
That seems like an unnecessary complication to me.  I'd prefer instead
to just emit the necessary fencing in the generic code and update the MD
docs so that maintainers know they don't have to emit the RTL fences
themselves.

Jeff

Re: [PATCH], PR target/81593, Optimize PowerPC vector sets coming from a vector extracts

2017-07-31 Thread Michael Meissner

On Fri, Jul 28, 2017 at 04:08:50PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Jul 27, 2017 at 07:21:14PM -0400, Michael Meissner wrote:
> > This patches optimizes the PowerPC vector set operation for 64-bit doubles 
> > and
> > longs where the elements in the vector set may have been extracted from 
> > another
> > vector (PR target/81593):
> 
> > * config/rs6000/rs6000.c (rs6000_emit_xxpermdi): New function to
> > emit XXPERMDI accessing either double word in either vector
> > register inputs.
> 
> "emit" is not a good name for this: that is generally used for something
> that does emit_insn, i.e. put an insn in the instruction stream.  This
> function returns a string a define_insn can return.  For the rl* insns
> I called the similar functions rs6000_insn_for_*, maybe something like
> that is better here?

Yeah, I should have used rs6000_output_xxpermdi or similar (or output_xxpermdi,
etc.), which is what the other functions used.

> > +/* Emit a XXPERMDI instruction that can extract from either double word of 
> > the
> > +   two arguments.  ELEMENT1 and ELEMENT2 are either NULL or they are 0/1 
> > giving
> > +   which double word to be used for the operand.  */
> > +
> > +const char *
> > +rs6000_emit_xxpermdi (rtx operands[], rtx element1, rtx element2)
> > +{
> > +  int op1_dword = (!element1) ? 0 : INTVAL (element1);
> > +  int op2_dword = (!element2) ? 0 : INTVAL (element2);
> > +
> > +  gcc_assert (IN_RANGE (op1_dword | op2_dword, 0, 1));
> > +
> > +  if (BYTES_BIG_ENDIAN)
> > +{
> > +  operands[3] = GEN_INT (2*op1_dword + op2_dword);
> > +  return "xxpermdi %x0,%x1,%x2,%3";
> > +}
> > +  else
> > +{
> > +  if (element1)
> > +   op1_dword = 1 - op1_dword;
> > +
> > +  if (element2)
> > +   op2_dword = 1 - op2_dword;
> > +
> > +  operands[3] = GEN_INT (op1_dword + 2*op2_dword);
> > +  return "xxpermdi %x0,%x2,%x1,%3";
> > +}
> > +}
> 
> I think calling this with the rtx elementN args makes this only more
> complicated (the function comment doesn't say what they are or what
> NULL means, btw).

Ok, let me think on it.

> 
> >  (define_insn "vsx_concat_"
> > -  [(set (match_operand:VSX_D 0 "gpc_reg_operand" "=,we")
> > +  [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wa,we")
> > (vec_concat:VSX_D
> > -(match_operand: 1 "gpc_reg_operand" ",b")
> > -(match_operand: 2 "gpc_reg_operand" ",b")))]
> > +(match_operand: 1 "gpc_reg_operand" "wa,b")
> > +(match_operand: 2 "gpc_reg_operand" "wa,b")))]
> >"VECTOR_MEM_VSX_P (mode)"
> >  {
> >if (which_alternative == 0)
> > -return (BYTES_BIG_ENDIAN
> > -   ? "xxpermdi %x0,%x1,%x2,0"
> > -   : "xxpermdi %x0,%x2,%x1,0");
> > +return rs6000_emit_xxpermdi (operands, NULL_RTX, NULL_RTX);
> >  
> >else if (which_alternative == 1)
> > -return (BYTES_BIG_ENDIAN
> > +return (VECTOR_ELT_ORDER_BIG
> > ? "mtvsrdd %x0,%1,%2"
> > : "mtvsrdd %x0,%2,%1");
> 
> This one could be
> 
> {
>   if (!BYTES_BIG_ENDIAN)
> std::swap (operands[1], operands[2]);
> 
>   switch (which_alternative)
> {
> case 0:
>   return "xxpermdi %x0,%x1,%x2,0";
> case 1:
>   return "mtvsrdd %x0,%1,%2";
> default:
>   gcc_unreachable ();
> }
> }

> (Could/should we use xxmrghd instead?  Do all supported assemblers know
> that extended mnemonic, is it actually more readable?)

For me no, xxpermdi is clearer.  But if you want xxmrghd, I can do it.

> > --- gcc/testsuite/gcc.target/powerpc/vsx-extract-7.c
> > (svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc/vsx-extract-7.c)
> >  (revision 0)
> > +++ gcc/testsuite/gcc.target/powerpc/vsx-extract-7.c
> > (.../gcc/testsuite/gcc.target/powerpc/vsx-extract-7.c)  (revision 250640)
> > @@ -0,0 +1,15 @@
> > +/* { dg-do compile { target { powerpc*-*-* } } } */
> > +/* { dg-skip-if "" { powerpc*-*-darwin* } } */
> > +/* { dg-require-effective-target powerpc_vsx_ok } */
> > +/* { dg-options "-O2 -mvsx" } */
> > +
> > +vector double
> > +test_vpasted (vector double high, vector double low)
> > +{
> > +  vector double res;
> > +  res[1] = high[1];
> > +  res[0] = low[0];
> > +  return res;
> > +}
> > +
> > +/* { dg-final { scan-assembler-times {\mxxpermdi\M} 1 } } */
> 
> In this and the other testcase, should you test no other insns at all
> are generated?

It is kind of hard to test a negative, without trying to guess what possible
instructions could be generated.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Re: [PATCH 4/6] lra-assigns.c: give up on qsort checking in assign_by_spills

2017-07-31 Thread Jeff Law

On 07/15/2017 02:47 PM, Alexander Monakov wrote:
> The reload_pseudo_compare_func comparator, when used from assign_by_spills,
> can be non-transitive, indicating A < B < C < A if both A and C satisfy
> !bitmap_bit_p (&non_reload_pseudos, rAC), but B does not.
> 
> This function was originally a proper comparator, and the problematic
> clause was added to fix PR 57878:
> https://gcc.gnu.org/ml/gcc-patches/2013-07/msg00732.html
> 
> That the comparator is invalid implies that that PR, if it still exists,
> can reappear (but probably under more complicated circumstances).
> 
> This looks like a sensitive area, so disabling checking is the only
> obvious approach.
> 
>   * lra-assigns.c (reload_pseudo_compare_func): Add a FIXME.
> (assign_by_spills): Use non-checking qsort.
This is OK once the final comparator consistency checking patch is approved.

jeff

Re: [PATCH], PR target/81593, Optimize PowerPC vector sets coming from a vector extracts

2017-07-31 Thread Michael Meissner

On Sun, Jul 30, 2017 at 09:00:58AM -0500, Bill Schmidt wrote:
> >> (define_insn "vsx_concat_"
> >> -  [(set (match_operand:VSX_D 0 "gpc_reg_operand" "=,we")
> >> +  [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wa,we")
> >>(vec_concat:VSX_D
> >> -   (match_operand: 1 "gpc_reg_operand" ",b")
> >> -   (match_operand: 2 "gpc_reg_operand" ",b")))]
> >> +   (match_operand: 1 "gpc_reg_operand" "wa,b")
> >> +   (match_operand: 2 "gpc_reg_operand" "wa,b")))]
> >>   "VECTOR_MEM_VSX_P (mode)"
> >> {
> >>   if (which_alternative == 0)
> >> -return (BYTES_BIG_ENDIAN
> >> -  ? "xxpermdi %x0,%x1,%x2,0"
> >> -  : "xxpermdi %x0,%x2,%x1,0");
> >> +return rs6000_emit_xxpermdi (operands, NULL_RTX, NULL_RTX);
> >> 
> >>   else if (which_alternative == 1)
> >> -return (BYTES_BIG_ENDIAN
> >> +return (VECTOR_ELT_ORDER_BIG
> >>? "mtvsrdd %x0,%1,%2"
> >>: "mtvsrdd %x0,%2,%1");
> > 
> > This one could be
> > 
> > {
> >  if (!BYTES_BIG_ENDIAN)
> 
> !VECTOR_ELT_ORDER_BIG (to accommodate -maltivec=be).  (We have some general 
> bitrot associated with -maltivec=be that needs to be addressed, or we need to 
> give up on it altogether.  Still of two minds about this.)
> 
> Bill

In this case, I believe I tested -maltivec=be, and BYTES_BIG_ENDIAN is correct
(I originally had it using VECTOR_ELT_ORDER_BIG and got failures).  But I need
to look at it again.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Re: [PATCH], PR target/81593, Optimize PowerPC vector sets coming from a vector extracts

2017-07-31 Thread Bill Schmidt


> On Jul 31, 2017, at 12:40 PM, Michael Meissner  
> wrote:
> 
> On Sun, Jul 30, 2017 at 09:00:58AM -0500, Bill Schmidt wrote:
 (define_insn "vsx_concat_"
 -  [(set (match_operand:VSX_D 0 "gpc_reg_operand" "=,we")
 +  [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wa,we")
(vec_concat:VSX_D
 -   (match_operand: 1 "gpc_reg_operand" ",b")
 -   (match_operand: 2 "gpc_reg_operand" ",b")))]
 +   (match_operand: 1 "gpc_reg_operand" "wa,b")
 +   (match_operand: 2 "gpc_reg_operand" "wa,b")))]
  "VECTOR_MEM_VSX_P (mode)"
 {
  if (which_alternative == 0)
 -return (BYTES_BIG_ENDIAN
 -  ? "xxpermdi %x0,%x1,%x2,0"
 -  : "xxpermdi %x0,%x2,%x1,0");
 +return rs6000_emit_xxpermdi (operands, NULL_RTX, NULL_RTX);
 
  else if (which_alternative == 1)
 -return (BYTES_BIG_ENDIAN
 +return (VECTOR_ELT_ORDER_BIG
? "mtvsrdd %x0,%1,%2"
: "mtvsrdd %x0,%2,%1");
>>> 
>>> This one could be
>>> 
>>> {
>>> if (!BYTES_BIG_ENDIAN)
>> 
>> !VECTOR_ELT_ORDER_BIG (to accommodate -maltivec=be).  (We have some general 
>> bitrot associated with -maltivec=be that needs to be addressed, or we need 
>> to give up on it altogether.  Still of two minds about this.)
>> 
>> Bill
> 
> In this case, I believe I tested -maltivec=be, and BYTES_BIG_ENDIAN is correct
> (I originally had it using VECTOR_ELT_ORDER_BIG and got failures).  But I need
> to look at it again.

Hi Mike,

You misunderstand me, I think you had it right (you did move to 
VECTOR_ELT_ORDER_BIG here)
but I just wanted to clarify that Segher's suggestion would also need to use 
VECTOR_ELT_ORDER_BIG.

Thanks,
Bill

> 
> -- 
> Michael Meissner, IBM
> IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
> email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Re: [PATCH 5/6] haifa-sched.c: give up qsort checking when autoprefetch heuristic is in use

2017-07-31 Thread Jeff Law

On 07/15/2017 02:47 PM, Alexander Monakov wrote:
> The autopref_rank_for_schedule sub-comparator and its subroutine
> autopref_rank_data lack transitivity.  Skip checking if they are in use.
> 
> This heuristic is disabled by default everywhere except ARM and AArch64,
> so on other targets this does not suppress  checking all the time.
> 
>   * haifa-sched.c (ready_sort_real): Disable qsort checking if
> autoprefetch heuristic will be used.
> (autopref_rank_for_schedule): Add FIXME.
This is also OK once the final patch for consistency checking is approved.

jeff

Re: [PATCH] Fix libquadmath regression (bootstrap failure) on FreeBSD

2017-07-31 Thread Joseph Myers

On Wed, 19 Jul 2017, Jakub Jelinek wrote:

> >  And can you please push this
> > to glibc as well (which I do not have access to)?
> 
> glibc uses u_int32_t in all the places of *powl.c where libquadmath
> uses uint32_t, and various other files, so I guess it is intentional.

I'd consider it a relic of old fdlibm code, i.e. code (all over libm) that 
hasn't yet been cleaned up to use uintN_t instead of u_intN_t internally 
(but which it would be desirable to clean up like that).

I think it would be desirable to be able to do libquadmath imports via a 
script that automatically applies substitutions to turn glibc code into 
code that can be used in libquadmath.  Right now required substitutions 
would include s/u_int/uint/, and converting L() to apply appropriate 
suffixes, and converting function calls to the *q form, while some other 
substitutions such as s/_Float128/__float128/ would not actually be needed 
for the code to build but would make the results of substitution closer to 
the present libquadmath contents, and so easier to compare.  If then the 
number of substitutions needed can be reduced over time (e.g. defining L() 
as a macro in quadmath-imp.h rather than substituting it in the sources), 
that would be good.  Moving glibc code (across libm, not just ldbl-128) to 
use uintN_t would be in that category.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] Add -std=c++2a

2017-07-31 Thread Joseph Myers

On Thu, 20 Jul 2017, Andrew Sutton wrote:

> This adds a new C++ dialect, enabled by -std=c++2a.

I don't see documentation here.  I'd expect additions to invoke.texi and 
standards.texi discussing the new option.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH 1/2] x86,s390: add compiler memory barriers when expanding atomic_thread_fence (PR 80640)

2017-07-31 Thread Alexander Monakov

On Mon, 31 Jul 2017, Jeff Law wrote:
> > Please consider that expand_mem_thread_fence is used to place fences around
> > seq-cst atomic loads&stores when the backend doesn't provide a direct 
> > pattern.
> > With compiler barriers on both sides of the machine barrier, the generated
> > sequence for a seq-cst atomic load will be 7 insns:
> > 
> >   asm volatile ("":::"memory");
> >   machine_seq_cst_fence ();
> >   asm volatile ("":::"memory");
> >   dst = mem[src];
> >   asm volatile ("":::"memory");
> >   machine_seq_cst_fence ();
> >   asm volatile ("":::"memory");
> > 
> > I can easily imagine people looking at RTL dumps with this overkill fencing
> > being unhappy about this.
> But the extra fences aren't actually going to impact anything except
> perhaps an unmeasurable compile-time hit.  ie, it may look bad, but I'd
> have a hard time believing it matters in practice.

I agree it doesn't matter that much from compile-time/memory standpoint.
I meant it matters from the standpoint of a person working with the RTL dumps,
i.e. having to work through all that, especially if they need to work with
non-slim dumps.

> > I'd be more happy with detecting empty expansion via get_last_insn ().
> That seems like an unnecessary complication to me.

I think it's quite simple by GCC standards:

  {
rtx_insn last = get_last_insn ();
emit_insn (targetm.gen_mem_thread_fence (GEN_INT (model)));
if (last == get_last_insn () && !is_mm_relaxed (model))
  expand_asm_memory_barrier ();
  }

> I'd prefer instead to just emit the necessary fencing in the generic code
> and update the MD docs so that maintainers know they don't have to emit
> the RTL fences themselves.

I agree the docs need an update, but hopefully not in this way.  The legacy
__sync_synchronize barrier has always been required to be a compiler barrier
when expanded to RTL, and it's quite natural to use the same RTL structure
for new atomic fences as the old memory_barrier expansion.  The only problem
in current practice seen so far is with empty expansion for new patterns.

Thanks.
Alexander

Re: [PATCH 6/6] qsort comparator consistency checking

2017-07-31 Thread Jeff Law

On 07/15/2017 02:47 PM, Alexander Monakov wrote:
> This is the updated qsort comparator verifier.
> 
> Since we have vec::qsort(cmp), the patch uses the macro argument counting
> trick to redirect only the four-argument invocations of qsort to qsort_chk.
> I realize that won't win much sympathies, but a patch doing mass-renaming
> of qsort in the whole GCC codebase would win even fewer, I suspect.
> 
> The macro ENABLE_FULL_QSORT_CHECKING could be used to enable full O(n^2)
> checking, but there's no accompanying configure.ac change.  I'm not sure
> that would be useful in practice, so I'd rather turn it into #if 0.
> 
> The suppression in genmodes was needed because it's not linked with vec.o.
> 
>   * genmodes.c (calc_wider_mode): Suppress qsort macro.
> * system.h [CHECKING_P] (qsort): Redirect to qsort_chk.
> (qsort_nochk): New macro.
> (qsort_chk): Declare.
> (qsort_disable_checking): Declare.
> * vec.c (qsort_chk_error): New static function.
> (qsort_disable_checking): Define.
> (qsort_chk): New function.
I  must have missed something.  Can't you just define

qsort (BASE, NMEMB, SIZE, COMPARE) into

qsort_chk (BASE, NMEMB, SIZE, COMPARE)

That shouldn't affect the qsort from vec?  Right?  Or am I missing something

Jeff

Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-07-31 Thread Prathamesh Kulkarni

On 23 May 2017 at 19:10, Prathamesh Kulkarni
 wrote:
> On 19 May 2017 at 19:02, Jan Hubicka  wrote:
>>>
>>> * LTO and memory management
>>> This is a general question about LTO and memory management.
>>> IIUC the following sequence takes place during normal LTO:
>>> LGEN: generate_summary, write_summary
>>> WPA: read_summary, execute ipa passes, write_opt_summary
>>>
>>> So I assumed it was OK in LGEN to allocate return_callees_map in
>>> generate_summary and free it in write_summary and during WPA, allocate
>>> return_callees_map in read_summary and free it after execute (since
>>> write_opt_summary does not require return_callees_map).
>>>
>>> However with fat LTO, it seems the sequence changes for LGEN with
>>> execute phase takes place after write_summary. However since
>>> return_callees_map is freed in pure_const_write_summary and
>>> propagate_malloc() accesses it in execute stage, it results in
>>> segmentation fault.
>>>
>>> To work around this, I am using the following hack in 
>>> pure_const_write_summary:
>>> // FIXME: Do not free if -ffat-lto-objects is enabled.
>>> if (!global_options.x_flag_fat_lto_objects)
>>>   free_return_callees_map ();
>>> Is there a better approach for handling this ?
>>
>> I think most passes just do not free summaries with -flto.  We probably want
>> to fix it to make it possible to compile multiple units i.e. from plugin by
>> adding release_summaries method...
>> So I would say it is OK to do the same as others do and leak it with -flto.
>>> diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
>>> index e457166ea39..724c26e03f6 100644
>>> --- a/gcc/ipa-pure-const.c
>>> +++ b/gcc/ipa-pure-const.c
>>> @@ -56,6 +56,7 @@ along with GCC; see the file COPYING3.  If not see
>>>  #include "tree-scalar-evolution.h"
>>>  #include "intl.h"
>>>  #include "opts.h"
>>> +#include "ssa.h"
>>>
>>>  /* Lattice values for const and pure functions.  Everything starts out
>>> being const, then may drop to pure and then neither depending on
>>> @@ -69,6 +70,15 @@ enum pure_const_state_e
>>>
>>>  const char *pure_const_names[3] = {"const", "pure", "neither"};
>>>
>>> +enum malloc_state_e
>>> +{
>>> +  PURE_CONST_MALLOC_TOP,
>>> +  PURE_CONST_MALLOC,
>>> +  PURE_CONST_MALLOC_BOTTOM
>>> +};
>>
>> It took me a while to work out what PURE_CONST means here :)
>> I would just call it something like STATE_MALLOC_TOP... or so.
>> ipa_pure_const is outdated name from the time pass was doing only
>> those two.
>>> @@ -109,6 +121,10 @@ typedef struct funct_state_d * funct_state;
>>>
>>>  static vec funct_state_vec;
>>>
>>> +/* A map from node to subset of callees. The subset contains those callees
>>> + * whose return-value is returned by the node. */
>>> +static hash_map< cgraph_node *, vec* > *return_callees_map;
>>> +
>>
>> Hehe, a special case of return jump function.  We ought to support those 
>> more generally.
>> How do you keep it up to date over callgraph changes?
>>> @@ -921,6 +1055,23 @@ end:
>>>if (TREE_NOTHROW (decl))
>>>  l->can_throw = false;
>>>
>>> +  if (ipa)
>>> +{
>>> +  vec v = vNULL;
>>> +  l->malloc_state = PURE_CONST_MALLOC_BOTTOM;
>>> +  if (DECL_IS_MALLOC (decl))
>>> + l->malloc_state = PURE_CONST_MALLOC;
>>> +  else if (malloc_candidate_p (DECL_STRUCT_FUNCTION (decl), v))
>>> + {
>>> +   l->malloc_state = PURE_CONST_MALLOC_TOP;
>>> +   vec *callees_p = new vec (vNULL);
>>> +   for (unsigned i = 0; i < v.length (); ++i)
>>> + callees_p->safe_push (v[i]);
>>> +   return_callees_map->put (fn, callees_p);
>>> + }
>>> +  v.release ();
>>> +}
>>> +
>>
>> I would do non-ipa variant, too.  I think most attributes can be detected 
>> that way
>> as well.
>>
>> The patch generally makes sense to me.  It would be nice to make it easier 
>> to write such
>> a basic propagators across callgraph (perhaps adding a template doing the 
>> basic
>> propagation logic). Also I think you need to solve the problem with keeping 
>> your
>> summaries up to date across callgraph node removal and duplications.
> Thanks for the suggestions, I will try to address them in a follow-up patch.
> IIUC, I would need to modify ipa-pure-const cgraph hooks -
> add_new_function, remove_node_data, duplicate_node_data
> to keep return_callees_map up-to-date across callgraph node insertions
> and removal ?
>
> Also, if instead of having a separate data-structure like return_callees_map,
> should we rather have a flag within cgraph_edge, which marks that the
> caller may return the value of the callee ?
Hi,
Sorry for the very late response. I have attached an updated version
of the prototype patch,
which adds a non-ipa variant, and keeps return_callees_map up-to-date
across callgraph
node insertions and removal. For the non-ipa variant,
malloc_candidate_p() additionally checks
that all the "return callees" have DECL_IS_MALLOC set to true.
Bootstrapped+tested and LTO bootstrapped+tested on x86_64-unknown-linux-gnu.
Does it look OK so

Re: [PATCH 6/6] qsort comparator consistency checking

2017-07-31 Thread Alexander Monakov

On Mon, 31 Jul 2017, Jeff Law wrote:
> I  must have missed something.  Can't you just define
> 
> qsort (BASE, NMEMB, SIZE, COMPARE) into
> 
> qsort_chk (BASE, NMEMB, SIZE, COMPARE)
> 
> That shouldn't affect the qsort from vec?  Right?  Or am I missing something

If you do

  #define qsort(base, n, sz, cmp) qsort_chk (base, n, sz, cmp)

then all invocations of vec::qsort, i.e.

  myvec.qsort (mycmp);

will yield a preprocessing error due to wrong number of arguments supplied
to the qsort macro (one instead of four).

Alexander

Re: [PATCH] Make mempcpy more optimal (PR middle-end/70140).

2017-07-31 Thread Jeff Law

On 07/20/2017 12:59 AM, Martin Liška wrote:
> Hello.
> 
> Following patch does sharing of expansion for mem{p,}cpy and also strpcy 
> (with a known constant as source)
> so that we use same type of expansion (direct insns emission, direct emission 
> with a loop instruction and
> library call). As mentioned in the PR, glibc does not provide an optimized 
> version for majority of targets.
> 
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
> 
> Ready to be installed?
> Martin
> 
> gcc/testsuite/ChangeLog:
> 
> 2017-07-17  Martin Liska  
> 
>   PR middle-end/70140
>   * gcc.dg/string-opt-1.c: Adjust test-case to scan for memcpy.
> 
> gcc/ChangeLog:
> 
> 2017-07-17  Martin Liska  
> 
>   PR middle-end/70140
>   * builtins.c (expand_builtin_memcpy_args): Remove.
>   (expand_builtin_memcpy): Call newly added function
>   expand_builtin_memory_copy_args.
>   (expand_builtin_memcpy_with_bounds): Likewise.
>   (expand_builtin_mempcpy): Remove last argument.
>   (expand_builtin_mempcpy_with_bounds): Likewise.
>   (expand_builtin_memory_copy_args): New function created from
>   expand_builtin_mempcpy_args with small modifications.
>   (expand_builtin_mempcpy_args): Remove.
>   (expand_builtin_stpcpy): Remove unused argument.
>   (expand_builtin): Likewise.
>   (expand_builtin_with_bounds): Likewise.


> 
> 
> diff --git a/gcc/builtins.c b/gcc/builtins.c
> index 2deef725620..016f68d2cb6 100644
> --- a/gcc/builtins.c
> +++ b/gcc/builtins.c
> @@ -3493,94 +3419,103 @@ expand_builtin_mempcpy_with_bounds (tree exp, rtx 
> target, machine_mode mode)
>  }
>  }
>  
> -/* Helper function to do the actual work for expand_builtin_mempcpy.  The
> -   arguments to the builtin_mempcpy call DEST, SRC, and LEN are broken out
> -   so that this can also be called without constructing an actual CALL_EXPR.
> -   The other arguments and return value are the same as for
> -   expand_builtin_mempcpy.  */
> +/* Helper function to do the actual work for expand of memory copy family
> +   functions (memcpy, mempcpy, stpcpy).  Expansing should assign LEN bytes
"Expansing"?

OK with the nit fixed.


Jeff

Re: [PATCH 00/19] cleanup of memory stats prototypes

2017-07-31 Thread Jeff Law

On 07/27/2017 02:43 AM, Richard Biener wrote:
> On Thu, Jul 27, 2017 at 10:30 AM,   wrote:
>> From: Trevor Saunders 
>>
>> The preC++ way of passing information about the call site of a function was 
>> to
>> use a macro that passed __file__, __LINE__, and __FUNCTION__ to a function 
>> with
>> the same name with _stat appended to it.  The way this is now done with C++ 
>> is
>> to have arguments where the default value is __LINE__, __FILE__, and
>> __FUNCTION__ in the caller.  This has the significant advantage that if you
>> look for "^function (" you find the correct function, where in the C way of
>> doing things you need to realize its a macro and check the definition of the
>> macro to see what to look for next.  So this removes a layer of indirection,
>> and makes things somewhat more consistant in using the C++ way of doing 
>> things.
>>
>> patches independently bootstrapped and regtested on ppc64le-linux-gnu.  I
>> successfully ran make all-gcc with --enable-gather-detailed-mem-stats, but
>> couldn't complete a bootstrap before the series was applied, because the
>> ddrs_table in tree-loop-distribution.c causes memory statistics gathering to 
>> crash before the series as well as after it.  ok?
> 
> Thanks!  This was on my list of things todo...
> 
> The series is ok.
Just wanted to relay a thanks from me too.  I always find it annoying to
remember which functions are _stat thingies when debugging.  No more!

Jeff

Re: [1/2] PR 78736: New warning -Wenum-conversion

2017-07-31 Thread Prathamesh Kulkarni

On 11 July 2017 at 17:59, Prathamesh Kulkarni
 wrote:
> On 13 June 2017 at 01:47, Joseph Myers  wrote:
>> This is OK with one fix:
>>
>>> +C ObjC Var(warn_enum_conversion) Init(0) Warning LangEnabledBy(C Objc,Wall)
>>
>> I believe the LangEnabledBy arguments are case-sensitive, so you need to
>> have ObjC not Objc there for it to work correctly.  (*.opt parsing isn't
>> very good at detecting typos and giving errors rather than silently
>> ignoring things.)
> Hi,
> Sorry for the late response, I was on a vacation.
> The attached patch is rebased and bootstrap+tested on 
> x86_64-unknown-linux-gnu.
> I have modified it slightly to not warn for enums with different names
> but having same value ranges.
> For eg:
> enum e1 { e1_1, e1_2 };
> enum e2 { e2_1, e2_2 };
>
> enum e1 x = e2_1;
> With this version, there would be no warning for the above assignment
> since both e1 and e2 have
> same value ranges.  Is that OK ?
>
> The patch has following fallouts in the testsuite:
>
> a) libgomp:
> I initially assume it was a false positive because I thought enum
> gomp_schedule_type
> and enum omp_sched_t have same value-ranges but it looks like omp_sched_t
> has range [1, 4] while gomp_schedule_type has range [0, 4] with one
> extra element.
> Is the warning then correct for this case ?
>
> b) libgfortran:
> i) Implicit conversion from unit_mode to file_mode
> ii) Implicit conversion from unit_sign_s to unit_sign.
> I suppose the warning is OK for these cases since unit_mode, file_mode
> have different value-ranges and similarly for unit_sign_s, unit_sign ?
>
> Also I tested the warning by compiling the kernel for x86_64 with
> allmodconifg (attached), and there
> have been quite few instances of the warning (attached). I have been
> through few cases which I don't think are false positives
> but I wonder then whether we should relegate the warning to Wextra instead ?
ping: https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00514.html

Thanks,
Prathamesh
>
> Thanks,
> Prathamesh
>>
>> --
>> Joseph S. Myers
>> jos...@codesourcery.com

[PATCH] Update myself in MAINTAINERS

2017-07-31 Thread Richard Henderson

A change of jobs requires a change of email.  And I'm going to finally
admit that I've not been active gcc development in the last year or two.
I am far enough behind that I cannot globally review changes.  That said,
I do not plan to abandon Alpha just yet.


r~
---
 MAINTAINERS | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 18bfbbed619..7effec1287f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -22,7 +22,6 @@ Maintainers
 
 Richard Earnshaw   
 Richard Biener 
-Richard Henderson  
 Jakub Jelinek  
 Richard Kenner 
 Jeff Law   
@@ -42,7 +41,7 @@ their own patches from other maintainers or reviewers.
 
 aarch64 port   Marcus Shawcroft
 aarch64 port   Richard Earnshaw
-alpha port Richard Henderson   
+alpha port Richard Henderson   
 arc port   Joern Rennecke  
 arm port   Nick Clifton
 arm port   Richard Earnshaw
@@ -101,7 +100,6 @@ s390 port   Andreas Krebbel 

 score port Chen Liqin  
 sh portAlexandre Oliva 
 sh portOleg Endo   
-sparc port Richard Henderson   
 sparc port David S. Miller 
 sparc port Eric Botcazou   
 spu port   Trevor Smigiel  

@@ -141,7 +139,6 @@ cygwin, mingw-w64   Jonathan Yong   
<10wa...@gmail.com>
Language Front Ends Maintainers
 
 C front end/ISO C99Joseph Myers
-C front end/ISO C99Richard Henderson   
 Ada front end  Arnaud Charlet  
 Ada front end  Eric Botcazou   
 BRIG (HSAIL) front end Pekka Jääskeläinen  

@@ -162,7 +159,6 @@ fp-bit  Ian Lance Taylor

 libdecnumber   Ben Elliston
 libgcc Ian Lance Taylor
 libgo  Ian Lance Taylor
-libgompRichard Henderson   
 libgompJakub Jelinek   
 libiberty  DJ Delorie  
 libiberty  Ian Lance Taylor
@@ -417,6 +413,7 @@ Mark Heffernan  

 George Helffrich   
 Daniel Hellstrom   
 Fergus Henderson   
+Richard Henderson  
 Stuart Henderson   
 Matthew Hiller 
 Kazu Hirata
-- 
2.13.3

[PATCH, Fortran] Support for legacy %FILL fields in STRUCTUREs

2017-07-31 Thread Fritz Reese

All,

Attached is a patch which extends the DEC STRUCTURE support to include
special fields named '%FILL'. These fields generate anonymous
inaccessible components, commonly used in DEC FORTRAN to control
alignment of fields within STRUCTUREs. The patch is fairly
straightforward. The match is performed during parsing in decl.c
(variable_decl) by generating a component with an internal anonymous
name which is an invalid Fortran identifier to replace any components
specified as '%FILL'.

Regtests on x86_64-redhat-linux. OK for trunk?

---
Fritz Reese


>From a94d1aefe608447ff0de469ca90cbcc4942f6168 Mon Sep 17 00:00:00 2001
From: Fritz Reese 
Date: Mon, 31 Jul 2017 14:18:25 -0400
Subject: [PATCH] Support for legacy %FILL field in STRUCTUREs.

gcc/fortran/
* decl.c (attr_seen): New static variable.
* decl.c (variable_decl): Match %FILL in STRUCTURE body.
* gfortran.texi: Update documentation.

gcc/testsuite/gfortran.dg/
* dec_structure_18.f90, dec_structure_19.f90, dec_structure_20.f90,
dec_structure_21.f90: New.
---
From a94d1aefe608447ff0de469ca90cbcc4942f6168 Mon Sep 17 00:00:00 2001
From: Fritz Reese 
Date: Mon, 31 Jul 2017 14:18:25 -0400
Subject: [PATCH] Support for legacy %FILL field in STRUCTUREs.

gcc/fortran/
	* decl.c (attr_seen): New static variable.
	* decl.c (variable_decl): Match %FILL in STRUCTURE body.
	* gfortran.texi: Update documentation.

gcc/testsuite/gfortran.dg/
	* dec_structure_18.f90, dec_structure_19.f90, dec_structure_20.f90,
	dec_structure_21.f90: New.
---
 gcc/fortran/decl.c | 56 +-
 gcc/fortran/gfortran.texi  | 14 +++
 gcc/testsuite/gfortran.dg/dec_structure_18.f90 | 38 +
 gcc/testsuite/gfortran.dg/dec_structure_19.f90 | 38 +
 gcc/testsuite/gfortran.dg/dec_structure_20.f90 | 18 +
 gcc/testsuite/gfortran.dg/dec_structure_21.f90 | 10 +
 6 files changed, 172 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/dec_structure_18.f90
 create mode 100644 gcc/testsuite/gfortran.dg/dec_structure_19.f90
 create mode 100644 gcc/testsuite/gfortran.dg/dec_structure_20.f90
 create mode 100644 gcc/testsuite/gfortran.dg/dec_structure_21.f90

diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c
index bd310703557..0c56b4010bc 100644
--- a/gcc/fortran/decl.c
+++ b/gcc/fortran/decl.c
@@ -54,6 +54,7 @@ static gfc_typespec current_ts;
 static symbol_attribute current_attr;
 static gfc_array_spec *current_as;
 static int colon_seen;
+static int attr_seen;
 
 /* The current binding label (if any).  */
 static const char* curr_binding_label;
@@ -2140,6 +2141,7 @@ static match
 variable_decl (int elem)
 {
   char name[GFC_MAX_SYMBOL_LEN + 1];
+  static unsigned int fill_id = 0;
   gfc_expr *initializer, *char_len;
   gfc_array_spec *as;
   gfc_array_spec *cp_as; /* Extra copy for Cray Pointees.  */
@@ -2157,9 +2159,47 @@ variable_decl (int elem)
   /* When we get here, we've just matched a list of attributes and
  maybe a type and a double colon.  The next thing we expect to see
  is the name of the symbol.  */
-  m = gfc_match_name (name);
+
+  /* If we are parsing a structure with legacy support, we allow the symbol
+ name to be '%FILL' which gives it an anonymous (inaccessible) name.  */
+  m = MATCH_NO;
+  gfc_gobble_whitespace ();
+  if (gfc_peek_ascii_char () == '%')
+{
+  gfc_next_ascii_char ();
+  m = gfc_match ("fill");
+}
+
   if (m != MATCH_YES)
-goto cleanup;
+{
+  m = gfc_match_name (name);
+  if (m != MATCH_YES)
+	goto cleanup;
+}
+
+  else
+{
+  m = MATCH_ERROR;
+  if (gfc_current_state () != COMP_STRUCTURE)
+	{
+	  if (flag_dec_structure)
+	gfc_error ("%s not allowed outside STRUCTURE at %C", "%FILL");
+	  else
+	gfc_error ("%s at %C is a DEC extension, enable with "
+		   "%<-fdec-structure%>", "%FILL");
+	  goto cleanup;
+	}
+
+  if (attr_seen)
+	{
+	  gfc_error ("%s entity cannot have attributes at %C", "%FILL");
+	  goto cleanup;
+	}
+
+  /* %FILL components are given invalid fortran names.  */
+  snprintf (name, GFC_MAX_SYMBOL_LEN + 1, "%%FILL%u", fill_id++);
+  m = MATCH_YES;
+}
 
   var_locus = gfc_current_locus;
 
@@ -2260,6 +2300,14 @@ variable_decl (int elem)
 	}
 }
 
+  /* %FILL components may not have initializers.  */
+  if (strncmp (name, "%FILL", 5) == 0 && gfc_match_eos () != MATCH_YES)
+{
+  gfc_error ("%s entity cannot have an initializer at %C", "%FILL");
+  m = MATCH_ERROR;
+  goto cleanup;
+}
+
   /*  If this symbol has already shown up in a Cray Pointer declaration,
   and this is not a component declaration,
   then we want to set the type & bail out.  */
@@ -3858,6 +3906,7 @@ match_attr_spec (void)
 
   current_as = NULL;
   colon_seen = 0;
+  attr_seen = 0;
 
   /* See if we get all of the keywords up to the final double colon.  */
   for (d = GFC_DE

Re: [PATCH] enhance overflow and truncation detection in strncpy and strncat (PR 81117)

2017-07-31 Thread Martin Sebor


So I think the fixes exposed by the new warning are OK to go in as-is
immediately if you wish to do so.  Minor questions on the actual
improved warnings inline.



Sure, thanks.




-static inline tree
+static tree
 compute_objsize (tree dest, int ostype)
 {

...

+  type = TYPE_MAIN_VARIANT (type);
+
+  if (TREE_CODE (type) == ARRAY_TYPE)
+return TYPE_SIZE_UNIT (type);

So I *think* TYPE_SIZE_UNIT isn't necessarily guaranteed to be a
INTEGER_CST, it could be a non-constant expression for the size.  Are
the callers of compute_objsize prepared to handle that?  Just to be
clear, I'd prefer to return TYPE_SIZE_UNIT even if it's not an
INTEGER_CST, I'm just worried about the callers ability to handle that
correctly.


They should be prepared for it.  If not, it's a bug.  I've added
a few more test cases though I'm not sure the case you're concerned
about actually arises (VLA sizes are represented as gimple calls to
__builtin_alloca_with_align so the code doesn't get this far).




diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index d0b9050..e4208d9 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -5170,6 +5170,57 @@ whether to issue a warning.  Similarly to 
@option{-Wstringop-overflow=3} this
 setting of the option may result in warnings for benign code.
 @end table

+@item -Wstringop-truncation
+@opindex Wstringop-overflow
+@opindex Wno-stringop-overflow


Looks like I have a couple of typos up there.  The option name
should be the same in all three entries.  Let me fix that.


+Warn for calls to bounded string manipulation functions such as @code{strncat},
+@code{strncpy}, and @code{stpncpy} that may either truncate the copied string
+or leave the destination unchanged.
+
+In the following example, the call to @code{strncat} specifies the length
+of the source string as the bound.  If the destination contains a non-empty
+string the copy of the source will be truncated.  Therefore, the call is
+diagnosed.  To avoid the warning use @code{strlen (d) - 4)} as the bound;
+
+@smallexample
+void append (char *d)
+@{
+  strncat (d, ".txt", 4);
+@}
+@end smallexample

Sorry.  I don't follow this particular example.  Where's the truncation
when strlen (SRC) == N for strncat?  In that case aren't we going to
append SRC without the nul terminator, then nul terminate the result?
Isn't that the same as appending SRC with its nul terminator?  Am I
missing something here?


You're right that there is no truncation and the effect is
the same but only in the unlikely case when the destination
is empty.  Otherwise the result is truncated.

(Although well defined, calling strncat with an empty destination
and with the bound as big as the source is a misuse of the API.
The expected way to copy a string is to call one of the copy
functions and the warning relies on this as the basis for the
diagnostic.)



Also your advice for avoiding the warning seems wrong.  Isn't it going
to produce a huge value if strlen (d) < 4?


Right, that advice is wrong.  I had fixed this in my local tree
before posting the patch but must not have updated the patch.
Let me refresh the patch.


 @item -Wsizeof-array-argument
 @opindex Wsizeof-array-argument
diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index d94dc9c..5d021f9 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -1856,21 +1893,68 @@ gimple_fold_builtin_strncat (gimple_stmt_iterator *gsi)
   return true;
 }

+  if (!nowarn && cmpsrc <= 0)
+   {
+ /* To avoid certain truncation the specified bound should also
+not be equal to (or less than) the length of the source.  */
+ location_t loc = gimple_location (stmt);
+ if (warning_at (loc, OPT_Wstringop_truncation,
+ cmpsrc == 0
+ ? G_("%qD specified bound %E equals source length")
+ : G_("%qD specified bound %E is less than source "
+  "length %u"),
+ gimple_call_fndecl (stmt), len, srclen))
+   gimple_set_no_warning (stmt, true);
+   }

So I think this corresponds to the example above I asked about.  Where's
the truncation when the bound is equal to the source length?


Yes.  Hopefully the above answers the question.


NIT: s/potentiall/potentially/



Sure. Fixed along with the other typo above.




@@ -2512,6 +2651,19 @@ strlen_optimize_stmt (gimple_stmt_iterator *gsi)
  case BUILT_IN_STPCPY_CHK_CHKP:
handle_builtin_strcpy (DECL_FUNCTION_CODE (callee), gsi);
break;
+
+ case BUILT_IN_STRNCAT:
+ case BUILT_IN_STRNCAT_CHK:
+   check_builtin_strncat (DECL_FUNCTION_CODE (callee), gsi);
+   break;
+
+ case BUILT_IN_STPNCPY:
+ case BUILT_IN_STPNCPY_CHK:
+ case BUILT_IN_STRNCPY:
+ case BUILT_IN_STRNCPY_CHK:
+   check_builtin_stxncpy (DECL_FUNCTION_CODE (callee), gsi);
+   break;
+

So we've got calls to check the arg

Re: [PATCH, rs6000] Fix PR81622

2017-07-31 Thread Bill Schmidt


> On Jul 31, 2017, at 11:27 AM, Jakub Jelinek  wrote:
> 
> On Mon, Jul 31, 2017 at 11:19:26AM -0500, Bill Schmidt wrote:
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81622 reports an ICE-on-invalid
>> for the vec_ld and vec_st built-in functions.  This fires when the last
>> argument of the built-in is not a pointer or array type, as is required.
>> We break on this during early expansion of the built-ins into tree code
>> during parsing.  The solution, as with other ill-formed uses of these
>> built-ins, is to avoid the early expansion when the argument has an invalid
>> type, so that normal error handling can kick in later.
>> 
>> (The long-term solution is to move the vec_ld and vec_st built-ins to the
>> gimple folding work that Will Schmidt has been doing, but that hasn't
>> happened yet.)
>> 
>> Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
>> Is this ok for trunk and GCC 7?  I'd like to get it into 7.2 since it
>> is a 7 regression.
> 
> See the patch I've attached in the PR, this isn't sufficient
> (and for the ARRAY_TYPE I wonder if you can ever see there ARRAY_TYPE),
> the function has various other issues, including e.g. ICE on
> vec_cmpne (1, 2) with -mpower9.

Yes, I'll withdraw the patch (but the ARRAY_TYPE thing is necessary as
we've discussed).  I'll step out of your way on this one since you've got it
well in hand.  It would be great to have a fix in for 7.2, though.

Bill

> 
>   Jakub
>

New French PO file for 'gcc' (version 7.1.0)

2017-07-31 Thread Translation Project Robot

Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the French team of translators.  The file is available at:

http://translationproject.org/latest/gcc/fr.po

(This file, 'gcc-7.1.0.fr.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

Re: [PATCH, RFC] Proposed PowerPC IEEE 128-bit floating point changes

2017-07-31 Thread Joseph Myers

On Fri, 21 Jul 2017, Michael Meissner wrote:

> The first change is to enable the C language to use _Float128 keyword (but not
> __float128) without having to use the -mfloat128 option on power7-power9
> systems.  My question is in the TR that introduced _Float128, is there any
> expectation that outside of the built-in functions we already provide, that we
> need to provide runtime functions?  Yes, glibc 2.26 will be coming along
> shortly, and it should provide most/all of the F128 functions, but distros
> won't pick this library up for some time.

In formal standards terms, TS 18661-3 builds on TS 18661-1 (for binary 
floating point) or 18661-2 (for decimal floating point), both of which add 
the contents of ,  and the numeric conversion functions 
from  to the library features required of freestanding 
implementations.

I don't think that makes any difference to what GCC should do, however.  
Just as stdatomic.h is provided by GCC, though strictly a hosted library 
feature rather than one required of freestanding implementations, and just 
as GCC always requires a C library to provide memcpy / memmove / memset, 
even the case of a freestanding implementation requires a cooperating 
compiler / library pair, and in this case it makes the most sense for 
those library facilities (both C11 ones and newer TS 18661 ones such as 
for float128) to come from an external libc/libm.  At some point I should 
update standards.texi to discuss these variations from the old rule of 
thumb of GCC providing exactly those library facilities required of 
freestanding implementations.  Practically, code expecting compiler 
support for _Float128 to imply library support is just like code expecting 
__STDC_VERSION__ to imply C11 library support; people may write such code, 
but it's not practically portable.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH 00/19] cleanup of memory stats prototypes

2017-07-31 Thread Martin Sebor


On 07/27/2017 02:30 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

The preC++ way of passing information about the call site of a function was to
use a macro that passed __file__, __LINE__, and __FUNCTION__ to a function with
the same name with _stat appended to it.  The way this is now done with C++ is
to have arguments where the default value is __LINE__, __FILE__, and
__FUNCTION__ in the caller.  This has the significant advantage that if you
look for "^function (" you find the correct function, where in the C way of
doing things you need to realize its a macro and check the definition of the
macro to see what to look for next.  So this removes a layer of indirection,
and makes things somewhat more consistant in using the C++ way of doing things.


So that's what these things are for! :)

I must be missing something either about the macros or about your
changes.  The way I read the changes it seems that it's no longer
possible to call, say,

  t = make_node (code);

and have __FILE__, __LINE__, and __FUNCTION__ passed as arguments
behind the scenes.

Instead, to achieve that, make_node has to be called like so

  t = make_node (code PASS_MEM_STAT);

Otherwise calling make_node() with just one argument will end up
calling it with the defaults.

Martin

Re: [PATCH 00/19] cleanup of memory stats prototypes

2017-07-31 Thread Trevor Saunders

On Mon, Jul 31, 2017 at 02:56:40PM -0600, Martin Sebor wrote:
> On 07/27/2017 02:30 AM, tbsaunde+...@tbsaunde.org wrote:
> > From: Trevor Saunders 
> > 
> > The preC++ way of passing information about the call site of a function was 
> > to
> > use a macro that passed __file__, __LINE__, and __FUNCTION__ to a function 
> > with
> > the same name with _stat appended to it.  The way this is now done with C++ 
> > is
> > to have arguments where the default value is __LINE__, __FILE__, and
> > __FUNCTION__ in the caller.  This has the significant advantage that if you
> > look for "^function (" you find the correct function, where in the C way of
> > doing things you need to realize its a macro and check the definition of the
> > macro to see what to look for next.  So this removes a layer of indirection,
> > and makes things somewhat more consistant in using the C++ way of doing 
> > things.
> 
> So that's what these things are for! :)
> 
> I must be missing something either about the macros or about your
> changes.  The way I read the changes it seems that it's no longer
> possible to call, say,
> 
>   t = make_node (code);
> 
> and have __FILE__, __LINE__, and __FUNCTION__ passed as arguments
> behind the scenes.
> 
> Instead, to achieve that, make_node has to be called like so
> 
>   t = make_node (code PASS_MEM_STAT);
> 
> Otherwise calling make_node() with just one argument will end up
> calling it with the defaults.

calling it with the defaults will do the same thing the macro used to
do, so its fine unless you want to pass in a location that you got as an
argument.

I'm guessing you are missing that the functions when used as the default
argument provide the location of the call site of the function, not the
location of the prototype.  Which is not the most obvious thing.

Trev

> 
> Martin
>

Re: [RFC] Remaining references of Java

2017-07-31 Thread Joseph Myers

On Tue, 25 Jul 2017, Richard Biener wrote:

> > 2) fold-const.c:
> >
> >   1882/* The following code implements the floating point to integer
> >   1883   conversion rules required by the Java Language Specification,
> >   1884   that IEEE NaNs are mapped to zero and values that overflow
> >   1885   the target precision saturate, i.e. values greater than
> >   1886   INT_MAX are mapped to INT_MAX, and values less than INT_MIN
> >   1887   are mapped to INT_MIN.  These semantics are allowed by the
> >   1888   C and C++ standards that simply state that the behavior of
> >   1889   FP-to-integer conversion is unspecified upon overflow.  */
> >   1890
> >   1891wide_int val;
> >   1892REAL_VALUE_TYPE r;
> >   1893REAL_VALUE_TYPE x = TREE_REAL_CST (arg1);
> >
> > Can we somehow remove that Richi?
> 
> Remove what?  The comment?  It still matches the code and we have to
> do sth.  So I don't see why we should change this.
> 
> Maybe Joseph knows whether future standards / TRs recommend sth different
> or exactly this.

The requirement under Annex F is to return an unspecified value (with 
"invalid" raised) for such overflowing / NaN conversions to integer.

"unspecified value" means that any particular execution of the code must 
behave as if some consistent value was chosen; after

double d;
int i = d;

you can't have one subsequent conditional show i == 0 and another show i 
== INT_MAX.  But it's fine for different executions of that code to use 
different values.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: C PATCH to detect clashing attributes (PR c/81544)

2017-07-31 Thread Joseph Myers

On Tue, 25 Jul 2017, Marek Polacek wrote:

> PR c/81544 complaints that we aren't detecting clashing noreturn /
> warn_unused_result attributes so this patch adds that checking.  Martin
> plans to do more systematic checking in this area but meanwhile we
> might want to go with this.
> 
> Bootstrapped/regtested on x86_64-linux, ok for trunk?

I'd expect exactly the same cases to be diagnosed for the two attributes 
on the same declaration, in either order, whether or not inside a single 
__attribute__, as are diagnosed when multiple declarations are involved 
(this applies to all such cases of invalid attribute combinations, not 
just this one).  Whether or not this patch achieves this, the testcase 
doesn't seem to cover the case of a single declaration with both 
attributes (and I don't see an existing such test in c-c++-common or 
gcc.dg either).

-- 
Joseph S. Myers
jos...@codesourcery.com

[PATCH] i386: Update naked-1.c for PIC

2017-07-31 Thread H.J. Lu

For ia32 targets, -fPIC may generate

call __x86.get_pc_thunk.ax
...
__x86.get_pc_thunk.ax:
movl(%esp), %eax
ret

We should check "ret" only for non-PIC or non-ia32 targets.

OK for trunk?

H.J.
---
* gcc.target/i386/naked-1.c: Check "ret" only for non-PIC or
non-ia32 targets.
---
 gcc/testsuite/gcc.target/i386/naked-1.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/i386/naked-1.c 
b/gcc/testsuite/gcc.target/i386/naked-1.c
index 440dbe9ee7a..dda354371ba 100644
--- a/gcc/testsuite/gcc.target/i386/naked-1.c
+++ b/gcc/testsuite/gcc.target/i386/naked-1.c
@@ -11,4 +11,4 @@ foo (void)
 }
 /* { dg-final { scan-assembler "# naked" } } */
 /* { dg-final { scan-assembler "ud2" } } */
-/* { dg-final { scan-assembler-not "ret" } } */
+/* { dg-final { scan-assembler-not "ret" { target { nonpic || { ! ia32 } } } } 
} */
-- 
2.13.3

Re: [PATCH 00/19] cleanup of memory stats prototypes

2017-07-31 Thread Martin Sebor


On 07/31/2017 03:34 PM, Trevor Saunders wrote:

On Mon, Jul 31, 2017 at 02:56:40PM -0600, Martin Sebor wrote:

On 07/27/2017 02:30 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

The preC++ way of passing information about the call site of a function was to
use a macro that passed __file__, __LINE__, and __FUNCTION__ to a function with
the same name with _stat appended to it.  The way this is now done with C++ is
to have arguments where the default value is __LINE__, __FILE__, and
__FUNCTION__ in the caller.  This has the significant advantage that if you
look for "^function (" you find the correct function, where in the C way of
doing things you need to realize its a macro and check the definition of the
macro to see what to look for next.  So this removes a layer of indirection,
and makes things somewhat more consistant in using the C++ way of doing things.


So that's what these things are for! :)

I must be missing something either about the macros or about your
changes.  The way I read the changes it seems that it's no longer
possible to call, say,

  t = make_node (code);

and have __FILE__, __LINE__, and __FUNCTION__ passed as arguments
behind the scenes.

Instead, to achieve that, make_node has to be called like so

  t = make_node (code PASS_MEM_STAT);

Otherwise calling make_node() with just one argument will end up
calling it with the defaults.


calling it with the defaults will do the same thing the macro used to
do, so its fine unless you want to pass in a location that you got as an
argument.


Sorry, I'm still not sure I understand.  Please bear with me
if I'm just being slow.

When GATHER_STATISTICS is defined, make_node_stat is declared
like so in current GCC (without your patch):

  extern tree
  make_node_stat (enum tree_code , const char * _loc_name,
  int _loc_line,  const char * _loc_function);

and the make_node macro like so:

  #define make_node(t) make_node_stat (t MEM_STAT_INFO)

with MEM_STAT_INFO being defined as follows:

  #define MEM_STAT_INFO , ALONE_MEM_STAT_INFO
  #define ALONE_MEM_STAT_INFO __FILE__, __LINE__, __FUNCTION__

So a call to

  t = make_node (code);

expands to

  t = make_node_stat (code, __FILE__, __LINE__, __FUNCTION__);

If I'm reading your patch correctly (please correct me if/where
I'm not) then it treats the same call as just an ordinary call
to the make_node function with no macro expansion taking place
at the call  site.  I.e., it will be made with the _loc_name,
_loc_line, and _loc_function arguments set to whatever their
defaults are where the function is declared.  To have the call
do the same thing as before it will have to be made with explicit
arguments, e.g., like so:

  t = make_node (code MEM_STAT_INFO);

That seems like a (significant) difference.


I'm guessing you are missing that the functions when used as the default
argument provide the location of the call site of the function, not the
location of the prototype.  Which is not the most obvious thing.


I am indeed missing that.  How do/can they do that when that's
not how C++ 98 default arguments work?  (It's only possible with
GCC's __builtin_{FILE,LINE,FUNCTION} but not without these
extensions.)

Martin

Re: [PATCH 4/N] Recover GOTO predictor.

2017-07-31 Thread David Edelsohn

This patch breaks bootstrap on AIX and probably is a hidden bug on all targets.

With the patch, libbacktrace/xcoff.c fails to compile with the error:

/nasfarm/edelsohn/src/src/libbacktrace/xcoff.c: In function 'xcoff_add':
/nasfarm/edelsohn/src/src/libbacktrace/xcoff.c:822:13: error: 'incl'
may be used uninitialized in this function
[-Werror=maybe-uninitialized]
filename = incl->filename;
~^~~~
/nasfarm/edelsohn/src/src/libbacktrace/xcoff.c:777:22: note: 'incl'
was declared here
   struct xcoff_incl *incl;
  ^~~~

incl is used immediately above the line that produces the warning --
with no warning.

I can provide a pre-processed xcoff.c, if you need it.

Thanks, David

1 2 >

1 - 100 of 122 matches

Mail list logo