[testsuite] XFAIL gcc.dg/tree-ssa/ssa-dom-cse-2.c on SPARC 64-bit

2016-01-27 Thread Eric Botcazou
For the same reason as on Alpha and PowerPC 64-bit.

Tested on x86-64/Linux and SPARC64/Solaris, applied on the mainline.


2016-01-27  Eric Botcazou  

* gcc.dg/tree-ssa/ssa-dom-cse-2.c: XFAIL on SPARC 64-bit.

-- 
Eric BotcazouIndex: gcc.dg/tree-ssa/ssa-dom-cse-2.c
===
--- gcc.dg/tree-ssa/ssa-dom-cse-2.c	(revision 232773)
+++ gcc.dg/tree-ssa/ssa-dom-cse-2.c	(working copy)
@@ -21,4 +21,4 @@ foo ()
but the loop reads only one element at a time, and DOM cannot resolve these.
The same happens on powerpc depending on the SIMD support available.  */
 
-/* { dg-final { scan-tree-dump "return 28;" "optimized" { xfail alpha*-*-* powerpc64*-*-* } } } */
+/* { dg-final { scan-tree-dump "return 28;" "optimized" { xfail { { alpha*-*-* powerpc64*-*-* } || { sparc*-*-* && lp64 } } } } } */


Re: [PATCH] Fix PR69336

2016-01-27 Thread Eric Botcazou
> The new test case fails on s390x:

Likewise on SPARC 32-bit and 64-bit.

-- 
Eric Botcazou


Re: PING^1: [PATCH] Add TYPE_EMPTY_RECORD for C++ empty class

2016-01-27 Thread Marc Glisse

On Tue, 26 Jan 2016, H.J. Lu wrote:


On Tue, Jan 26, 2016 at 1:40 PM, Jakub Jelinek  wrote:

On Tue, Jan 26, 2016 at 01:21:52PM -0800, H.J. Lu wrote:

Like this:

/* Returns true if TYPE is POD for the purpose of layout and an empty
   class or an class with empty classes.  */

static bool
is_empty_record (tree type)
{
  if (type == error_mark_node)
return false;

  if (!CLASS_TYPE_P (type))
return false;

  if (CLASSTYPE_NON_LAYOUT_POD_P (type))
return false;

  gcc_assert (COMPLETE_TYPE_P (type));

  if (CLASSTYPE_EMPTY_P (type))
return true;

  tree field;

  for (field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
if (TREE_CODE (field) == FIELD_DECL
&& !DECL_ARTIFICIAL (field)
&& !is_empty_record (TREE_TYPE (field)))
  return false;

  return true;
}


So you say that K1 in e.g.:

struct A1 {}; struct A2 {};
struct B1 { A1 a; A2 b; }; struct B2 { A1 a; A2 b; };
struct C1 { B1 a; B2 b; }; struct C2 { B1 a; B2 b; };
struct D1 { C1 a; C2 b; }; struct D2 { C1 a; C2 b; };
struct E1 { D1 a; D2 b; }; struct E2 { D1 a; D2 b; };
struct F1 { E1 a; E2 b; }; struct F2 { E1 a; E2 b; };
struct G1 { F1 a; F2 b; }; struct G2 { F1 a; F2 b; };
struct H1 { G1 a; G2 b; }; struct H2 { G1 a; G2 b; };
struct I1 { H1 a; H2 b; }; struct I2 { H1 a; H2 b; };
struct J1 { I1 a; I2 b; }; struct J2 { I1 a; I2 b; };
struct K1 { J1 a; J2 b; };
int v;
__attribute__((noinline, noclone))
K1 foo (int a, K1 x, int b)
{
  v = a + b;
  return x;
}
K1 k, m;
void
bar (void)
{
  m = foo (1, k, 2);
}

is empty class?  What does clang do with this?

Jakub


Revised:

/* Returns true if TYPE is POD of one-byte or less in size for the purpose
  of layout and an empty class or an class with empty classes.  */

static bool
is_empty_record (tree type)
{
 if (type == error_mark_node)
   return false;

 if (!CLASS_TYPE_P (type))
   return false;

 if (CLASSTYPE_NON_LAYOUT_POD_P (type))
   return false;

 gcc_assert (COMPLETE_TYPE_P (type));

 if (CLASSTYPE_EMPTY_P (type))
   return true;

 if (int_size_in_bytes (type) > 1)
   return false;


That's completely arbitrary :-(

If we are defining a new ABI, I preferred your previous version (at least 
the idea, I didn't check the code). If we are trying to follow clang, then 
we need either a clear English definition from a clang developer or at 
least someone should check what conditions they use in the code. Trying to 
infer it from a couple examples sounds too fragile for an ABI decision.



 tree field;

 for (field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
   if (TREE_CODE (field) == FIELD_DECL
   && !DECL_ARTIFICIAL (field)
   && !is_empty_record (TREE_TYPE (field)))
 return false;

 return true;
}


--
Marc Glisse


Re: PING^1: [PATCH] Add TYPE_EMPTY_RECORD for C++ empty class

2016-01-27 Thread Jakub Jelinek
On Wed, Jan 27, 2016 at 09:10:14AM +0100, Marc Glisse wrote:
> >Revised:
> >
> >/* Returns true if TYPE is POD of one-byte or less in size for the purpose
> >  of layout and an empty class or an class with empty classes.  */
> >
> >static bool
> >is_empty_record (tree type)
> >{
> > if (type == error_mark_node)
> >   return false;
> >
> > if (!CLASS_TYPE_P (type))
> >   return false;
> >
> > if (CLASSTYPE_NON_LAYOUT_POD_P (type))
> >   return false;
> >
> > gcc_assert (COMPLETE_TYPE_P (type));
> >
> > if (CLASSTYPE_EMPTY_P (type))
> >   return true;
> >
> > if (int_size_in_bytes (type) > 1)
> >   return false;
> 
> That's completely arbitrary :-(

Yeah.  Because (adapted to be compilable with C):
struct A1 {}; struct A2 {};
struct B1 { struct A1 a; struct A2 b; }; struct B2 { struct A1 a; struct A2 b; 
};
struct C1 { struct B1 a; struct B2 b; }; struct C2 { struct B1 a; struct B2 b; 
};
struct D1 { struct C1 a; struct C2 b; }; struct D2 { struct C1 a; struct C2 b; 
};
struct E1 { struct D1 a; struct D2 b; }; struct E2 { struct D1 a; struct D2 b; 
};
struct F1 { struct E1 a; struct E2 b; }; struct F2 { struct E1 a; struct E2 b; 
};
struct G1 { struct F1 a; struct F2 b; }; struct G2 { struct F1 a; struct F2 b; 
};
struct H1 { struct G1 a; struct G2 b; }; struct H2 { struct G1 a; struct G2 b; 
};
struct I1 { struct H1 a; struct H2 b; }; struct I2 { struct H1 a; struct H2 b; 
};
struct J1 { struct I1 a; struct I2 b; }; struct J2 { struct I1 a; struct I2 b; 
};
struct K1 { struct J1 a; struct J2 b; };
int v;
__attribute__((noinline, noclone))
struct K1 foo (int a, struct K1 x, int b)
{
  v = a + b;
  return x;
}
struct K1 k, m;
void
bar (void)
{
  m = foo (1, k, 2);
}
then would have a different calling convention between C and C++,
so where is the argument that we change anything just to make the two
compatible?  Though, of course, those two will never be compatible,
it is enough to add struct L1 { int a; struct K1 b; int c; }; and
that structure has 1024+8 bytes in C++ and 8 bytes in C.
As clang generates different code for the above between C and C++, it
clearly special cases for some reason just the most common case.
IMHO it is not worth to change GCC ABI...

Jakub


Re: [PR 69355] Correct hole detection when total_scalarization fails

2016-01-27 Thread Richard Biener
On Tue, 26 Jan 2016, Martin Jambor wrote:

> Hi,
> 
> PR 69355 has revealed that when SRA attempts total scalarization of an
> aggregate but this fails because the user type-casts a scalar field
> and stores into a it a smaller aggregate (and the scalar field is not
> written to, whether directly or as a part of an aggregate store), the
> pass can loose track of unscalarized data there.
> 
> I think that this can happen only when violating strict aliasing rules
> but with -fno-strict-aliasing it should work.
> 
> Fixed thusly with the patch below (the condition is there to avoid
> detecting padding after aggregate-fields in totally-scalarized
> aggregates as unscalarized data).  Bootstrapped and tested on
> x86_64-linux.  OK for trunk?  And the gcc-5 branch?

Ok.

Thanks,
Richard.

> Thanks,
> 
> Martin
> 
> 
> 2016-01-26  Martin Jambor  
> 
>   PR tree-optimization/69355
>   * tree-sra.c (analyze_access_subtree): Correct hole detection when
>   total_scalarization fails.
> 
> testsuite/
>   * gcc.dg/tree-ssa/pr69355.c: New test.
> 
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/pr69355.c | 44 
> +
>  gcc/tree-sra.c  |  2 +-
>  2 files changed, 45 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr69355.c
> 
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr69355.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr69355.c
> new file mode 100644
> index 000..f515c21
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr69355.c
> @@ -0,0 +1,44 @@
> +/* { dg-do run } */
> +/* { dg-options "-O -fno-strict-aliasing" } */
> +
> +struct S
> +{
> +  void *a;
> +  long double b;
> +};
> +
> +struct Z
> +{
> +  long long l;
> +  short s;
> +} __attribute__((packed));
> +
> +struct S __attribute__((noclone, noinline))
> +foo (void *v, struct Z *z)
> +{
> +  struct S t;
> +  t.a = v;
> +  *(struct Z *) &t.b = *z;
> +  return t;
> +}
> +
> +struct Z gz;
> +
> +int
> +main (int argc, char **argv)
> +{
> +  struct S s;
> +
> +  if (sizeof (long double) < sizeof (struct Z))
> +return 0;
> +
> +  gz.l = 0xbeef;
> +  gz.s = 0xab;
> +
> +  s = foo ((void *) 0, &gz);
> +
> +  if struct Z *) &s.b)->l != gz.l)
> +  || (((struct Z *) &s.b)->s != gz.s))
> +__builtin_abort ();
> +  return 0;
> +}
> diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
> index 740542f..b0e737a 100644
> --- a/gcc/tree-sra.c
> +++ b/gcc/tree-sra.c
> @@ -2421,7 +2421,7 @@ analyze_access_subtree (struct access *root, struct 
> access *parent,
>  
>if (covered_to < limit)
>   hole = true;
> -  if (scalar)
> +  if (scalar || !allow_replacements)
>   root->grp_total_scalarization = 0;
>  }
>  
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


[PATCH, testsuite]: Disable some tests in gcc.dg/torture/pr68264.c for older glibcs

2016-01-27 Thread Uros Bizjak
2016-01-27  Uros Bizjak  

* gcc.dg/torture/pr68264.c: Disable log1p test for glibc < 2.22
and expm1 test for glibc < 2.11.

Tested on x86_64-linux-gnu, CentOS 5.11 and Fedora 23.

OK for mainline?

Uros.
diff --git a/gcc/testsuite/gcc.dg/torture/pr68264.c 
b/gcc/testsuite/gcc.dg/torture/pr68264.c
index 8396b34..9294d5a 100644
--- a/gcc/testsuite/gcc.dg/torture/pr68264.c
+++ b/gcc/testsuite/gcc.dg/torture/pr68264.c
@@ -74,8 +74,10 @@ test (void)
 #endif
 TEST (log2 (d), LARGE_NEG_EDOM);
   TEST (log10 (d), LARGE_NEG_EDOM);
-  /* Disabled due to glibc PR 6792, fixed in Apr 2015.  */
+#if defined(__GLIBC__) && (__GLIBC__ < 2 || (__GLIBC__ == 2 && __GLIBC_MINOR__ 
< 22))
+  /* Disabled due to glibc PR 6792, fixed in glibc 2.22.  */
   if (0)
+#endif
 TEST (log1p (d), LARGE_NEG_EDOM);
   TEST (exp (d), POWER_ERANGE);
 #if (defined (__sun__) || defined(__hppa__)) && defined (__unix__)
@@ -85,7 +87,11 @@ test (void)
 #endif
 {
   TEST (exp2 (d), POWER_ERANGE);
-  TEST (expm1 (d), POWER_ERANGE);
+#if defined(__GLIBC__) && (__GLIBC__ < 2 || (__GLIBC__ == 2 && __GLIBC_MINOR__ 
< 11))
+  /* Disabled due to glibc PR 6788, fixed in glibc 2.11.  */
+  if (0)
+#endif
+   TEST (expm1 (d), POWER_ERANGE);
 }
   TEST (sqrt (d), LARGE_NEG_EDOM);
   TEST (pow (100.0, d), POWER_ERANGE);


Re: PING^1: [PATCH] Add TYPE_EMPTY_RECORD for C++ empty class

2016-01-27 Thread Marc Glisse

On Wed, 27 Jan 2016, Jakub Jelinek wrote:


On Wed, Jan 27, 2016 at 09:10:14AM +0100, Marc Glisse wrote:

Revised:

/* Returns true if TYPE is POD of one-byte or less in size for the purpose
 of layout and an empty class or an class with empty classes.  */

static bool
is_empty_record (tree type)
{
if (type == error_mark_node)
  return false;

if (!CLASS_TYPE_P (type))
  return false;

if (CLASSTYPE_NON_LAYOUT_POD_P (type))
  return false;

gcc_assert (COMPLETE_TYPE_P (type));

if (CLASSTYPE_EMPTY_P (type))
  return true;

if (int_size_in_bytes (type) > 1)
  return false;


That's completely arbitrary :-(


Yeah.  Because (adapted to be compilable with C):
struct A1 {}; struct A2 {};
struct B1 { struct A1 a; struct A2 b; }; struct B2 { struct A1 a; struct A2 b; 
};
struct C1 { struct B1 a; struct B2 b; }; struct C2 { struct B1 a; struct B2 b; 
};
struct D1 { struct C1 a; struct C2 b; }; struct D2 { struct C1 a; struct C2 b; 
};
struct E1 { struct D1 a; struct D2 b; }; struct E2 { struct D1 a; struct D2 b; 
};
struct F1 { struct E1 a; struct E2 b; }; struct F2 { struct E1 a; struct E2 b; 
};
struct G1 { struct F1 a; struct F2 b; }; struct G2 { struct F1 a; struct F2 b; 
};
struct H1 { struct G1 a; struct G2 b; }; struct H2 { struct G1 a; struct G2 b; 
};
struct I1 { struct H1 a; struct H2 b; }; struct I2 { struct H1 a; struct H2 b; 
};
struct J1 { struct I1 a; struct I2 b; }; struct J2 { struct I1 a; struct I2 b; 
};
struct K1 { struct J1 a; struct J2 b; };
int v;
__attribute__((noinline, noclone))
struct K1 foo (int a, struct K1 x, int b)
{
 v = a + b;
 return x;
}
struct K1 k, m;
void
bar (void)
{
 m = foo (1, k, 2);
}
then would have a different calling convention between C and C++,
so where is the argument that we change anything just to make the two
compatible?  Though, of course, those two will never be compatible,
it is enough to add struct L1 { int a; struct K1 b; int c; }; and
that structure has 1024+8 bytes in C++ and 8 bytes in C.


I don't know how empty classes are used in C in practice, but it could 
make sense to have ABI compatibility as long as no empty struct is used as 
a member of another struct (I also suggested an attribute to let 
C++ use the same layout as C here: PR63579). But then the usual definition 
of empty would be sufficient.



As clang generates different code for the above between C and C++, it
clearly special cases for some reason just the most common case.
IMHO it is not worth to change GCC ABI...


I was interested in this change because it improves C++, C compatibility 
was a convenient excuse ;-)


--
Marc Glisse


[PATCH] Remove xfail from thread_local-order2.C.

2016-01-27 Thread Dominik Vogt
g++.dg/tls/thread_local-order2.C no longer fail with Glibc-2.18 or
newer since this commit:

  2014-08-01  Zifei Tong  

* libsupc++/atexit_thread.cc (HAVE___CXA_THREAD_ATEXIT_IMPL): Add
_GLIBCXX_ prefix to macro.

  git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@213504 138bc75d-0d04-0410-96

https://gcc.gnu.org/ml/gcc-patches/2014-07/msg02091.html

So, is it time to remove the xfail from the test case?

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/testsuite/ChangeLog

* g++.dg/tls/thread_local-order2.C: Remove xfail.
>From 0b0abbd2e6d9d8b6857622065bdcbdde31b5ddb0 Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Wed, 27 Jan 2016 09:54:07 +0100
Subject: [PATCH] Remove xfail from thread_local-order2.C.

This should work with Glibc-2.18 or newer.
---
 gcc/testsuite/g++.dg/tls/thread_local-order2.C | 1 -
 1 file changed, 1 deletion(-)

diff --git a/gcc/testsuite/g++.dg/tls/thread_local-order2.C b/gcc/testsuite/g++.dg/tls/thread_local-order2.C
index f8df917..d3351e6 100644
--- a/gcc/testsuite/g++.dg/tls/thread_local-order2.C
+++ b/gcc/testsuite/g++.dg/tls/thread_local-order2.C
@@ -2,7 +2,6 @@
 // that isn't reverse order of construction.  We need to move
 // __cxa_thread_atexit into glibc to get this right.
 
-// { dg-do run { xfail *-*-* } }
 // { dg-require-effective-target c++11 }
 // { dg-add-options tls }
 // { dg-require-effective-target tls_runtime }
-- 
2.3.0



[PATCH] S/390: Require a hardware vector support for test to succeed.

2016-01-27 Thread Dominik Vogt
The test case works on S/390 too, but only with -march=z13 or
later (i.e. if Gcc can make use of hardware vector support).
Otherwise the optimization gets too complex.  The attached patch
forces Gcc to use -march=z13 instead of xfail'ing the test on
S/390.

What do you think about this approach?

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/testsuite/ChangeLog

* gcc.dg/tree-ssa/ssa-dom-cse-2.c: Require a hardware vector support for
test to succeed.
>From d855b255132d6a3ef46571e84aa75c95d8c3c737 Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Wed, 27 Jan 2016 08:55:03 +0100
Subject: [PATCH] S/390: Require a hardware vector support for test to
 succeed.

---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c
index 2349400..f68b53b 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c
@@ -1,5 +1,9 @@
 /* { dg-do compile } */
 /* { dg-options "-O3 -fno-tree-fre -fno-tree-pre -fdump-tree-optimized --param sra-max-scalarization-size-Ospeed=32" } */
+/* S390 needs hardware vector support for this to work (the optimization gets
+ * too complex without it.
+ * { dg-additional-options "-march=z13" { target { s390*-*-* } } } */
+
 
 int
 foo ()
-- 
2.3.0



Re: [C PATCH] Fix -Wunused-function (PR debug/66869)

2016-01-27 Thread Richard Biener
On Tue, Jan 26, 2016 at 5:18 PM, Jakub Jelinek  wrote:
> On Tue, Jan 26, 2016 at 04:21:08PM +0100, Richard Biener wrote:
>> > --- gcc/c/c-decl.c.jj   2016-01-21 00:41:47.0 +0100
>> > +++ gcc/c/c-decl.c  2016-01-25 16:36:31.973504082 +0100
>> > @@ -10741,11 +10741,19 @@ c_write_global_declarations_1 (tree glob
>> >if (TREE_CODE (decl) == FUNCTION_DECL
>> >   && DECL_INITIAL (decl) == 0
>> >   && DECL_EXTERNAL (decl)
>> > - && !TREE_PUBLIC (decl)
>> > - && C_DECL_USED (decl))
>> > + && !TREE_PUBLIC (decl))
>> > {
>> > - pedwarn (input_location, 0, "%q+F used but never defined", decl);
>> > - TREE_NO_WARNING (decl) = 1;
>> > + if (C_DECL_USED (decl))
>> > +   {
>> > + pedwarn (input_location, 0, "%q+F used but never defined", 
>> > decl);
>> > + TREE_NO_WARNING (decl) = 1;
>> > +   }
>> > + /* For -Wunused-function push the unused statics into cgraph,
>> > +so that check_global_declaration emits the warning.  */
>> > + else if (warn_unused_function
>> > +  && ! DECL_ARTIFICIAL (decl)
>> > +  && ! TREE_NO_WARNING (decl))
>> > +   cgraph_node::get_create (decl);
>>
>> Err, so why not warn here directly?
>
> You mean check if it has a cgraph node (i.e. get instead of get_create) and
> if it doesn't, warn?  What I'm worried in that case is that it might have a
> cgraph node created later on for whatever reason and that we'll get double
> warning (from here and from cgraphunit.c (check_global_declaration)).
> I can try it though.

No, simply warn and set TREE_NO_WARNING so cgraph doesn't warn again.

Richard.

> Jakub


Re: [PATCH] Fix up wi::lrshift (PR c++/69399)

2016-01-27 Thread Richard Biener
On Wed, Jan 27, 2016 at 12:41 AM, Jakub Jelinek  wrote:
> On Tue, Jan 26, 2016 at 01:55:41PM -0800, Mike Stump wrote:
>> On Jan 26, 2016, at 1:26 PM, Jakub Jelinek  wrote:
>> > will do cc1plus size comparison afterwards.
>>
>> We know the dynamic check is larger.  You can’t tell the advantage of
>> speed from size.  Better would be to time compiling any random large
>> translation unit.
>>
>> Nice to see that only 14 calls remain, that’s way better than the 34 I
>> thought.
>
> So, it seems probably the PR65656 changes made things actually significantly
> worse, while I see a (small) difference in the generated code between the two
> patches if I compile say tree-ssa-ccp.c with g++ 5.x, in the bootstrapped
> compiler there is no difference at all, the compilers with either patch
> have identical objdump -dr cc1plus.  Already at *.gimple time all the
> relevant __builtin_constant_p are resolved and it seems all to 0.
>
> So please agree on one of the two patches (don't care which), and I'll try
> to distill a testcase to look at for PR65656.

I don't care if the wi::lshift by LOG2_BITS_PER_UNIT done by
get_ref_base_and_extent
are compiled to as good quality as before (I suspect it doesn't matter
in this case
as the shift amount is constant, but maybe due to PR65656 the
non-STATIC_CONSTANT_P
variant is better).

Richard.

> Jakub


[PATCH] PR other/69006: S/390: Fix extra newlines after diagnostics.

2016-01-27 Thread Dominik Vogt
The attached patch removes a blank line after an error message.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69006

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/ChangeLog

PR other/69006
* config/s390/s390-c.c (s390_resolve_overloaded_builtin): Remove
trailing blank line from error message.
>From ae4938b344821b3f6d244ec7fc5f9703599d3d55 Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Wed, 27 Jan 2016 11:50:49 +0100
Subject: [PATCH] PR other/69006: S/390: Fix extra newlines after
 diagnostics.

---
 gcc/config/s390/s390-c.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/s390/s390-c.c b/gcc/config/s390/s390-c.c
index 14d030d..2b6e405 100644
--- a/gcc/config/s390/s390-c.c
+++ b/gcc/config/s390/s390-c.c
@@ -909,7 +909,7 @@ s390_resolve_overloaded_builtin (location_t loc,
 }
   else if (num_matches > 1)
 {
-  error_at (loc, "ambiguous overload for intrinsic: %s\n",
+  error_at (loc, "ambiguous overload for intrinsic: %s",
 	 IDENTIFIER_POINTER (DECL_NAME (ob_fndecl)));
   return error_mark_node;
 }
-- 
2.3.0



Re: Patch RFA: Add option -fcollectible-pointers, use it in ivopts

2016-01-27 Thread Bernd Schmidt

+This option is disabled by default for most languages, enabled by
+default for languages that use garbage collection.


This is not true as of this patch.


Index: tree-ssa-loop-ivopts.c
===
--- tree-ssa-loop-ivopts.c  (revision 232580)
+++ tree-ssa-loop-ivopts.c  (working copy)
@@ -2815,6 +2815,16 @@
struct iv_cand *cand = NULL;
tree type, orig_type;

+  /* -fkeep-gc-roots-live means that we have to keep a real pointer
+ live, but the ivopts code may replace a real pointer with one
+ pointing before or after the memory block that is then adjusted
+ into the memory block during the loop.  FIXME: It would likely be
+ better to actually force the pointer live and still use ivopts;
+ for example, it would be enough to write the pointer into memory
+ and keep it there until after the loop.  */
+  if (flag_keep_gc_roots_live && POINTER_TYPE_P (TREE_TYPE (base)))
+return NULL;


Is there a reason to exclude a candidate if the base is just a pointer 
to an object? Or does it just not make a difference?


I feel somewhat uneasy that this probably isn't a full solution either. 
But it seems like a reasonable starting point. I'm still not sure we 
want to advertise the option to users, as it might overpromise at the 
moment. Do we have precedent for internal-only options?


Or maybe we do need some other form to represent the need to keep a 
pointer alive in the IL. I suspect that for gcc-6 your patch is probably 
reasonable, but we should mark the option as internal and likely to go 
away in the future.



Bernd


Re: C++ PATCH for c++/69496 (ICE on VLA in constexpr function)

2016-01-27 Thread Marek Polacek
On Tue, Jan 26, 2016 at 08:58:06PM -0700, Martin Sebor wrote:
> On 01/26/2016 04:02 PM, Marek Polacek wrote:
> >The (invalid) testcase was causing an ICE because we were passing the result
> >of array_type_nelts_top immediately into tree_int_cst_lt, but for VLAs, the
> >result doesn't have to be a constant.  Fixed by evaluating the bound of the
> >array so that we're able to give a proper out-of-bounds diagnostics.  And the
> >VERIFY_CONSTANT should ensure that if the array bound couldn't be reduced to
> >a constant, we bail out rather than evoke an ICE.
> 
> Wow, you are quick! :)
> 
> I wonder if it might be better to instead reject VLAs in constexpr
> functions altogether.  Not because they're not in C++, but because
> C (or gcc) doesn't allow them to be initialized (and so accepting
> an initialized VLA is a g++ extension of an extension), and
> because in constexpr functions they are rejected without
> initialization just like other uninitialized variables.

I don't think we can do this at this time.  E.g. the following program works
even with GCC 5 and -std=c++14:

constexpr int
foo (int n)
{
int a[n] = { 1, 2, 3 };
int z = 0;
for (unsigned i = 0; i < 3; ++i)
  z += a[i];
return z;
}

int
main ()
{
  constexpr int n = foo (3);
  __builtin_printf ("%d\n", n);
}

So starting to reject such a code might broke working programs.  And we're
able to reject non-standard code: -pedantic-errors.
 
> FWIW, it seems to me the fewer extensions we support the less risk
> of something going wrong because of unforeseen interactions with
> other features.

True... (hi statement expressions!).

Marek


Re: [PATCH, PR69110] Don't return NULL access_fns in dr_analyze_indices

2016-01-27 Thread Tom de Vries

On 26/01/16 17:59, Sebastian Pop wrote:

Tom de Vries wrote:

diff --git a/gcc/tree-data-ref.c b/gcc/tree-data-ref.c
index a40f40d..4c29fc2 100644
--- a/gcc/tree-data-ref.c
+++ b/gcc/tree-data-ref.c
@@ -1510,8 +1510,9 @@ initialize_data_dependence_relation (struct 
data_reference *a,
if (operand_equal_p (DR_REF (a), DR_REF (b), 0))
  {
   if (loop_nest.exists ()
-&& !object_address_invariant_in_loop_p (loop_nest[0],
-   DR_BASE_OBJECT (a)))
+&& (!object_address_invariant_in_loop_p (loop_nest[0],
+ DR_BASE_OBJECT (a))
+|| DR_NUM_DIMENSIONS (a) == 0))


Also please fix the indentation of all this if stmt.



Done.


{
  DDR_ARE_DEPENDENT (res) = chrec_dont_know;
  return res;
@@ -1548,8 +1549,9 @@ initialize_data_dependence_relation (struct 
data_reference *a,
   analyze it.  TODO -- in fact, it would suffice to record that there may
   be arbitrary dependences in the loops where the base object varies.  */
if (loop_nest.exists ()
-  && !object_address_invariant_in_loop_p (loop_nest[0],
- DR_BASE_OBJECT (a)))
+  && (!object_address_invariant_in_loop_p (loop_nest[0],
+  DR_BASE_OBJECT (a))
+ || DR_NUM_DIMENSIONS (a) == 0))
  {
DDR_ARE_DEPENDENT (res) = chrec_dont_know;
return res;


Let's check for DR_NUM_DIMENSIONS (a) == 0 independently of loop_nest.exists ().


Done.


We check for the loop_nest because we need to access the outer loop loop_nest[0]
to analyze the base object of a.

Otherwise the change looks good to me.



Bootstrapped and reg-tested on x86_64.

Committed as attached to trunk, 5.0 and 4.9 (And fixed up pass number in 
testcases in 5.0 and 4.9).


Thanks,
- Tom
Handle DR_NUM_DIMENSIONS == 0 in initialize_data_dependence_relation

2016-01-12  Tom de Vries  

	PR tree-optimization/69110
	* tree-data-ref.c (initialize_data_dependence_relation): Handle
	DR_NUM_DIMENSIONS == 0.

	* gcc.dg/autopar/pr69110.c: New test.

	* testsuite/libgomp.c/pr69110.c: New test.

---
 gcc/testsuite/gcc.dg/autopar/pr69110.c | 17 +
 gcc/tree-data-ref.c| 21 +++--
 libgomp/testsuite/libgomp.c/pr69110.c  | 26 ++
 3 files changed, 54 insertions(+), 10 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/autopar/pr69110.c b/gcc/testsuite/gcc.dg/autopar/pr69110.c
new file mode 100644
index 000..27cdae5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/autopar/pr69110.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -ftree-parallelize-loops=2 -fno-tree-loop-im -fdump-tree-parloops2-details" } */
+
+#define N 1000
+
+unsigned int i = 0;
+
+void
+foo (void)
+{
+  unsigned int z;
+  for (z = 0; z < N; ++z)
+++i;
+}
+
+/* { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 0 "parloops2" } } */
+/* { dg-final { scan-tree-dump-times "FAILED: data dependencies exist across iterations" 1 "parloops2" } } */
diff --git a/gcc/tree-data-ref.c b/gcc/tree-data-ref.c
index a40f40d..d6d9ffc 100644
--- a/gcc/tree-data-ref.c
+++ b/gcc/tree-data-ref.c
@@ -1509,13 +1509,14 @@ initialize_data_dependence_relation (struct data_reference *a,
   /* The case where the references are exactly the same.  */
   if (operand_equal_p (DR_REF (a), DR_REF (b), 0))
 {
- if (loop_nest.exists ()
-&& !object_address_invariant_in_loop_p (loop_nest[0],
-   	DR_BASE_OBJECT (a)))
-  {
-DDR_ARE_DEPENDENT (res) = chrec_dont_know;
-return res;
-  }
+  if ((loop_nest.exists ()
+	   && !object_address_invariant_in_loop_p (loop_nest[0],
+		   DR_BASE_OBJECT (a)))
+	  || DR_NUM_DIMENSIONS (a) == 0)
+	{
+	  DDR_ARE_DEPENDENT (res) = chrec_dont_know;
+	  return res;
+	}
   DDR_AFFINE_P (res) = true;
   DDR_ARE_DEPENDENT (res) = NULL_TREE;
   DDR_SUBSCRIPTS (res).create (DR_NUM_DIMENSIONS (a));
@@ -1547,9 +1548,9 @@ initialize_data_dependence_relation (struct data_reference *a,
   /* If the base of the object is not invariant in the loop nest, we cannot
  analyze it.  TODO -- in fact, it would suffice to record that there may
  be arbitrary dependences in the loops where the base object varies.  */
-  if (loop_nest.exists ()
-  && !object_address_invariant_in_loop_p (loop_nest[0],
- 	  DR_BASE_OBJECT (a)))
+  if ((loop_nest.exists ()
+   && !object_address_invariant_in_loop_p (loop_nest[0], DR_BASE_OBJECT (a)))
+  || DR_NUM_DIMENSIONS (a) == 0)
 {
   DDR_ARE_DEPENDENT (res) = chrec_dont_know;
   return res;
diff --git a/libgomp/testsuite/libgomp.c/pr69110.c b/libgomp/testsuite/libgomp.c/pr69110.c
new file mode 100644
index 000..0d9e5ca
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/pr69110.c
@@ -0,0 +1,26 @@
+/* { dg-do run } */
+/* { dg-options "-ftree-para

Re: [PATCH] Fix up wi::lrshift (PR c++/69399)

2016-01-27 Thread Jakub Jelinek
On Wed, Jan 27, 2016 at 11:38:33AM +0100, Richard Biener wrote:
> On Wed, Jan 27, 2016 at 12:41 AM, Jakub Jelinek  wrote:
> > On Tue, Jan 26, 2016 at 01:55:41PM -0800, Mike Stump wrote:
> >> On Jan 26, 2016, at 1:26 PM, Jakub Jelinek  wrote:
> >> > will do cc1plus size comparison afterwards.
> >>
> >> We know the dynamic check is larger.  You can’t tell the advantage of
> >> speed from size.  Better would be to time compiling any random large
> >> translation unit.
> >>
> >> Nice to see that only 14 calls remain, that’s way better than the 34 I
> >> thought.
> >
> > So, it seems probably the PR65656 changes made things actually significantly
> > worse, while I see a (small) difference in the generated code between the 
> > two
> > patches if I compile say tree-ssa-ccp.c with g++ 5.x, in the bootstrapped
> > compiler there is no difference at all, the compilers with either patch
> > have identical objdump -dr cc1plus.  Already at *.gimple time all the
> > relevant __builtin_constant_p are resolved and it seems all to 0.
> >
> > So please agree on one of the two patches (don't care which), and I'll try
> > to distill a testcase to look at for PR65656.
> 
> I don't care if the wi::lshift by LOG2_BITS_PER_UNIT done by
> get_ref_base_and_extent
> are compiled to as good quality as before (I suspect it doesn't matter
> in this case
> as the shift amount is constant, but maybe due to PR65656 the
> non-STATIC_CONSTANT_P
> variant is better).

Ok, I'll commit the non-STATIC_CONSTANT_P variant for now, we can revisit it
later.  I have distilled a testcase for Jason in PR65656.  It seems the
problematic STATIC_CONSTANT_P is only if it is inside of the condition of
?: expression.  So we could work around it by using something like:
-  if (STATIC_CONSTANT_P (xi.precision > HOST_BITS_PER_WIDE_INT)
-  ? (STATIC_CONSTANT_P (shift < HOST_BITS_PER_WIDE_INT - 1)
- && xi.len == 1
- && xi.val[0] <= (HOST_WIDE_INT) ((unsigned HOST_WIDE_INT)
-  HOST_WIDE_INT_MAX >> shift))
-  : precision <= HOST_BITS_PER_WIDE_INT)
+  bool fast_path = false;
+  if (STATIC_CONSTANT_P (xi.precision > HOST_BITS_PER_WIDE_INT)
+{
+  if (STATIC_CONSTANT_P (shift < HOST_BITS_PER_WIDE_INT - 1)
+  && xi.len == 1
+  && xi.val[0] <= (HOST_WIDE_INT) ((unsigned HOST_WIDE_INT)
+   HOST_WIDE_INT_MAX >> shift))
+fast_path = true;
+}
+  else if (precision <= HOST_BITS_PER_WIDE_INT)
+fast_path = true;
+  if (fast_path)
{
  val[0] = xi.ulow () << shift;
  result.set_len (1);
}
  else
result.set_len (lshift_large (val, xi.val, xi.len,
  precision, shift));
and similarly in lrshift.

Jakub


Re: [hsa merge 07/10] IPA-HSA pass

2016-01-27 Thread Martin Liška
On 01/26/2016 12:41 AM, Jan Hubicka wrote:
>> On Mon, Jan 25, 2016 at 04:21:50PM +0100, Martin Liška wrote:
>>> On 01/16/2016 11:00 AM, Jan Hubicka wrote:
 Can't it be represented via explicit REF_ADDR or something like that?

 Honza
>>>
>>> Hi.
>>>
>>> Sure, I've just done a patch that can do that. However, as we're currently 
>>> in stage4,
>>> that change would probably require explicit permission of a release manager?
>>
>> If Honza is fine with it and you've tested it, this is ok for trunk.
> 
> It looks fine to me.
> 
> Honza
> 

I've just bootregtested the patch on x86_64-linux-pc and I'm going to install
it to trunk.

Martin


[PATCH] Fix PR sanitizer/PR69276

2016-01-27 Thread Martin Liška
Following patch was kind of pre-approved by Jakub in:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69276#c4

Patch can bootstrap in x86_64-linux-gnu and survives regression tests.
I also verified that newly added test-case works with the patch.

Ready for trunk?
Thanks,
Martin


>From 4e4575cfef5d06d8e8477716ce2f4d7e28ae66f0 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Thu, 14 Jan 2016 18:15:04 +0100
Subject: [PATCH] Fix PR sanitizer/PR69276

gcc/testsuite/ChangeLog:

2016-01-14  Martin Liska  

	* g++.dg/asan/pr69276.C: New test.

gcc/ChangeLog:

2016-01-14  Martin Liska  

	PR sanitizer/PR69276
	* asan.c (has_stmt_been_instrumented_p): Instrument gimple calls
	that are gimple_store_p.
	(maybe_instrument_call): Likewise.
---
 gcc/asan.c  | 21 
 gcc/testsuite/g++.dg/asan/pr69276.C | 38 +
 2 files changed, 59 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/asan/pr69276.C

diff --git a/gcc/asan.c b/gcc/asan.c
index 2f9f92f..1747e90 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -897,6 +897,16 @@ has_stmt_been_instrumented_p (gimple *stmt)
 	  return true;
 	}
 }
+  else if (is_gimple_call (stmt) && gimple_store_p (stmt))
+{
+  asan_mem_ref r;
+  asan_mem_ref_init (&r, NULL, 1);
+
+  r.start = gimple_call_lhs (stmt);
+  r.access_size = int_size_in_bytes (TREE_TYPE (r.start));
+  return has_mem_ref_been_instrumented (&r);
+}
+
   return false;
 }
 
@@ -2038,6 +2048,17 @@ maybe_instrument_call (gimple_stmt_iterator *iter)
   gimple_set_location (g, gimple_location (stmt));
   gsi_insert_before (iter, g, GSI_SAME_STMT);
 }
+  else if (gimple_store_p (stmt))
+{
+  tree ref_expr = gimple_call_lhs (stmt);
+  instrument_derefs (iter, ref_expr,
+			 gimple_location (stmt),
+			 /*is_store=*/true);
+
+  gsi_next (iter);
+  return true;
+}
+
   return false;
 }
 
diff --git a/gcc/testsuite/g++.dg/asan/pr69276.C b/gcc/testsuite/g++.dg/asan/pr69276.C
new file mode 100644
index 000..ff43650
--- /dev/null
+++ b/gcc/testsuite/g++.dg/asan/pr69276.C
@@ -0,0 +1,38 @@
+/* { dg-do run } */
+/* { dg-shouldfail "asan" } */
+/* { dg-additional-options "-O0 -fno-lto" } */
+
+#include 
+
+typedef __SIZE_TYPE__ size_t;
+inline void * operator new (size_t, void *p) { return p; }
+
+
+struct vec
+{
+  int size;
+};
+
+struct vnull
+{
+  operator vec() { return vec(); }
+};
+vnull vNULL;
+
+struct A
+{
+  A(): value2 (vNULL), value3 (vNULL) {}
+  int value;
+  vec value2;
+  vec value3;
+};
+
+int main()
+{
+  int *array = (int *)malloc (sizeof (int) * 1);
+  A *a = new (array) A ();
+  free (array);
+}
+
+/* { dg-output "ERROR: AddressSanitizer: heap-buffer-overflow.*(\n|\r\n|\r)" } */
+/* { dg-output "#0 0x\[0-9a-f\]+ +in A::A()" } */
-- 
2.7.0



[PATCH] Fix PR pch/68758

2016-01-27 Thread Martin Liška
Hello.

As mentioned in: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68758#c5, I 
consider
more logic to encapsulate valgrind annotation magic within a 
ENABLE_VALGRIND_ANNOTATIONS
macro rather than ENABLE_VALGRIND_CHECKING.

With patch applied, ggc invalid reads/writes, reported by valgrind tool, are 
gone.
The patch can survive regression tests and bootstrap on x86_64-linux-gnu?

Ready for trunk?
Thanks,
Martin
>From bfa1e418c8039f8f81047fc1c60c185a95a4deca Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 6 Jan 2016 13:29:33 +0100
Subject: [PATCH] Fix PR pch/68758.

gcc/ChangeLog:

2016-01-06  Martin Liska  

	PR pch/68758
	* ggc-common.c (gt_pch_save): Use ENABLE_VALGRIND_ANNOTATIONS macro
	instead of ENABLE_VALGRIND_CHECKING.
---
 gcc/ggc-common.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/ggc-common.c b/gcc/ggc-common.c
index c919ba8..9b291aa 100644
--- a/gcc/ggc-common.c
+++ b/gcc/ggc-common.c
@@ -492,7 +492,7 @@ gt_pch_save (FILE *f)
 
   ggc_pch_prepare_write (state.d, state.f);
 
-#if defined ENABLE_VALGRIND_CHECKING && defined VALGRIND_GET_VBITS
+#if defined ENABLE_VALGRIND_ANNOTATIONS && defined VALGRIND_GET_VBITS
   vec vbits = vNULL;
 #endif
 
@@ -504,7 +504,7 @@ gt_pch_save (FILE *f)
 	  this_object_size = state.ptrs[i]->size;
 	  this_object = XRESIZEVAR (char, this_object, this_object_size);
 	}
-#if defined ENABLE_VALGRIND_CHECKING && defined VALGRIND_GET_VBITS
+#if defined ENABLE_VALGRIND_ANNOTATIONS && defined VALGRIND_GET_VBITS
   /* obj might contain uninitialized bytes, e.g. in the trailing
 	 padding of the object.  Avoid warnings by making the memory
 	 temporarily defined and then restoring previous state.  */
@@ -561,7 +561,7 @@ gt_pch_save (FILE *f)
 			state.ptrs[i]->note_ptr_fn == gt_pch_p_S);
   if (state.ptrs[i]->note_ptr_fn != gt_pch_p_S)
 	memcpy (state.ptrs[i]->obj, this_object, state.ptrs[i]->size);
-#if defined ENABLE_VALGRIND_CHECKING && defined VALGRIND_GET_VBITS
+#if defined ENABLE_VALGRIND_ANNOTATIONS && defined VALGRIND_GET_VBITS
   if (__builtin_expect (get_vbits == 1, 0))
 	{
 	  (void) VALGRIND_SET_VBITS (state.ptrs[i]->obj, vbits.address (),
@@ -575,7 +575,7 @@ gt_pch_save (FILE *f)
 	}
 #endif
 }
-#if defined ENABLE_VALGRIND_CHECKING && defined VALGRIND_GET_VBITS
+#if defined ENABLE_VALGRIND_ANNOTATIONS && defined VALGRIND_GET_VBITS
   vbits.release ();
 #endif
 
-- 
2.7.0



Re: [PATCH] Fix PR sanitizer/PR69276

2016-01-27 Thread Jakub Jelinek
On Wed, Jan 27, 2016 at 01:47:10PM +0100, Martin Liška wrote:
> Following patch was kind of pre-approved by Jakub in:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69276#c4
> 
> Patch can bootstrap in x86_64-linux-gnu and survives regression tests.
> I also verified that newly added test-case works with the patch.
> 
> Ready for trunk?

Ok, with nits.

> >From 4e4575cfef5d06d8e8477716ce2f4d7e28ae66f0 Mon Sep 17 00:00:00 2001
> From: marxin 
> Date: Thu, 14 Jan 2016 18:15:04 +0100
> Subject: [PATCH] Fix PR sanitizer/PR69276
> 
> gcc/testsuite/ChangeLog:
> 
> 2016-01-14  Martin Liska  
> 
>   * g++.dg/asan/pr69276.C: New test.
> 
> gcc/ChangeLog:
> 
> 2016-01-14  Martin Liska  
> 
>   PR sanitizer/PR69276
>   * asan.c (has_stmt_been_instrumented_p): Instrument gimple calls
>   that are gimple_store_p.
>   (maybe_instrument_call): Likewise.
> ---
>  gcc/asan.c  | 21 
>  gcc/testsuite/g++.dg/asan/pr69276.C | 38 
> +
>  2 files changed, 59 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/asan/pr69276.C
> 
> diff --git a/gcc/asan.c b/gcc/asan.c
> index 2f9f92f..1747e90 100644
> --- a/gcc/asan.c
> +++ b/gcc/asan.c
> @@ -2038,6 +2048,17 @@ maybe_instrument_call (gimple_stmt_iterator *iter)
>gimple_set_location (g, gimple_location (stmt));
>gsi_insert_before (iter, g, GSI_SAME_STMT);
>  }
> +  else if (gimple_store_p (stmt))

I'd remove the "else " here, I believe if a noreturn call returns aggregate
the lhs is not removed and the store can still (partially) happen, if it is
returned by invisible reference.  Do you instrument it before the call or
after btw?  Before the call might be premature, the call might not return
and not store anything into the result, after the call might be too late
(store happened already).

Jakub


Re: [PATCH] Fix PR pch/68758

2016-01-27 Thread Richard Biener
On Wed, Jan 27, 2016 at 1:52 PM, Martin Liška  wrote:
> Hello.
>
> As mentioned in: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68758#c5, I 
> consider
> more logic to encapsulate valgrind annotation magic within a 
> ENABLE_VALGRIND_ANNOTATIONS
> macro rather than ENABLE_VALGRIND_CHECKING.
>
> With patch applied, ggc invalid reads/writes, reported by valgrind tool, are 
> gone.
> The patch can survive regression tests and bootstrap on x86_64-linux-gnu?
>
> Ready for trunk?

Ok.

THanks,
RIchard.

> Thanks,
> Martin


Re: [PATCH, 4.9, rs6000, testsuite] Fix PR69479

2016-01-27 Thread David Edelsohn
On Tue, Jan 26, 2016 at 4:46 PM, Bill Schmidt
 wrote:
> Hi,
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69479 notes that
> gcc.dg/and-1.c fails a scan-assembler-not test for nand, but the test
> does pass in subsequent releases.  The test author indicates in comment
> #1 that we can just remove this test for powerpc*-*-*, which this patch
> does.  Verified for 4.9 on powerpc64le-unknown-linux-gnu.  Ok to commit
> to that branch?
>
> Thanks,
> Bill
>
>
> 2016-01-26  Bill Schmidt  
>
> * gcc.dg/and-1.c: Remove nand test for powerpc*-*-*.

Sigh. This testcase should have been placed in gcc.target.

This patch is okay.

Thanks, David


Re: PING^1: [PATCH] Add TYPE_EMPTY_RECORD for C++ empty class

2016-01-27 Thread H.J. Lu
On Wed, Jan 27, 2016 at 1:03 AM, Marc Glisse  wrote:
> On Wed, 27 Jan 2016, Jakub Jelinek wrote:
>
>> On Wed, Jan 27, 2016 at 09:10:14AM +0100, Marc Glisse wrote:

 Revised:

 /* Returns true if TYPE is POD of one-byte or less in size for the
 purpose
  of layout and an empty class or an class with empty classes.  */

 static bool
 is_empty_record (tree type)
 {
 if (type == error_mark_node)
   return false;

 if (!CLASS_TYPE_P (type))
   return false;

 if (CLASSTYPE_NON_LAYOUT_POD_P (type))
   return false;

 gcc_assert (COMPLETE_TYPE_P (type));

 if (CLASSTYPE_EMPTY_P (type))
   return true;

 if (int_size_in_bytes (type) > 1)
   return false;
>>>
>>>
>>> That's completely arbitrary :-(
>>
>>
>> Yeah.  Because (adapted to be compilable with C):
>> struct A1 {}; struct A2 {};
>> struct B1 { struct A1 a; struct A2 b; }; struct B2 { struct A1 a; struct
>> A2 b; };
>> struct C1 { struct B1 a; struct B2 b; }; struct C2 { struct B1 a; struct
>> B2 b; };
>> struct D1 { struct C1 a; struct C2 b; }; struct D2 { struct C1 a; struct
>> C2 b; };
>> struct E1 { struct D1 a; struct D2 b; }; struct E2 { struct D1 a; struct
>> D2 b; };
>> struct F1 { struct E1 a; struct E2 b; }; struct F2 { struct E1 a; struct
>> E2 b; };
>> struct G1 { struct F1 a; struct F2 b; }; struct G2 { struct F1 a; struct
>> F2 b; };
>> struct H1 { struct G1 a; struct G2 b; }; struct H2 { struct G1 a; struct
>> G2 b; };
>> struct I1 { struct H1 a; struct H2 b; }; struct I2 { struct H1 a; struct
>> H2 b; };
>> struct J1 { struct I1 a; struct I2 b; }; struct J2 { struct I1 a; struct
>> I2 b; };
>> struct K1 { struct J1 a; struct J2 b; };
>> int v;
>> __attribute__((noinline, noclone))
>> struct K1 foo (int a, struct K1 x, int b)
>> {
>>  v = a + b;
>>  return x;
>> }
>> struct K1 k, m;
>> void
>> bar (void)
>> {
>>  m = foo (1, k, 2);
>> }
>> then would have a different calling convention between C and C++,
>> so where is the argument that we change anything just to make the two
>> compatible?  Though, of course, those two will never be compatible,
>> it is enough to add struct L1 { int a; struct K1 b; int c; }; and
>> that structure has 1024+8 bytes in C++ and 8 bytes in C.
>
>
> I don't know how empty classes are used in C in practice, but it could make
> sense to have ABI compatibility as long as no empty struct is used as a
> member of another struct (I also suggested an attribute to let C++ use the
> same layout as C here: PR63579). But then the usual definition of empty
> would be sufficient.
>
>> As clang generates different code for the above between C and C++, it
>> clearly special cases for some reason just the most common case.
>> IMHO it is not worth to change GCC ABI...
>
>
> I was interested in this change because it improves C++, C compatibility was
> a convenient excuse ;-)
>

I opened  a clang bug:

https://llvm.org/bugs/show_bug.cgi?id=26337

I propose the following definitions:

i. An empty record is:
   1. A class without member.  Or
   2. An array of empty records.  Or
   3. A class with empty records.
ii. An empty record type for parameter passing is POD for the purpose of layout
and
   1. A class without member.  Or
   2. A class with empty records.

/* An empty record is:
   1. A class without member.  Or
   2. An array of empty records.  Or
   3. A class with empty records.  */

/* Returns true if TYPE is an empty record or an array of empty records.  */

static bool
is_empty_record_or_array_of_empty_record (tree type)
{
  if (CLASS_TYPE_P (type))
{
  if (CLASSTYPE_EMPTY_P (type))
return true;

  tree field;

  for (field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
if (TREE_CODE (field) == FIELD_DECL
&& !DECL_ARTIFICIAL (field)
&& !is_empty_record_or_array_of_empty_record (TREE_TYPE (field)))
  return false;
  return true;
}
  else if (TREE_CODE (type) == ARRAY_TYPE)
return is_empty_record_or_array_of_empty_record (TREE_TYPE (type));
  return false;
}

/* Returns true if TYPE is POD for the purpose of layout and
   1. A class without member.  Or
   2. A class with empty records.  */

static bool
is_empty_record_for_parm (tree type)
{
  if (type == error_mark_node)
return false;

  if (!CLASS_TYPE_P (type))
return false;

  if (CLASSTYPE_NON_LAYOUT_POD_P (type))
return false;

  gcc_assert (COMPLETE_TYPE_P (type));

  if (CLASSTYPE_EMPTY_P (type))
return true;

  tree field;

  for (field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
if (TREE_CODE (field) == FIELD_DECL
&& !DECL_ARTIFICIAL (field)
&& !is_empty_record_or_array_of_empty_record (TREE_TYPE (field)))
  return false;

  return true;
}


-- 
H.J.


Re: [PATCH] Fix PR sanitizer/PR69276

2016-01-27 Thread Martin Liška
On 01/27/2016 01:58 PM, Jakub Jelinek wrote:
> On Wed, Jan 27, 2016 at 01:47:10PM +0100, Martin Liška wrote:
>> Following patch was kind of pre-approved by Jakub in:
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69276#c4
>>
>> Patch can bootstrap in x86_64-linux-gnu and survives regression tests.
>> I also verified that newly added test-case works with the patch.
>>
>> Ready for trunk?
> 
> Ok, with nits.
> 
>> >From 4e4575cfef5d06d8e8477716ce2f4d7e28ae66f0 Mon Sep 17 00:00:00 2001
>> From: marxin 
>> Date: Thu, 14 Jan 2016 18:15:04 +0100
>> Subject: [PATCH] Fix PR sanitizer/PR69276
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2016-01-14  Martin Liska  
>>
>>  * g++.dg/asan/pr69276.C: New test.
>>
>> gcc/ChangeLog:
>>
>> 2016-01-14  Martin Liska  
>>
>>  PR sanitizer/PR69276
>>  * asan.c (has_stmt_been_instrumented_p): Instrument gimple calls
>>  that are gimple_store_p.
>>  (maybe_instrument_call): Likewise.
>> ---
>>  gcc/asan.c  | 21 
>>  gcc/testsuite/g++.dg/asan/pr69276.C | 38 
>> +
>>  2 files changed, 59 insertions(+)
>>  create mode 100644 gcc/testsuite/g++.dg/asan/pr69276.C
>>
>> diff --git a/gcc/asan.c b/gcc/asan.c
>> index 2f9f92f..1747e90 100644
>> --- a/gcc/asan.c
>> +++ b/gcc/asan.c
>> @@ -2038,6 +2048,17 @@ maybe_instrument_call (gimple_stmt_iterator *iter)
>>gimple_set_location (g, gimple_location (stmt));
>>gsi_insert_before (iter, g, GSI_SAME_STMT);
>>  }
>> +  else if (gimple_store_p (stmt))
> 
> I'd remove the "else " here, I believe if a noreturn call returns aggregate
> the lhs is not removed and the store can still (partially) happen, if it is
> returned by invisible reference.  

Hi Jakub.

Ah, you are right, good nit!

Do you instrument it before the call or
> after btw?  Before the call might be premature, the call might not return
> and not store anything into the result, after the call might be too late
> (store happened already).

The instrumentation is applied before the call. Hope we do not introduce
a new false positives.

Can I apply the v2?

Thanks,
Martin

> 
>   Jakub
> 

>From d78539fdd0a0d3c3275eb0cdbdd50d7b6ddf9c4c Mon Sep 17 00:00:00 2001
From: marxin 
Date: Thu, 14 Jan 2016 18:15:04 +0100
Subject: [PATCH] Fix PR sanitizer/PR69276

gcc/testsuite/ChangeLog:

2016-01-14  Martin Liska  

	* g++.dg/asan/pr69276.C: New test.

gcc/ChangeLog:

2016-01-14  Martin Liska  

	PR sanitizer/PR69276
	* asan.c (has_stmt_been_instrumented_p): Instrument gimple calls
	that are gimple_store_p.
	(maybe_instrument_call): Likewise.
---
 gcc/asan.c  | 22 +
 gcc/testsuite/g++.dg/asan/pr69276.C | 38 +
 2 files changed, 60 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/asan/pr69276.C

diff --git a/gcc/asan.c b/gcc/asan.c
index 2f9f92f..aeb840d 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -897,6 +897,16 @@ has_stmt_been_instrumented_p (gimple *stmt)
 	  return true;
 	}
 }
+  else if (is_gimple_call (stmt) && gimple_store_p (stmt))
+{
+  asan_mem_ref r;
+  asan_mem_ref_init (&r, NULL, 1);
+
+  r.start = gimple_call_lhs (stmt);
+  r.access_size = int_size_in_bytes (TREE_TYPE (r.start));
+  return has_mem_ref_been_instrumented (&r);
+}
+
   return false;
 }
 
@@ -2038,6 +2048,18 @@ maybe_instrument_call (gimple_stmt_iterator *iter)
   gimple_set_location (g, gimple_location (stmt));
   gsi_insert_before (iter, g, GSI_SAME_STMT);
 }
+
+  if (gimple_store_p (stmt))
+{
+  tree ref_expr = gimple_call_lhs (stmt);
+  instrument_derefs (iter, ref_expr,
+			 gimple_location (stmt),
+			 /*is_store=*/true);
+
+  gsi_next (iter);
+  return true;
+}
+
   return false;
 }
 
diff --git a/gcc/testsuite/g++.dg/asan/pr69276.C b/gcc/testsuite/g++.dg/asan/pr69276.C
new file mode 100644
index 000..ff43650
--- /dev/null
+++ b/gcc/testsuite/g++.dg/asan/pr69276.C
@@ -0,0 +1,38 @@
+/* { dg-do run } */
+/* { dg-shouldfail "asan" } */
+/* { dg-additional-options "-O0 -fno-lto" } */
+
+#include 
+
+typedef __SIZE_TYPE__ size_t;
+inline void * operator new (size_t, void *p) { return p; }
+
+
+struct vec
+{
+  int size;
+};
+
+struct vnull
+{
+  operator vec() { return vec(); }
+};
+vnull vNULL;
+
+struct A
+{
+  A(): value2 (vNULL), value3 (vNULL) {}
+  int value;
+  vec value2;
+  vec value3;
+};
+
+int main()
+{
+  int *array = (int *)malloc (sizeof (int) * 1);
+  A *a = new (array) A ();
+  free (array);
+}
+
+/* { dg-output "ERROR: AddressSanitizer: heap-buffer-overflow.*(\n|\r\n|\r)" } */
+/* { dg-output "#0 0x\[0-9a-f\]+ +in A::A()" } */
-- 
2.7.0



Re: Patch RFA: Add option -fcollectible-pointers, use it in ivopts

2016-01-27 Thread Ian Lance Taylor
On Wed, Jan 27, 2016 at 3:18 AM, Bernd Schmidt  wrote:
>> +This option is disabled by default for most languages, enabled by
>> +default for languages that use garbage collection.
>
>
> This is not true as of this patch.

Yes.  As I said elsewhere, my intent is to do that as a separate patch.


>> Index: tree-ssa-loop-ivopts.c
>> ===
>> --- tree-ssa-loop-ivopts.c  (revision 232580)
>> +++ tree-ssa-loop-ivopts.c  (working copy)
>> @@ -2815,6 +2815,16 @@
>> struct iv_cand *cand = NULL;
>> tree type, orig_type;
>>
>> +  /* -fkeep-gc-roots-live means that we have to keep a real pointer
>> + live, but the ivopts code may replace a real pointer with one
>> + pointing before or after the memory block that is then adjusted
>> + into the memory block during the loop.  FIXME: It would likely be
>> + better to actually force the pointer live and still use ivopts;
>> + for example, it would be enough to write the pointer into memory
>> + and keep it there until after the loop.  */
>> +  if (flag_keep_gc_roots_live && POINTER_TYPE_P (TREE_TYPE (base)))
>> +return NULL;
>
>
> Is there a reason to exclude a candidate if the base is just a pointer to an
> object? Or does it just not make a difference?

Off hand I don't see how it would make a difference.  If the pointer
is not incremented or indexed, ivopts isn't going to be able to do
anything with it anyhow.


> I feel somewhat uneasy that this probably isn't a full solution either. But
> it seems like a reasonable starting point. I'm still not sure we want to
> advertise the option to users, as it might overpromise at the moment. Do we
> have precedent for internal-only options?

I suppose an option marked as "Undocumented" could be considered to be
internal-only.

The option does in principle over-promise.  On the other hand, it's
probably pretty good in practice; Java and Go have gotten by without
this option for many years.  I only have a single test case that shows
a real problem, and it only shows up on PPC64.  That said, I'm fine
with making the option undocumented if you prefer.

> Or maybe we do need some other form to represent the need to keep a pointer
> alive in the IL. I suspect that for gcc-6 your patch is probably reasonable,
> but we should mark the option as internal and likely to go away in the
> future.

Why would the option  be likely to go away?  The option name is chosen
(based on Richi's suggestion) so that it can be applied to other
passes besides ivopts, and it's useful for languages that do not
normally do garbage collection.  If we improve the implementation as
you suggest, it would still be reasonable to keep the option.

Ian


Re: C++ PATCH for c++/69496 (ICE on VLA in constexpr function)

2016-01-27 Thread Jason Merrill

OK, but the testcase should go in ext/.

Jason


Re: C++ PATCH for c++/69496 (ICE on VLA in constexpr function)

2016-01-27 Thread Marek Polacek
On Wed, Jan 27, 2016 at 09:08:27AM -0500, Jason Merrill wrote:
> OK, but the testcase should go in ext/.

Oops, will move it there then, thanks!

Marek


Re: [hsa merge 00/10] Merge of HSA branch

2016-01-27 Thread Martin Jambor
Hi,

sorry for getting so late to this:

On Thu, Jan 21, 2016 at 05:10:17PM -0600, Gerald Pfeifer wrote:
> On Tue, 19 Jan 2016, Richard Biener wrote:
> > I think the merge warrants a NEWS entry on gcc.gnu.org/
> 
> ...and gcc-6/changes.html. :-)
> 
> Martin, happy to help.  Want to propose some text (or even patch)?
> 

So what would you think about the following?  Perhaps it is too
verbose but I wanted to mention the few areas users should know have
changed, if they happen to try HSA out.  I can certainly cut it down a
bit.

Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/changes.html,v
retrieving revision 1.52
diff -u -r1.52 changes.html
--- changes.html25 Jan 2016 15:09:55 -  1.52
+++ changes.html27 Jan 2016 14:15:49 -
@@ -272,6 +272,30 @@

 
 
+Heterogeneous Systems Architecture
+   
+ GCC can now generate HSAIL for simple OpenMP device constructs
+   if configured with --enable-offload-targets=hsa.  A new
+   libgomp plugin then run these HSAIL kernels implementing these
+   constructs on HSA capable GPUs via standard HSA run-time.
+   
+   If the HSA compilation back-end determines it cannot output HSAIL
+   for a particular input, it gives a warning by default.  These
+   warnings can be suppressed with -Wno-hsa.  To give a
+   few examples, the HSA back-end does not implement compilation of
+   code using function pointers and variable-sized variables and
+   parameters, functions with variadic arguments as well as a number of
+   other less common programming constructs.
+
+   When compilation for HSA is enabled, the compiler attempts to
+   compile composite OpenMP constructs
+
+#pragma omp target teams distribute parallel for
+into parallel HSA GPU kernels.
+ 
+   
+
+
 IA-32/x86-64

  GCC now supports the Intel CPU named Skylake with AVX-512 extensions

The change to the news on the main page might then be:

Index: index.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v
retrieving revision 1.992
diff -u -r1.992 index.html
--- index.html  24 Jan 2016 23:54:36 -  1.992
+++ index.html  27 Jan 2016 14:16:25 -
@@ -52,6 +52,13 @@
 
 
 
+ Heterogeneous Systems Architecture support in GCC
+ [2016-01-27]
+ http://www.hsafoundation.com/";> Heterogeneous Systems
+ Architecture 1.0 https://gcc.gnu.org/gcc-6/changes.html#hsa";>
+ support was added to GCC.  Contributed by Martin Jambor, Martin Liška
+ and Michael Matz from SUSE.
+
 GCC 5.3 released
 [2015-12-04]
 

Any comments welcome.

Thanks,

Martin


Re: [PATCH] PR other/69006: S/390: Fix extra newlines after diagnostics.

2016-01-27 Thread David Malcolm
On Wed, 2016-01-27 at 12:01 +0100, Dominik Vogt wrote:
> The attached patch removes a blank line after an error message.
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69006

Presumably this was exposed by the stricter testing I added to lib/gcc
-dg.exp in r232837?

>else if (num_matches > 1)
>  {
> -  error_at (loc, "ambiguous overload for intrinsic: %s\n",
> +  error_at (loc, "ambiguous overload for intrinsic: %s",
>IDENTIFIER_POINTER (DECL_NAME (ob_fndecl)));
>return error_mark_node;
>  }

I'm not a reviewer, but FWIW the removal of the trailing "\n" looks
sane to me.

Should this code be using %qs rather than %s? (or somesuch, or is that
a gcc 7 thing)

Dave



[PATCH] Fix PR69166

2016-01-27 Thread Richard Biener

The following fixes a missing check on commutativity / associativity in
reduction detection.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2016-01-27  Richard Biener  

PR tree-optimization/69166
* tree-vect-loop.c (vect_is_simple_reduction): Always check
reduction code for commutativity / associativity.

* gcc.dg/torture/pr69166.c: New testcase.

Index: gcc/tree-vect-loop.c
===
*** gcc/tree-vect-loop.c(revision 232867)
--- gcc/tree-vect-loop.c(working copy)
*** vect_is_simple_reduction (loop_vec_info
*** 2750,2766 
&& SSA_NAME_DEF_STMT (op1) == phi)
  code = PLUS_EXPR;
  
!   if (check_reduction)
  {
!   if (code == COND_EXPR)
*v_reduc_type = COND_REDUCTION;
!   else if (!commutative_tree_code (code) || !associative_tree_code (code))
!   {
! if (dump_enabled_p ())
!   report_vect_op (MSG_MISSED_OPTIMIZATION, def_stmt,
!   "reduction: not commutative/associative: ");
! return NULL;
!   }
  }
  
if (get_gimple_rhs_class (code) != GIMPLE_BINARY_RHS)
--- 2741,2757 
&& SSA_NAME_DEF_STMT (op1) == phi)
  code = PLUS_EXPR;
  
!   if (code == COND_EXPR)
  {
!   if (check_reduction)
*v_reduc_type = COND_REDUCTION;
! }
!   else if (!commutative_tree_code (code) || !associative_tree_code (code))
! {
!   if (dump_enabled_p ())
!   report_vect_op (MSG_MISSED_OPTIMIZATION, def_stmt,
!   "reduction: not commutative/associative: ");
!   return NULL;
  }
  
if (get_gimple_rhs_class (code) != GIMPLE_BINARY_RHS)
*** vect_is_simple_reduction (loop_vec_info
*** 2856,2866 
   and therefore vectorizing reductions in the inner-loop during
   outer-loop vectorization is safe.  */
  
!   if (*v_reduc_type != COND_REDUCTION)
  {
/* CHECKME: check for !flag_finite_math_only too?  */
!   if (SCALAR_FLOAT_TYPE_P (type) && !flag_associative_math
! && check_reduction)
{
  /* Changing the order of operations changes the semantics.  */
  if (dump_enabled_p ())
--- 2847,2857 
   and therefore vectorizing reductions in the inner-loop during
   outer-loop vectorization is safe.  */
  
!   if (*v_reduc_type != COND_REDUCTION
!   && check_reduction)
  {
/* CHECKME: check for !flag_finite_math_only too?  */
!   if (SCALAR_FLOAT_TYPE_P (type) && !flag_associative_math)
{
  /* Changing the order of operations changes the semantics.  */
  if (dump_enabled_p ())
*** vect_is_simple_reduction (loop_vec_info
*** 2868,2874 
"reduction: unsafe fp math optimization: ");
  return NULL;
}
!   else if (INTEGRAL_TYPE_P (type) && check_reduction)
{
  if (!operation_no_trapping_overflow (type, code))
{
--- 2859,2865 
"reduction: unsafe fp math optimization: ");
  return NULL;
}
!   else if (INTEGRAL_TYPE_P (type))
{
  if (!operation_no_trapping_overflow (type, code))
{
*** vect_is_simple_reduction (loop_vec_info
*** 2891,2897 
  return NULL;
}
}
!   else if (SAT_FIXED_POINT_TYPE_P (type) && check_reduction)
{
  /* Changing the order of operations changes the semantics.  */
  if (dump_enabled_p ())
--- 2882,2888 
  return NULL;
}
}
!   else if (SAT_FIXED_POINT_TYPE_P (type))
{
  /* Changing the order of operations changes the semantics.  */
  if (dump_enabled_p ())
Index: gcc/testsuite/gcc.dg/torture/pr69166.c
===
*** gcc/testsuite/gcc.dg/torture/pr69166.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr69166.c  (working copy)
***
*** 0 
--- 1,14 
+ /* { dg-do compile } */
+ 
+ void fn2(double *e, double a)
+ {
+   int b = 0;
+   for (; b < 256; b++)
+ {
+   int c = 0;
+   double x = e[b];
+   for (; c < 256; ++c)
+   x /= a;
+   e[b] = x;
+ }
+ }


Re: [Fortran, gcc-5, patch, pr69268, v1] [5 Regression] Sourced allocation calls function twice

2016-01-27 Thread Andre Vehreschild
Hi Paul,

thanks for the review. Commited as r232876.

Regards,
Andre

On Tue, 26 Jan 2016 18:36:28 +0100
Paul Richard Thomas  wrote:

> Dear Andre,
> 
> The patch looks fine to me. OK for 5-branch.
> 
> Thanks for the patch.
> 
> Paul
> 
> On 26 January 2016 at 13:28, Andre Vehreschild  wrote:
> > Hi all,
> >
> > please find attached a patch to solve the issue of evaluating a source=
> > expression of an allocate() twice in gcc-5. The patch is a combination
> > and partial back port of several prs of the mainline (namely, but not
> > the complete list: pr44672, pr65548).
> >
> > The patch needed the counts of builtin_mallocs/frees in
> > allocatable_scalar_13 to be adapted. There are now fewer calls to the
> > memory management routines. Valgrind does not report any memory issues
> > in the modified code, but that does not mean there aren't any. I am
> > happy to hear about any issue, this patch causes (still having issues
> > getting the sanitizer to work).
> >
> > Bootstrapped and regtested on x86_64-linux-gnu/F23.
> >
> > Ok, for gcc-5-branch?
> >
> > Regards,
> > Andre
> > --
> > Andre Vehreschild * Email: vehre ad gmx dot de  
> 
> 
> 


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
Index: gcc/fortran/ChangeLog
===
--- gcc/fortran/ChangeLog	(Revision 232870)
+++ gcc/fortran/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,11 @@
+2016-01-27  Andre Vehreschild  
+
+	PR fortran/p69268
+	* trans-stmt.c (gfc_trans_allocate): Make sure the source=
+	expression is evaluated once only. Use gfc_trans_assignment ()
+	instead of explicitly calling gfc_trans_string_copy () to
+	reduce the code complexity in trans_allocate.
+
 2016-01-25  Dominique d'Humieres 
 
 	PR fortran/68283
Index: gcc/fortran/trans-stmt.c
===
--- gcc/fortran/trans-stmt.c	(Revision 232870)
+++ gcc/fortran/trans-stmt.c	(Arbeitskopie)
@@ -5108,7 +5108,7 @@
 gfc_trans_allocate (gfc_code * code)
 {
   gfc_alloc *al;
-  gfc_expr *expr;
+  gfc_expr *expr, *e3rhs = NULL;
   gfc_se se, se_sz;
   tree tmp;
   tree parm;
@@ -5130,6 +5130,7 @@
   stmtblock_t post;
   tree nelems;
   bool upoly_expr, tmp_expr3_len_flag = false, al_len_needs_set;
+  gfc_symtree *newsym = NULL;
 
   if (!code->ext.alloc.list)
 return NULL_TREE;
@@ -5239,16 +5240,28 @@
 	 false, false);
 	  gfc_add_block_to_block (&block, &se.pre);
 	  gfc_add_block_to_block (&post, &se.post);
-	  /* Prevent aliasing, i.e., se.expr may be already a
-		 variable declaration.  */
+
 	  if (!VAR_P (se.expr))
 		{
+		  tree var;
+
 		  tmp = build_fold_indirect_ref_loc (input_location,
 		 se.expr);
-		  tmp = gfc_evaluate_now (tmp, &block);
+
+		  /* We need a regular (non-UID) symbol here, therefore give a
+		 prefix.  */
+		  var = gfc_create_var (TREE_TYPE (tmp), "source");
+		  if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (tmp)))
+		{
+		  gfc_allocate_lang_decl (var);
+		  GFC_DECL_SAVED_DESCRIPTOR (var) = GFC_DECL_SAVED_DESCRIPTOR (tmp);
+		}
+		  gfc_add_modify_loc (input_location, &block, var, tmp);
+		  tmp = var;
 		}
 	  else
 		tmp = se.expr;
+
 	  if (!code->expr3->mold)
 		expr3 = tmp;
 	  else
@@ -5357,6 +5370,71 @@
 	  else
 	expr3_esize = TYPE_SIZE_UNIT (
 		  gfc_typenode_for_spec (&code->expr3->ts));
+
+	  /* The routine gfc_trans_assignment () already implements all
+	 techniques needed.  Unfortunately we may have a temporary
+	 variable for the source= expression here.  When that is the
+	 case convert this variable into a temporary gfc_expr of type
+	 EXPR_VARIABLE and used it as rhs for the assignment.  The
+	 advantage is, that we get scalarizer support for free,
+	 don't have to take care about scalar to array treatment and
+	 will benefit of every enhancements gfc_trans_assignment ()
+	 gets.
+	 Exclude variables since the following block does not handle
+	 array sections.  In any case, there is no harm in sending
+	 variables to gfc_trans_assignment because there is no
+	 evaluation of variables.  */
+	  if (code->expr3->expr_type != EXPR_VARIABLE
+	  && code->expr3->mold != 1 && expr3 != NULL_TREE
+	  && DECL_P (expr3) && DECL_ARTIFICIAL (expr3))
+	{
+	  /* Build a temporary symtree and symbol.  Do not add it to
+		 the current namespace to prevent accidently modifying
+		 a colliding symbol's as.  */
+	  newsym = XCNEW (gfc_symtree);
+	  /* The name of the symtree should be unique, because
+		 gfc_create_var () took care about generating the
+		 identifier.  */
+	  newsym->name = gfc_get_string (IDENTIFIER_POINTER (
+	   DECL_NAME (expr3)));
+	  newsym->n.sym = gfc_new_symbol (newsym->name, NULL);
+	  /* The backend_decl is known.  It is expr3, which is inserted
+		 here.  */
+	  newsym->n.sym->backend_decl = expr3;
+	  e3rhs = gfc_get_expr ();
+	  e3rhs->ts = code->e

Re: [PATCH][Testsuite] Fix scan-tree-dump failures with vect_multiple_sizes

2016-01-27 Thread Richard Biener
On Fri, Jan 22, 2016 at 3:20 PM, Alan Lawrence  wrote:
> Since r230292, these tests in gcc.dg/vect have been failing on ARM, AArch64, 
> and x86_64 with -march=haswell (among others - when prefer_avx128 is true):
>
> vect-outer-1-big-array.c scan-tree-dump-times vect "grouped access in outer 
> loop" 2
> vect-outer-1.c scan-tree-dump-times vect "grouped access in outer loop" 2
> vect-outer-1a-big-array.c scan-tree-dump-times vect "grouped access in outer 
> loop" 2
> vect-outer-1a.c scan-tree-dump-times vect "grouped access in outer loop" 2
> vect-outer-1b-big-array.c scan-tree-dump-times vect "grouped access in outer 
> loop" 2
> vect-outer-1b.c scan-tree-dump-times vect "grouped access in outer loop" 2
> vect-outer-2b.c scan-tree-dump-times vect "grouped access in outer loop" 2
> vect-outer-3b.c scan-tree-dump-times vect "grouped access in outer loop" 4
> vect-reduc-dot-s8b.c scan-tree-dump-times vect 
> "vect_recog_widen_mult_pattern: detected" 2
> (plus all the -flto -ffat-lto-objects equivalents).
>
> This is because that commit changed vect_analyze_loop{,_2} to bail out early
> and avoid retrying with a different vector size on finding a fatal error;
> all the above tests are conditioned on { target vect_multiple_sizes }.
>
> Hence, drop the vect_multiple_sizes version of the scan-tree-dump, as we now
> expect those messages to show up once everywhere.
>
> The optional extra would be to add a message that vect_analyze_loop was 
> failing
> with *fatal* error, and scan for that, but that doesn't really seem warranted.
>
> Tested vect.exp on aarch64-none-elf, arm-none-eabi, and x86_64 linux with 
> -march=haswell.
>
> OK for trunk?

Ok.

Thanks,
Richard.

> Cheers, Alan
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/vect/vect-outer-1-big-array.c: Drop vect_multiple_sizes;
> use same scan-tree-dump-times on all platforms.
> * gcc.dg/vect/vect-outer-1.c: Likewise.
> * gcc.dg/vect/vect-outer-1a-big-array.c: Likewise.
> * gcc.dg/vect/vect-outer-1a.c: Likewise.
> * gcc.dg/vect/vect-outer-1b-big-array.c: Likewise.
> * gcc.dg/vect/vect-outer-1b.c: Likewise.
> * gcc.dg/vect/vect-outer-2b.c: Likewise.
> * gcc.dg/vect/vect-outer-3b.c: Likewise.
> * gcc.dg/vect/vect-reduc-dot-s8b.c: Likewise.
> ---
>  gcc/testsuite/gcc.dg/vect/vect-outer-1-big-array.c  | 3 +--
>  gcc/testsuite/gcc.dg/vect/vect-outer-1.c| 3 +--
>  gcc/testsuite/gcc.dg/vect/vect-outer-1a-big-array.c | 3 +--
>  gcc/testsuite/gcc.dg/vect/vect-outer-1a.c   | 3 +--
>  gcc/testsuite/gcc.dg/vect/vect-outer-1b-big-array.c | 3 +--
>  gcc/testsuite/gcc.dg/vect/vect-outer-1b.c   | 3 +--
>  gcc/testsuite/gcc.dg/vect/vect-outer-2b.c   | 3 +--
>  gcc/testsuite/gcc.dg/vect/vect-outer-3b.c   | 3 +--
>  gcc/testsuite/gcc.dg/vect/vect-reduc-dot-s8b.c  | 3 +--
>  9 files changed, 9 insertions(+), 18 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-outer-1-big-array.c 
> b/gcc/testsuite/gcc.dg/vect/vect-outer-1-big-array.c
> index 6c61b09..63918ad 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-outer-1-big-array.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-outer-1-big-array.c
> @@ -22,5 +22,4 @@ foo (){
>  }
>
>  /* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { 
> xfail *-*-* } } } */
> -/* { dg-final { scan-tree-dump-times "grouped access in outer loop" 1 "vect" 
> { target { ! vect_multiple_sizes } } } } */
> -/* { dg-final { scan-tree-dump-times "grouped access in outer loop" 2 "vect" 
> { target vect_multiple_sizes } } } */
> +/* { dg-final { scan-tree-dump-times "grouped access in outer loop" 1 "vect" 
> } } */
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-outer-1.c 
> b/gcc/testsuite/gcc.dg/vect/vect-outer-1.c
> index 5fdaa83..b1bcbc2 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-outer-1.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-outer-1.c
> @@ -22,5 +22,4 @@ foo (){
>  }
>
>  /* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { 
> xfail *-*-* } } } */
> -/* { dg-final { scan-tree-dump-times "grouped access in outer loop" 1 "vect" 
> { target { ! vect_multiple_sizes } } } } */
> -/* { dg-final { scan-tree-dump-times "grouped access in outer loop" 2 "vect" 
> { target vect_multiple_sizes } } } */
> +/* { dg-final { scan-tree-dump-times "grouped access in outer loop" 1 "vect" 
> } } */
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-outer-1a-big-array.c 
> b/gcc/testsuite/gcc.dg/vect/vect-outer-1a-big-array.c
> index 68b25f9..98dfcfb 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-outer-1a-big-array.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-outer-1a-big-array.c
> @@ -20,5 +20,4 @@ foo (){
>  }
>
>  /* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { 
> xfail *-*-* } } } */
> -/* { dg-final { scan-tree-dump-times "grouped access in outer loop" 1 "vect" 
> { target { ! vect_multiple_sizes } } } } */
> -/* { dg-final { scan-tree-dump-times "grouped access in outer loop" 2 "vect" 
> { t

[patch] libstdc++/69295 Set FP options for failing special functions tests

2016-01-27 Thread Jonathan Wakely

This should fix the FAILs seen on trunk for some targets.

Tested powerpc64-linux, comitted to trunk.
commit 6f952bce33752add3fd08efac03bc19782ef09b8
Author: Jonathan Wakely 
Date:   Wed Jan 27 13:55:45 2016 +

Set FP options for failing special functions tests

	PR libstdc++/69295
	* testsuite/ext/special_functions/hyperg/check_value.cc: Use
	-ffp-contract=off, and -ffloat-store to disable excess precision.
	* testsuite/special_functions/02_assoc_legendre/check_value.cc: Use
	-ffp-contract=off.

diff --git a/libstdc++-v3/testsuite/ext/special_functions/hyperg/check_value.cc b/libstdc++-v3/testsuite/ext/special_functions/hyperg/check_value.cc
index 218c07a..d87fcef 100644
--- a/libstdc++-v3/testsuite/ext/special_functions/hyperg/check_value.cc
+++ b/libstdc++-v3/testsuite/ext/special_functions/hyperg/check_value.cc
@@ -1,5 +1,6 @@
-// { dg-options "-D__STDCPP_WANT_MATH_SPEC_FUNCS__" }
-//
+// { dg-options "-D__STDCPP_WANT_MATH_SPEC_FUNCS__ -ffp-contract=off" }
+// { dg-additional-options "-ffloat-store" { target { m68*-*-* || ia32 } } }
+
 // Copyright (C) 2016 Free Software Foundation, Inc.
 //
 // This file is part of the GNU ISO C++ Library.  This library is free
diff --git a/libstdc++-v3/testsuite/special_functions/02_assoc_legendre/check_value.cc b/libstdc++-v3/testsuite/special_functions/02_assoc_legendre/check_value.cc
index a514207..0c60ac6 100644
--- a/libstdc++-v3/testsuite/special_functions/02_assoc_legendre/check_value.cc
+++ b/libstdc++-v3/testsuite/special_functions/02_assoc_legendre/check_value.cc
@@ -1,5 +1,5 @@
-// { dg-options "-D__STDCPP_WANT_MATH_SPEC_FUNCS__" }
-//
+// { dg-options "-D__STDCPP_WANT_MATH_SPEC_FUNCS__ -ffp-contract=off" }
+
 // Copyright (C) 2016 Free Software Foundation, Inc.
 //
 // This file is part of the GNU ISO C++ Library.  This library is free


Re: Patch RFA: Add option -fcollectible-pointers, use it in ivopts

2016-01-27 Thread Bernd Schmidt

On 01/27/2016 02:59 PM, Ian Lance Taylor wrote:

+This option is disabled by default for most languages, enabled by
+default for languages that use garbage collection.



This is not true as of this patch.


Yes.  As I said elsewhere, my intent is to do that as a separate patch.


Then the followup patch should adjust the documentation.


The option does in principle over-promise.  On the other hand, it's
probably pretty good in practice; Java and Go have gotten by without
this option for many years.  I only have a single test case that shows
a real problem, and it only shows up on PPC64.  That said, I'm fine
with making the option undocumented if you prefer.


Or maybe we do need some other form to represent the need to keep a pointer
alive in the IL. I suspect that for gcc-6 your patch is probably reasonable,
but we should mark the option as internal and likely to go away in the
future.


Why would the option  be likely to go away?  The option name is chosen
(based on Richi's suggestion) so that it can be applied to other
passes besides ivopts, and it's useful for languages that do not
normally do garbage collection.  If we improve the implementation as
you suggest, it would still be reasonable to keep the option.


I think it would be better described in the IL, so that in an LTO 
situation the restrictions would apply only to code which needs it. 
Although I suppose you could then enable it for C and make that generate 
the necessary representation.


Still, I feel uncomfortable about making a promise we don't really 
expect to fully keep yet, so I would prefer this option to be 
undocumented for now. I won't object if someone else wants to approve it 
as a normal option however.



Bernd



Re: Patch RFA: Add option -fcollectible-pointers, use it in ivopts

2016-01-27 Thread Ian Lance Taylor
On Wed, Jan 27, 2016 at 7:16 AM, Bernd Schmidt  wrote:
>
> Still, I feel uncomfortable about making a promise we don't really expect to
> fully keep yet, so I would prefer this option to be undocumented for now. I
> won't object if someone else wants to approve it as a normal option however.

Are you approving the patch if I change the option to be undocumented?

Ian


[COMMITTED] Add myself as GCC maintainer

2016-01-27 Thread Wilco Dijkstra
I've added myself to the "Write After Approval" maintainers (Committed revision 
232880):

Index: ChangeLog
===
--- ChangeLog   (revision 232874)
+++ ChangeLog   (working copy)
@@ -1,3 +1,7 @@
+2015-01-27  Wilco Dijkstra  
+
+   * MAINTAINERS: (Write After Approval): Add myself.
+
 2016-01-25  Aditya Kumar  
Sebastian Pop  
 
Index: MAINTAINERS
===
--- MAINTAINERS (revision 232874)
+++ MAINTAINERS (working copy)
@@ -377,6 +377,7 @@
 Bud Davis  
 Chris Demetriou
 Sameera Deshpande  
+Wilco Dijkstra 
 Benoit Dupont de Dinechin  

 Michael Eager  
 Bernd Edlinger 




Re: C++ PATCH for c++/69379 (ICE with PTRMEM_CST wrapped in NOP_EXPR)

2016-01-27 Thread Jason Merrill

On 01/25/2016 03:00 PM, Marek Polacek wrote:

It appears that handling the case when the types don't match is sufficient, at
least all the tests pass, thus the following should be enough.


OK.


+   "reinterpret_cast has different types");


Let's say "a reinterpret_cast is not a constant-expression" here.

OK with that change.

Jason




Re: Patch RFA: Add option -fcollectible-pointers, use it in ivopts

2016-01-27 Thread Bernd Schmidt

On 01/27/2016 04:18 PM, Ian Lance Taylor wrote:

On Wed, Jan 27, 2016 at 7:16 AM, Bernd Schmidt  wrote:


Still, I feel uncomfortable about making a promise we don't really expect to
fully keep yet, so I would prefer this option to be undocumented for now. I
won't object if someone else wants to approve it as a normal option however.


Are you approving the patch if I change the option to be undocumented?


Yes.


Bernd


[PATCH, PR target/69454] Disable TARGET_STV when stack is not properly aligned

2016-01-27 Thread Ilya Enkovich
Hi,

Currently STV pass may require a stack realignment if any
transformation occurs to enable SSE registers spill/fill.
It appears it's invalid to increase stack alignment requirements
at this point.  Thus we have to either assume we need stack to be
aligned if are going to run STV pass or disable STV if stack is
not properly aligned.  I suppose we shouldn't ignore explicitly
requested stack alignment not beeing sure we really optimize
anything (and STV is not an optimization frequiently applied).
So I think we may disable TARGET_STV for such cases as Jakub
suggested.  This patch was bootstrapped and regtested on
x86_64-pc-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
gcc/

2016-01-27  Jakub Jelinek  
Ilya Enkovich  

PR target/69454
* config/i386/i386.c (convert_scalars_to_vector): Remove
stack alignment fixes.
(ix86_option_override_internal): Disable TARGET_STV if stack
is not properly aligned.

gcc/testsuite/

2016-01-27  Ilya Enkovich  

PR target/69454
* gcc.target/i386/pr69454-1.c: New test.
* gcc.target/i386/pr69454-2.c: New test.


diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 34b57a4..9fb8db8 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -3588,16 +3588,6 @@ convert_scalars_to_vector ()
   bitmap_obstack_release (NULL);
   df_process_deferred_rescans ();
 
-  /* Conversion means we may have 128bit register spills/fills
- which require aligned stack.  */
-  if (converted_insns)
-{
-  if (crtl->stack_alignment_needed < 128)
-   crtl->stack_alignment_needed = 128;
-  if (crtl->stack_alignment_estimated < 128)
-   crtl->stack_alignment_estimated = 128;
-}
-
   return 0;
 }
 
@@ -5453,6 +5443,11 @@ ix86_option_override_internal (bool main_args_p,
 opts->x_target_flags |= MASK_VZEROUPPER;
   if (!(opts_set->x_target_flags & MASK_STV))
 opts->x_target_flags |= MASK_STV;
+  /* Disable STV if -mpreferred-stack-boundary={2,3} - the needed
+ stack realignment will be extra cost the pass doesn't take into
+ account and the pass can't realign the stack.  */
+  if (ix86_preferred_stack_boundary < 64)
+opts->x_target_flags &= ~MASK_STV;
   if (!ix86_tune_features[X86_TUNE_AVX256_UNALIGNED_LOAD_OPTIMAL]
   && !(opts_set->x_target_flags & MASK_AVX256_SPLIT_UNALIGNED_LOAD))
 opts->x_target_flags |= MASK_AVX256_SPLIT_UNALIGNED_LOAD;
diff --git a/gcc/testsuite/gcc.target/i386/pr69454-1.c 
b/gcc/testsuite/gcc.target/i386/pr69454-1.c
new file mode 100644
index 000..12ecfd3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr69454-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile { target { ia32 } } } */
+/* { dg-options "-O2 -msse2 -mno-accumulate-outgoing-args 
-mpreferred-stack-boundary=2" } */
+
+typedef struct { long long w64[2]; } V128;
+extern V128* fn2(void);
+long long a;
+V128 b;
+void fn1() {
+  V128 *c = fn2();
+  c->w64[0] = a ^ b.w64[0];
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr69454-2.c 
b/gcc/testsuite/gcc.target/i386/pr69454-2.c
new file mode 100644
index 000..28bab93
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr69454-2.c
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { ia32 } } } */
+/* { dg-options "-O2 -mpreferred-stack-boundary=2" } */
+
+extern void fn2 ();
+long long a, b;
+
+void fn1 ()
+{
+  long long c = a;
+  a = b ^ a;
+  fn2 ();
+  a = c;
+}


Re: PING^1: [PATCH] Add TYPE_EMPTY_RECORD for C++ empty class

2016-01-27 Thread H.J. Lu
On Wed, Jan 27, 2016 at 5:46 AM, H.J. Lu  wrote:
> On Wed, Jan 27, 2016 at 1:03 AM, Marc Glisse  wrote:
>> On Wed, 27 Jan 2016, Jakub Jelinek wrote:
>>
>>> On Wed, Jan 27, 2016 at 09:10:14AM +0100, Marc Glisse wrote:
>
> Revised:
>
> /* Returns true if TYPE is POD of one-byte or less in size for the
> purpose
>  of layout and an empty class or an class with empty classes.  */
>
> static bool
> is_empty_record (tree type)
> {
> if (type == error_mark_node)
>   return false;
>
> if (!CLASS_TYPE_P (type))
>   return false;
>
> if (CLASSTYPE_NON_LAYOUT_POD_P (type))
>   return false;
>
> gcc_assert (COMPLETE_TYPE_P (type));
>
> if (CLASSTYPE_EMPTY_P (type))
>   return true;
>
> if (int_size_in_bytes (type) > 1)
>   return false;


 That's completely arbitrary :-(
>>>
>>>
>>> Yeah.  Because (adapted to be compilable with C):
>>> struct A1 {}; struct A2 {};
>>> struct B1 { struct A1 a; struct A2 b; }; struct B2 { struct A1 a; struct
>>> A2 b; };
>>> struct C1 { struct B1 a; struct B2 b; }; struct C2 { struct B1 a; struct
>>> B2 b; };
>>> struct D1 { struct C1 a; struct C2 b; }; struct D2 { struct C1 a; struct
>>> C2 b; };
>>> struct E1 { struct D1 a; struct D2 b; }; struct E2 { struct D1 a; struct
>>> D2 b; };
>>> struct F1 { struct E1 a; struct E2 b; }; struct F2 { struct E1 a; struct
>>> E2 b; };
>>> struct G1 { struct F1 a; struct F2 b; }; struct G2 { struct F1 a; struct
>>> F2 b; };
>>> struct H1 { struct G1 a; struct G2 b; }; struct H2 { struct G1 a; struct
>>> G2 b; };
>>> struct I1 { struct H1 a; struct H2 b; }; struct I2 { struct H1 a; struct
>>> H2 b; };
>>> struct J1 { struct I1 a; struct I2 b; }; struct J2 { struct I1 a; struct
>>> I2 b; };
>>> struct K1 { struct J1 a; struct J2 b; };
>>> int v;
>>> __attribute__((noinline, noclone))
>>> struct K1 foo (int a, struct K1 x, int b)
>>> {
>>>  v = a + b;
>>>  return x;
>>> }
>>> struct K1 k, m;
>>> void
>>> bar (void)
>>> {
>>>  m = foo (1, k, 2);
>>> }
>>> then would have a different calling convention between C and C++,
>>> so where is the argument that we change anything just to make the two
>>> compatible?  Though, of course, those two will never be compatible,
>>> it is enough to add struct L1 { int a; struct K1 b; int c; }; and
>>> that structure has 1024+8 bytes in C++ and 8 bytes in C.
>>
>>
>> I don't know how empty classes are used in C in practice, but it could make
>> sense to have ABI compatibility as long as no empty struct is used as a
>> member of another struct (I also suggested an attribute to let C++ use the
>> same layout as C here: PR63579). But then the usual definition of empty
>> would be sufficient.
>>
>>> As clang generates different code for the above between C and C++, it
>>> clearly special cases for some reason just the most common case.
>>> IMHO it is not worth to change GCC ABI...
>>
>>
>> I was interested in this change because it improves C++, C compatibility was
>> a convenient excuse ;-)
>>
>
> I opened  a clang bug:
>
> https://llvm.org/bugs/show_bug.cgi?id=26337
>
> I propose the following definitions:
>
> i. An empty record is:
>1. A class without member.  Or
>2. An array of empty records.  Or
>3. A class with empty records.
> ii. An empty record type for parameter passing is POD for the purpose of 
> layout
> and
>1. A class without member.  Or
>2. A class with empty records.
>
> /* An empty record is:
>1. A class without member.  Or
>2. An array of empty records.  Or
>3. A class with empty records.  */
>
> /* Returns true if TYPE is an empty record or an array of empty records.  */
>
> static bool
> is_empty_record_or_array_of_empty_record (tree type)
> {
>   if (CLASS_TYPE_P (type))
> {
>   if (CLASSTYPE_EMPTY_P (type))
> return true;
>
>   tree field;
>
>   for (field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
> if (TREE_CODE (field) == FIELD_DECL
> && !DECL_ARTIFICIAL (field)
> && !is_empty_record_or_array_of_empty_record (TREE_TYPE (field)))
>   return false;
>   return true;
> }
>   else if (TREE_CODE (type) == ARRAY_TYPE)
> return is_empty_record_or_array_of_empty_record (TREE_TYPE (type));
>   return false;
> }
>
> /* Returns true if TYPE is POD for the purpose of layout and
>1. A class without member.  Or
>2. A class with empty records.  */
>
> static bool
> is_empty_record_for_parm (tree type)
> {
>   if (type == error_mark_node)
> return false;
>
>   if (!CLASS_TYPE_P (type))
> return false;
>
>   if (CLASSTYPE_NON_LAYOUT_POD_P (type))
> return false;
>
>   gcc_assert (COMPLETE_TYPE_P (type));
>
>   if (CLASSTYPE_EMPTY_P (type))
> return true;
>
>   tree field;
>
>   for (field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
> if (TREE_CODE (field) == FIELD_DECL
> && !DECL_ARTIFICIAL (field)
> && !is_empty_record_or_arr

Re: [PATCH, PR target/69454] Disable TARGET_STV when stack is not properly aligned

2016-01-27 Thread H.J. Lu
On Wed, Jan 27, 2016 at 7:34 AM, Ilya Enkovich  wrote:
> Hi,
>
> Currently STV pass may require a stack realignment if any
> transformation occurs to enable SSE registers spill/fill.
> It appears it's invalid to increase stack alignment requirements
> at this point.  Thus we have to either assume we need stack to be
> aligned if are going to run STV pass or disable STV if stack is
> not properly aligned.  I suppose we shouldn't ignore explicitly
> requested stack alignment not beeing sure we really optimize
> anything (and STV is not an optimization frequiently applied).
> So I think we may disable TARGET_STV for such cases as Jakub
> suggested.  This patch was bootstrapped and regtested on
> x86_64-pc-linux-gnu.  OK for trunk?
>
> Thanks,
> Ilya
> --
> gcc/
>
> 2016-01-27  Jakub Jelinek  
> Ilya Enkovich  
>
> PR target/69454
> * config/i386/i386.c (convert_scalars_to_vector): Remove
> stack alignment fixes.
> (ix86_option_override_internal): Disable TARGET_STV if stack
> is not properly aligned.
>
> gcc/testsuite/
>
> 2016-01-27  Ilya Enkovich  
>
> PR target/69454
> * gcc.target/i386/pr69454-1.c: New test.
> * gcc.target/i386/pr69454-2.c: New test.
>
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 34b57a4..9fb8db8 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -3588,16 +3588,6 @@ convert_scalars_to_vector ()
>bitmap_obstack_release (NULL);
>df_process_deferred_rescans ();
>
> -  /* Conversion means we may have 128bit register spills/fills
> - which require aligned stack.  */
> -  if (converted_insns)
> -{
> -  if (crtl->stack_alignment_needed < 128)
> -   crtl->stack_alignment_needed = 128;
> -  if (crtl->stack_alignment_estimated < 128)
> -   crtl->stack_alignment_estimated = 128;
> -}
> -
>return 0;
>  }
>
> @@ -5453,6 +5443,11 @@ ix86_option_override_internal (bool main_args_p,
>  opts->x_target_flags |= MASK_VZEROUPPER;
>if (!(opts_set->x_target_flags & MASK_STV))
>  opts->x_target_flags |= MASK_STV;
> +  /* Disable STV if -mpreferred-stack-boundary={2,3} - the needed
> + stack realignment will be extra cost the pass doesn't take into
> + account and the pass can't realign the stack.  */
> +  if (ix86_preferred_stack_boundary < 64)
> +opts->x_target_flags &= ~MASK_STV;
>if (!ix86_tune_features[X86_TUNE_AVX256_UNALIGNED_LOAD_OPTIMAL]
>&& !(opts_set->x_target_flags & MASK_AVX256_SPLIT_UNALIGNED_LOAD))
>  opts->x_target_flags |= MASK_AVX256_SPLIT_UNALIGNED_LOAD;

The right fix is

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index a03a515..62af55a 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -3588,16 +3588,6 @@ convert_scalars_to_vector ()
   bitmap_obstack_release (NULL);
   df_process_deferred_rescans ();

-  /* Conversion means we may have 128bit register spills/fills
- which require aligned stack.  */
-  if (converted_insns)
-{
-  if (crtl->stack_alignment_needed < 128)
- crtl->stack_alignment_needed = 128;
-  if (crtl->stack_alignment_estimated < 128)
- crtl->stack_alignment_estimated = 128;
-}
-
   return 0;
 }

@@ -29300,8 +29290,10 @@ ix86_minimum_alignment (tree exp, machine_mode mode,
 return align;

   /* Don't do dynamic stack realignment for long long objects with
- -mpreferred-stack-boundary=2.  */
-  if ((mode == DImode || (type && TYPE_MODE (type) == DImode))
+ -mpreferred-stack-boundary=2.  The STV pass uses SSE2 instructions
+ on DImode which needs 64-bit alignment for DImode.  */
+  if (!(TARGET_STV && TARGET_SSE2 && optimize > 1)
+  && (mode == DImode || (type && TYPE_MODE (type) == DImode))
   && (!type || !TYPE_USER_ALIGN (type))
   && (!decl || !DECL_USER_ALIGN (decl)))
 return 32;



-- 
H.J.


Re: [PATCH, PR target/69454] Disable TARGET_STV when stack is not properly aligned

2016-01-27 Thread Jakub Jelinek
On Wed, Jan 27, 2016 at 06:34:41PM +0300, Ilya Enkovich wrote:
> @@ -5453,6 +5443,11 @@ ix86_option_override_internal (bool main_args_p,
>  opts->x_target_flags |= MASK_VZEROUPPER;
>if (!(opts_set->x_target_flags & MASK_STV))
>  opts->x_target_flags |= MASK_STV;
> +  /* Disable STV if -mpreferred-stack-boundary={2,3} - the needed

The comment doesn't match the code, you disable STV only for
-mpreferred-stack-boundary=2.

> + stack realignment will be extra cost the pass doesn't take into
> + account and the pass can't realign the stack.  */
> +  if (ix86_preferred_stack_boundary < 64)
> +opts->x_target_flags &= ~MASK_STV;
>if (!ix86_tune_features[X86_TUNE_AVX256_UNALIGNED_LOAD_OPTIMAL]
>&& !(opts_set->x_target_flags & MASK_AVX256_SPLIT_UNALIGNED_LOAD))
>  opts->x_target_flags |= MASK_AVX256_SPLIT_UNALIGNED_LOAD;

Jakub


Re: [PATCH, PR target/69454] Disable TARGET_STV when stack is not properly aligned

2016-01-27 Thread H.J. Lu
On Wed, Jan 27, 2016 at 7:34 AM, Ilya Enkovich  wrote:
> Hi,
>
> Currently STV pass may require a stack realignment if any
> transformation occurs to enable SSE registers spill/fill.
> It appears it's invalid to increase stack alignment requirements
> at this point.  Thus we have to either assume we need stack to be
> aligned if are going to run STV pass or disable STV if stack is
> not properly aligned.  I suppose we shouldn't ignore explicitly
> requested stack alignment not beeing sure we really optimize
> anything (and STV is not an optimization frequiently applied).
> So I think we may disable TARGET_STV for such cases as Jakub
> suggested.  This patch was bootstrapped and regtested on
> x86_64-pc-linux-gnu.  OK for trunk?
>

> diff --git a/gcc/testsuite/gcc.target/i386/pr69454-1.c 
> b/gcc/testsuite/gcc.target/i386/pr69454-1.c
> new file mode 100644
> index 000..12ecfd3
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr69454-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile { target { ia32 } } } */
> +/* { dg-options "-O2 -msse2 -mno-accumulate-outgoing-args 
> -mpreferred-stack-boundary=2" } */
> +
> +typedef struct { long long w64[2]; } V128;
> +extern V128* fn2(void);
> +long long a;
> +V128 b;
> +void fn1() {
> +  V128 *c = fn2();
> +  c->w64[0] = a ^ b.w64[0];
> +}
> diff --git a/gcc/testsuite/gcc.target/i386/pr69454-2.c 
> b/gcc/testsuite/gcc.target/i386/pr69454-2.c
> new file mode 100644
> index 000..28bab93
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr69454-2.c
> @@ -0,0 +1,13 @@
> +/* { dg-do compile { target { ia32 } } } */
> +/* { dg-options "-O2 -mpreferred-stack-boundary=2" } */

This needs:

+/* { dg-options "-O2 -msse2 -mno-accumulate-outgoing-args
-mpreferred-stack-boundary=2" } */

to trigger.

> +extern void fn2 ();
> +long long a, b;
> +
> +void fn1 ()
> +{
> +  long long c = a;
> +  a = b ^ a;
> +  fn2 ();
> +  a = c;
> +}

Here is a different patch, which I believe is the right fix.

-- 
H.J.
From 86e040399dd5ca6b23597be4aff5edb9ac2ab5d7 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Mon, 25 Jan 2016 12:31:45 -0800
Subject: [PATCH] Don't align DImode to 32 bits if the STV pass is enabled

Since the STV pass uses SSE2 instructions on DImode which needs 64-bit
alignment for DImode, don't align DImode to 32 bits if the STV pass is
enabled.

gcc/

	PR target/69454
	* config/i386/i386.c (convert_scalars_to_vector): Don't change
	stack alignment here.
	(ix86_minimum_alignment): Don't align DImode to 32 bits if the
	STV pass is enabled.

gcc/testsuite/

	PR target/69454
	* gcc.target/i386/pr69454-1.c: New test.
	* gcc.target/i386/pr69454-2.c: Likewise.
---
 gcc/config/i386/i386.c| 16 
 gcc/testsuite/gcc.target/i386/pr69454-1.c | 11 +++
 gcc/testsuite/gcc.target/i386/pr69454-2.c | 13 +
 3 files changed, 28 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr69454-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr69454-2.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index cfbdf0f..8babdaf 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -3588,16 +3588,6 @@ convert_scalars_to_vector ()
   bitmap_obstack_release (NULL);
   df_process_deferred_rescans ();
 
-  /* Conversion means we may have 128bit register spills/fills
- which require aligned stack.  */
-  if (converted_insns)
-{
-  if (crtl->stack_alignment_needed < 128)
-	crtl->stack_alignment_needed = 128;
-  if (crtl->stack_alignment_estimated < 128)
-	crtl->stack_alignment_estimated = 128;
-}
-
   return 0;
 }
 
@@ -29299,8 +29289,10 @@ ix86_minimum_alignment (tree exp, machine_mode mode,
 return align;
 
   /* Don't do dynamic stack realignment for long long objects with
- -mpreferred-stack-boundary=2.  */
-  if ((mode == DImode || (type && TYPE_MODE (type) == DImode))
+ -mpreferred-stack-boundary=2.  The STV pass uses SSE2 instructions
+ on DImode which needs 64-bit alignment for DImode.  */
+  if (!(TARGET_STV && TARGET_SSE2 && optimize > 1)
+  && (mode == DImode || (type && TYPE_MODE (type) == DImode))
   && (!type || !TYPE_USER_ALIGN (type))
   && (!decl || !DECL_USER_ALIGN (decl)))
 return 32;
diff --git a/gcc/testsuite/gcc.target/i386/pr69454-1.c b/gcc/testsuite/gcc.target/i386/pr69454-1.c
new file mode 100644
index 000..12ecfd3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr69454-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile { target { ia32 } } } */
+/* { dg-options "-O2 -msse2 -mno-accumulate-outgoing-args -mpreferred-stack-boundary=2" } */
+
+typedef struct { long long w64[2]; } V128;
+extern V128* fn2(void);
+long long a;
+V128 b;
+void fn1() {
+  V128 *c = fn2();
+  c->w64[0] = a ^ b.w64[0];
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr69454-2.c b/gcc/testsuite/gcc.target/i386/pr69454-2.c
new file mode 100644
index 000..4820b76
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr69454-2.c
@@ -0,0 +1,13 @@
+/* { dg

Re: [PATCH, PR target/69454] Disable TARGET_STV when stack is not properly aligned

2016-01-27 Thread Ilya Enkovich
2016-01-27 18:43 GMT+03:00 H.J. Lu :
> On Wed, Jan 27, 2016 at 7:34 AM, Ilya Enkovich  wrote:
>> Hi,
>>
>> Currently STV pass may require a stack realignment if any
>> transformation occurs to enable SSE registers spill/fill.
>> It appears it's invalid to increase stack alignment requirements
>> at this point.  Thus we have to either assume we need stack to be
>> aligned if are going to run STV pass or disable STV if stack is
>> not properly aligned.  I suppose we shouldn't ignore explicitly
>> requested stack alignment not beeing sure we really optimize
>> anything (and STV is not an optimization frequiently applied).
>> So I think we may disable TARGET_STV for such cases as Jakub
>> suggested.  This patch was bootstrapped and regtested on
>> x86_64-pc-linux-gnu.  OK for trunk?
>>
>> Thanks,
>> Ilya
>> --
>> gcc/
>>
>> 2016-01-27  Jakub Jelinek  
>> Ilya Enkovich  
>>
>> PR target/69454
>> * config/i386/i386.c (convert_scalars_to_vector): Remove
>> stack alignment fixes.
>> (ix86_option_override_internal): Disable TARGET_STV if stack
>> is not properly aligned.
>>
>> gcc/testsuite/
>>
>> 2016-01-27  Ilya Enkovich  
>>
>> PR target/69454
>> * gcc.target/i386/pr69454-1.c: New test.
>> * gcc.target/i386/pr69454-2.c: New test.
>>
>>
>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>> index 34b57a4..9fb8db8 100644
>> --- a/gcc/config/i386/i386.c
>> +++ b/gcc/config/i386/i386.c
>> @@ -3588,16 +3588,6 @@ convert_scalars_to_vector ()
>>bitmap_obstack_release (NULL);
>>df_process_deferred_rescans ();
>>
>> -  /* Conversion means we may have 128bit register spills/fills
>> - which require aligned stack.  */
>> -  if (converted_insns)
>> -{
>> -  if (crtl->stack_alignment_needed < 128)
>> -   crtl->stack_alignment_needed = 128;
>> -  if (crtl->stack_alignment_estimated < 128)
>> -   crtl->stack_alignment_estimated = 128;
>> -}
>> -
>>return 0;
>>  }
>>
>> @@ -5453,6 +5443,11 @@ ix86_option_override_internal (bool main_args_p,
>>  opts->x_target_flags |= MASK_VZEROUPPER;
>>if (!(opts_set->x_target_flags & MASK_STV))
>>  opts->x_target_flags |= MASK_STV;
>> +  /* Disable STV if -mpreferred-stack-boundary={2,3} - the needed
>> + stack realignment will be extra cost the pass doesn't take into
>> + account and the pass can't realign the stack.  */
>> +  if (ix86_preferred_stack_boundary < 64)
>> +opts->x_target_flags &= ~MASK_STV;
>>if (!ix86_tune_features[X86_TUNE_AVX256_UNALIGNED_LOAD_OPTIMAL]
>>&& !(opts_set->x_target_flags & MASK_AVX256_SPLIT_UNALIGNED_LOAD))
>>  opts->x_target_flags |= MASK_AVX256_SPLIT_UNALIGNED_LOAD;
>
> The right fix is
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index a03a515..62af55a 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -3588,16 +3588,6 @@ convert_scalars_to_vector ()
>bitmap_obstack_release (NULL);
>df_process_deferred_rescans ();
>
> -  /* Conversion means we may have 128bit register spills/fills
> - which require aligned stack.  */
> -  if (converted_insns)
> -{
> -  if (crtl->stack_alignment_needed < 128)
> - crtl->stack_alignment_needed = 128;
> -  if (crtl->stack_alignment_estimated < 128)
> - crtl->stack_alignment_estimated = 128;
> -}
> -
>return 0;
>  }
>
> @@ -29300,8 +29290,10 @@ ix86_minimum_alignment (tree exp, machine_mode mode,
>  return align;
>
>/* Don't do dynamic stack realignment for long long objects with
> - -mpreferred-stack-boundary=2.  */
> -  if ((mode == DImode || (type && TYPE_MODE (type) == DImode))
> + -mpreferred-stack-boundary=2.  The STV pass uses SSE2 instructions
> + on DImode which needs 64-bit alignment for DImode.  */
> +  if (!(TARGET_STV && TARGET_SSE2 && optimize > 1)
> +  && (mode == DImode || (type && TYPE_MODE (type) == DImode))
>&& (!type || !TYPE_USER_ALIGN (type))
>&& (!decl || !DECL_USER_ALIGN (decl)))
>  return 32;
>

DImode object doesn't mean STV will be applied.  So you might just
ignore preferred stack
alignment for no reason. 'Right' here depends on what is more
important in such case.

Thanks,
Ilya

>
>
> --
> H.J.


Re: [PATCH, PR target/69454] Disable TARGET_STV when stack is not properly aligned

2016-01-27 Thread Ilya Enkovich
On 27 Jan 16:44, Jakub Jelinek wrote:
> On Wed, Jan 27, 2016 at 06:34:41PM +0300, Ilya Enkovich wrote:
> > @@ -5453,6 +5443,11 @@ ix86_option_override_internal (bool main_args_p,
> >  opts->x_target_flags |= MASK_VZEROUPPER;
> >if (!(opts_set->x_target_flags & MASK_STV))
> >  opts->x_target_flags |= MASK_STV;
> > +  /* Disable STV if -mpreferred-stack-boundary={2,3} - the needed
> 
> The comment doesn't match the code, you disable STV only for
> -mpreferred-stack-boundary=2.

Thanks, here is an updated version.

Ilya
--
gcc/

2016-01-27  Jakub Jelinek  
Ilya Enkovich  

PR target/69454
* config/i386/i386.c (convert_scalars_to_vector): Remove
stack alignment fixes.
(ix86_option_override_internal): Disable TARGET_STV if stack
is not properly aligned.

gcc/testsuite/

2016-01-27  Ilya Enkovich  

PR target/69454
* gcc.target/i386/pr69454-1.c: New test.
* gcc.target/i386/pr69454-2.c: New test.


diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 34b57a4..9fb8db8 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -3588,16 +3588,6 @@ convert_scalars_to_vector ()
   bitmap_obstack_release (NULL);
   df_process_deferred_rescans ();
 
-  /* Conversion means we may have 128bit register spills/fills
- which require aligned stack.  */
-  if (converted_insns)
-{
-  if (crtl->stack_alignment_needed < 128)
-   crtl->stack_alignment_needed = 128;
-  if (crtl->stack_alignment_estimated < 128)
-   crtl->stack_alignment_estimated = 128;
-}
-
   return 0;
 }
 
@@ -5453,6 +5443,11 @@ ix86_option_override_internal (bool main_args_p,
 opts->x_target_flags |= MASK_VZEROUPPER;
   if (!(opts_set->x_target_flags & MASK_STV))
 opts->x_target_flags |= MASK_STV;
+  /* Disable STV if -mpreferred-stack-boundary=2 - the needed
+ stack realignment will be extra cost the pass doesn't take into
+ account and the pass can't realign the stack.  */
+  if (ix86_preferred_stack_boundary < 64)
+opts->x_target_flags &= ~MASK_STV;
   if (!ix86_tune_features[X86_TUNE_AVX256_UNALIGNED_LOAD_OPTIMAL]
   && !(opts_set->x_target_flags & MASK_AVX256_SPLIT_UNALIGNED_LOAD))
 opts->x_target_flags |= MASK_AVX256_SPLIT_UNALIGNED_LOAD;
diff --git a/gcc/testsuite/gcc.target/i386/pr69454-1.c 
b/gcc/testsuite/gcc.target/i386/pr69454-1.c
new file mode 100644
index 000..12ecfd3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr69454-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile { target { ia32 } } } */
+/* { dg-options "-O2 -msse2 -mno-accumulate-outgoing-args 
-mpreferred-stack-boundary=2" } */
+
+typedef struct { long long w64[2]; } V128;
+extern V128* fn2(void);
+long long a;
+V128 b;
+void fn1() {
+  V128 *c = fn2();
+  c->w64[0] = a ^ b.w64[0];
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr69454-2.c 
b/gcc/testsuite/gcc.target/i386/pr69454-2.c
new file mode 100644
index 000..28bab93
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr69454-2.c
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { ia32 } } } */
+/* { dg-options "-O2 -mpreferred-stack-boundary=2" } */
+
+extern void fn2 ();
+long long a, b;
+
+void fn1 ()
+{
+  long long c = a;
+  a = b ^ a;
+  fn2 ();
+  a = c;
+}


Re: [PATCH, PR target/69454] Disable TARGET_STV when stack is not properly aligned

2016-01-27 Thread H.J. Lu
On Wed, Jan 27, 2016 at 8:11 AM, Ilya Enkovich  wrote:
> On 27 Jan 16:44, Jakub Jelinek wrote:
>> On Wed, Jan 27, 2016 at 06:34:41PM +0300, Ilya Enkovich wrote:
>> > @@ -5453,6 +5443,11 @@ ix86_option_override_internal (bool main_args_p,
>> >  opts->x_target_flags |= MASK_VZEROUPPER;
>> >if (!(opts_set->x_target_flags & MASK_STV))
>> >  opts->x_target_flags |= MASK_STV;
>> > +  /* Disable STV if -mpreferred-stack-boundary={2,3} - the needed
>>
>> The comment doesn't match the code, you disable STV only for
>> -mpreferred-stack-boundary=2.
>
> Thanks, here is an updated version.
>
> Ilya
> --
> gcc/
>
> 2016-01-27  Jakub Jelinek  
> Ilya Enkovich  
>
> PR target/69454
> * config/i386/i386.c (convert_scalars_to_vector): Remove
> stack alignment fixes.
> (ix86_option_override_internal): Disable TARGET_STV if stack
> is not properly aligned.
>
> gcc/testsuite/
>
> 2016-01-27  Ilya Enkovich  
>
> PR target/69454
> * gcc.target/i386/pr69454-1.c: New test.
> * gcc.target/i386/pr69454-2.c: New test.
>
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 34b57a4..9fb8db8 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -3588,16 +3588,6 @@ convert_scalars_to_vector ()
>bitmap_obstack_release (NULL);
>df_process_deferred_rescans ();
>
> -  /* Conversion means we may have 128bit register spills/fills
> - which require aligned stack.  */
> -  if (converted_insns)
> -{
> -  if (crtl->stack_alignment_needed < 128)
> -   crtl->stack_alignment_needed = 128;
> -  if (crtl->stack_alignment_estimated < 128)
> -   crtl->stack_alignment_estimated = 128;
> -}
> -
>return 0;
>  }
>
> @@ -5453,6 +5443,11 @@ ix86_option_override_internal (bool main_args_p,
>  opts->x_target_flags |= MASK_VZEROUPPER;
>if (!(opts_set->x_target_flags & MASK_STV))
>  opts->x_target_flags |= MASK_STV;
> +  /* Disable STV if -mpreferred-stack-boundary=2 - the needed
> + stack realignment will be extra cost the pass doesn't take into
> + account and the pass can't realign the stack.  */
> +  if (ix86_preferred_stack_boundary < 64)
> +opts->x_target_flags &= ~MASK_STV;
>if (!ix86_tune_features[X86_TUNE_AVX256_UNALIGNED_LOAD_OPTIMAL]
>&& !(opts_set->x_target_flags & MASK_AVX256_SPLIT_UNALIGNED_LOAD))
>  opts->x_target_flags |= MASK_AVX256_SPLIT_UNALIGNED_LOAD;

MINIMUM_ALIGNMENT keeps track stack alignment.  It is OK
to disable STV for -mpreferred-stack-boundary=2.  But you should
also update ix86_minimum_alignment to make sure that STV is
disabled before returning 32 for DImode.


-- 
H.J.


[Patch, fortran, GCC-5/6, PR62536, v1] ICE (segfault) for invalid END BLOCK statement

2016-01-27 Thread Andre Vehreschild
Hi all,

attached are two patches one for trunk and one for the gcc-5-branch,
which fix an ICE when BLOCK statements are not closed correctly (too
many END BLOCKs). Unfortunately did the patch I tailored for gcc-5 not
work on trunk.

GCC 5: The ICE is prevented by making sure that gfc_current_ns is not
set to NULL, when too many closing END BLOCK statements are found. This
may lead to gfortran be in the wrong namespace, but gfortran is already
in error. Therefore this does not matter any more. We just need to exit
cleanly.

GCC 6/trunk: Here the ICE is prevented by making sure, that unnesting
of BLOCK namespaces is only done, when the END encountered is not the
one of a BLOCK. This prevents gfortran from removing a correctly
initialized and for further analysis required namespace.

Bootstrapped and regtested ok on x86_64-linux-gnu/F23. 

Ok for trunk and gcc-5-branch, respectively?

Regards,
Andre
-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c
index 2708413..adea3f3 100644
--- a/gcc/fortran/decl.c
+++ b/gcc/fortran/decl.c
@@ -6454,9 +6454,16 @@ cleanup:
 	  prev_ns = ns;
 	  ns = ns->sibling;
 	}
-  
-  gfc_free_namespace (gfc_current_ns);
-  gfc_current_ns = parent_ns;
+
+  if (parent_ns)
+	{
+	  /* Free the current namespace only when the parent one exists.  This
+	 prevents an ICE when more END BLOCK then BLOCK statements are
+	 present.  It does not mean any further harm, because we already
+	 have errored.  */
+	  gfc_free_namespace (gfc_current_ns);
+	  gfc_current_ns = parent_ns;
+	}
 }
 
   return MATCH_ERROR;
diff --git a/gcc/testsuite/gfortran.dg/block_14.f08 b/gcc/testsuite/gfortran.dg/block_14.f08
new file mode 100644
index 000..4e5903b
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/block_14.f08
@@ -0,0 +1,21 @@
+! { dg-do compile }
+!
+! Contributed by Tobias Burnus  
+! Check fix for PR62536 works as expected.
+
+function f2 (x)
+implicit none
+  integer f2, x
+  block
+   block named ! { dg-error "Unclassifiable statement" }
+integer a ! should be SAVEd
+a = a + x ! should increment by y every time
+f2 = a
+   end block named ! { dg-error "Syntax error in END BLOCK statement" }
+  end block ! { dg-error "Expected block name of " }
+  return
+endfunction ! { dg-error "Expecting END BLOCK statement" }
+
+end ! { dg-error "END BLOCK statement expected" }
+! { dg-excess-errors "Previous error cause unexpected end of file." }
+
gcc/testsuite/ChangeLog:

2016-01-27  Andre Vehreschild  

PR fortran/62536
* gfortran.dg/block_14.f08: New test.


gcc/fortran/ChangeLog:

2016-01-27  Andre Vehreschild  

PR fortran/62536
* decl.c: Prevent setting gfc_current_ns to NULL when block statement's
nesting is incomplete.  There is already an error conditon, so having
gfc_current_ns pointing to an eventually wrong namespace does not matter
that much.

diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c
index 7c0e011..df81369 100644
--- a/gcc/fortran/decl.c
+++ b/gcc/fortran/decl.c
@@ -6327,6 +6327,7 @@ gfc_match_end (gfc_statement *st)
   gfc_namespace *parent_ns, *ns, *prev_ns;
   gfc_namespace **nsp;
   bool abreviated_modproc_decl;
+  bool got_matching_end = false;
 
   old_loc = gfc_current_locus;
   if (gfc_match ("end") != MATCH_YES)
@@ -6510,6 +6511,8 @@ gfc_match_end (gfc_statement *st)
 		 ? "END PROCEDURE" : gfc_ascii_statement(*st), &old_loc);
   goto cleanup;
 }
+  else
+got_matching_end = true;
 
   old_loc = gfc_current_locus;
   /* If we're at the end, make sure a block name wasn't required.  */
@@ -6581,7 +6584,7 @@ cleanup:
   /* If we are missing an END BLOCK, we created a half-ready namespace.
  Remove it from the parent namespace's sibling list.  */
 
-  while (state == COMP_BLOCK)
+  while (state == COMP_BLOCK && !got_matching_end)
 {
   parent_ns = gfc_current_ns->parent;
 
@@ -6601,7 +6604,7 @@ cleanup:
 	  prev_ns = ns;
 	  ns = ns->sibling;
 	}
-  
+
   gfc_free_namespace (gfc_current_ns);
   gfc_current_ns = parent_ns;
   gfc_state_stack = gfc_state_stack->previous;
diff --git a/gcc/testsuite/gfortran.dg/block_15.f08 b/gcc/testsuite/gfortran.dg/block_15.f08
new file mode 100644
index 000..fd1ca55
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/block_15.f08
@@ -0,0 +1,20 @@
+! { dg-do compile }
+!
+! Contributed by Tobias Burnus  
+! Check fix for PR62536 works as expected.
+
+function f2 (x)
+implicit none
+  integer f2, x
+  block
+   block named ! { dg-error "Unclassifiable statement" }
+integer a ! should be SAVEd
+a = a + x ! should increment by y every time
+f2 = a
+   end block named ! { dg-error "Syntax error in END BLOCK statement" }
+  end block
+  return
+endfunction
+
+end
+
diff --git a/gcc/testsuite/gfortran.dg/block_end_error_1.f90 b/gcc/testsuite/gfortran.dg/block_end_error_1.f90
index e7dcdfa..4bbe89b 100644
--- a/gcc/testsuite/gfortran.dg/block_end_error_1.

Re: [PATCH, PR target/69454] Disable TARGET_STV when stack is not properly aligned

2016-01-27 Thread Ilya Enkovich
2016-01-27 19:18 GMT+03:00 H.J. Lu :
> On Wed, Jan 27, 2016 at 8:11 AM, Ilya Enkovich  wrote:
>> On 27 Jan 16:44, Jakub Jelinek wrote:
>>> On Wed, Jan 27, 2016 at 06:34:41PM +0300, Ilya Enkovich wrote:
>>> > @@ -5453,6 +5443,11 @@ ix86_option_override_internal (bool main_args_p,
>>> >  opts->x_target_flags |= MASK_VZEROUPPER;
>>> >if (!(opts_set->x_target_flags & MASK_STV))
>>> >  opts->x_target_flags |= MASK_STV;
>>> > +  /* Disable STV if -mpreferred-stack-boundary={2,3} - the needed
>>>
>>> The comment doesn't match the code, you disable STV only for
>>> -mpreferred-stack-boundary=2.
>>
>> Thanks, here is an updated version.
>>
>> Ilya
>> --
>> gcc/
>>
>> 2016-01-27  Jakub Jelinek  
>> Ilya Enkovich  
>>
>> PR target/69454
>> * config/i386/i386.c (convert_scalars_to_vector): Remove
>> stack alignment fixes.
>> (ix86_option_override_internal): Disable TARGET_STV if stack
>> is not properly aligned.
>>
>> gcc/testsuite/
>>
>> 2016-01-27  Ilya Enkovich  
>>
>> PR target/69454
>> * gcc.target/i386/pr69454-1.c: New test.
>> * gcc.target/i386/pr69454-2.c: New test.
>>
>>
>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>> index 34b57a4..9fb8db8 100644
>> --- a/gcc/config/i386/i386.c
>> +++ b/gcc/config/i386/i386.c
>> @@ -3588,16 +3588,6 @@ convert_scalars_to_vector ()
>>bitmap_obstack_release (NULL);
>>df_process_deferred_rescans ();
>>
>> -  /* Conversion means we may have 128bit register spills/fills
>> - which require aligned stack.  */
>> -  if (converted_insns)
>> -{
>> -  if (crtl->stack_alignment_needed < 128)
>> -   crtl->stack_alignment_needed = 128;
>> -  if (crtl->stack_alignment_estimated < 128)
>> -   crtl->stack_alignment_estimated = 128;
>> -}
>> -
>>return 0;
>>  }
>>
>> @@ -5453,6 +5443,11 @@ ix86_option_override_internal (bool main_args_p,
>>  opts->x_target_flags |= MASK_VZEROUPPER;
>>if (!(opts_set->x_target_flags & MASK_STV))
>>  opts->x_target_flags |= MASK_STV;
>> +  /* Disable STV if -mpreferred-stack-boundary=2 - the needed
>> + stack realignment will be extra cost the pass doesn't take into
>> + account and the pass can't realign the stack.  */
>> +  if (ix86_preferred_stack_boundary < 64)
>> +opts->x_target_flags &= ~MASK_STV;
>>if (!ix86_tune_features[X86_TUNE_AVX256_UNALIGNED_LOAD_OPTIMAL]
>>&& !(opts_set->x_target_flags & MASK_AVX256_SPLIT_UNALIGNED_LOAD))
>>  opts->x_target_flags |= MASK_AVX256_SPLIT_UNALIGNED_LOAD;
>
> MINIMUM_ALIGNMENT keeps track stack alignment.  It is OK
> to disable STV for -mpreferred-stack-boundary=2.  But you should
> also update ix86_minimum_alignment to make sure that STV is
> disabled before returning 32 for DImode.

If -mpreferred-stack-boundary=2 then STV is disabled, if STV is enabled then
-mpreferred-stack-boundary>=3 and this condition in
ix86_minimum_alignment works:

  if (TARGET_64BIT || align != 64 || ix86_preferred_stack_boundary >= 64)
return align;

Thanks,
Ilya

>
>
> --
> H.J.


Re: [PATCH, PR target/69454] Disable TARGET_STV when stack is not properly aligned

2016-01-27 Thread H.J. Lu
On Wed, Jan 27, 2016 at 8:29 AM, Ilya Enkovich  wrote:
> 2016-01-27 19:18 GMT+03:00 H.J. Lu :
>> On Wed, Jan 27, 2016 at 8:11 AM, Ilya Enkovich  
>> wrote:
>>> On 27 Jan 16:44, Jakub Jelinek wrote:
 On Wed, Jan 27, 2016 at 06:34:41PM +0300, Ilya Enkovich wrote:
 > @@ -5453,6 +5443,11 @@ ix86_option_override_internal (bool main_args_p,
 >  opts->x_target_flags |= MASK_VZEROUPPER;
 >if (!(opts_set->x_target_flags & MASK_STV))
 >  opts->x_target_flags |= MASK_STV;
 > +  /* Disable STV if -mpreferred-stack-boundary={2,3} - the needed

 The comment doesn't match the code, you disable STV only for
 -mpreferred-stack-boundary=2.
>>>
>>> Thanks, here is an updated version.
>>>
>>> Ilya
>>> --
>>> gcc/
>>>
>>> 2016-01-27  Jakub Jelinek  
>>> Ilya Enkovich  
>>>
>>> PR target/69454
>>> * config/i386/i386.c (convert_scalars_to_vector): Remove
>>> stack alignment fixes.
>>> (ix86_option_override_internal): Disable TARGET_STV if stack
>>> is not properly aligned.
>>>
>>> gcc/testsuite/
>>>
>>> 2016-01-27  Ilya Enkovich  
>>>
>>> PR target/69454
>>> * gcc.target/i386/pr69454-1.c: New test.
>>> * gcc.target/i386/pr69454-2.c: New test.
>>>
>>>
>>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>>> index 34b57a4..9fb8db8 100644
>>> --- a/gcc/config/i386/i386.c
>>> +++ b/gcc/config/i386/i386.c
>>> @@ -3588,16 +3588,6 @@ convert_scalars_to_vector ()
>>>bitmap_obstack_release (NULL);
>>>df_process_deferred_rescans ();
>>>
>>> -  /* Conversion means we may have 128bit register spills/fills
>>> - which require aligned stack.  */
>>> -  if (converted_insns)
>>> -{
>>> -  if (crtl->stack_alignment_needed < 128)
>>> -   crtl->stack_alignment_needed = 128;
>>> -  if (crtl->stack_alignment_estimated < 128)
>>> -   crtl->stack_alignment_estimated = 128;
>>> -}
>>> -
>>>return 0;
>>>  }
>>>
>>> @@ -5453,6 +5443,11 @@ ix86_option_override_internal (bool main_args_p,
>>>  opts->x_target_flags |= MASK_VZEROUPPER;
>>>if (!(opts_set->x_target_flags & MASK_STV))
>>>  opts->x_target_flags |= MASK_STV;
>>> +  /* Disable STV if -mpreferred-stack-boundary=2 - the needed
>>> + stack realignment will be extra cost the pass doesn't take into
>>> + account and the pass can't realign the stack.  */
>>> +  if (ix86_preferred_stack_boundary < 64)
>>> +opts->x_target_flags &= ~MASK_STV;
>>>if (!ix86_tune_features[X86_TUNE_AVX256_UNALIGNED_LOAD_OPTIMAL]
>>>&& !(opts_set->x_target_flags & MASK_AVX256_SPLIT_UNALIGNED_LOAD))
>>>  opts->x_target_flags |= MASK_AVX256_SPLIT_UNALIGNED_LOAD;
>>
>> MINIMUM_ALIGNMENT keeps track stack alignment.  It is OK
>> to disable STV for -mpreferred-stack-boundary=2.  But you should
>> also update ix86_minimum_alignment to make sure that STV is
>> disabled before returning 32 for DImode.
>
> If -mpreferred-stack-boundary=2 then STV is disabled, if STV is enabled then
> -mpreferred-stack-boundary>=3 and this condition in
> ix86_minimum_alignment works:
>
>   if (TARGET_64BIT || align != 64 || ix86_preferred_stack_boundary >= 64)
> return align;
>

No, you shouldn't make assumptions in ix86_minimum_alignment. You
should check explicitly that STV is disabled in ix86_minimum_alignment.


-- 
H.J.


[PATCH] Add NULL check for vectype in vectorizable_comparison

2016-01-27 Thread Ilya Enkovich
Hi,

This is a trivial patch which adds a NULL check for vectype.
We may have a NULL vectype in case of void call.  Will commit
as obvious to trunk after testing.

Thanks,
Ilya
--
gcc/

2016-01-27  Ilya Enkovich  

* tree-vect-stmts.c (vectorizable_comparison): Add
NULL check for vectype.

gcc/testsuite/

2016-01-27  Ilya Enkovich  

* gcc.dg/declare-simd.c: New test.


diff --git a/gcc/testsuite/gcc.dg/declare-simd.c 
b/gcc/testsuite/gcc.dg/declare-simd.c
new file mode 100644
index 000..1c71b60
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/declare-simd.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fopenmp-simd" } */
+
+#pragma omp declare simd linear (p2, p3)
+extern void fn2 (float p1, float *p2, float *p3);
+
+float *a, *b;
+void fn1 (float *p1)
+{
+  int i;
+#pragma omp simd
+  for (i = 0; i < 1000; i++)
+fn2 (p1[i], a + i, b + i);
+}
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 1dcd129..fa4a364 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -7764,7 +7764,7 @@ vectorizable_comparison (gimple *stmt, 
gimple_stmt_iterator *gsi,
   if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
 return false;
 
-  if (!VECTOR_BOOLEAN_TYPE_P (vectype))
+  if (!vectype || !VECTOR_BOOLEAN_TYPE_P (vectype))
 return false;
 
   mask_type = vectype;


Re: C++ PATCH for c++/69496 (ICE on VLA in constexpr function)

2016-01-27 Thread Martin Sebor

I wonder if it might be better to instead reject VLAs in constexpr
functions altogether.  Not because they're not in C++, but because
C (or gcc) doesn't allow them to be initialized (and so accepting
an initialized VLA is a g++ extension of an extension), and
because in constexpr functions they are rejected without
initialization just like other uninitialized variables.


I don't think we can do this at this time.  E.g. the following program works
even with GCC 5 and -std=c++14:

constexpr int
foo (int n)
{
 int a[n] = { 1, 2, 3 };
 int z = 0;
 for (unsigned i = 0; i < 3; ++i)
   z += a[i];
 return z;
}

int
main ()
{
   constexpr int n = foo (3);
   __builtin_printf ("%d\n", n);
}


This happens to work but I suspect it's only by accident.  When
the number of elements in the initializer is increased to exceed
the number of elements in the VLA GCC gets into infinite recursion.
(I opened bug 69516 with a test case).  The same error in a non-
constexpr function causes a SEGV at runtime (this is also
a regression WRT 4.9.3 -- I opened bug 69517 for it).


So starting to reject such a code might broke working programs.  And we're
able to reject non-standard code: -pedantic-errors.


I agree that there is some risk that it might break some working
programs.  I would expect the most common use of initialized VLAs
be to set all elements to zero using the "= { }" or "= { 0 }"
syntax.  Initializers with more elements are, IMO, likely to be
a bug where the user doesn't realize they defined a VLA rather
than an ordinary array.  Since VLAs are required to have at least
1 element, would diagnosing initializers with more than one element
more loudly (such as by default or with -Wall as opposed to with
-Wpedantic) be a good solution?

Martin


Fix PR ada/69488

2016-01-27 Thread Eric Botcazou
This is the en masse failure of the gnat.dg/sso tests on Windows, which is the 
same DOS line termination issue I run into for gcc.dg/sso on Visium:
 https://gcc.gnu.org/ml/gcc-patches/2015-11/msg01133.html.

Rainer wrote and tested the attached patch on both Windows and Linux.  Applied 
on the mainline.


2016-01-27  Rainer Emrich  

PR ada/69488
* gnat.dg/sso/*.adb: Robustify dg-output directives.

-- 
Eric BotcazouIndex: gcc/testsuite/gnat.dg/sso/conv1.adb
===
--- gcc/testsuite/gnat.dg/sso/conv1.adb	(Revision 232815)
+++ gcc/testsuite/gnat.dg/sso/conv1.adb	(Arbeitskopie)
@@ -36,15 +36,15 @@ begin
X_L.S := 12345;
X_L.C := 'a';
Dump ("X_L", X_L.S, X_L.C);
-   -- { dg-output "X_L = \\(S => 12345, C => 'a'\\)\n" }
+   -- { dg-output "X_L = \\(S => 12345, C => 'a'\\).*\n" }
 
X_H.S := 23456;
X_H.C := 'b';
Dump ("X_H", X_H.S, X_H.C);
-   -- { dg-output "X_H = \\(S => 23456, C => 'b'\\)\n" }
+   -- { dg-output "X_H = \\(S => 23456, C => 'b'\\).*\n" }
 
X_H := R_H (X_L);
Dump ("X_H", X_H.S, X_H.C);
-   -- { dg-output "X_H = \\(S => 12345, C => 'a'\\)\n" }
+   -- { dg-output "X_H = \\(S => 12345, C => 'a'\\).*\n" }
 
 end;
Index: gcc/testsuite/gnat.dg/sso/p1.adb
===
--- gcc/testsuite/gnat.dg/sso/p1.adb	(Revision 232815)
+++ gcc/testsuite/gnat.dg/sso/p1.adb	(Arbeitskopie)
@@ -13,50 +13,50 @@ begin
   Put ("My_R1:");
   Dump (My_R1'Address, R1'Max_Size_In_Storage_Elements);
   New_Line;
-  -- { dg-output "My_R1: 78 56 34 12\n" }
+  -- { dg-output "My_R1: 78 56 34 12.*\n" }
 
   Put ("My_R2:");
   Dump (My_R2'Address, R2'Max_Size_In_Storage_Elements);
   New_Line;
-  -- { dg-output "My_R2: 12 34 56 78\n" }
+  -- { dg-output "My_R2: 12 34 56 78.*\n" }
 
   Local_R1 := My_R1;
   Put ("Local_R1 :");
   Dump (Local_R1'Address, R1'Max_Size_In_Storage_Elements);
   New_Line;
-  -- { dg-output "Local_R1 : 78 56 34 12\n" }
+  -- { dg-output "Local_R1 : 78 56 34 12.*\n" }
 
   Local_R2 := My_R2;
   Put ("Local_R2 :");
   Dump (Local_R2'Address, R2'Max_Size_In_Storage_Elements);
   New_Line;
-  -- { dg-output "Local_R2 : 12 34 56 78\n" }
+  -- { dg-output "Local_R2 : 12 34 56 78.*\n" }
 
   Local_R1.I := 16#12345678#;
 
   Put ("Local_R1 :");
   Dump (Local_R1'Address, R1'Max_Size_In_Storage_Elements);
   New_Line;
-  -- { dg-output "Local_R1 : 78 56 34 12\n" }
+  -- { dg-output "Local_R1 : 78 56 34 12.*\n" }
 
   Local_R2.I := 16#12345678#;
 
   Put ("Local_R2 :");
   Dump (Local_R2'Address, R2'Max_Size_In_Storage_Elements);
   New_Line;
-  -- { dg-output "Local_R2 : 12 34 56 78\n" }
+  -- { dg-output "Local_R2 : 12 34 56 78.*\n" }
 
   Local_R1.I := Local_R2.I;
 
   Put ("Local_R1 :");
   Dump (Local_R1'Address, R1'Max_Size_In_Storage_Elements);
   New_Line;
-  -- { dg-output "Local_R1 : 78 56 34 12\n" }
+  -- { dg-output "Local_R1 : 78 56 34 12.*\n" }
 
   Local_R2.I := Local_R1.I;
 
   Put ("Local_R2 :");
   Dump (Local_R2'Address, R2'Max_Size_In_Storage_Elements);
   New_Line;
-  -- { dg-output "Local_R2 : 12 34 56 78\n" }
+  -- { dg-output "Local_R2 : 12 34 56 78.*\n" }
 end;
Index: gcc/testsuite/gnat.dg/sso/p10.adb
===
--- gcc/testsuite/gnat.dg/sso/p10.adb	(Revision 232815)
+++ gcc/testsuite/gnat.dg/sso/p10.adb	(Arbeitskopie)
@@ -14,50 +14,50 @@ begin
   Put ("My_R1:");
   Dump (My_R1'Address, R1'Max_Size_In_Storage_Elements);
   New_Line;
-  -- { dg-output "My_R1: 78 56 34 12\n" }
+  -- { dg-output "My_R1: 78 56 34 12.*\n" }
 
   Put ("My_R2:");
   Dump (My_R2'Address, R2'Max_Size_In_Storage_Elements);
   New_Line;
-  -- { dg-output "My_R2: 12 34 56 78\n" }
+  -- { dg-output "My_R2: 12 34 56 78.*\n" }
 
   Local_R1 := My_R1;
   Put ("Local_R1 :");
   Dump (Local_R1'Address, R1'Max_Size_In_Storage_Elements);
   New_Line;
-  -- { dg-output "Local_R1 : 78 56 34 12\n" }
+  -- { dg-output "Local_R1 : 78 56 34 12.*\n" }
 
   Local_R2 := My_R2;
   Put ("Local_R2 :");
   Dump (Local_R2'Address, R2'Max_Size_In_Storage_Elements);
   New_Line;
-  -- { dg-output "Local_R2 : 12 34 56 78\n" }
+  -- { dg-output "Local_R2 : 12 34 56 78.*\n" }
 
   Local_R1.I := 16#12345678#;
 
   Put ("Local_R1 :");
   Dump (Local_R1'Address, R1'Max_Size_In_Storage_Elements);
   New_Line;
-  -- { dg-output "Local_R1 : 78 56 34 12\n" }
+  -- { dg-output "Local_R1 : 78 56 34 12.*\n" }
 
   Local_R2.I := 16#12345678#;
 
   Put ("Local_R2 :");
   Dump (Local_R2'Address, R2'Max_Size_In_Storage_Elements);
   New_Line;
-  -- { dg-output "Local_R2 : 12 34 56 78\n" }
+  -- { dg-output "Local_R2 : 12 34 56 78.*\n" }
 
   Local_R1.I := Local_R2.I;
 
   Put ("Local_R1 :");
   Dump (Local_R1'Address, R1'Max_Size_In_Storage_Elements);
   New_Line;
-  -- { dg-output "Local_R1 : 78 56 34 12\n" }
+  -- { dg-output "Local_R1 : 78 56 34 12.*\n" }
 
   Local_R2.I := Local_R1.I;
 
   Put

Re: NetBSD has SSP in libc

2016-01-27 Thread Jeff Law

On 01/26/2016 11:41 PM, Thomas Klausner wrote:

On Tue, Jan 26, 2016 at 11:35:15PM -0700, Jeff Law wrote:

On 01/23/2016 02:52 PM, Thomas Klausner wrote:

NetBSD has SSP included in libc:

http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/libc/ssp/?only_with_tag=MAIN

gcc/configure does not know about this. The attached patch (against
SVN checkout from today) fixes this.

I've previously reported this as
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68380 but was told that
patches should be sent here.

Thanks.

If I understand the history SSP first appeared in NetBSD 4.0, so this code
really ought to be conditional on NetBSD 4.0 or newer to be strictly
correct.

Unfortunately we don't have much interaction with NetBSD anymore -- so we
have no clue if supporting building on anything before NetBSD 4.0 is even
reasonable or feasible anymore.


The only supported release branches are NetBSD-6.x and NetBSD-7.x.

NetBSD-4.x was desupported when NetBSD 6.x came out, in 2012:
https://blog.netbsd.org/tnf/entry/end_of_life_for_4
OK.  Based on the above information I went ahead and committed your 
change.  Strictly speaking it's not a regression, but I think it's 
localized & safe enough that it can go in even at this stage in our 
development/release process.


Thanks,
Jeff


Re: [C/C++ PATCH] Don't emit invalid VEC_COND_EXPR for vector comparisons (PR c/68062)

2016-01-27 Thread Jeff Law

On 01/27/2016 12:26 AM, Marek Polacek wrote:

Ping.

On Wed, Jan 20, 2016 at 12:31:51PM +0100, Marek Polacek wrote:

On Wed, Jan 13, 2016 at 11:11:52PM +, Joseph Myers wrote:

The C front-end changes are OK.


Jason, is the C++ part of this patch here

(which is identical to the change in the C FE) ok?

Also, not sure about backporting this, maybe just to 5?

I'll go ahead and ack the C++ bits.  This is fine for the trunk.

WRT backporting, your call.

jeff


Re: [PATCH] libcpp: use better locations for _Pragma tokens (preprocessor/69126)

2016-01-27 Thread Jeff Law

On 01/17/2016 07:09 AM, David Malcolm wrote:

Our code for handling the "_Pragma" builtin macro is implemented in
libcpp/directives.c:destringize_and_run.

It handles _Pragma by creating a one-line buffer containing the _Pragma
content, then sending it to do_pragma, which tokenizes it and handles
the input as if it were spelled as #pragma.

Unfortunately the tokens it generates have bogus location values; the
values are in the current highest ordinary map at the time of expansion,
and this determines the effective location of the synthesized #pragma.

Hence for PR preprocessor/69126:


#pragma GCC diagnostic push
#define MACRO \
 _Pragma("GCC diagnostic ignored \"-Wunused-variable\"") \
 int x;

int g()
{
 MACRO;
 return 0;
}
#pragma GCC diagnostic pop


although -save-temps looks sane:


#pragma GCC diagnostic push

int g()
{
# 19 "../../src/gcc/testsuite/c-c++-common/pr69126.c"
#pragma GCC diagnostic ignored "-Wunused-variable"
# 19 "../../src/gcc/testsuite/c-c++-common/pr69126.c"
 int x;;
 return 0;
}
#pragma GCC diagnostic pop


the "ignored" directive is treated as if it occured in this
bogus location:

(gdb) call inform (226930, "pre, the location of the ignore")
/tmp/test.cc:17:24: note: pre, the location of the ignore
  MACRO;
 ^
which is effectively *after* the
 int x;;
line.

I believe this is a long-standing bug, but the cleanup of the
C++ FE's impl of -Wunused-variable in r226234 (to avoid using
%q+D) has exposed this as a regression (since the -Wunused-variable
warning is now emitted for this location:
 int x;
  ^
rather than at input_location:
  return 0;
  ^
and hence isn't suppressed by the _Pragma.

The function destringize_and_run is full of comments like
   /* Ugh; an awful kludge.  We are really not set up to be lexing
and:
   /* ??? Antique Disgusting Hack.  What does this do?  */
and similar "???" lines.

I believe it predates macro maps (I believe they were added in
r180081 on 2011-10-15); as far as I can tell, the code is set
up for *line* maps, i.e. it may even predate column information
(it gets the line right, but not the column).

To minimize the change to destringize_and_run at this stage, the
following patch papers over the problem by fixing up the locations
for the tokens synthesized for _Pragma, setting them all to be at the
expansion point of the _Pragma directive, rather than some columns
after it.

This fixes the location of the synthesized #pragma and hence
fixes the regression seen in the PR.  The -save-temps output (which
was already correct) is unaffected.

Successfully bootstrapped®rtested on x86_64-pc-linux-gnu;
adds 3 PASS results to g++.sum and 1 to gcc.sum
(although the regression currently only affects C++, I put the new
test case into c-c++-common for good measure).

OK for trunk in stage 3?

gcc/testsuite/ChangeLog:
PR preprocessor/69126
* c-c++-common/pr69126.c: New test case.

libcpp/ChangeLog:
PR preprocessor/69126
* directives.c (destringize_and_run): Add expansion_loc param; use
it when handling unexpanded pragmas to fixup the locations of the
synthesized tokens.
(_cpp_do__Pragma): Add expansion_loc param and use it when calling
destringize_and_run.
* internal.h (_cpp_do__Pragma): Add expansion_loc param.
* macro.c (builtin_macro): Pass expansion location of _Pragma to
_cpp_do__Pragma.

It's a hack, but OK.

This would be precisely the kind of thing that ideally we'd cover the 
existing behaviour with unit tests before trying to clean up the layers 
of hackery in this code.


jeff



Re: RFA: Fix ICE compiling gcc.dg/lto/pr55113_0.c for x86/x86_64

2016-01-27 Thread Bernd Schmidt

On 01/08/2016 02:00 PM, Bernd Schmidt wrote:

On 01/08/2016 10:41 AM, Richard Biener wrote:

On Tue, Dec 22, 2015 at 10:55 AM, Nick Clifton  wrote:

Richard Biener wrote:

I think the option should be simply removed...


Tempting - but we are in stage 3...  My patch at least fixes the ICE for
now.


Yeah, but is it a complete fix for all the -fshort-double issues?


I vote for removing it given that it appears not to have worked for
several releases (PR60410). It is a target specific option really
(m68hc11 judging by one entry in ChangeLog-2001).


So I was looking at removing it, but it turns out t-mips-img tries to 
build a multilib using -fshort-double. Matthew - is this really 
required? I tried to build a mips-img-elf compiler, but building libgcc 
fails with


conftest.c:1:0: error: unsupported combination: 'mips32r6' 
-mno-explicit-relocs

 /* confdefs.h */


conftest.c:1:0: error: '-mno-gpopt' needs '-mexplicit-relocs'


Bernd



Re: [PATCH] fix gimplification of call parameters (PR cilkplus/69267)

2016-01-27 Thread Jeff Law

On 01/15/2016 03:41 PM, Ryan Burn wrote:

2016-01-15  Ryan Burn

 PR cilkplus/69267
 * cilk.c (cilk_gimplify_call_params_in_spawned_fn): Change to use
 gimplify_arg. Removed superfluous post_p argument.
 * c-family.h (cilk_gimplify_call_params_in_spawned_fn): Removed
 superfluous post_p argument.
 * c-gimplify.c (c_gimplify_expr): Likewise.

gcc/cp/ChangeLog:

2016-01-15  Ryan Burn

 PR cilkplus/69267
 * cp-gimplify.c (cilk_cp_gimplify_call_params_in_spawned_fn): Removed
 superfluous post_p argument in call to
 cilk_gimplify_call_params_in_spawned_fn.

gcc/testsuite/ChangeLog:

2016-01-15  Ryan Burn

PR cilkplus/69267
* g++.dg/cilk-plus/CK/pr69267.cc: New test.

Thanks.  Even though we're in stage4, this came in before stage3 closed. 
 I went ahead and committed it to the trunk.


Thanks again, and sorry about the delay.

jeff


RE: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation

2016-01-27 Thread Ajit Kumar Agarwal


-Original Message-
From: Jeff Law [mailto:l...@redhat.com] 
Sent: Wednesday, January 27, 2016 12:48 PM
To: Ajit Kumar Agarwal; Richard Biener
Cc: GCC Patches; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; 
Nagaraju Mekala
Subject: Re: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa 
representation

On 01/18/2016 11:27 AM, Ajit Kumar Agarwal wrote:

>>
>> Ajit, can you confirm which of adpcm_code or adpcm_decode where path 
>> splitting is showing a gain?  I suspect it's the former but would 
>> like to make sure so that I can adjust the heuristics properly.
>>> I'd still like to have this answered when you can Ajit, just to be 
>>> 100% that it's the path splitting in adpcm_code that's responsible 
>>> for the improvements you're seeing in adpcm.
>
> The adpcm_coder get optimized with path splitting whereas the 
> adpcm_decoder is not optimized further with path splitting. In 
> adpcm_decoder the join node is duplicated into its predecessors and 
> with the duplication of join node the code is not optimized further.
>>Right.  Just wanted to make sure my analysis corresponded with what you were 
>>seeing in your benchmarking -- and it does.

>>I suspect that if we looked at this problem from the angle of isolating paths 
>>based on how constant PHI arguments feed into and allow simplifications in 
>>later blocks that we might get >>better long term results -- including 
>>improving adpcm_decoder which has the same idiom as adpcm_coder -- it's just 
>>in the wrong spot in the CFG.
>>But that's obviously gcc-7 material.

Can I look into it.

Thanks & Regards
Ajit

Jeff



Re: [PATCH, aarch64] Fix pr69305 -- addti miscompilation

2016-01-27 Thread James Greenhalgh
On Sun, Jan 24, 2016 at 03:19:35AM -0800, Richard Henderson wrote:
> As Jakub notes in the PR, the representation for add_compare and
> sub_compare were wrong.  And several of the add_carryin patterns
> were duplicates.
> 
> This adds a CC_Cmode for which only the Carry bit is valid.
> 
> The patch appears to generate moderately decent code.  For gcc7 we
> should look into why we'll prefer to mark an output REG_UNUSED
> instead of matching the pattern with that output removed.  This
> results in continuing to use adds (though simplifying adc) after
> we've proved that there will be no carry into the high part of an
> adds+adc pair.
> 
> Ok?
> 
> 
> r~

Hi Richard,

Some tiny nits below:

> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 71fc514..363785e 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -1755,6 +1755,44 @@
>[(set_attr "type" "alus_sreg,alus_imm,alus_imm")]
>  )
>  
> +(define_insn "*add3_compare1_cconly"

I don't understand the naming scheme, it got me a wee bit confused with
add3_compare0 and friends, where the 0 indicates a comparison with
zero...

> +  [(set (reg:CC_C CC_REGNUM)
> + (ne:CC_C
> +   (plus:
> + (zero_extend:
> +   (match_operand:GPI 0 "aarch64_reg_or_zero" "%rZ,rZ,rZ"))
> + (zero_extend:
> +   (match_operand:GPI 1 "aarch64_plus_operand" "r,I,J")))
> +   (zero_extend:
> + (plus:GPI (match_dup 0) (match_dup 1)]
> +  ""
> +  "@
> +  cmn\\t%0, %1
> +  cmn\\t%0, %1
> +  cmp\\t%0, #%n1"
> +  [(set_attr "type" "alus_sreg,alus_imm,alus_imm")]
> +)
> +
> +(define_insn "add3_compare1"
> +  [(set (reg:CC_C CC_REGNUM)
> + (ne:CC_C
> +   (plus:
> + (zero_extend:
> +   (match_operand:GPI 1 "aarch64_reg_or_zero" "%rZ,rZ,rZ"))
> + (zero_extend:
> +   (match_operand:GPI 2 "aarch64_plus_operand" "r,I,J")))
> +   (zero_extend:
> + (plus:GPI (match_dup 1) (match_dup 2)
> +   (set (match_operand:GPI 0 "register_operand" "=r,r,r")
> + (plus:GPI (match_dup 1) (match_dup 2)))]
> +  ""
> +  "@
> +  adds\\t%0, %1, %2
> +  adds\\t%0, %1, %2
> +  subs\\t%0, %1, #%n2"
> +  [(set_attr "type" "alus_sreg,alus_imm,alus_imm")]
> +)
> +
>  (define_insn "*adds_shift_imm_"
>[(set (reg:CC_NZ CC_REGNUM)
>   (compare:CC_NZ



> +;; Note that a single add with carry is matched by cinc,
> +;; and the adc_reg and csel types are matched into the same
> +;; pipelines by existing cores.

I can't see us remembering to update this comment on pipeline models
were it to ever become false. Maybe just drop it?

> @@ -2440,13 +2427,53 @@
>[(set_attr "type" "alu_ext")]
>  )
>  
> -(define_insn "sub3_carryin"
> -  [(set
> -(match_operand:GPI 0 "register_operand" "=r")
> -(minus:GPI (minus:GPI
> - (match_operand:GPI 1 "register_operand" "r")
> - (ltu:GPI (reg:CC CC_REGNUM) (const_int 0)))
> -(match_operand:GPI 2 "register_operand" "r")))]
> +;; The hardware description is op1 + ~op2 + C.
> +;;   = op1 + (-op2 + 1) + (1 - !C)
> +;;   = op1 - op2 - 1 + 1 - !C
> +;;   = op1 - op2 - !C.
> +;; We describe the later.

s/later/latter/

Otherwise, this is OK.

Thanks,
James



Re: Patch RFA: Add option -fcollectible-pointers, use it in ivopts

2016-01-27 Thread Ian Lance Taylor
On Wed, Jan 27, 2016 at 7:24 AM, Bernd Schmidt  wrote:
> On 01/27/2016 04:18 PM, Ian Lance Taylor wrote:
>>
>> On Wed, Jan 27, 2016 at 7:16 AM, Bernd Schmidt 
>> wrote:
>>>
>>>
>>> Still, I feel uncomfortable about making a promise we don't really expect
>>> to
>>> fully keep yet, so I would prefer this option to be undocumented for now.
>>> I
>>> won't object if someone else wants to approve it as a normal option
>>> however.
>>
>>
>> Are you approving the patch if I change the option to be undocumented?
>
>
> Yes.

Committed as follows.

gcc/ChangeLog:

2016-01-27  Ian Lance Taylor  

* common.opt (fkeep-gc-roots-live): New undocumented option.
* tree-ssa-loop-ivopts.c (add_candidate_1): If
-fkeep-gc-roots-live, skip pointers.
(add_iv_candidate_for_biv): Handle add_candidate_1 returning
NULL.

gcc/testsuite/ChangeLog:

2016-01-27  Ian Lance Taylor  

* gcc.dg/tree-ssa/ivopt_5.c: New test.
Index: gcc/common.opt
===
--- gcc/common.opt  (revision 232580)
+++ gcc/common.opt  (working copy)
@@ -1380,6 +1380,10 @@
 Enable hoisting adjacent loads to encourage generating conditional move
 instructions.
 
+fkeep-gc-roots-live
+Common Undocumented Report Var(flag_keep_gc_roots_live) Optimization
+; Always keep a pointer to a live memory block
+
 floop-parallelize-all
 Common Report Var(flag_loop_parallelize_all) Optimization
 Mark all loops as parallel.
Index: gcc/tree-ssa-loop-ivopts.c
===
--- gcc/tree-ssa-loop-ivopts.c  (revision 232580)
+++ gcc/tree-ssa-loop-ivopts.c  (working copy)
@@ -2815,6 +2815,16 @@
   struct iv_cand *cand = NULL;
   tree type, orig_type;
 
+  /* -fkeep-gc-roots-live means that we have to keep a real pointer
+ live, but the ivopts code may replace a real pointer with one
+ pointing before or after the memory block that is then adjusted
+ into the memory block during the loop.  FIXME: It would likely be
+ better to actually force the pointer live and still use ivopts;
+ for example, it would be enough to write the pointer into memory
+ and keep it there until after the loop.  */
+  if (flag_keep_gc_roots_live && POINTER_TYPE_P (TREE_TYPE (base)))
+return NULL;
+
   /* For non-original variables, make sure their values are computed in a type
  that does not invoke undefined behavior on overflows (since in general,
  we cannot prove that these induction variables are non-wrapping).  */
@@ -3083,8 +3093,11 @@
  cand = add_candidate_1 (data,
  iv->base, iv->step, true, IP_ORIGINAL, NULL,
  SSA_NAME_DEF_STMT (def));
- cand->var_before = iv->ssa_name;
- cand->var_after = def;
+ if (cand)
+   {
+ cand->var_before = iv->ssa_name;
+ cand->var_after = def;
+   }
}
   else
gcc_assert (gimple_bb (phi) == data->current_loop->header);
Index: gcc/testsuite/gcc.dg/tree-ssa/ivopt_5.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/ivopt_5.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/ivopt_5.c (working copy)
@@ -0,0 +1,23 @@
+/* { dg-options "-O2 -fdump-tree-ivopts -fkeep-gc-roots-live" } */
+
+/* Only integer ivopts here when using -fkeep-gc-roots-live.   */
+
+void foo (char *pstart, int n)
+{
+  char *p;
+  char *pend = pstart + n;
+
+  for (p = pstart; p < pend; p++)
+*p = 1;
+}
+
+void foo1 (char *pstart, int n)
+{
+  char *p;
+  char *pend = pstart + n;
+
+  for (p = pstart; p != pend; p++)
+*p = 1;
+}
+
+/* { dg-final { scan-tree-dump-times "ivtmp.\[0-9_\]* = PHI <\[^0\]" 0 
"ivopts"} } */


Re: [C/C++ PATCH] Don't emit invalid VEC_COND_EXPR for vector comparisons (PR c/68062)

2016-01-27 Thread Marek Polacek
On Wed, Jan 27, 2016 at 10:02:36AM -0700, Jeff Law wrote:
> On 01/27/2016 12:26 AM, Marek Polacek wrote:
> >Ping.
> >
> >On Wed, Jan 20, 2016 at 12:31:51PM +0100, Marek Polacek wrote:
> >>On Wed, Jan 13, 2016 at 11:11:52PM +, Joseph Myers wrote:
> >>>The C front-end changes are OK.
> >>
> >>Jason, is the C++ part of this patch here
> >>
> >>(which is identical to the change in the C FE) ok?
> >>
> >>Also, not sure about backporting this, maybe just to 5?
> I'll go ahead and ack the C++ bits.  This is fine for the trunk.

Thanks.
 
> WRT backporting, your call.

I think I'll put it into GCC 5 (it's safe and should apply cleanly),
but leave 4.9 alone.

Marek


[PATCH v3] PR48344: Fix unrecognizable insn error with -fstack-limit-register=r2

2016-01-27 Thread Kelvin Nilsen
This patch has bootstrapped and tested on powerpc64le-unknown-linux-gnu 
with no regressions.  Is this ok for the trunk?


See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48344 for the original 
problem report.  The error resulted because gcc's processing of 
command-line options within gcc initialization code originally preceded 
the processing of target-specific configuration hooks.


In the unpatched gcc implementation, the Pmode (pointer mode) variable 
has not been initialized at the time the -fstack-limit-register 
command-line option is processed.  As a consequence, the stack-limiting 
register is not assigned a proper mode.  Thus, rtl instructions that 
make use of this stack-limiting register have an unspecified mode, and 
are therefore not matched by any known instructions.


The fix represented in this patch is to defer the command-line 
processing related to command-line specification of a stack-limiting 
register until after target-specific initialization has been completed.


Some questions and issues raised in response to version 2 of this patch 
are addressed below:


1. Concerns regarding possible unintended consequences that might result 
from moving all target-specific initialization to precede the invocation 
of the handle_common_deferred_options () function are addressed by 
preserving the original initialization order and moving the relevant 
command-line processing to follow the target-specific initialization.


2. A question was raised as to whether Pmode can change with attribute 
target.  It cannot.



gcc/ChangeLog:

2016-01-27  Kelvin Nilsen  

* toplev.c (do_compile): Invoke finish_deferred_option_handling ()
upon return from process_options () and provide comment to explain
why. 
* opts-global.c (handle_common_deferred_options): Introduce static
variables opt_fstack_limit_symbol_arg and
opt_fstack_limit_register_no to remember command-line options
that cannot be processed until after target-specific
initialization has been completed.  Modify the handling of the
OPT_fstack_limit_register_ and OPT_fstack_limit_symbol_
command-line options to defer processing.
(finish_deferred_option_handling): New function to
complete the command-line processing associated with the
OPT_fstack_limit_register_ and OPT_fstack_limit_symbol_
command-line options.
* opts.h: New declaration of finish_deferred_option_handling.

gcc/testsuite/ChangeLog:

2016-01-27  Kelvin Nilsen  

* gcc.target/powerpc/pr48344-1.c: New test.


Index: gcc/testsuite/gcc.target/powerpc/pr48344-1.c
===
--- gcc/testsuite/gcc.target/powerpc/pr48344-1.c(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr48344-1.c(revision 232633)
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-fstack-limit-register=r2" } */
+void foo ()
+{
+  int N = 2;
+  int slots[N];
+
+}
+
Index: gcc/toplev.c
===
--- gcc/toplev.c(revision 232135)
+++ gcc/toplev.c(working copy)
@@ -1938,7 +1938,10 @@ standard_type_bitsize (int bitsize)
 static void
 do_compile ()
 {
+  /* process_options () performs target-specific initialization.  Upon
+ return from process_options (), Pmode has a meaningful value. */
   process_options ();
+  finish_deferred_option_handling ();
 
   /* Don't do any more if an error has already occurred.  */
   if (!seen_error ())
Index: gcc/opts.h
===
--- gcc/opts.h  (revision 232135)
+++ gcc/opts.h  (working copy)
@@ -372,6 +372,7 @@ extern void control_warning_option (unsigned int o
 extern char *write_langs (unsigned int mask);
 extern void print_ignored_options (void);
 extern void handle_common_deferred_options (void);
+extern void finish_deferred_option_handling (void);
 extern bool common_handle_option (struct gcc_options *opts,
  struct gcc_options *opts_set,
  const struct cl_decoded_option *decoded,
Index: gcc/opts-global.c
===
--- gcc/opts-global.c   (revision 232135)
+++ gcc/opts-global.c   (working copy)
@@ -310,6 +310,31 @@ decode_options (struct gcc_options *opts, struct g
   finish_options (opts, opts_set, loc);
 }
 
+/* During execution of handle_common_deferred_options (), the Pmode
+   variable cannot be used because it has not yet been initialized.
+   For this reason, handling of the OPT_fstack_limit_register_ and
+   OPT_fstack_limit_symbol_ options is deferred until execution
+   of finish_deferred_option_handling (), which is invoked following
+   target-specific initialization.
+
+   The variable opt_fstack_limit_symbol_arg represents the name
+   of the register specified in an OPT_fstack_limit_symbol_ command
+   line option

Re: [PATCH, aarch64] Fix pr69305 -- addti miscompilation

2016-01-27 Thread Richard Henderson
On 01/27/2016 09:31 AM, James Greenhalgh wrote:
>> +(define_insn "*add3_compare1_cconly"
> 
> I don't understand the naming scheme, it got me a wee bit confused with
> add3_compare0 and friends, where the 0 indicates a comparison with
> zero...

Ah, well, I didn't get that 0 indicated comparison with zero.
I thought it was simply zero-th out of a list of alternatives.
Certainly quite a lot of other patterns are like that.

Indeed, the add.*carryin patterns that I delete in this very
patch were marked [0-3].

>> +;; Note that a single add with carry is matched by cinc,
>> +;; and the adc_reg and csel types are matched into the same
>> +;; pipelines by existing cores.
> 
> I can't see us remembering to update this comment on pipeline models
> were it to ever become false. Maybe just drop it?

Fair enough; I'll just mention cinc without the scheduling note.


r~


Go patch committed: enable -fkeep-gc-roots-live for Go

2016-01-27 Thread Ian Lance Taylor
This patch enables the new -fkeep-gc-roots-live option for Go.  This
fixes https://golang.org/issue/13662 .  Bootstrapped and ran Go
testsuite on x86_64-pc-linux-gnu.  Committed to mainline.

Ian

2016-01-27  Ian Lance Taylor  

* go-lang.c (go_langhook_init_options_struct): Default to
-fkeep-gc-roots-live.
Index: gcc/go/go-lang.c
===
--- gcc/go/go-lang.c(revision 232888)
+++ gcc/go/go-lang.c(working copy)
@@ -150,6 +150,9 @@
   opts->x_flag_exceptions = 1;
   opts->x_flag_non_call_exceptions = 1;
 
+  /* We need to keep pointers live for the garbage collector.  */
+  opts->x_flag_keep_gc_roots_live = 1;
+
   /* Go programs expect runtime.Callers to work, and that uses
  libbacktrace that uses debug info.  Set the debug info level to 1
  by default.  In post_options we will set the debug type if the


[committed] Build sanitizer builtins in lto on demand (PR lto/69254)

2016-01-27 Thread Jakub Jelinek
Hi!

On Tue, Jan 26, 2016 at 09:54:18PM +0100, Jakub Jelinek wrote:
> On Tue, Jan 26, 2016 at 04:44:39PM +0100, Jakub Jelinek wrote:
> > 2016-01-26  Jakub Jelinek  
> > 
> > PR lto/69254
> > * lto-opts.c (lto_write_options): Write also -f{,no-}sanitize=
> > options.
> > * lto-wrapper.c (struct lto_decoded_options): New type.
> > (append_option, merge_and_complain, append_compiler_options,
> > append_linker_options, append_offload_options,
> > compile_offload_image, compile_images_for_offload_targets,
> > find_and_merge_options): Pass around options
> > in struct lto_decoded_options instead of struct cl_decoded_option
> > pointer and count pair.
> > (get_options_from_collect_gcc_options): Likewise.  Parse -fsanitize=
> > options and if in the end any ub sanitizers are enabled, set
> > decoded_opts->sanitize_undefined to true.
> > (run_gcc): Adjust callers of these functions.  If
> > fdecoded_options.sanitize_undefined is true, append
> > -fsanitize=shift after the linker options.
> 
> Now successfully bootstrapped/regtested on x86_64-linux and i686-linux.

After IRC discussions, I've bootstrapped/regtested following patch instead
and Richard pre-approved it on IRC, thus I've committed it to trunk:

2016-01-27  Jakub Jelinek  

PR lto/69254
* sanitizer.def: Add BEGIN_SANITIZER_BUILTINS and
END_SANITIZER_BUILTINS markers using DEF_BUILTIN_STUB.
* asan.c (DEF_BUILTIN_STUB): Temporarily define.
* tree-streamer-in.c: Include asan.h.
(streamer_get_builtin_tree): For builtins in sanitizer
range call initialize_sanitizer_builtins and retry.

--- gcc/sanitizer.def.jj2016-01-04 14:55:53.0 +0100
+++ gcc/sanitizer.def   2016-01-27 16:12:47.285764673 +0100
@@ -20,12 +20,16 @@ along with GCC; see the file COPYING3.
 
 /* Before including this file, you should define a macro:
 
+ DEF_BUILTIN_STUB(ENUM, NAME)
  DEF_SANITIZER_BUILTIN (ENUM, NAME, TYPE, ATTRS)
 
See builtins.def for details.
The builtins are created by the C-family of FEs in c-family/c-common.c,
for other FEs by asan.c.  */
 
+/* This has to come before all the sanitizer builtins.  */
+DEF_BUILTIN_STUB(BEGIN_SANITIZER_BUILTINS, (const char *)0)
+
 /* Address Sanitizer */
 DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_INIT, "__asan_init",
  BT_FN_VOID, ATTR_NOTHROW_LEAF_LIST)
@@ -515,3 +519,6 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_UBSAN_HAN
 DEF_SANITIZER_BUILTIN(BUILT_IN_SANITIZER_COV_TRACE_PC,
  "__sanitizer_cov_trace_pc",
  BT_FN_VOID, ATTR_NOTHROW_LEAF_LIST)
+
+/* This has to come after all the sanitizer builtins.  */
+DEF_BUILTIN_STUB(END_SANITIZER_BUILTINS, (const char *)0)
--- gcc/asan.c.jj   2016-01-04 14:55:52.0 +0100
+++ gcc/asan.c  2016-01-27 16:20:06.304809636 +0100
@@ -2370,6 +2370,8 @@ initialize_sanitizer_builtins (void)
   /* ECF_COLD missing */ ATTR_CONST_NORETURN_NOTHROW_LEAF_LIST
 #undef ATTR_PURE_NOTHROW_LEAF_LIST
 #define ATTR_PURE_NOTHROW_LEAF_LIST ECF_PURE | ATTR_NOTHROW_LEAF_LIST
+#undef DEF_BUILTIN_STUB
+#define DEF_BUILTIN_STUB(ENUM, NAME)
 #undef DEF_SANITIZER_BUILTIN
 #define DEF_SANITIZER_BUILTIN(ENUM, NAME, TYPE, ATTRS) \
   decl = add_builtin_function ("__builtin_" NAME, TYPE, ENUM,  \
@@ -2389,6 +2391,7 @@ initialize_sanitizer_builtins (void)
   ATTR_PURE_NOTHROW_LEAF_LIST)
 
 #undef DEF_SANITIZER_BUILTIN
+#undef DEF_BUILTIN_STUB
 }
 
 /* Called via htab_traverse.  Count number of emitted
--- gcc/tree-streamer-in.c.jj   2016-01-25 12:10:59.0 +0100
+++ gcc/tree-streamer-in.c  2016-01-27 16:20:40.294348588 +0100
@@ -32,6 +32,7 @@ along with GCC; see the file COPYING3.
 #include "builtins.h"
 #include "ipa-chkp.h"
 #include "gomp-constants.h"
+#include "asan.h"
 
 
 /* Read a STRING_CST from the string table in DATA_IN using input
@@ -1136,13 +1137,21 @@ streamer_get_builtin_tree (struct lto_in
fatal_error (input_location,
 "machine independent builtin code out of range");
   result = builtin_decl_explicit (fcode);
-  if (!result
- && fcode > BEGIN_CHKP_BUILTINS
- && fcode < END_CHKP_BUILTINS)
+  if (!result)
{
- fcode = (enum built_in_function) (fcode - BEGIN_CHKP_BUILTINS - 1);
- result = builtin_decl_explicit (fcode);
- result = chkp_maybe_clone_builtin_fndecl (result);
+ if (fcode > BEGIN_CHKP_BUILTINS && fcode < END_CHKP_BUILTINS)
+   {
+ fcode = (enum built_in_function)
+ (fcode - BEGIN_CHKP_BUILTINS - 1);
+ result = builtin_decl_explicit (fcode);
+ result = chkp_maybe_clone_builtin_fndecl (result);
+   }
+ else if (fcode > BEGIN_SANITIZER_BUILTINS
+  && fcode < END_SANITIZER_BUILTINS)
+   {
+ initialize_sanitizer_builtins ();
+ result = builtin_decl_ex

Go patch committed: Accept map composite literals with omitted key types

2016-01-27 Thread Ian Lance Taylor
In the Go 1.5 release, the language was changed slightly to permit
omitting the type of a key in a composite literal of map type.  This
patch by Chris Manghane implements that in gccgo.  This fixes
https://golang.org/issue/10263 .  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 232858)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-9e68d67d65fd72b9b4f163f2f26e15cd0d3e2cd2
+8dce33f24dd3a34e3574c1d2604428586b63c1aa
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc(revision 232239)
+++ gcc/go/gofrontend/expressions.cc(working copy)
@@ -12583,69 +12583,7 @@ Map_construction_expression::do_dump_exp
   ast_dump_context->ostream() << "}";
 }
 
-// A general composite literal.  This is lowered to a type specific
-// version.
-
-class Composite_literal_expression : public Parser_expression
-{
- public:
-  Composite_literal_expression(Type* type, int depth, bool has_keys,
-  Expression_list* vals, bool all_are_names,
-  Location location)
-: Parser_expression(EXPRESSION_COMPOSITE_LITERAL, location),
-  type_(type), depth_(depth), vals_(vals), has_keys_(has_keys),
-  all_are_names_(all_are_names)
-  { }
-
- protected:
-  int
-  do_traverse(Traverse* traverse);
-
-  Expression*
-  do_lower(Gogo*, Named_object*, Statement_inserter*, int);
-
-  Expression*
-  do_copy()
-  {
-return new Composite_literal_expression(this->type_, this->depth_,
-   this->has_keys_,
-   (this->vals_ == NULL
-? NULL
-: this->vals_->copy()),
-   this->all_are_names_,
-   this->location());
-  }
-
-  void
-  do_dump_expression(Ast_dump_context*) const;
-  
- private:
-  Expression*
-  lower_struct(Gogo*, Type*);
-
-  Expression*
-  lower_array(Type*);
-
-  Expression*
-  make_array(Type*, const std::vector*, Expression_list*);
-
-  Expression*
-  lower_map(Gogo*, Named_object*, Statement_inserter*, Type*);
-
-  // The type of the composite literal.
-  Type* type_;
-  // The depth within a list of composite literals within a composite
-  // literal, when the type is omitted.
-  int depth_;
-  // The values to put in the composite literal.
-  Expression_list* vals_;
-  // If this is true, then VALS_ is a list of pairs: a key and a
-  // value.  In an array initializer, a missing key will be NULL.
-  bool has_keys_;
-  // If this is true, then HAS_KEYS_ is true, and every key is a
-  // simple identifier.
-  bool all_are_names_;
-};
+// Class Composite_literal_expression.
 
 // Traversal.
 
@@ -12664,12 +12602,17 @@ Composite_literal_expression::do_travers
   // The type may not be resolvable at this point.
   Type* type = this->type_;
 
-  for (int depth = this->depth_; depth > 0; --depth)
+  for (int depth = 0; depth < this->depth_; ++depth)
 {
   if (type->array_type() != NULL)
 type = type->array_type()->element_type();
   else if (type->map_type() != NULL)
-type = type->map_type()->val_type();
+{
+  if (this->key_path_[depth])
+type = type->map_type()->key_type();
+  else
+type = type->map_type()->val_type();
+}
   else
 {
   // This error will be reported during lowering.
@@ -12723,12 +12666,17 @@ Composite_literal_expression::do_lower(G
 {
   Type* type = this->type_;
 
-  for (int depth = this->depth_; depth > 0; --depth)
+  for (int depth = 0; depth < this->depth_; ++depth)
 {
   if (type->array_type() != NULL)
type = type->array_type()->element_type();
   else if (type->map_type() != NULL)
-   type = type->map_type()->val_type();
+{
+  if (this->key_path_[depth])
+type = type->map_type()->key_type();
+  else
+type = type->map_type()->val_type();
+}
   else
{
  if (!type->is_error())
Index: gcc/go/gofrontend/expressions.h
===
--- gcc/go/gofrontend/expressions.h (revision 232239)
+++ gcc/go/gofrontend/expressions.h (working copy)
@@ -47,6 +47,7 @@ class Bound_method_expression;
 class Field_reference_expression;
 class Interface_field_reference_expression;
 class Allocation_expression;
+class Composite_literal_expression;
 class Struct_construction_expression;
 class Array_c

Re: [C PATCH] Fix -Wunused-function (PR debug/66869)

2016-01-27 Thread Jakub Jelinek
Hi!

On Wed, Jan 27, 2016 at 11:17:18AM +0100, Richard Biener wrote:
> No, simply warn and set TREE_NO_WARNING so cgraph doesn't warn again.

This seems to work too, bootstrapped/regtested on x86_64-linux and
i686-linux, ok for trunk?

2016-01-25  Jakub Jelinek  

PR debug/66869
* c-decl.c (c_write_global_declarations_1): Warn with
warn_unused_function if static prototype without definition
is not C_DECL_USED.

* gcc.dg/pr66869.c: New test.

--- gcc/c/c-decl.c.jj   2016-01-25 22:33:11.813025064 +0100
+++ gcc/c/c-decl.c  2016-01-27 13:03:15.896068387 +0100
@@ -10741,11 +10741,22 @@ c_write_global_declarations_1 (tree glob
   if (TREE_CODE (decl) == FUNCTION_DECL
  && DECL_INITIAL (decl) == 0
  && DECL_EXTERNAL (decl)
- && !TREE_PUBLIC (decl)
- && C_DECL_USED (decl))
+ && !TREE_PUBLIC (decl))
{
- pedwarn (input_location, 0, "%q+F used but never defined", decl);
- TREE_NO_WARNING (decl) = 1;
+ if (C_DECL_USED (decl))
+   {
+ pedwarn (input_location, 0, "%q+F used but never defined", decl);
+ TREE_NO_WARNING (decl) = 1;
+   }
+ /* For -Wunused-function warn about unused static prototypes.  */
+ else if (warn_unused_function
+  && ! DECL_ARTIFICIAL (decl)
+  && ! TREE_NO_WARNING (decl))
+   {
+ warning (OPT_Wunused_function,
+  "%q+F declared % but never defined", decl);
+ TREE_NO_WARNING (decl) = 1;
+   }
}
 
   wrapup_global_declaration_1 (decl);
--- gcc/testsuite/gcc.dg/pr66869.c.jj   2016-01-27 12:59:46.997929005 +0100
+++ gcc/testsuite/gcc.dg/pr66869.c  2016-01-27 12:59:46.997929005 +0100
@@ -0,0 +1,6 @@
+/* PR debug/66869 */
+/* { dg-do compile } */
+/* { dg-options "-Wunused-function" } */
+
+static void test (void); /* { dg-warning "'test' declared 'static' but never 
defined" } */
+int i;


Jakub


[PATCH] Fix up a libstdc++ testsuite regression

2016-01-27 Thread Jakub Jelinek
Hi!

I've noticed recent
UNRESOLVED: libstdc++-prettyprinters/whatis.cc compilation failed to produce 
executable
regression and I bet it is caused by the  changes where it no
longer includes .  Tested on x86_64-linux, ok for trunk?

2016-01-27  Jakub Jelinek  

* testsuite/libstdc++-prettyprinters/whatis.cc: Include .

--- libstdc++-v3/testsuite/libstdc++-prettyprinters/whatis.cc   (revision 
232881)
+++ libstdc++-v3/testsuite/libstdc++-prettyprinters/whatis.cc   (working copy)
@@ -33,6 +33,7 @@
 #include 
 #include 
 #include 
+#include 
 
 template
 void

Jakub


[P2][PR tree-optimization/68398] Refine when we allow the FSM threader to create irreducible inner loops

2016-01-27 Thread Jeff Law


Right now we only allow the FSM threader to create an irreducible inner 
loop when it's able to eliminate a multiway branch.  The theory being 
that the multiway branch is very expensive and potentially outweighs the 
value of loop optimizations.


That was a fine start, but as seen in 68398, further refinement is 
desirable.


Essentially in the 68398 case, we want to thread through the latch to a 
point in the loop that does not dominate the latch -- thus creating an 
irreducible inner loop.  But in the 68398 case, we aren't eliminating a 
multiway branch, just a simple conditional branch.


I stared at the jump threading paths, cfg and overall code for a long 
time over the last few days pondering what characteristics are important 
here.  In the end it seems so simple.


If the path has a high number of blocks relative to the number of 
statements in the path, then the path (and subsequent irreducible loop) 
really isn't likely to be helped by loop optimizations to begin with -- 
essentially the jump thread path is just a series of conditional 
branches where we know the result of the last one -- and there's very 
little computation going on in the path.


So that's what I'm keying on -- the ratio of blocks to statements in the 
path.   The higher that ratio, the more inclined we are to allow 
creation of the irreducible inner loop.


FWIW, I did look at simply allowing irreducible loops after the loop 
optimizers were complete -- that doesn't resolve the issues here. We 
need it to happen early so that the rest of the optimization pipeline 
(including dom, vrp2, dom2) see the jump-threaded code and refine it 
further.


We may still want to make that kind of change, but I'd like to see code 
that benefits from such a change rather than blindly going forward with it.


Going back to that ratio of blocks to statements.  We were counting PHIs 
in the thread path against the statement count.  Except for the final 
block, those PHIs are going to be const/copy propagated away.  So the 
code to count the PHIs was moved out of the loop and just looks at PHIs 
in the final block in the path.


Second we want to be able to compare the number of statements and the 
number of blocks in a jump threading path.  In particular we'd like to 
look at ratios between the two.  So there's two controlling PARAMS to 
scale the raw numbers, which then makes it easier to build ratios.  So 
we can say that once the number of blocks exceeds the number of 
statements by 1.5X, then allow creation of the irreducible loop (that's 
done by scaling the blocks by 3 and statements by 2.


Testing.  For the testcase in 68398, this improves things by another 
percent or two in terms of raw instruction counts.  I get something like 
206k-207k instruction references.


For pr66752-3.c the heuristic fires and we handle all 3 jump threads 
during VRP1 and obviously we continue to collapse away all 3 tests of FLAG.


For vrp46.c, ssa-dom-thread-2{c,d}.c I added statements on the jump 
threading path so that the heuristic wouldn't fire to preserve the 
spirit of those tests.


ssa-dom-thread-2{h,g}.c new tests, copied from the original 
ssa-dom-thread-2{c,d} which show the heuristic firing.


ssa-dom-thread-7.c does more threading early as the heuristic fires 
there.  So its expected output was adjusted.



Bootstrapped and regression tested on x86_64 linux.  Installing on the 
trunk.


This also fixes 3 minor issues Bernd spotted in the last round of changes.

Jeff
commit a5b3ed5d29ceb18b9e257894661f328462f581c8
Author: law 
Date:   Wed Jan 27 19:19:47 2016 +

PR tree-optimization/68398
* params.def (PARAM_FSM_SCALE_PATH_STMTS): New parameter.
(PARAM_FSM_SCALE_PATH_BLOCKS): Likewise.
* tree-ssa-threadbackward.c (fsm_find_control_statement_thread_paths):
Only count PHIs in the last block in the path.  The others will
const/copy propagate away.  Add heuristic to allow more irreducible
subloops to be created when it is likely profitable to do so.

* tree-ssa-threadbackward.c (fsm_find_control_statement_thread_paths):
Fix typo in comment.  Use gsi_after_labels and remove the GIMPLE_LABEL
check from within the loop.  Use gsi_next_nondebug rather than gsi_next.

PR tree-optimization/68398
* gcc.dg/tree-ssa/pr66752-3.c: Update expected output.
* gcc.dg/tree-ssa/ssa-dom-thread-2c.c: Add extra statements on thread
path to avoid new heuristic allowing more irreducible regions
* gcc.dg/tree-ssa/ssa-dom-thread-2d.c: Likewise.
* gcc.dg/tree-ssa/vrp46.c: Likewise.
* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Update expected output.
* gcc.dg/tree-ssa/ssa-dom-thread-2g.c: New test.
* gcc.dg/tree-ssa/ssa-dom-thread-2h.c: Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@232897 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 1973060..76c7af2 100644

[PATCH] Don't change stack_alignment_needed for __tls_get_addr

2016-01-27 Thread H.J. Lu

__tls_get_addr must be called with 16-byte aligned stack, which is
guaranted by setting preferred_stack_boundary to 128 bits.  There
is no need to change stack_alignment_needed for __tls_get_addr.

Tested on x86-64.  OK for trunk?

H.J.
--
PR target/68986
* config/i386/i386.c (ix86_update_stack_boundary): Don't
change stack_alignment_needed for __tls_get_addr call.
---
 gcc/config/i386/i386.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index cfbdf0f..6dc1fa5 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -12035,11 +12035,7 @@ ix86_update_stack_boundary (void)
   /* __tls_get_addr needs to be called with 16-byte aligned stack.  */
   if (ix86_tls_descriptor_calls_expanded_in_cfun
   && crtl->preferred_stack_boundary < 128)
-{
-  crtl->preferred_stack_boundary = 128;
-  if (crtl->stack_alignment_needed < 128)
-   crtl->stack_alignment_needed = 128;
-}
+crtl->preferred_stack_boundary = 128;
 }
 
 /* Handle the TARGET_GET_DRAP_RTX hook.  Return NULL if no DRAP is
-- 
2.5.0



Re: [PATCH] Fix up a libstdc++ testsuite regression

2016-01-27 Thread Jeff Law

On 01/27/2016 11:54 AM, Jakub Jelinek wrote:

Hi!

I've noticed recent
UNRESOLVED: libstdc++-prettyprinters/whatis.cc compilation failed to produce 
executable
regression and I bet it is caused by the  changes where it no
longer includes .  Tested on x86_64-linux, ok for trunk?

2016-01-27  Jakub Jelinek  

* testsuite/libstdc++-prettyprinters/whatis.cc: Include .

OK.
jeff



Re: [C PATCH] Fix -Wunused-function (PR debug/66869)

2016-01-27 Thread Jeff Law

On 01/27/2016 11:51 AM, Jakub Jelinek wrote:

Hi!

On Wed, Jan 27, 2016 at 11:17:18AM +0100, Richard Biener wrote:

No, simply warn and set TREE_NO_WARNING so cgraph doesn't warn again.


This seems to work too, bootstrapped/regtested on x86_64-linux and
i686-linux, ok for trunk?

2016-01-25  Jakub Jelinek  

PR debug/66869
* c-decl.c (c_write_global_declarations_1): Warn with
warn_unused_function if static prototype without definition
is not C_DECL_USED.

* gcc.dg/pr66869.c: New test.

OK.
jeff



Re: [PATCH 1/4] Make SRA scalarize constant-pool loads

2016-01-27 Thread David Edelsohn
The new sra-17.c and sra-18.c tests fail on AIX because the regex is
too restrictive -- AIX labels don't have exactly the same format.  On
AIX, the labels in the dumps look like "LC..0" instead of ".LC0".

This patch adds "*" and ".*" so that the "." prepended to LC is
optional and to allow characters between the "LC" and the "0".

I needed extra escapes for the sra-17.c line that matches multiple
times - for no apparent reason.

Okay?

Thanks, David

Index: sra-17.c
===
--- sra-17.c(revision 232872)
+++ sra-17.c(working copy)
@@ -15,5 +15,5 @@
   abort ();
 }

-/* { dg-final { scan-tree-dump-times "Removing load: a = \\\*.LC0;" 1
"esra" } } */
-/* { dg-final { scan-tree-dump-times "SR.\[0-9_\]+ = \\\*.LC0\\\[" 4
"esra" } } */
+/* { dg-final { scan-tree-dump-times "Removing load: a =
\\\*.*LC.*0;" 1 "esra" } } */
+/* { dg-final { scan-tree-dump-times "SR.\[0-9_\]+ =
\\\*\\.*LC\\.*0\\\[" 4 "esra" } } */
Index: sra-18.c
===
--- sra-18.c(revision 232872)
+++ sra-18.c(working copy)
@@ -21,8 +21,8 @@
   abort ();
 }

-/* { dg-final { scan-tree-dump-times "Removing load: a = \\\*.LC0;" 1
"esra" } } */
-/* { dg-final { scan-tree-dump-times "SR.\[0-9_\]+ =
\\\*.LC0\\.b\\\[0\\\]\\.f\\\[0\\\]\\.x" 1 "esra" } } */
-/* { dg-final { scan-tree-dump-times "SR.\[0-9_\]+ =
\\\*.LC0\\.b\\\[0\\\]\\.f\\\[1\\\]\\.x" 1 "esra" } } */
-/* { dg-final { scan-tree-dump-times "SR.\[0-9_\]+ =
\\\*.LC0\\.b\\\[1\\\]\\.f\\\[0\\\]\\.x" 1 "esra" } } */
-/* { dg-final { scan-tree-dump-times "SR.\[0-9_\]+ =
\\\*.LC0\\.b\\\[1\\\]\\.f\\\[1\\\]\\.x" 1 "esra" } } */
+/* { dg-final { scan-tree-dump-times "Removing load: a =
\\\*.*LC.*0;" 1 "esra" } } */
+/* { dg-final { scan-tree-dump-times "SR.\[0-9_\]+ =
\\\*.*LC.*0\\.b\\\[0\\\]\\.f\\\[0\\\]\\.x" 1 "esra" } } */
+/* { dg-final { scan-tree-dump-times "SR.\[0-9_\]+ =
\\\*.*LC.*0\\.b\\\[0\\\]\\.f\\\[1\\\]\\.x" 1 "esra" } } */
+/* { dg-final { scan-tree-dump-times "SR.\[0-9_\]+ =
\\\*.*LC.*0\\.b\\\[1\\\]\\.f\\\[0\\\]\\.x" 1 "esra" } } */
+/* { dg-final { scan-tree-dump-times "SR.\[0-9_\]+ =
\\\*.*LC.*0\\.b\\\[1\\\]\\.f\\\[1\\\]\\.x" 1 "esra" } } */



Re: [P2][PR tree-optimization/68398] Refine when we allow the FSM threader to create irreducible inner loops

2016-01-27 Thread Bernhard Reutner-Fischer
On January 27, 2016 8:20:47 PM GMT+01:00, Jeff Law  wrote:

>
>This also fixes 3 minor issues Bernd spotted in the last round of
>changes.

Bernhard, not Bernd i suppose. Note that the !is_gimple_debug (stmt) should now 
be redundant in the hunk below.
Thanks,

@@ -280,33 +280,19 @@ fsm_find_control_statement_thread_paths (tree name,
  break;
}
 
- for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+ for (gsi = gsi_after_labels (bb);
+  !gsi_end_p (gsi);
+  gsi_next_nondebug (&gsi))
{
  gimple *stmt = gsi_stmt (gsi);
  /* Do not count empty statements and labels.  */
  if (gimple_code (stmt) != GIMPLE_NOP
- && gimple_code (stmt) != GIMPLE_LABEL
  && !(gimple_code (stmt) == GIMPLE_ASSIGN
   && gimple_assign_rhs_code (stmt) == ASSERT_EXPR)
  && !is_gimple_debug (stmt))
++n_insns;
}

>
>Jeff




[PATCH, i386]: Fix PR69512: ICE when using avx with i586 (edit)

2016-01-27 Thread Uros Bizjak
Hello!

-march=i586 activates TARGET_ZERO_EXTEND_WITH_AND, which disables movz
instructions. These targets have to use and instructions, and when
flags are not allowed to be clobbered a sequence of register clear +
movstrict.

2016-01-27  Uros Bizjak  

PR target/69512
* config/i386/i386.md (*zext_doubleword_and): New pattern.
(*zext_doubleword): Disable for TARGET_ZERO_EXTEND_WITH_AND.

testsuite/ChangeLog:

2016-01-27  Uros Bizjak  

PR target/69512
* gcc.target/i386/pr69512.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
Committed to mainline SVN.

Uros.
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 232878)
+++ config/i386/i386.md (working copy)
@@ -3874,10 +3874,29 @@
   [(set_attr "type" "imovx")
(set_attr "mode" "SI")])
 
+(define_insn_and_split "*zext_doubleword_and"
+  [(set (match_operand:DI 0 "register_operand" "=&")
+   (zero_extend:DI (match_operand:SWI12 1 "nonimmediate_operand" "m")))]
+  "!TARGET_64BIT && TARGET_STV && TARGET_SSE2
+   && TARGET_ZERO_EXTEND_WITH_AND && optimize_function_for_speed_p (cfun)"
+  "#"
+  "&& reload_completed && GENERAL_REG_P (operands[0])"
+  [(set (match_dup 2) (const_int 0))]
+{
+  split_double_mode (DImode, &operands[0], 1, &operands[0], &operands[2]);
+
+  emit_move_insn (operands[0], const0_rtx);
+
+  gcc_assert (!TARGET_PARTIAL_REG_STALL);
+  emit_insn (gen_movstrict
+(gen_lowpart (mode, operands[0]), operands[1]));
+})
+
 (define_insn_and_split "*zext_doubleword"
   [(set (match_operand:DI 0 "register_operand" "=r")
(zero_extend:DI (match_operand:SWI12 1 "nonimmediate_operand" "m")))]
-  "!TARGET_64BIT && TARGET_STV && TARGET_SSE2"
+  "!TARGET_64BIT && TARGET_STV && TARGET_SSE2
+   && !(TARGET_ZERO_EXTEND_WITH_AND && optimize_function_for_speed_p (cfun))"
   "#"
   "&& reload_completed && GENERAL_REG_P (operands[0])"
   [(set (match_dup 0) (zero_extend:SI (match_dup 1)))
Index: testsuite/gcc.target/i386/pr69512.c
===
--- testsuite/gcc.target/i386/pr69512.c (nonexistent)
+++ testsuite/gcc.target/i386/pr69512.c (working copy)
@@ -0,0 +1,14 @@
+/* { dg-do compile { target ia32 } } */
+/* { dg-options "-march=i586 -mavx -O2" } */
+
+extern double s1[];
+extern double s2[];
+extern long long e[];
+
+void test (void)
+{
+  int i;
+
+  for (i = 0; i < 2; i++)
+e[i] = !__builtin_isunordered(s1[i], s2[i]) && s1[i] != s2[i] ? -1 : 0;
+}


Re: [P2][PR tree-optimization/68398] Refine when we allow the FSM threader to create irreducible inner loops

2016-01-27 Thread Jeff Law

On 01/27/2016 12:39 PM, Bernhard Reutner-Fischer wrote:

On January 27, 2016 8:20:47 PM GMT+01:00, Jeff Law  wrote:



This also fixes 3 minor issues Bernd spotted in the last round of
changes.


Bernhard, not Bernd i suppose. Note that the !is_gimple_debug (stmt) should now 
be redundant in the hunk below.
Thanks,

Terribly sorry.  Yes, it should have been Bernhard.

The is_gimple_debug is not redundant -- the first statement in the block 
could be a debug statement.  If we had a gsi-first-non-debug-non-label 
function, we could use that and then the is_gimple_debug would be redundant


jeff



Re: [PATCH] Don't change stack_alignment_needed for __tls_get_addr

2016-01-27 Thread Uros Bizjak
On Wed, Jan 27, 2016 at 8:25 PM, H.J. Lu  wrote:
>
> __tls_get_addr must be called with 16-byte aligned stack, which is
> guaranted by setting preferred_stack_boundary to 128 bits.  There
> is no need to change stack_alignment_needed for __tls_get_addr.
>
> Tested on x86-64.  OK for trunk?

You know the purpose of these flags better than I, so - OK.

Thanks,
Uros.

>
> H.J.
> --
> PR target/68986
> * config/i386/i386.c (ix86_update_stack_boundary): Don't
> change stack_alignment_needed for __tls_get_addr call.
> ---
>  gcc/config/i386/i386.c | 6 +-
>  1 file changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index cfbdf0f..6dc1fa5 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -12035,11 +12035,7 @@ ix86_update_stack_boundary (void)
>/* __tls_get_addr needs to be called with 16-byte aligned stack.  */
>if (ix86_tls_descriptor_calls_expanded_in_cfun
>&& crtl->preferred_stack_boundary < 128)
> -{
> -  crtl->preferred_stack_boundary = 128;
> -  if (crtl->stack_alignment_needed < 128)
> -   crtl->stack_alignment_needed = 128;
> -}
> +crtl->preferred_stack_boundary = 128;
>  }
>
>  /* Handle the TARGET_GET_DRAP_RTX hook.  Return NULL if no DRAP is
> --
> 2.5.0
>


Re: new port: vn8

2016-01-27 Thread Mike Stump
On Jan 26, 2016, at 10:33 PM, Nguyễn Sinh Ngọc  
wrote:
> I wonder that what paper is?
> Is it an introduction about new feature in our target?

I was not able to make any sense of these two question.  Likely a language 
barrier.  If you can find a way to rephrase them…  I can try and answer.

Re: [P2][PR tree-optimization/68398] Refine when we allow the FSM threader to create irreducible inner loops

2016-01-27 Thread Jakub Jelinek
On Wed, Jan 27, 2016 at 12:48:31PM -0700, Jeff Law wrote:
> On 01/27/2016 12:39 PM, Bernhard Reutner-Fischer wrote:
> >On January 27, 2016 8:20:47 PM GMT+01:00, Jeff Law  wrote:
> >
> >>
> >>This also fixes 3 minor issues Bernd spotted in the last round of
> >>changes.
> >
> >Bernhard, not Bernd i suppose. Note that the !is_gimple_debug (stmt) should 
> >now be redundant in the hunk below.
> >Thanks,
> Terribly sorry.  Yes, it should have been Bernhard.
> 
> The is_gimple_debug is not redundant -- the first statement in the block
> could be a debug statement.  If we had a gsi-first-non-debug-non-label
> function, we could use that and then the is_gimple_debug would be redundant

But then using gsi_next_nondebug doesn't make much sense.

Jakub


Re: [PATCH, testsuite] Fix g++.dg/pr67989.C test failure when running with -march or -mcpu

2016-01-27 Thread Mike Stump
On Jan 26, 2016, at 10:35 PM, Thomas Preud'homme 
 wrote:
> On Monday, January 18, 2016 11:33:47 AM Thomas Preud'homme wrote:
>> On Wednesday, January 13, 2016 06:39:20 PM Bernd Schmidt wrote:
>>> On 01/12/2016 08:55 AM, Thomas Preud'homme wrote:
 On Monday, January 11, 2016 04:57:18 PM Bernd Schmidt wrote:
> On 01/08/2016 10:33 AM, Thomas Preud'homme wrote:
>> 2016-01-08  Thomas Preud'homme  
>> 
>>  * g++.dg/pr67989.C: Remove ARM-specific option.
>>  * gcc.target/arm/pr67989.C: New file.
> 
> I checked some other arm tests and they have things like
> 
> /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } {
> "-march=*" } { "-march=armv4t" } } */
> /* { dg-skip-if "avoid conflicting multilib options" { *-*-* } {
> "-mthumb" } { "" } } */
> 
> Do you need the same in your testcase?
 
 That was the first approach I took but Kyrill suggested me to use
 arm_arch_v4t and arm_arch_v4t_ok machinery instead. It should take care
 about whether the architecture can be selected.
>>> 
>>> Hmm, the ones I looked at did use dg-add-options, but not the
>>> corresponding _ok requirement. So I think this is OK.
>> 
>> Just to make sure: ok as in OK to commit as is?

Ok.


Re: [PATCH, testsuite]: Disable some tests in gcc.dg/torture/pr68264.c for older glibcs

2016-01-27 Thread Mike Stump
On Jan 27, 2016, at 12:58 AM, Uros Bizjak  wrote:
> 2016-01-27  Uros Bizjak  
> 
>* gcc.dg/torture/pr68264.c: Disable log1p test for glibc < 2.22
>and expm1 test for glibc < 2.11.
> 
> Tested on x86_64-linux-gnu, CentOS 5.11 and Fedora 23.
> 
> OK for mainline?

So, unless we have a glibc style person that wants to comment further on it, I 
think the patch is Ok.

The only part I was worried about, is that you included defined(__GLIBC__) &&, 
and you did.  Darwin and newlib for example, won’t define that.  You also 
included the glibc bug number, which is excellent.

RFA: patch to fix PR69299

2016-01-27 Thread Vladimir Makarov

  The following patch fixes PR69299.

  The details of the problem is described on

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69299

  The patch was successfully bootstrapped and tested on x86/x86-64.

  The patch introduces a new type of constraints 
define_special_memory_constraint for memory constraints whose address 
reload can not make memory to satisfy the constraint.  It is useful when 
specifically aligned memory is necessary or desirable. I don't know what 
is the best name for this constraint.  I use special_memory_constraint 
but it could be more specific, e.g. aligned_memory_constraint.  Please 
let me know what is the best name for you.


  Is the patch ok to commit?




Index: ChangeLog
===
--- ChangeLog	(revision 232903)
+++ ChangeLog	(working copy)
@@ -1,3 +1,37 @@
+2016-01-27  Vladimir Makarov  
+
+	PR target/68990
+	* config/i386/constraints.md (Bm): Describe as special memory
+	constraint.
+	* doc/md.texi (DEFINE_SPECIAL_MEMORY_CONSTRAINT): Describe it.
+	* genoutput.c (main): Process DEFINE_SPECIAL_MEMORY_CONSTRAINT.
+	* genpreds.c (struct constraint_data): Add is_special_memory.
+	(have_special_memory_constraints, special_memory_start): New
+	static vars.
+	(special_memory_end): Ditto.
+	(add_constraint): Add new arg is_special_memory.  Add code to
+	process its true value.  Update have_special_memory_constraints.
+	(process_define_constraint): Pass the new arg.
+	(process_define_register_constraint): Ditto.
+	(choose_enum_order): Process special memory.
+	(write_tm_preds_h): Generate enum const CT_SPECIAL_MEMORY and
+	function insn_extra_special_memory_constraint.
+	(main): Process DEFINE_SPECIAL_MEMORY_CONSTRAINT.
+	* gensupport.c (process_rtx): Process
+	DEFINE_SPECIAL_MEMORY_CONSTRAINT.
+	* ira-costs.c (record_reg_classes): Process CT_SPECIAL_MEMORY.
+	* ira-lives.c (single_reg_class): Use
+	insn_extra_special_memory_constraint.
+	* ira.c (ira_setup_alts): Process CT_SPECIAL_MEMORY.
+	* lra-constraints.c (process_alt_operands): Ditto.
+	(curr_insn_transform): Use insn_extra_special_memory_constraint.
+	* recog.c (asm_operand_ok, preprocess_constraints): Process
+	CT_SPECIAL_MEMORY.
+	* reload.c (find_reloads): Ditto.
+	* rtl.def (DEFINE_SPECIFAL_MEMORY_CONSTRAINT): New.
+	* stmt.c (parse_input_constraint): Use
+	insn_extra_special_memory_constraint.
+
 2016-01-27  H.J. Lu  
 
 	PR target/68986
Index: config/i386/constraints.md
===
--- config/i386/constraints.md	(revision 232571)
+++ config/i386/constraints.md	(working copy)
@@ -161,7 +161,7 @@
   "@internal GOT memory operand."
   (match_operand 0 "GOT_memory_operand"))
 
-(define_constraint "Bm"
+(define_special_memory_constraint "Bm"
   "@internal Vector memory operand."
   (match_operand 0 "vector_memory_operand"))
 
Index: doc/md.texi
===
--- doc/md.texi	(revision 232571)
+++ doc/md.texi	(working copy)
@@ -4424,6 +4424,20 @@ The syntax and semantics are otherwise i
 @code{define_constraint}.
 @end deffn
 
+@deffn {MD Expression} define_special_memory_constraint name docstring exp
+Use this expression for constraints that match a subset of all memory
+operands: that is, @code{reload} can not make them match by reloading
+the address as it is described for @code{define_memory_constraint} or
+such address reload is undesirable with the performance point of view.
+
+For example, @code{define_special_memory_constraint} can be useful if
+specifically aligned memory is necessary or desirable for some insn
+operand.
+
+The syntax and semantics are otherwise identical to
+@code{define_constraint}.
+@end deffn
+
 @deffn {MD Expression} define_address_constraint name docstring exp
 Use this expression for constraints that match a subset of all address
 operands: that is, @code{reload} can make the constraint match by
Index: genoutput.c
===
--- genoutput.c	(revision 232571)
+++ genoutput.c	(working copy)
@@ -1019,6 +1019,7 @@ main (int argc, char **argv)
   case DEFINE_REGISTER_CONSTRAINT:
   case DEFINE_ADDRESS_CONSTRAINT:
   case DEFINE_MEMORY_CONSTRAINT:
+  case DEFINE_SPECIAL_MEMORY_CONSTRAINT:
 	note_constraint (&info);
 	break;
 
Index: genpreds.c
===
--- genpreds.c	(revision 232571)
+++ genpreds.c	(working copy)
@@ -659,11 +659,11 @@ write_one_predicate_function (struct pre
 
 /* Constraints fall into two categories: register constraints
(define_register_constraint), and others (define_constraint,
-   define_memory_constraint, define_address_constraint).  We
-   work out automatically which of the various old-style macros
-   they correspond to, and produce appropriate code.  They all
-   go in the same hash table so we can verify that there are no
-   duplicate names.  */
+   define_memory_con

Re: C++ PATCH for c++/69496 (ICE on VLA in constexpr function)

2016-01-27 Thread Jason Merrill

On 01/27/2016 11:52 AM, Martin Sebor wrote:

I agree that there is some risk that it might break some working
programs.  I would expect the most common use of initialized VLAs
be to set all elements to zero using the "= { }" or "= { 0 }"
syntax.  Initializers with more elements are, IMO, likely to be
a bug where the user doesn't realize they defined a VLA rather
than an ordinary array.  Since VLAs are required to have at least
1 element, would diagnosing initializers with more than one element
more loudly (such as by default or with -Wall as opposed to with
-Wpedantic) be a good solution?


That makes sense to me.

Jason




Re: Is it OK for rtx_addr_can_trap_p_1 to attempt to compute the frame layout? (was Re: [PATCH] Skip re-computing the mips frame info after reload completed)

2016-01-27 Thread Bernd Edlinger
On 26.01.2016 22:18, Richard Sandiford wrote:
> [cc-ing Eric as RTL maintainer]
>
> Matthew Fortune  writes:
>> Bernd Edlinger  writes:
>>> Matthew Fortune  writes:
 Has the patch been tested beyond just building GCC? I can do a
 test run for you if you don't have things set up to do one yourself.
>>>
>>> I built a cross-gcc with all languages and a cross-glibc, but I have
>>> not set up an emulation environment, so if you could give it a test
>>> that would be highly welcome.
>>
>> mipsel-linux-gnu test results are the same before and after this patch.
>>
>> Please go ahead and commit.
>
> I still object to this.  And it feels like the patch was posted
> as though it was a new one in order to avoid answering the objections
> that were raised when it was last posted:
>
>https://gcc.gnu.org/ml/gcc-patches/2015-12/msg02218.html
>

Richard, I am really sorry when you feel now like I did not take your
objections seriously.  Let me first explain what happened from my point
of view:

When I posted this response to your objections here:

  https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00235.html

I was waiting for your response, but nothing happened, so I kind of
forgot about the issue.  In the meantime Ubuntu and Debian began to
roll out GCC-6 and they got stuck at the same issue, but instead of
waiting for pr69012 to be eventually resolved they created a duplicate
pr69129, and then last Thursday Nick applied my initial patch without
my intervention, probably because of the pressure they put on us.

That changed significantly how I looked at the issue after that point,
as there was no actual regression anymore, the revised patch still made
sense, but for another reason.  When you look at the 20+ targets in the
gcc tree you'll see that almost all of them have a frame-layout
computation function and all except mips have a shortcut
"if (reload_completed) return;" in that function.  And OTOH mips has
one of the most complicated frame layout functions of all targets.

For all of these reasons I posted a new patch which tries to resolve
differences between mips and other targets inital_elimination_offset
functions.

I still think that it is not OK for the mips target to do the frame
layout already in mips_frame_pointer_required because the frame layout
will change several times, until reload is completed, and that function
is only called right in the beginning.

And I think that it is not really well-designed to have a frame layout
function F(x,y)->z unless you assume F to be a trivial O(1) function
that has no significant computation overhead.  When you assume F to be
an expensive O(N) or higher complexity you would design the interface
like compute_frame_layout_now() and
get_cached_initial_elimination_offset(x,y)->z.

Also note, that with the current design, of initial_elimination_offset
the frame layout is already computed several times, because reload has
to call the function with different parameters, and there is no way how
to know when the cached value can be used and when not.

I do also think that having to cache the initial elimination offset
values in the back-end that are already cached in the target does not
improve anything.  Then I would prefer to have a set of new target
callbacks that makes it easy to access the cached target values.

If you want to propose a patch for that I will not object, but I doubt
it will be suitable for Stage 4.


Thanks
Bernd.


> IMO the problem is that rtx_addr_can_trap_p_1 duplicates a large
> bit of LRA/reload logic:
>
> 
> /* Compute an approximation for the offset between the register
> FROM and TO for the current function, as it was at the start
> of the routine.  */
>
> static HOST_WIDE_INT
> get_initial_register_offset (int from, int to)
> {
> #ifdef ELIMINABLE_REGS
>static const struct elim_table_t
>{
>  const int from;
>  const int to;
>} table[] = ELIMINABLE_REGS;
>HOST_WIDE_INT offset1, offset2;
>unsigned int i, j;
>
>if (to == from)
>  return 0;
>
>/* It is not safe to call INITIAL_ELIMINATION_OFFSET
>   before the reload pass.  We need to give at least
>   an estimation for the resulting frame size.  */
>if (! reload_completed)
>  {
>offset1 = crtl->outgoing_args_size + get_frame_size ();
> #if !STACK_GROWS_DOWNWARD
>offset1 = - offset1;
> #endif
>if (to == STACK_POINTER_REGNUM)
>   return offset1;
>else if (from == STACK_POINTER_REGNUM)
>   return - offset1;
>else
>   return 0;
>   }
>
>for (i = 0; i < ARRAY_SIZE (table); i++)
>if (table[i].from == from)
>   {
> if (table[i].to == to)
>   {
> INITIAL_ELIMINATION_OFFSET (table[i].from, table[i].to,
> offset1);
> return offset1;
>   }
> for (j = 0; j < ARRAY_SIZE (table); j++)
>   {
>  

Re: [PATCH] Low-hanging C++-lexer speedup (PR c++/24208)

2016-01-27 Thread Jason Merrill

On 01/25/2016 11:45 PM, Patrick Palka wrote:

+/* This needs to be set to TRUE before the lexer-debugging infrastructure can
+   be used.  The point of this flag is to help the compiler to fold away calls
+   to cp_lexer_debugging_p within this source file at compile time, when the
+   lexer is not being debugged.  */
+
+#define LEXER_DEBUGGING_ENABLED_P false


I wonder about using an existing flag, such as CHECKING_P, but I guess 
this is OK too.


Jason



Re: [Patch AArch64] Restrict 16-bit sqrdml{sa}h instructions to FP_LO_REGS

2016-01-27 Thread Evandro Menezes

On 01/26/16 10:04, James Greenhalgh wrote:

Hi,

In their forms using 16-bit lanes, the sqrdmlah and sqrdmlsh instruction
available when compiling with -march=armv8.1-a are only usable with
a register number in the range 0 to 15 for operand 3, as gas will point
out:

   Error: register number out of range 0 to 15 at
 operand 3 -- `sqrdmlsh v2.4h,v4.4h,v23.h[5]'

This patch teaches GCC to avoid registers outside of this range when
appropriate, in the same fashion as we do for other instructions with
this limitation.

Tested on an internal testsuite targeting Neon intrinsics.

OK?

Thanks,
James

---
2016-01-25  James Greenhalgh  

* config/aarch64/aarch64.md
(arch64_sqrdmlh_lane): Fix register
constraints for operand 3.
(aarch64_sqrdmlh_laneq): Likewise.



LGTM

--
Evandro Menezes



RE: [PATCH] MIPS: Prevent the p5600-bonding.c test from being run for the n32 and 64 ABIs

2016-01-27 Thread Moore, Catherine


> -Original Message-
> From: Andrew Bennett [mailto:andrew.benn...@imgtec.com]
> Sent: Wednesday, January 20, 2016 9:42 AM
> To: Matthew Fortune; gcc-patches@gcc.gnu.org
> Cc: Moore, Catherine
> Subject: RE: [PATCH] MIPS: Prevent the p5600-bonding.c test from being run
> for the n32 and 64 ABIs
> 
> Ping.
> 

This is OK now.

> 
> > -Original Message-
> > From: gcc-patches-ow...@gcc.gnu.org
> > [mailto:gcc-patches-ow...@gcc.gnu.org] On Behalf Of Andrew Bennett
> > Sent: 02 September 2015 14:55
> > To: Matthew Fortune; gcc-patches@gcc.gnu.org
> > Cc: Moore, Catherine (catherine_mo...@mentor.com)
> > Subject: RE: [PATCH] MIPS: Prevent the p5600-bonding.c test from being
> > run for the n32 and 64 ABIs
> >
> > > > diff --git a/gcc/testsuite/gcc.target/mips/p5600-bonding.c
> > > > b/gcc/testsuite/gcc.target/mips/p5600-bonding.c
> > > > index 0890ffa..20c26ca 100644
> > > > --- a/gcc/testsuite/gcc.target/mips/p5600-bonding.c
> > > > +++ b/gcc/testsuite/gcc.target/mips/p5600-bonding.c
> > > > @@ -1,6 +1,7 @@
> > > >  /* { dg-do compile } */
> > > >  /* { dg-options "-dp -mtune=p5600  -mno-micromips -mno-mips16" }
> > > > */
> > > >  /* { dg-skip-if "Bonding needs peephole optimization." { *-*-* } { 
> > > > "-O0"
> > "-
> > > O1" } { "" } }
> > > > */
> > > > +/* { dg-skip-if "There is no DI mode support for load/store
> > > > +bonding" { *-
> > *-
> > > * } { "-
> > > > mabi=n32" "-mabi=64" } { "" } } */  typedef int VINT32
> > > > __attribute__ ((vector_size((16;
> > >
> > > If the best fix we can do for this test is to limit what it tests
> > > then we should still not just skip it. There is some precedence for
> > > tests that require a specific arch with the isa=loongson special
> > > case. I'd rather just lock the test down to p5600 as per the filename.
> >
> > I have changed the testcase's dg-options so that it is only built for p5600.
> > The updated patch and ChangeLog are below.
> >
> > Ok to commit?
> >
> > Many thanks,
> >
> >
> >
> > Andrew
> >
> >
> > testsuite/
> > * gcc.target/mips/p5600-bonding.c (dg-options): Force the test to be
> > always
> > built for p5600.
> > * gcc.target/mips/mips.exp (mips-dg-options): Add support for the
> > isa=p5600
> > dg-option.
> >
> >
> > diff --git a/gcc/testsuite/gcc.target/mips/mips.exp
> > b/gcc/testsuite/gcc.target/mips/mips.exp
> > index 42e7fff..e8d1895 100644
> > --- a/gcc/testsuite/gcc.target/mips/mips.exp
> > +++ b/gcc/testsuite/gcc.target/mips/mips.exp
> > @@ -142,6 +142,9 @@
> >  #   isa=loongson
> >  #  select a Loongson processor
> >  #
> > +#   isa=p5600
> > +#  select a P5600 processor
> > +#
> >  #   addressing=absolute
> >  #  force absolute addresses to be used
> >  #
> > @@ -1009,6 +1012,10 @@ proc mips-dg-options { args } {
> > if { ![regexp {^-march=loongson} $arch] } {
> > set arch "-march=loongson2f"
> > }
> > +   } elseif { [string equal $spec "isa=p5600"] } {
> > +   if { ![regexp {^-march=p5600} $arch] } {
> > +   set arch "-march=p5600"
> > +   }
> > } else {
> > if { ![regexp {^(isa(?:|_rev))(=|<=|>=)([0-9]*)$} \
> >$spec dummy prop relation value nocpus] } {
> > diff --git a/gcc/testsuite/gcc.target/mips/p5600-bonding.c
> > b/gcc/testsuite/gcc.target/mips/p5600-bonding.c
> > index 0890ffa..0bc6d91 100644
> > --- a/gcc/testsuite/gcc.target/mips/p5600-bonding.c
> > +++ b/gcc/testsuite/gcc.target/mips/p5600-bonding.c
> > @@ -1,5 +1,5 @@
> >  /* { dg-do compile } */
> > -/* { dg-options "-dp -mtune=p5600  -mno-micromips -mno-mips16" } */
> > +/* { dg-options "-dp isa=p5600 -mtune=p5600 -mno-micromips
> > +-mno-mips16" } */
> >  /* { dg-skip-if "Bonding needs peephole optimization." { *-*-* } {
> > "-O0" "- O1" } { "" } } */  typedef int VINT32 __attribute__
> > ((vector_size((16;



[lra, committed] Fix PR 69447

2016-01-27 Thread Richard Henderson
This PR appears to be related to PR 66424, which Vlad fixed back in September,
but different in that the insn being rematerialized used the entire DImode
pseudo instead of an SImode subreg of a DImode pseudo.

The effect was the same, however, in that it lengthened the lifetime of one
half of the double-word pseudo, causing it to conflict with the coloring set up
by IRA.

While this solves the problem by refusing to remat any double-word pseudo that
has ever had one of its subreg taken, a more complete solution is to also sync
IRA and LRA so that both of them track lifetimes of subregs.  Not for now,
obviously.

Tested on x86_64, i686, and armv7hf-linux.
Approved by Vlad in the PR.


r~
PR rtl-opt/69447
* lra-remat.c (subreg_regs): New.
(dump_candidates_and_remat_bb_data): Dump it.
(operand_to_remat): Reject if operand in subreg_regs.
(set_bb_regs): Collect subreg_regs.
(lra_remat): Init and free subreg_regs.  Compute
calculate_local_reg_remat_bb_data before create_cands.

testsuite/
* gcc.c-torture/execute/pr69447.c: New test.


diff --git a/gcc/lra-remat.c b/gcc/lra-remat.c
index 6f490b9..4d8099f 100644
--- a/gcc/lra-remat.c
+++ b/gcc/lra-remat.c
@@ -77,6 +77,9 @@ static int call_used_regs_arr[FIRST_PSEUDO_REGISTER];
 /* Bitmap used for different calculations.  */
 static bitmap_head temp_bitmap;
 
+/* Registers accessed via subreg_p.  */
+static bitmap_head subreg_regs;
+
 typedef struct cand *cand_t;
 typedef const struct cand *const_cand_t;
 
@@ -383,30 +386,30 @@ operand_to_remat (rtx_insn *insn)
 return -1;
   /* First find a pseudo which can be rematerialized.  */
   for (reg = id->regs; reg != NULL; reg = reg->next)
-/* True FRAME_POINTER_NEEDED might be because we can not follow
-   changing sp offsets, e.g. alloca is used.  If the insn contains
-   stack pointer in such case, we can not rematerialize it as we
-   can not know sp offset at a rematerialization place.  */
-if (reg->regno == STACK_POINTER_REGNUM && frame_pointer_needed)
-  return -1;
-else if (reg->type == OP_OUT && ! reg->subreg_p
-&& find_regno_note (insn, REG_UNUSED, reg->regno) == NULL)
-  {
-   /* We permits only one spilled reg.  */
-   if (found_reg != NULL)
- return -1;
-   found_reg = reg;
-  }
-/* IRA calculates conflicts separately for subregs of two words
-   pseudo.  Even if the pseudo lives, e.g. one its subreg can be
-   used lately, another subreg hard register can be already used
-   for something else.  In such case, it is not safe to
-   rematerialize the insn.  */
-else if (reg->type == OP_IN && reg->subreg_p
-&& reg->regno >= FIRST_PSEUDO_REGISTER
-&& (GET_MODE_SIZE (PSEUDO_REGNO_MODE (reg->regno))
-== 2 * UNITS_PER_WORD))
-  return -1;
+{
+  /* True FRAME_POINTER_NEEDED might be because we can not follow
+changing sp offsets, e.g. alloca is used.  If the insn contains
+stack pointer in such case, we can not rematerialize it as we
+can not know sp offset at a rematerialization place.  */
+  if (reg->regno == STACK_POINTER_REGNUM && frame_pointer_needed)
+   return -1;
+  else if (reg->type == OP_OUT && ! reg->subreg_p
+  && find_regno_note (insn, REG_UNUSED, reg->regno) == NULL)
+   {
+ /* We permits only one spilled reg.  */
+ if (found_reg != NULL)
+   return -1;
+ found_reg = reg;
+}
+  /* IRA calculates conflicts separately for subregs of two words
+pseudo.  Even if the pseudo lives, e.g. one its subreg can be
+used lately, another subreg hard register can be already used
+for something else.  In such case, it is not safe to
+rematerialize the insn.  */
+  if (reg->regno >= FIRST_PSEUDO_REGISTER
+ && bitmap_bit_p (&subreg_regs, reg->regno))
+   return -1;
+}
   if (found_reg == NULL)
 return -1;
   if (found_reg->regno < FIRST_PSEUDO_REGISTER)
@@ -631,6 +634,9 @@ dump_candidates_and_remat_bb_data (void)
   lra_dump_bitmap_with_title ("avout cands in BB",
  &get_remat_bb_data (bb)->avout_cands, 
bb->index);
 }
+  fprintf (lra_dump_file, "subreg regs:");
+  dump_regset (&subreg_regs, lra_dump_file);
+  putc ('\n', lra_dump_file);
 }
 
 /* Free all BB data.  */
@@ -655,21 +661,24 @@ finish_remat_bb_data (void)
 
 
 
-/* Update changed_regs and dead_regs of BB from INSN.  */
+/* Update changed_regs, dead_regs, subreg_regs of BB from INSN.  */
 static void
 set_bb_regs (basic_block bb, rtx_insn *insn)
 {
   lra_insn_recog_data_t id = lra_get_insn_recog_data (insn);
+  remat_bb_data_t bb_info = get_remat_bb_data (bb);
   struct lra_insn_reg *reg;
 
   for (reg = id->regs; reg != NULL; reg = reg->next)
-if (reg->type != OP_IN)
-  bitmap_set_bit (&get_remat_bb_data (bb)->changed_regs, reg->regno);
-else
-  

Re: [PATCH] Low-hanging C++-lexer speedup (PR c++/24208)

2016-01-27 Thread Patrick Palka
On Wed, Jan 27, 2016 at 4:33 PM, Jason Merrill  wrote:
> On 01/25/2016 11:45 PM, Patrick Palka wrote:
>>
>> +/* This needs to be set to TRUE before the lexer-debugging infrastructure
>> can
>> +   be used.  The point of this flag is to help the compiler to fold away
>> calls
>> +   to cp_lexer_debugging_p within this source file at compile time, when
>> the
>> +   lexer is not being debugged.  */
>> +
>> +#define LEXER_DEBUGGING_ENABLED_P false
>
>
> I wonder about using an existing flag, such as CHECKING_P, but I guess this
> is OK too.

The way I look at it is that you need to deliberately add calls to
cp_lexer_[start|stop]_debugging in parser.c in order to make use of
the lexer debugging stuff anyway, so it's not too invasive to guard
this functionality behind a distinct flag that needs to be manually
toggled within parser.c.  And it is nice to avoid this overhead even
when CHECKING_P, to provide a minor speed-up in bootstrap times and
stuff.


[PATCH] add test for target/17381 - Unnecessary register move for float extend

2016-01-27 Thread Martin Sebor

The attached patch adds a test for the apparently long fixed
bug.

FWIW, I've been trying to close out some of these old bugs and
while it doesn't seem to be done consistently, it occurs to me
that it might be nice to add tests for them.  Please let me
know if you don't think it's worth the trouble (not just mine
but also reviewing the tests and maintaining them).

Martin
PR target/17381 - Unnecessary register move for float extend

gcc/testsuite/ChangeLog:
2016-01-27  Martin Sebor  

	PR target/17381
	* gcc.target/powerpc/pr17381.c: New test.
Index: gcc/testsuite/gcc.target/powerpc/pr17381.c
===
--- gcc/testsuite/gcc.target/powerpc/pr17381.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr17381.c	(working copy)
@@ -0,0 +1,11 @@
+/* PR target/17381 - Unnecessary register move for float extend */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+double d;
+float test1(float fParm)
+{
+  d = fParm + 1.0;
+  return fParm + 1.0f;
+}
+/* { dg-final { scan-assembler-times "fmr" 1 } } */


[Patch, Fortran, committed] PR 69484: [5/6 Regression] documentation issue: -Wtabs and -Wall

2016-01-27 Thread Janus Weil
Hi all,

I have just committed a rather obvious documentation fix to trunk:

https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=232906

I'll also backport this to the gcc-5 branch soonish ...

Cheers,
Janus


Another suspicious code hunk in c-family/c-common.c

2016-01-27 Thread Andrew MacLeod
another chock with the static type code in 
c-family/c-common:handle_deprecated_attribute()

 around line 8919-8923


handle_deprecated_attribute (tree *node, tree name,
 tree args, int flags,
 bool *no_add_attrs)
{
<...>
if (DECL_P (*node))
{
  tree decl = *node;
  type = TREE_TYPE (decl);
<...>
}
else if (TYPE_P (*node))
{
  if (!(flags & (int) ATTR_FLAG_TYPE_IN_PLACE))
*node = build_variant_type_copy (*node);
  TREE_DEPRECATED (*node) = 1;
  type = *node;
}
<...>

if (type && TYPE_NAME (type))
{
  if (TREE_CODE (TYPE_NAME (type)) == IDENTIFIER_NODE)
(*) what = TYPE_NAME (*node);
  else if (TREE_CODE (TYPE_NAME (type)) == TYPE_DECL
   && DECL_NAME (TYPE_NAME (type)))
what = DECL_NAME (TYPE_NAME (type));
}
  if (what)
warning (OPT_Wattributes, "%qE attribute ignored for %qE", 
name, what);



I think the (*) line is suppose to be
  what = TYPE_NAME (type);

It will work fine for the case where a type is passed in since type is 
equivilent to *name,  but in some cases of DECL I'd say it's not looking 
at the correct field..


Andrew


[PATCH, rs6000] Disable static branch prediction in absence of real profile data

2016-01-27 Thread Pat Haugen
The following patch prevents static prediction if we don't have real 
profile data. Testing on SPEC CPU2006 showed a couple improvements in 
specint and specfp neutral. Bootstrap/regtest on powerpc64 with no new 
regressions. Ok for trunk?


-Pat


2016-01-27  Pat Haugen  

* config/rs6000/rs6000.c (output_cbranch): Don't statically predict
branches if using guessed profile.


Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c	(revision 232881)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -21424,14 +21424,15 @@ output_cbranch (rtx op, const char *labe
   /* PROB is the difference from 50%.  */
   int prob = XINT (note, 0) - REG_BR_PROB_BASE / 2;
 
-  /* Only hint for highly probable/improbable branches on newer
-	 cpus as static prediction overrides processor dynamic
-	 prediction.  For older cpus we may as well always hint, but
+  /* Only hint for highly probable/improbable branches on newer cpus when
+	 we have real profile data, as static prediction overrides processor
+	 dynamic prediction.  For older cpus we may as well always hint, but
 	 assume not taken for branches that are very close to 50% as a
 	 mispredicted taken branch is more expensive than a
 	 mispredicted not-taken branch.  */
   if (rs6000_always_hint
 	  || (abs (prob) > REG_BR_PROB_BASE / 100 * 48
+	  && (profile_status_for_fn (cfun) != PROFILE_GUESSED)
 	  && br_prob_note_reliable_p (note)))
 	{
 	  if (abs (prob) > REG_BR_PROB_BASE / 20


Re: [PATCH][AArch64] Replace insn to zero up DF register

2016-01-27 Thread Evandro Menezes

On 01/22/16 07:52, Wilco Dijkstra wrote:

On 12/16/2015 03:30 PM, Evandro Menezes wrote:

 On 10/30/2015 05:24 AM, Marcus Shawcroft wrote:

 On 20 October 2015 at 00:40, Evandro Menezes  
wrote:

 In the existing targets, it seems that it's always faster to zero 
up a DF

 register with "movi %d0, #0" instead of "fmov %d0, xzr".

 This patch modifies the respective pattern.


 Hi Evandro,

 This patch changes the generic, u architecture independent instruction
 selection. The ARM ARM (C3.5.3) makes a specific recommendation about
 the choice of instruction in this situation and the current
 implementation in GCC follows that recommendation.  Wilco has also
 picked up on this issue he has the same patch internal to ARM along
 with an ongoing discussion with ARM architecture folk regarding this
 recommendation.  I'm reluctant to take this patch right now on the
 basis that it runs contrary to ARM ARM recommendation pending the
 conclusion of Wilco's discussion with ARM architecture folk.


 Have you had a chance to discuss this internally further?

Yes, it was decided to remove the recommendation from future ARM ARM's.

Several review comments on your patch 
(https://patchwork.ozlabs.org/patch/532736):

* This should be added to movhf, movsf and movdf - movtf already does this.
* It is important to set the "fp" and "simd" attributes so that the movi 
variant can
only be selected if it is available.



Hi, Wilco.

   2016-01-27 Evandro Menezes 

Replace insn to zero up SIMD registers

gcc/
* config/aarch64/aarch64.md
(*movhf_aarch64): Add "movi %0, #0" to zero up register.
(*movsf_aarch64): Likewise.
(*movdf_aarch64): Likewise.

When this decision is final, methinks that this patch would be close to 
addressing this change.


I have a question though: is it necessary to add the "fp" and "simd" 
attributes to both movsf_aarch64 and movdf_aarch64 as well?


Thank you,

--
Evandro Menezes

>From b390d411b56cfcdedf4601d0487baf1ef84e79ab Mon Sep 17 00:00:00 2001
From: Evandro Menezes 
Date: Mon, 19 Oct 2015 18:31:48 -0500
Subject: [PATCH] Replace insn to zero up SIMD registers

gcc/
	* config/aarch64/aarch64.md
	(*movhf_aarch64): Add "movi %0, #0" to zero up register.
	(*movsf_aarch64): Likewise.
	(*movdf_aarch64): Likewise.
---
 gcc/config/aarch64/aarch64.md | 25 ++---
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index f6c8eb1..43a3854 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1164,64 +1164,67 @@
   operands[1] = force_reg (mode, operands[1]);
   }
 )
 
 (define_insn "*movhf_aarch64"
-  [(set (match_operand:HF 0 "nonimmediate_operand" "=w, ?r,w,w,m,r,m ,r")
-	(match_operand:HF 1 "general_operand"  "?rY, w,w,m,w,m,rY,r"))]
+  [(set (match_operand:HF 0 "nonimmediate_operand" "=w,w,?r,w,w,m,r,m ,r")
+	(match_operand:HF 1 "general_operand"  "?r,Y, w,w,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], HFmode)
 || aarch64_reg_or_fp_zero (operands[1], HFmode))"
   "@
mov\\t%0.h[0], %w1
+   movi\\t%0.8h, #0
umov\\t%w0, %1.h[0]
mov\\t%0.h[0], %1.h[0]
ldr\\t%h0, %1
str\\t%h1, %0
ldrh\\t%w0, %1
strh\\t%w1, %0
mov\\t%w0, %w1"
-  [(set_attr "type" "neon_from_gp,neon_to_gp,neon_move,\
+  [(set_attr "type" "neon_from_gp,neon_move,neon_to_gp,neon_move,\
  f_loads,f_stores,load1,store1,mov_reg")
-   (set_attr "simd" "yes,yes,yes,*,*,*,*,*")
-   (set_attr "fp"   "*,*,*,yes,yes,*,*,*")]
+   (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*")
+   (set_attr "fp"   "*,*,*,*,yes,yes,*,*,*")]
 )
 
 (define_insn "*movsf_aarch64"
-  [(set (match_operand:SF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
-	(match_operand:SF 1 "general_operand"  "?rY, w,w,Ufc,m,w,m,rY,r"))]
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=w,w,?r,w,w  ,w,m,r,m ,r")
+	(match_operand:SF 1 "general_operand"  "?r,Y, w,w,Ufc,m,w,m,rY,r"))]
   "TARGET_FLOAT && (register_operand (operands[0], SFmode)
 || aarch64_reg_or_fp_zero (operands[1], SFmode))"
   "@
fmov\\t%s0, %w1
+   movi\\t%0.4s, #0
fmov\\t%w0, %s1
fmov\\t%s0, %s1
fmov\\t%s0, %1
ldr\\t%s0, %1
str\\t%s1, %0
ldr\\t%w0, %1
str\\t%w1, %0
mov\\t%w0, %w1"
-  [(set_attr "type" "f_mcr,f_mrc,fmov,fconsts,\
+  [(set_attr "type" "f_mcr,neon_move,f_mrc,fmov,fconsts,\
  f_loads,f_stores,load1,store1,mov_reg")]
 )
 
 (define_insn "*movdf_aarch64"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=w, ?r,w,w  ,w,m,r,m ,r")
-	(match_operand:DF 1 "general_operand"  "?rY, w,w,Ufc,m,w,m,rY,r"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=w,w,?r,w,w  ,w,m,r,m ,r")
+	(match_operand:DF 1 "general_operand"  "?r,Y, w,

Re: [Patch, fortran, GCC-5/6, PR62536, v1] ICE (segfault) for invalid END BLOCK statement

2016-01-27 Thread Jerry DeLisle
On 01/27/2016 08:25 AM, Andre Vehreschild wrote:
> Hi all,
> 
> attached are two patches one for trunk and one for the gcc-5-branch,
> which fix an ICE when BLOCK statements are not closed correctly (too
> many END BLOCKs). Unfortunately did the patch I tailored for gcc-5 not
> work on trunk.
> 
> GCC 5: The ICE is prevented by making sure that gfc_current_ns is not
> set to NULL, when too many closing END BLOCK statements are found. This
> may lead to gfortran be in the wrong namespace, but gfortran is already
> in error. Therefore this does not matter any more. We just need to exit
> cleanly.
> 
> GCC 6/trunk: Here the ICE is prevented by making sure, that unnesting
> of BLOCK namespaces is only done, when the END encountered is not the
> one of a BLOCK. This prevents gfortran from removing a correctly
> initialized and for further analysis required namespace.
> 
> Bootstrapped and regtested ok on x86_64-linux-gnu/F23. 
> 
> Ok for trunk and gcc-5-branch, respectively?
> 

OK and thanks for fixes. Much appreciated?

Jerry


Re: Another suspicious code hunk in c-family/c-common.c

2016-01-27 Thread Jeff Law

On 01/27/2016 03:41 PM, Andrew MacLeod wrote:

another chock with the static type code in
c-family/c-common:handle_deprecated_attribute()
  around line 8919-8923


handle_deprecated_attribute (tree *node, tree name,
  tree args, int flags,
  bool *no_add_attrs)
{
<...>
if (DECL_P (*node))
 {
   tree decl = *node;
   type = TREE_TYPE (decl);
<...>
 }
else if (TYPE_P (*node))
 {
   if (!(flags & (int) ATTR_FLAG_TYPE_IN_PLACE))
 *node = build_variant_type_copy (*node);
   TREE_DEPRECATED (*node) = 1;
   type = *node;
 }
<...>

if (type && TYPE_NAME (type))
 {
   if (TREE_CODE (TYPE_NAME (type)) == IDENTIFIER_NODE)
(*) what = TYPE_NAME (*node);
   else if (TREE_CODE (TYPE_NAME (type)) == TYPE_DECL
&& DECL_NAME (TYPE_NAME (type)))
 what = DECL_NAME (TYPE_NAME (type));
 }
   if (what)
 warning (OPT_Wattributes, "%qE attribute ignored for %qE",
name, what);


I think the (*) line is suppose to be
   what = TYPE_NAME (type);

It will work fine for the case where a type is passed in since type is
equivilent to *name,  but in some cases of DECL I'd say it's not looking
at the correct field..

That would be my suspicion as well.

Good to see this work exposing this kind of issue at compile time -- 
that was definitely one of the things I want to get out of this class of 
infrastructure changes.


jeff


Re: [PATCH] add test for target/17381 - Unnecessary register move for float extend

2016-01-27 Thread Jeff Law

On 01/27/2016 03:38 PM, Martin Sebor wrote:

The attached patch adds a test for the apparently long fixed
bug.

FWIW, I've been trying to close out some of these old bugs and
while it doesn't seem to be done consistently, it occurs to me
that it might be nice to add tests for them.  Please let me
know if you don't think it's worth the trouble (not just mine
but also reviewing the tests and maintaining them).
I suspect it's more oversight than anything.  And if an old bug isn't on 
the regression lists, it doesn't tend to get much attention.


I'm all for adding the regression tests and getting this stuff closed out.

I'll let the PPC maintainers weigh in on the actual test, but my 
inclination would be to include it if it works across the various PPC 
targets.



jeff


Re: [PATCH 1/4] Make SRA scalarize constant-pool loads

2016-01-27 Thread Jeff Law

On 01/27/2016 12:39 PM, David Edelsohn wrote:

The new sra-17.c and sra-18.c tests fail on AIX because the regex is
too restrictive -- AIX labels don't have exactly the same format.  On
AIX, the labels in the dumps look like "LC..0" instead of ".LC0".

This patch adds "*" and ".*" so that the "." prepended to LC is
optional and to allow characters between the "LC" and the "0".

I needed extra escapes for the sra-17.c line that matches multiple
times - for no apparent reason.
The joys of expect/tcl.  I just keep escaping until the regex that I 
developed outside the suite works.  I have been trying to get away from 
using .* though.  The longest match nature sometimes gives surprising 
results.  In theory .*? ought to work better, but I haven't tried using 
it much.


Anyway, the change looks fine to me.

jeff



Re: Is it OK for rtx_addr_can_trap_p_1 to attempt to compute the frame layout? (was Re: [PATCH] Skip re-computing the mips frame info after reload completed)

2016-01-27 Thread Eric Botcazou
> [cc-ing Eric as RTL maintainer]

Sorry for the delay, the message apparently bounced]

> IMO the problem is that rtx_addr_can_trap_p_1 duplicates a large
> bit of LRA/reload logic:
> 
> [...]
> 
> Under the current interface macros like INITIAL_ELIMINATION_OFFSET
> are expected to trigger the calculation of the target's frame layout
> (since you need that information to answer the question).
> To me it seems wrong that we're attempting to call that sort of
> macro in a query routine like rtx_addr_can_trap_p_1.

I'm a little uncomfortable stepping in here because, while I essentially share 
your objections and was opposed to the patch (I was almost sure that it would 
introduce maintainance issues for no practical benefit), I didn't imagine that 
such a failure mode would have been possible (computing an approximation of 
the frame layout after reload being problematic) so I didn't really object to 
being overruled after seeing Bernd's patch...

> IMO we should cache the information we need@the start of each
> LRA/reload cycle.  This will be more robust, because there will
> be no accidental changes to global state either during or after
> LRA/reload.  It will also be more efficient because
> rtx_addr_can_trap_p_1 can read the cached variables rather
> than calling back into the target.

That would be a better design for sure and would eliminate the maintainance 
issues I was originally afraid of.

My recommendation would be to back out Bernd's patch for GCC 6.0 (again, it 
doesn't fix any regression and, more importantly, any bug in real software, 
but only corner case bugs in dumb computer-generated testcases) but with the 
commitment to address the underlying issue for GCC 7.0 and backport the fix to 
GCC 6.x unless really impracticable.  That being said, it's ultimately Jakub 
and Richard's call.

-- 
Eric Botcazou


Re: [PATCH 1/4] Make SRA scalarize constant-pool loads

2016-01-27 Thread David Edelsohn
On Wed, Jan 27, 2016 at 6:36 PM, Jeff Law  wrote:
> On 01/27/2016 12:39 PM, David Edelsohn wrote:
>>
>> The new sra-17.c and sra-18.c tests fail on AIX because the regex is
>> too restrictive -- AIX labels don't have exactly the same format.  On
>> AIX, the labels in the dumps look like "LC..0" instead of ".LC0".
>>
>> This patch adds "*" and ".*" so that the "." prepended to LC is
>> optional and to allow characters between the "LC" and the "0".
>>
>> I needed extra escapes for the sra-17.c line that matches multiple
>> times - for no apparent reason.
>
> The joys of expect/tcl.  I just keep escaping until the regex that I
> developed outside the suite works.  I have been trying to get away from
> using .* though.  The longest match nature sometimes gives surprising
> results.  In theory .*? ought to work better, but I haven't tried using it
> much.
>
> Anyway, the change looks fine to me.

Segher pointed out to me that my revised regex was matching multiple
lines, so it was not triggering multiple times without the restriction
on the pattern.

A revised, tighter patch uses "?"

Index: sra-17.c
===
--- sra-17.c(revision 232904)
+++ sra-17.c(working copy)
@@ -15,5 +15,5 @@
   abort ();
 }

-/* { dg-final { scan-tree-dump-times "Removing load: a = \\\*.LC0;" 1
"esra" } } */
-/* { dg-final { scan-tree-dump-times "SR.\[0-9_\]+ = \\\*.LC0\\\[" 4
"esra" } } */
+/* { dg-final { scan-tree-dump-times "Removing load: a =
\\\*\\.?LC\\.?\\.?0;" 1 "esra" } } */
+/* { dg-final { scan-tree-dump-times "SR\\.\[0-9_\]+ =
\\\*\\.?LC\\.?\\.?0\\\[" 4 "esra" } } */
Index: sra-18.c
===
--- sra-18.c(revision 232904)
+++ sra-18.c(working copy)
@@ -21,8 +21,8 @@
   abort ();
 }

-/* { dg-final { scan-tree-dump-times "Removing load: a = \\\*.LC0;" 1
"esra" } } */
-/* { dg-final { scan-tree-dump-times "SR.\[0-9_\]+ =
\\\*.LC0\\.b\\\[0\\\]\\.f\\\[0\\\]\\.x" 1 "esra" } } */
-/* { dg-final { scan-tree-dump-times "SR.\[0-9_\]+ =
\\\*.LC0\\.b\\\[0\\\]\\.f\\\[1\\\]\\.x" 1 "esra" } } */
-/* { dg-final { scan-tree-dump-times "SR.\[0-9_\]+ =
\\\*.LC0\\.b\\\[1\\\]\\.f\\\[0\\\]\\.x" 1 "esra" } } */
-/* { dg-final { scan-tree-dump-times "SR.\[0-9_\]+ =
\\\*.LC0\\.b\\\[1\\\]\\.f\\\[1\\\]\\.x" 1 "esra" } } */
+/* { dg-final { scan-tree-dump-times "Removing load: a =
\\\*\\.?LC\\.?\\.?0;" 1 "esra" } } */
+/* { dg-final { scan-tree-dump-times "SR\\.\[0-9_\]+ =
\\\*\\.?LC\\.?\\.?0\\.b\\\[0\\\]\\.f\\\[0\\\]\\.x" 1 "esra" } } */
+/* { dg-final { scan-tree-dump-times "SR\\.\[0-9_\]+ =
\\\*\\.?LC\\.?\\.?0\\.b\\\[0\\\]\\.f\\\[1\\\]\\.x" 1 "esra" } } */
+/* { dg-final { scan-tree-dump-times "SR\\.\[0-9_\]+ =
\\\*\\.?LC\\.?\\.?0\\.b\\\[1\\\]\\.f\\\[0\\\]\\.x" 1 "esra" } } */
+/* { dg-final { scan-tree-dump-times "SR\\.\[0-9_\]+ =
\\\*\\.?LC\\.?\\.?0\\.b\\\[1\\\]\\.f\\\[1\\\]\\.x" 1 "esra" } } */


  1   2   >