date:20160119

Re: [ping] pending patches

2016-01-19 Thread Eric Botcazou

> 2016-01-05  Eric Botcazou  
> 
>   * dwarf2out.c (need_endianity_attribute_p): New inline predicate.
>   (base_type_die): Add REVERSE parameter and attach DW_AT_endianity to
>   the DIE accordingly.
>   (modified_type_die): Add REVERSE parameter and pass it recursively,
>   as well as to base_type_die.  Adjust presence check accordingly.
>   (base_type_for_mode): Adjust call to modified_type_die.
>   (add_type_attribute): Add REVERSE parameter and pass it to
>   modified_type_die.
>   (generic_parameter_die): Adjust call to add_type_attribute.
>   (add_scalar_info): Likewise.
>   (add_subscript_info): Likewise.
>   (gen_array_type_die): Likewise.
>   (gen_descr_array_type_die): Likewise.
>   (gen_entry_point_die): Likewise.
>   (gen_enumeration_type_die): Likewise.
>   (gen_formal_parameter_die): Likewise.
>   (gen_subprogram_die): Likewise.
>   (gen_variable_die ): Likewise.
>   (gen_const_die): Likewise.
>   (gen_field_die): Likewise.
>   (gen_pointer_type_die): Likewise.
>   (gen_reference_type_die): Likewise.
>   (gen_ptr_to_mbr_type_die): Likewise.
>   (gen_inheritance_die): Likewise.
>   (gen_subroutine_type_die): Likewise.
>   (gen_typedef_die): Likewise.
>   (force_type_die): Adjust call to modified_type_die.
> 
> 2016-01-05  Eric Botcazou  
> 
>   * gcc.dg/debug/dwarf2/sso.c: New test.

https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00132.html

-- 
Eric Botcazou

Re: [PATCH][RFC][Offloading] Fix PR68463

2016-01-19 Thread Richard Biener

On Mon, 18 Jan 2016, Ilya Verbin wrote:

> On Fri, Jan 15, 2016 at 09:15:01 +0100, Richard Biener wrote:
> > On Fri, 15 Jan 2016, Ilya Verbin wrote:
> > > II) The __offload_func_table, __offload_funcs_end, __offload_var_table,
> > > __offload_vars_end are now provided by the linker script, instead of
> > > crtoffload{begin,end}.o, this allows to surround all offload objects, even
> > > those that are not claimed by lto-plugin.
> > > Unfortunately it works only with ld, but doen't work with gold, because
> > > https://sourceware.org/bugzilla/show_bug.cgi?id=15373
> > > Any thoughts how to enable this linker script for gold?
> > 
> > The easiest way would probably to add this handling to the default
> > "linker script" in gold.  I don't see an easy way around requiring
> > changes to gold here - maybe dumping the default linker script from
> > bfd and injecting the rules with some scripting so you have a complete
> > script.  Though likely gold won't grok that result.
> > 
> > Really a question for Ian though.
> 
> Or the gcc driver can add crtoffload{begin,end}.o, but the problem is that it
> can't determine whether the program contains offloading or not.  So it can add
> them to all -fopenmp/-fopenacc programs, if the compiler was configured with
> --enable-offload-targets=...  The overhead would be about 340 bytes for
> binaries which doesn't use offloading.  Is this acceptable?  (Jakub?)

Can lto-wrapper add them as plugin outputs?  Or does that wreck ordering?

Richard.

> 
> > > I used the following testcase:
> > > $ cat main.c
> > > void foo1 ();
> > > void foo2 ();
> > > void foo3 ();
> > > void foo4 ();
> > > 
> > > int main ()
> > > {
> > >   foo1 ();
> > >   foo2 ();
> > >   foo3 ();
> > >   foo4 ();
> > >   return 0;
> > > }
> > > 
> > > $ cat test.c
> > > #include 
> > > #include 
> > > #define MAKE_FN_NAME(x) foo ## x
> > > #define FN_NAME(x) MAKE_FN_NAME(x)
> > > void FN_NAME(NUM) ()
> > > {
> > >   int x, d;
> > >   #pragma omp target map(from: x, d)
> > > {
> > >   x = NUM;
> > >   d = omp_is_initial_device ();
> > > }
> > >   printf ("%s:\t%s ()\tx = %d\n", d ? "HOST" : "TARGET", __FUNCTION__, x);
> > >   if (x != NUM)
> > > printf ("^\n");
> > > }
> > > 
> > > $ gcc -DNUM=1 -c -flto test.c -o obj1.o
> > > $ gcc -DNUM=2 -c -fopenmp test.c -o obj2.o
> > > $ gcc -DNUM=3 -c test.c -o obj3.o
> > > $ gcc -DNUM=4 -c -flto -fopenmp test.c -o obj4.o
> > > $ gcc -c main.c -o main.o
> > > $ gcc -fopenmp obj1.o obj2.o obj3.o obj4.o main.o && ./a.out
> > > $ gcc -fopenmp obj2.o obj3.o obj4.o obj1.o main.o && ./a.out
> > > $ gcc -fopenmp obj3.o obj1.o obj2.o obj4.o main.o && ./a.out
> > 
> > Did you try linking an archive with both offload-but-no-LTO and
> > offload-and-LTO objects inside?
> 
> No.  And it didn't work, because archives are handled by ld a bit differently.
> I will fix it.  Thanks!  From ld/ldlang.c:
> 
> /* Find the insert point for the plugin's replacement files.  We
>place them after the first claimed real object file, or if the
>first claimed object is an archive member, after the last real
>object file immediately preceding the archive.
> 
>   -- Ilya
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: [PATCH] Fix ICE with asm "m" (stmt-expr) operand (PR middle-end/67653)

2016-01-19 Thread Richard Biener

On Tue, 19 Jan 2016, Jakub Jelinek wrote:

> Hi!
> 
> Here is an attempt to fix ICE on statement expression in "m" asm input
> operand.  The problem is that gimplify_asm_expr attempts to mark it
> addressable, but that can be just too late, a temporary the stmt-expression
> gimplifies to might not be addressable and may be used already in the
> gimplified code.  Normally the C/C++ FEs attempt to mark the operand
> addressable already, but in case of statement expression the temporaries
> might not exist yet.
> The patch turns also the PR29119 testcase into invalid test, but you've
> already said in that PR it should be invalid and I agree with that.

Hmm, but can't we detect this in the FE?

> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

What happens if we just do _not_ mark the memory input addressable?
Shouldn't IRA/LRA in the end satisfy the constraint by spilling
a non-memory input and using the spill slot?

Richard.

> 2016-01-19  Jakub Jelinek  
> 
>   PR middle-end/67653
>   * gimplify.c (gimplify_asm_expr): Error if it is too late to
>   attempt to mark memory input operand addressable.
> 
>   * c-c++-common/pr67653.c: New test.
>   * gcc.dg/torture/pr29119.c: Add dg-error.
> 
> --- gcc/gimplify.c.jj 2016-01-15 20:37:30.0 +0100
> +++ gcc/gimplify.c2016-01-18 16:05:21.125640974 +0100
> @@ -5305,6 +5305,27 @@ gimplify_asm_expr (tree *expr_p, gimple_
>   TREE_VALUE (link) = error_mark_node;
> tret = gimplify_expr (&TREE_VALUE (link), pre_p, post_p,
>   is_gimple_lvalue, fb_lvalue | fb_mayfail);
> +   if (tret != GS_ERROR)
> + {
> +   /* Unlike output operands, memory inputs are not guaranteed
> +  to be lvalues by the FE, and while the expressions are
> +  marked addressable there, if it is e.g. a statement
> +  expression, temporaries in it might not end up being
> +  addressable.  They might be already used in the IL and thus
> +  it is too late to make them addressable now though.  */
> +   tree x = TREE_VALUE (link);
> +   while (handled_component_p (x))
> + x = TREE_OPERAND (x, 0);
> +   if (TREE_CODE (x) == MEM_REF
> +   && TREE_CODE (TREE_OPERAND (x, 0)) == ADDR_EXPR)
> + x = TREE_OPERAND (TREE_OPERAND (x, 0), 0);
> +   if ((TREE_CODE (x) == VAR_DECL
> +|| TREE_CODE (x) == PARM_DECL
> +|| TREE_CODE (x) == RESULT_DECL)
> +   && !TREE_ADDRESSABLE (x)
> +   && is_gimple_reg (x))
> + tret = GS_ERROR;
> + }
> mark_addressable (TREE_VALUE (link));
> if (tret == GS_ERROR)
>   {
> --- gcc/testsuite/c-c++-common/pr67653.c.jj   2016-01-18 16:03:49.302899912 
> +0100
> +++ gcc/testsuite/c-c++-common/pr67653.c  2016-01-18 16:03:20.0 
> +0100
> @@ -0,0 +1,8 @@
> +/* PR middle-end/67653 */
> +/* { dg-do compile } */
> +
> +void
> +foo (void)
> +{
> +  __asm__ ("" : : "m" (({ static int a; a; }))); /* { dg-error "memory 
> input 0 is not directly addressable" } */
> +}
> --- gcc/testsuite/gcc.dg/torture/pr29119.c.jj 2014-09-25 15:02:28.0 
> +0200
> +++ gcc/testsuite/gcc.dg/torture/pr29119.c2016-01-18 22:33:32.090515087 
> +0100
> @@ -2,6 +2,6 @@
>  
>  void ldt_add_entry(void)
>  {
> -   __asm__ ("" :: "m"(({unsigned __v; __v;})));
> +   __asm__ ("" :: "m"(({unsigned __v; __v;})));  /* { dg-error "memory 
> input 0 is not directly addressable" } */
>  }
>  
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

[PATCH] Fix PR69336

2016-01-19 Thread Richard Biener


The following patch enhances the recent change to DOMs memory reference
value-numbering to cover PR69336 (all handled components instead of
just ones with outermost ARRAY_REF).

Bootstrapped and tested on x86_64-unknown-linux-gnu.

I'm waiting until I committed the fix for PR69352.

Richard.

2016-01-19  Richard Biener  

PR tree-optimization/69336
* tree-ssa-scopedtables.c (avail_expr_hash): Handle all
handled components with get_ref_base_and_extent.
(equal_mem_array_ref_p): Adjust.

* g++.dg/tree-ssa/pr69336.C: New testcase.

Index: gcc/tree-ssa-scopedtables.c
===
*** gcc/tree-ssa-scopedtables.c (revision 232508)
--- gcc/tree-ssa-scopedtables.c (working copy)
*** avail_expr_hash (class expr_hash_elt *p)
*** 214,220 
  {
/* T could potentially be a switch index or a goto dest.  */
tree t = expr->ops.single.rhs;
!   if (TREE_CODE (t) == MEM_REF || TREE_CODE (t) == ARRAY_REF)
{
  /* Make equivalent statements of both these kinds hash together.
 Dealing with both MEM_REF and ARRAY_REF allows us not to care
--- 214,220 
  {
/* T could potentially be a switch index or a goto dest.  */
tree t = expr->ops.single.rhs;
!   if (TREE_CODE (t) == MEM_REF || handled_component_p (t))
{
  /* Make equivalent statements of both these kinds hash together.
 Dealing with both MEM_REF and ARRAY_REF allows us not to care
*** avail_expr_hash (class expr_hash_elt *p)
*** 251,259 
  static bool
  equal_mem_array_ref_p (tree t0, tree t1)
  {
!   if (TREE_CODE (t0) != MEM_REF && TREE_CODE (t0) != ARRAY_REF)
  return false;
!   if (TREE_CODE (t1) != MEM_REF && TREE_CODE (t1) != ARRAY_REF)
  return false;
  
if (!types_compatible_p (TREE_TYPE (t0), TREE_TYPE (t1)))
--- 251,259 
  static bool
  equal_mem_array_ref_p (tree t0, tree t1)
  {
!   if (TREE_CODE (t0) != MEM_REF && ! handled_component_p (t0))
  return false;
!   if (TREE_CODE (t1) != MEM_REF && ! handled_component_p (t1))
  return false;
  
if (!types_compatible_p (TREE_TYPE (t0), TREE_TYPE (t1)))
Index: gcc/testsuite/g++.dg/tree-ssa/pr69336.C
===
*** gcc/testsuite/g++.dg/tree-ssa/pr69336.C (revision 0)
--- gcc/testsuite/g++.dg/tree-ssa/pr69336.C (working copy)
***
*** 0 
--- 1,86 
+ // { dg-do compile }
+ // { dg-options "-O3 -fdump-tree-optimized -std=c++14" }
+ 
+ #include 
+ #include 
+ 
+ 
+ template struct static_map
+ {
+   using key_type = Key;
+   using mapped_type = T;
+   using value_type = std::pair;
+ private:
+   using _value_type = std::pair;
+   _value_type _values[N];
+   static constexpr _value_type _new_value_type(const std::pair &v)
+   {
+ return std::make_pair(0, std::make_pair(v.first, v.second));
+   }
+ public:
+   template constexpr static_map(U &&...il) : _values{ 
_new_value_type(il)... } { }
+   constexpr mapped_type &operator[](const key_type &k) { return at(k); }
+   constexpr const mapped_type &operator[](const key_type &k) const { return 
at(k); }
+   constexpr mapped_type &at(const key_type &k)
+   {
+ for (size_t n = 0; n < N; n++)
+   if (_values[n].second.first == k)
+ return _values[n].second.second;
+ throw std::out_of_range("Key not found");
+   }
+   constexpr const mapped_type &at(const key_type &k) const
+   {
+ for (size_t n = 0; n < N; n++)
+   if (_values[n].second.first == k)
+ return _values[n].second.second;
+ throw std::out_of_range("Key not found");
+   }
+ };
+ namespace detail
+ {
+   template constexpr 
static_map static_map_from_array(const std::pair(&il)[N], 
std::index_sequence)
+   {
+ return static_map(il[I]...);
+   }
+ }
+ template constexpr static_map 
make_static_map(const std::pair (&il)[N])
+ {
+   return detail::static_map_from_array(il, 
std::make_index_sequence());
+ }
+ 
+ /* Two phase construction, required because heterogeneous braced init
+ in C++ 14 has a big limitation: template auto make(Args &&...)
+ will accept make({ 5, "apple" }) as make(int, const char *) but
+ make({ 5, "apple" }, { 8, "pear" }) will fail to deduce Args as a
+ heterogeneous initializer_list is not permitted. This forces something
+ like make(make_pair{ 5, "apple" }, make_pair{ 8, "pear" }, ...) which
+ is less succinct than using a constexpr C array for the nested braced init.
+ */
+ constexpr std::pair map_data[] = {
+   { 5, "apple" },
+   { 8, "pear" },
+   { 0, "banana" }
+ };
+ 
+ template constexpr int cstrcmp(const char *a, const char *b)
+ {
+   for (size_t n = 0; n < N; n++)
+   {
+ if (a[n] < b[n]) return -1;
+ if (a[n] > b[n]) return 1;
+   }
+   return 0;
+ }
+ 
+ int main(void)
+ {
+   constexpr auto cmap = make_static_map(map_data);
+   // No abort() appears in assembler, so this was executed constexpr

[PATCH] Fix PR69352

2016-01-19 Thread Richard Biener


The following should fix PR69352 where hash collisions make
equal_mem_array_ref_p get refs it wasn't supposed to compare
and will not compare correctly (those with variable indices).

Profiledbootstrap and testing running on x86_64-unknown-linux-gnu.

Richard.

2016-01-19  Richard Biener  

PR tree-optimization/69352
* tree-ssa-scopedtables.c (equal_mem_array_ref_p): Constrain
max size properly.

* gcc.dg/torture/pr69352.c: New testcase.

Index: gcc/tree-ssa-scopedtables.c
===
--- gcc/tree-ssa-scopedtables.c (revision 232519)
+++ gcc/tree-ssa-scopedtables.c (working copy)
@@ -261,10 +261,14 @@ equal_mem_array_ref_p (tree t0, tree t1)
   bool rev0;
   HOST_WIDE_INT off0, sz0, max0;
   tree base0 = get_ref_base_and_extent (t0, &off0, &sz0, &max0, &rev0);
+  if (sz0 != max0)
+return false;
 
   bool rev1;
   HOST_WIDE_INT off1, sz1, max1;
   tree base1 = get_ref_base_and_extent (t1, &off1, &sz1, &max1, &rev1);
+  if (sz1 != max1)
+return false;
 
   /* Types were compatible, so these are sanity checks.  */
   gcc_assert (sz0 == sz1);
Index: gcc/testsuite/gcc.dg/torture/pr69352.c
===
*** gcc/testsuite/gcc.dg/torture/pr69352.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr69352.c  (working copy)
***
*** 0 
--- 1,40 
+ /* { dg-do compile } */
+ 
+ int a[10][14], b, c, d, e, f, g, h, i;
+ void bar (void);
+ int
+ foo (int x)
+ {
+   unsigned j;
+   int k = 0, l;
+   int m;
+   if (h)
+ m = 12;
+   else
+ m = 13;
+   if (a[x][m])
+ l = (long) foo;
+   a[x][i] = l;
+   while (c)
+ {
+   if (b)
+   {
+ if (f)
+   k = 1;
+ bar ();
+   }
+   for (; d;)
+   j++;
+ }
+   while (c)
+ {
+   if (a[x][12])
+   {
+ if (g)
+   k = 1;
+ j++;
+   }
+   c = e;
+ }
+   return k;
+ }

Re: [PATCH] Fix PR69352

2016-01-19 Thread Richard Biener

On Tue, 19 Jan 2016, Richard Biener wrote:

> 
> The following should fix PR69352 where hash collisions make
> equal_mem_array_ref_p get refs it wasn't supposed to compare
> and will not compare correctly (those with variable indices).
> 
> Profiledbootstrap and testing running on x86_64-unknown-linux-gnu.

The following is an updated patch based on comments - handles
size == -1 properly as well as the reverse storage order flag.

Richard.

2016-01-19  Richard Biener  

PR tree-optimization/69352
* tree-ssa-scopedtables.c (equal_mem_array_ref_p): Constrain
max size properly.

* gcc.dg/torture/pr69352.c: New testcase.

Index: gcc/tree-ssa-scopedtables.c
===
--- gcc/tree-ssa-scopedtables.c (revision 232519)
+++ gcc/tree-ssa-scopedtables.c (working copy)
@@ -225,7 +225,8 @@ avail_expr_hash (class expr_hash_elt *p)
   &reverse);
  /* Strictly, we could try to normalize variable-sized accesses too,
but here we just deal with the common case.  */
- if (size == max_size)
+ if (size != -1
+ && size == max_size)
{
  enum tree_code code = MEM_REF;
  hstate.add_object (code);
@@ -261,15 +262,22 @@ equal_mem_array_ref_p (tree t0, tree t1)
   bool rev0;
   HOST_WIDE_INT off0, sz0, max0;
   tree base0 = get_ref_base_and_extent (t0, &off0, &sz0, &max0, &rev0);
+  if (sz0 == -1
+  || sz0 != max0)
+return false;
 
   bool rev1;
   HOST_WIDE_INT off1, sz1, max1;
   tree base1 = get_ref_base_and_extent (t1, &off1, &sz1, &max1, &rev1);
+  if (sz1 == -1
+  || sz1 != max1)
+return false;
+
+  if (rev0 != rev1)
+return false;
 
-  /* Types were compatible, so these are sanity checks.  */
+  /* Types were compatible, so this is a sanity check.  */
   gcc_assert (sz0 == sz1);
-  gcc_assert (max0 == max1);
-  gcc_assert (rev0 == rev1);
 
   return (off0 == off1) && operand_equal_p (base0, base1, 0);
 }
Index: gcc/testsuite/gcc.dg/torture/pr69352.c
===
--- gcc/testsuite/gcc.dg/torture/pr69352.c  (revision 0)
+++ gcc/testsuite/gcc.dg/torture/pr69352.c  (working copy)
@@ -0,0 +1,40 @@
+/* { dg-do compile } */
+
+int a[10][14], b, c, d, e, f, g, h, i;
+void bar (void);
+int
+foo (int x)
+{
+  unsigned j;
+  int k = 0, l;
+  int m;
+  if (h)
+m = 12;
+  else
+m = 13;
+  if (a[x][m])
+l = (long) foo;
+  a[x][i] = l;
+  while (c)
+{
+  if (b)
+   {
+ if (f)
+   k = 1;
+ bar ();
+   }
+  for (; d;)
+   j++;
+}
+  while (c)
+{
+  if (a[x][12])
+   {
+ if (g)
+   k = 1;
+ j++;
+   }
+  c = e;
+}
+  return k;
+}

Re: [PATCH][RFC][Offloading] Fix PR68463

2016-01-19 Thread Jakub Jelinek

On Tue, Jan 19, 2016 at 09:57:01AM +0100, Richard Biener wrote:
> On Mon, 18 Jan 2016, Ilya Verbin wrote:
> 
> > On Fri, Jan 15, 2016 at 09:15:01 +0100, Richard Biener wrote:
> > > On Fri, 15 Jan 2016, Ilya Verbin wrote:
> > > > II) The __offload_func_table, __offload_funcs_end, __offload_var_table,
> > > > __offload_vars_end are now provided by the linker script, instead of
> > > > crtoffload{begin,end}.o, this allows to surround all offload objects, 
> > > > even
> > > > those that are not claimed by lto-plugin.
> > > > Unfortunately it works only with ld, but doen't work with gold, because
> > > > https://sourceware.org/bugzilla/show_bug.cgi?id=15373
> > > > Any thoughts how to enable this linker script for gold?
> > > 
> > > The easiest way would probably to add this handling to the default
> > > "linker script" in gold.  I don't see an easy way around requiring
> > > changes to gold here - maybe dumping the default linker script from
> > > bfd and injecting the rules with some scripting so you have a complete
> > > script.  Though likely gold won't grok that result.
> > > 
> > > Really a question for Ian though.
> > 
> > Or the gcc driver can add crtoffload{begin,end}.o, but the problem is that 
> > it
> > can't determine whether the program contains offloading or not.  So it can 
> > add
> > them to all -fopenmp/-fopenacc programs, if the compiler was configured with
> > --enable-offload-targets=...  The overhead would be about 340 bytes for
> > binaries which doesn't use offloading.  Is this acceptable?  (Jakub?)
> 
> Can lto-wrapper add them as plugin outputs?  Or does that wreck ordering?

Yeah, if that would work, it would be certainly appreciated, one thing is
wasting .text space and relocations in all -fopenmp programs (for -fopenacc
programs one kind of assumes there will be some offloading in there),
another one some extra constructor/destructor or what that would be even
worse.

Jakub

Re: [PING] genattrab.c generate switch

2016-01-19 Thread Richard Biener

On Mon, Jan 18, 2016 at 7:48 PM, Jeff Law  wrote:
> On 01/18/2016 07:09 AM, Jesper Broge Jørgensen wrote:
>>
>> Ping patch:
>>
>> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00784.html
>
> I'd put it in my gcc-7 queue.  But if Richard, Bernd, Richi or someone else
> wants to work though the changes as a bugfix for bootstrapping on platforms
> with crippled compilers, I won't object.

I'd take it as a bugfix but the patch still needs review.

Richard.

> jeff

Re: [PATCH 2/4] Equate MEM_REFs and ARRAY_REFs in tree-ssa-scopedtables.c

2016-01-19 Thread Christophe Lyon

On 19 January 2016 at 04:05, H.J. Lu  wrote:
> On Thu, Dec 24, 2015 at 3:55 AM, Alan Lawrence  wrote:
>> This version changes the test cases to fix failures on some platforms, by
>> rewriting the initializers so that they aren't pushed out to the constant 
>> pool.
>>
>> gcc/ChangeLog:
>>
>> * tree-ssa-scopedtables.c (avail_expr_hash): Hash MEM_REF and 
>> ARRAY_REF
>> using get_ref_base_and_extent.
>> (equal_mem_array_ref_p): New.
>> (hashable_expr_equal_p): Add call to previous.
>>
>
> This caused:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69352
>

Hi Alan,

This patch also caused regressions on arm-none-linux-gnueabihf
with GCC configured as:
--with-thumb --with-cpu=cortex-a57 --with-fpu=crypto-neon-fp-armv8

These tests now fail:
gcc.dg/torture/pr61742.c   -O2  (test for excess errors)
gcc.dg/torture/pr61742.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  (test for excess errors)
gcc.dg/torture/pr61742.c   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  (test for excess errors)
gcc.dg/torture/pr61742.c   -O3 -g  (test for excess errors)

Christophe

Re: -z bndplt documentation in GCC manual

2016-01-19 Thread Ilya Enkovich

2016-01-19 5:25 GMT+03:00 Sandra Loosemore :
> I think the documentation relating to '-z bndplt' in the GCC manual
> description of -fcheck-pointer-bounds is incorrect.  It looks like, as of
> r225862, the GCC driver is supposed to emit an error message if GCC was
> configured with a linker that doesn't support this option and you pass -mmpx
> without -static.  Is that right?  I'll fix the documentation once I'm clear
> on what the actual behavior is.

Compiler just emits a note where user is warned that GCC configuration may
lead to decreased instrumentation coverage.

Thanks,
Ilya

>
> -Sandra
>

Re: [PATCH, testsuite]: Disable LTO for gcc.c-torture/execute/builtins/{memops,strstr}-asm.c

2016-01-19 Thread Jakub Jelinek

On Mon, Jan 18, 2016 at 11:53:55PM +0100, Uros Bizjak wrote:
> Hello!
> 
> As explained by Honza in the PR, these two tests are not suitable for
> LTO tests. Also, the dg-options directives are not effective in this
> directory.
> 
> 2016-01-19  Uros Bizjak  
> 
> PR testsuite/68820
> * gcc.c-torture/execute/builtins/memops-asm.x: New file.
> * gcc.c-torture/execute/builtins/strstr-asm.x: Ditto.
> * gcc.c-torture/execute/builtins/strstr-asm.c: Remove dg-options.
> 
> Tested on x86_64-linux-gnu and checked logs that LTO tests are really skipped.
> 
> OK for mainline and branches?

Ok, thanks.

Jakub

Re: [hsa merge 00/10] Merge of HSA branch

2016-01-19 Thread Martin Jambor

Hi,

On Wed, Jan 13, 2016 at 06:39:25PM +0100, Martin Jambor wrote:
> Hi,
> 
> this is hopefully the last big re-post of the HSA patches...

I have committed the combined patch as revision 232549 after
bootstrapping and testing all languages on x86_64-linux and i686-linux
and verifying I did not break powerpc-aix more than it was before.

I will be updating gcc offloading wiki in a few days, meanwhile you
can use README.hsa file from the branch:

https://gcc.gnu.org/viewcvs/gcc/branches/hsa/gcc/README.hsa?view=markup

I will be also posting followup testsuite patches.

> 
> Thanks everybody for patience and feedback.  While we are of course
> opened for mor more of it, let's also hope the approval process will
> finish soon as it should now.

I can't but repeat my thanks, especially to Jakub for the review and
help with the many last-minute issues.

Martin

[PATCH, PR tree-optimization/69328] Fix vectorization of boolean vector comparision in COND_EXPR

2016-01-19 Thread Ilya Enkovich

Hi,

Currently vectorizer incorrectly handles a case when COND_EXPR
has boolean vector comparison.  Firstly masked COND_EXPR is
determined incorrectly.  Also we don't check vector types of
compared values are compatible.  This patch fixes these problems.
Bootstrapped and regtested for x86_64-pc-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
gcc/

2016-01-19  Ilya Enkovich  
Richard Biener  

PR tree-optimization/69328
* tree-vect-stmts.c (vect_is_simple_cond): Check compared
vectors have same number of elements.
(vectorizable_condition): Fix masked version recognition.


gcc/testsuite/

2016-01-19  Ilya Enkovich  

PR tree-optimization/69328
* gcc.dg/pr69328.c: New test.


diff --git a/gcc/testsuite/gcc.dg/pr69328.c b/gcc/testsuite/gcc.dg/pr69328.c
new file mode 100644
index 000..a495596
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr69328.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+int a, b;
+void fn1() {
+  int c;
+  char *d;
+  for (; a; ++a) {
+int e, f;
+e = d[a];
+if (!e && f || !f && e)
+  ++c;
+  }
+  if (c)
+b = .499;
+}
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 635c797..9d4d286 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -7441,6 +7441,10 @@ vect_is_simple_cond (tree cond, vec_info *vinfo, tree 
*comp_vectype)
   && TREE_CODE (rhs) != FIXED_CST)
 return false;
 
+  if (vectype1 && vectype2
+  && TYPE_VECTOR_SUBPARTS (vectype1) != TYPE_VECTOR_SUBPARTS (vectype2))
+return false;
+
   *comp_vectype = vectype1 ? vectype1 : vectype2;
   return true;
 }
@@ -7544,13 +7548,9 @@ vectorizable_condition (gimple *stmt, 
gimple_stmt_iterator *gsi,
   if (!vect_is_simple_use (else_clause, stmt_info->vinfo, &def_stmt, &dt))
 return false;
 
-  if (VECTOR_BOOLEAN_TYPE_P (comp_vectype))
-{
-  vec_cmp_type = comp_vectype;
-  masked = true;
-}
-  else
-vec_cmp_type = build_same_sized_truth_vector_type (comp_vectype);
+  masked = !COMPARISON_CLASS_P (cond_expr);
+  vec_cmp_type = build_same_sized_truth_vector_type (comp_vectype);
+
   if (vec_cmp_type == NULL_TREE)
 return false;

Remove outdated text from lto.texi

2016-01-19 Thread Kugan

Hi,

lto.texi has "Currently, the linker plugin works only in combination
with the Gold linker, but a GNU ld implementation is under development".
I don't think this is true any more. Attached patch removes this. is
this OK for trunk?

Thanks,
Kugan

gcc/ChangeLog:

2016-01-19  Kugan Vivekanandarajah  

* doc/lto.texi: Remove text that says only Gold has linker plugin
support.
diff --git a/gcc/doc/lto.texi b/gcc/doc/lto.texi
index 51aa796..9269e55 100644
--- a/gcc/doc/lto.texi
+++ b/gcc/doc/lto.texi
@@ -538,10 +538,6 @@ plugin obtains the symbol resolution information which 
specifies
 which symbols provided by the claimed objects are bound from the
 rest of a binary being linked.
 
-Currently, the linker plugin  works only in combination
-with the Gold linker, but a GNU ld implementation is under
-development.
-
 GCC is designed to be independent of the rest of the toolchain
 and aims to support linkers without plugin support.  For this
 reason it does not use the linker plugin by default.  Instead,

Re: [PATCH] ARM PR68620 (ICE with FP16 on armeb)

2016-01-19 Thread Christophe Lyon

On 18 January 2016 at 20:01, Alan Lawrence  wrote:
> Thanks for working on this, Christophe, and sorry I missed the PR. You got
> further in fixing more things than I did though :). A couple of comments:
>
>> For the vec_set_internal and neon_vld1_dup patterns, I
>> switched to an existing iterator which already had the needed
>> V4HF/V8HF (so I switched to VD_LANE and VQ2).
>
> It's a separate issue, and I hadn't done this either, but looking again - I
> don't see any reason why we shouldn't apply VD->VD_LANE to the vec_extract
> standard name pattern too. (At present looks like we have vec_extractv8hf but 
> no
> vec_extractv4hf ?)
>
OK, I'l add that to my patch

>> For neon_vdupn, I chose to implement neon_vdup_nv4hf and
>> neon_vdup_nv8hf instead of updating the VX iterator because I thought
>> it was not desirable to impact neon_vrev32.
>
> Well, the same instruction will suffice for vrev32'ing vectors of HF just as
> well as vectors of HI, so I think I'd argue that's harmless enough. To gain 
> the
> benefit, we'd need to update arm_evpc_neon_vrev with a few new cases, though.
>
Since this is more intrusive, I'd rather leave that part for later. OK?

>> @@ -5252,12 +5252,22 @@ vget_lane_s32 (int32x2_t __a, const int __b)
>> were marked always-inline so there were no call sites, the declaration
>> would nonetheless raise an error.  Hence, we must use a macro instead.  
>> */
>>
>> +  /* For big-endian, GCC's vector indices are the opposite way around
>> + to the architectural lane indices used by Neon intrinsics.  */
>
> Not quite the opposite way around, as you take into account yourself! 
> 'Reversed
> within each 64 bits', perhaps?
>
OK, I'll try to rephrase that.

>> +#ifdef __ARM_BIG_ENDIAN
>> +  /* Here, 3 is (4-1) where 4 is the number of lanes. This is also the
>> + right value for vectors with 8 lanes.  */
>> +#define __arm_lane(__vec, __idx) (__idx ^ 3)
>> +#else
>> +#define __arm_lane(__vec, __idx) __idx
>> +#endif
>> +
>
> Looks right, but sounds... my concern here is that I'm hoping at some point we
> will move the *other* vget/set_lane intrinsics to use GCC vector extensions
> too. At which time (unlike __aarch64_lane which can be used everywhere) this
> will be the wrong formula. Can we name (and/or comment) it to avoid misleading
> anyone? The key characteristic seems to be that it is for vectors of 16-bit
> elements only.
>
I'm not to follow, here. Looking at the patterns for
neon_vget_lane_*internal in neon.md,
I can see 2 flavours: one for VD, one for VQ2. The latter uses "halfelts".

Do you prefer that I create 2 macros (say __arm_lane and __arm_laneq),
that would be similar to the aarch64 ones (by computing the number of
lanes of the input vector), but the "q" one would use half the total
number of lanes instead?

>> @@ -5334,7 +5344,7 @@ vgetq_lane_s32 (int32x4_t __a, const int __b)
>>  ({   \
>>float16x8_t __vec = (__v); \
>>__builtin_arm_lane_check (8, __idx);   \
>> -  float16_t __res = __vec[__idx];\
>> +  float16_t __res = __vec[__arm_lane(__vec, __idx)]; \
>
> In passing - the function name in the @@ header is of course misleading, this 
> is #define vgetq_lane_f16 (and the later hunks)
>
> Thanks, Alan

Do not ICE with -fdump-ipa-cgraph when building firefox

2016-01-19 Thread Jan Hubicka

Hi,
cgraph dumpnow segfaults as we have symbols without DECL_NAME set.  This patch 
makes
us to print node by asm name or as   if everything fails.

Honza

* symtab.c (symtab_node::asm_name): Do not call printable name directly.
(symtab_node::name): Report name as unnamed if DECL_NAME is not set.
Index: symtab.c
===
--- symtab.c(revision 232466)
+++ symtab.c(working copy)
@@ -504,7 +513,7 @@ const char *
 symtab_node::asm_name () const
 {
   if (!DECL_ASSEMBLER_NAME_SET_P (decl))
-return lang_hooks.decl_printable_name (decl, 2);
+return name ();
   return IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl));
 }
 
@@ -513,6 +522,13 @@ symtab_node::asm_name () const
 const char *
 symtab_node::name () const
 {
+  if (!DECL_NAME (decl))
+{
+  if (DECL_ASSEMBLER_NAME_SET_P (decl))
+   return asm_name ();
+  else
+return "";
+}
   return lang_hooks.decl_printable_name (decl, 2);
 }

Re: reject decl with incomplete struct/union type in check_global_declaration()

2016-01-19 Thread Marek Polacek

Sorry for speaking up late, but I think we could do better with formatting
in this patch:

On Sat, Jan 16, 2016 at 03:45:22PM +0530, Prathamesh Kulkarni wrote:
> diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
> index 915376d..d36fc67 100644
> --- a/gcc/c/c-decl.c
> +++ b/gcc/c/c-decl.c
> @@ -4791,6 +4791,13 @@ finish_decl (tree decl, location_t init_loc, tree init,
>  TREE_TYPE (decl) = error_mark_node;
>}
>  
> +  if ((RECORD_OR_UNION_TYPE_P (TREE_TYPE (decl))
> +   || TREE_CODE (TREE_TYPE (decl)) == ENUMERAL_TYPE)
> +   && DECL_SIZE (decl) == 0 && TREE_STATIC (decl))

DECL_SIZE yields a tree, so I'd rather see NULL_TREE instead of 0 here (yeah,
the enclosing code uses 0s :().  The "&& TREE_STATIC..." should be on its own
line.

> + {
> +   incomplete_record_decls.safe_push (decl);
> + }
> +

Redundant braces.

> diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
> index a0e0052..3c8a496 100644
> --- a/gcc/c/c-parser.c
> +++ b/gcc/c/c-parser.c
> @@ -59,6 +59,8 @@ along with GCC; see the file COPYING3.  If not see
>  #include "gimple-expr.h"
>  #include "context.h"
>  
> +vec incomplete_record_decls = vNULL;

This could use a comment.

> +
> +  for (unsigned i = 0; i < incomplete_record_decls.length (); ++i)
> +{
> +  tree decl = incomplete_record_decls[i];
> +  if (DECL_SIZE (decl) == 0 && TREE_TYPE (decl) != error_mark_node)

I'd s/0/NULL_TREE/.

Marek

[chkp] Do not stream bodies of instrumentation thunks

2016-01-19 Thread Jan Hubicka

Hi,
instrumentation thunks are de-facto transparent aliases and thus their bodies
should not be streamed (longer term plan is to reorganize them to aliases).
This avoids an ICE when mixing instrumented and non-instrumented files in LTO.

Bootstrapped/regtested x86_64-linux, plan to commit it if there are no 
complains.

Honza

* lto-streamer-out.c (lto_output): Do not stream instrumentation
thunks.
Index: lto-streamer-out.c
===
--- lto-streamer-out.c  (revision 232466)
+++ lto-streamer-out.c  (working copy)
@@ -2320,7 +2320,8 @@ lto_output (void)
   if (cgraph_node *node = dyn_cast  (snode))
{
  if (lto_symtab_encoder_encode_body_p (encoder, node)
- && !node->alias)
+ && !node->alias
+ && (!node->thunk.thunk_p || !node->instrumented_version))
{
  if (flag_checking)
{

Fix ICE in lto-symtab

2016-01-19 Thread Jan Hubicka

Hi,
this patch fixes ICE with abstract decls.  Those need not to be linked.

Bootstrapped/regtested x86_64-linux, will commit it shortly.

Honza

* lto-symtab.c (lto_symtab_prevailing_virtual_decl): Abstract
decls have no assemblernames.
* g++.dg/torture/pr69136.C: New testcase.
Index: lto/lto-symtab.c
===
--- lto/lto-symtab.c(revision 232466)
+++ lto/lto-symtab.c(working copy)
@@ -987,6 +1013,8 @@ lto_symtab_merge_symbols (void)
 tree
 lto_symtab_prevailing_virtual_decl (tree decl)
 {
+  if (DECL_ABSTRACT_P (decl))
+return decl;
   gcc_checking_assert (!type_in_anonymous_namespace_p (DECL_CONTEXT (decl))
   && DECL_ASSEMBLER_NAME_SET_P (decl));
 
Index: testsuite/g++.dg/torture/pr69136.C
===
--- testsuite/g++.dg/torture/pr69136.C  (revision 0)
+++ testsuite/g++.dg/torture/pr69136.C  (revision 0)
@@ -0,0 +1,6 @@
+// { dg-do compile }
+class GrBufferAllocPool {
+  virtual ~GrBufferAllocPool();
+};
+GrBufferAllocPool::~GrBufferAllocPool() { static long a; }
+

Re: [Ping^3][PATCH][GCC][ARM] testcase memset-inline-10.c uses -mfloat-abi=hard but does not check whether target supports it

2016-01-19 Thread Andre Vieira (lists)


On 05/01/16 17:40, Andre Vieira wrote:

On 27/11/15 14:28, Andre Vieira wrote:

On 12/11/15 15:16, Andre Vieira wrote:

On 12/11/15 15:08, Andre Vieira wrote:

Hi,

   This patch changes the memset-inline-10.c testcase to make sure that
it is only compiled for ARM targets that support -mfloat-abi=hard using
the fact that all non-thumb1 targets do.

   This is correct because all targets for which -mthumb causes the
compiler to use thumb2 will support the generation of FP instructions.

   Tested by running regressions for this testcase for various ARM
targets.

   Is this OK to commit?

   Thanks,
   Andre Vieira

gcc/testsuite/ChangeLog:
2015-11-06  Andre Vieira  

 * gcc.target/arm/memset-inline-10.c: Added
 dg-require-effective-target arm_thumb2_ok.


Now with attachment, sorry about that.

Cheers,
Andre


Ping.



Ping.


Ping.

Re: [PATCH] Fix RTL DSE (PR rtl-optimization/68955, take 2)

2016-01-19 Thread Eric Botcazou

> 2016-01-19  Jakub Jelinek  
> 
>   PR rtl-optimization/68955
>   PR rtl-optimization/64557
>   * dse.c (record_store, check_mem_read_rtx): Don't call get_addr
>   here.  Fix up formatting.
>   * alias.c (get_addr): Handle VALUE + CONST_INT.

VALUE +/- CONST_INT (and actually also CONST_WIDE_INT).

>   * gcc.dg/torture/pr68955.c: New test.

OK, if you add "plus or minus an optional constant offset" to the head comment 
of the get_addr function, thanks.

Handling CONST_WIDE_INT looks superfluous to me though (as well as MINUS since 
it's non-canonical but we'd probably better be forgiving for this one).

-- 
Eric Botcazou

Fix ICE in get_untransformed_body

2016-01-19 Thread Jan Hubicka

Hi,
this patch fixes confusion of cgraph_node::get_untransformed_body about thunks
so we do not try to stream them in twice.

Honza

* cgraphunit.c (cgraph_node::expand_thunk): When forcing gimple
assume that the node has body.
* cgraph.c (cgraph_node::get_untransformed_body): Use gimple_body_p
check.
* g++.dg/lto/pr69133_0.C: New testcase.
* g++.dg/lto/pr69133_1.C: New testcase.
Index: cgraphunit.c
===
--- cgraphunit.c(revision 232466)
+++ cgraphunit.c(working copy)
@@ -1664,7 +1666,9 @@ cgraph_node::expand_thunk (bool output_a
   greturn *ret;
   bool alias_is_noreturn = TREE_THIS_VOLATILE (alias);
 
-  if (in_lto_p)
+  /* We may be called from expand_thunk that releses body except for
+DECL_ARGUMENTS.  In this case force_gimple_thunk is true.  */
+  if (in_lto_p && !force_gimple_thunk)
get_untransformed_body ();
   a = DECL_ARGUMENTS (thunk_fndecl);
 
Index: testsuite/g++.dg/lto/pr69133_0.C
===
--- testsuite/g++.dg/lto/pr69133_0.C(revision 0)
+++ testsuite/g++.dg/lto/pr69133_0.C(revision 0)
@@ -0,0 +1,19 @@
+// { dg-lto-do link }
+// { dg-lto-options { { -flto -O2 } } }
+// { dg-extra-ld-options "-r -nostdlib -flto -flto-partition=none -O2" }
+namespace xercesc_3_1 {
+class XMLEntityHandler {
+public:
+  virtual ~XMLEntityHandler();
+  virtual void m_fn1();
+  virtual bool m_fn2();
+  virtual void m_fn3();
+  virtual int m_fn4();
+  virtual void m_fn5();
+} * a;
+void fn1() {
+  a->m_fn5();
+  a->m_fn1();
+}
+}
+
Index: testsuite/g++.dg/lto/pr69133_1.C
===
--- testsuite/g++.dg/lto/pr69133_1.C(revision 0)
+++ testsuite/g++.dg/lto/pr69133_1.C(revision 0)
@@ -0,0 +1,22 @@
+namespace xercesc_3_1 {
+class A {
+  virtual void m_fn1();
+};
+class XMLEntityHandler {
+public:
+  virtual ~XMLEntityHandler();
+  virtual void m_fn2(const int &);
+  virtual bool m_fn3();
+  virtual void m_fn4();
+  virtual int m_fn5() = 0;
+  virtual void m_fn6(const int &);
+};
+class B : A, XMLEntityHandler {};
+class C : B {
+  void m_fn2(const int &);
+  void m_fn6(const int &);
+};
+void C::m_fn2(const int &) {}
+void C::m_fn6(const int &) {}
+}
+
Index: cgraph.c
===
--- cgraph.c(revision 232466)
+++ cgraph.c(working copy)
@@ -3305,10 +3300,12 @@ cgraph_node::get_untransformed_body (voi
   size_t len;
   tree decl = this->decl;
 
-  if (DECL_RESULT (decl))
+  /* Check if body is already there.  Either we have gimple body or
+ the function is thunk and in that case we set DECL_ARGUMENTS.  */
+  if (DECL_ARGUMENTS (decl) || gimple_has_body_p (decl))
 return false;
 
-  gcc_assert (in_lto_p);
+  gcc_assert (in_lto_p && !DECL_RESULT (decl));
 
   timevar_push (TV_IPA_LTO_GIMPLE_IN);

Re: [PING] genattrab.c generate switch

2016-01-19 Thread Jesper Broge Jørgensen



On 19/01/16 10:44, Richard Biener wrote:

On Mon, Jan 18, 2016 at 7:48 PM, Jeff Law  wrote:

On 01/18/2016 07:09 AM, Jesper Broge Jørgensen wrote:

Ping patch:

https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00784.html

I'd put it in my gcc-7 queue.  But if Richard, Bernd, Richi or someone else
wants to work though the changes as a bugfix for bootstrapping on platforms
with crippled compilers, I won't object.

I'd take it as a bugfix but the patch still needs review.

Richard.


jeff

Here is the reformatted patch:


gcc/ChangeLog:

2016-01-19  Jesper Broge Jørgensen  

* genattrtab.c (check_attr_set_switch): New function
(write_attr_set): Write a switch instead of if condition, if possible


diff --git a/gcc/genattrtab.c b/gcc/genattrtab.c
index 2caf8f6..8e7f9e6 100644
--- a/gcc/genattrtab.c
+++ b/gcc/genattrtab.c
@@ -4113,6 +4113,103 @@ eliminate_known_true (rtx known_true, rtx exp, 
int insn_code, int insn_index)

   return exp;
 }

+/* Check if exp contains a series of IOR conditions on the same attr_name.
+   If it does it can be turned into a switch statement and returns true.
+   If write_cases is true it will write the cases of the switch to 
outf.  */

+
+static int
+check_attr_set_switch (FILE *outf, rtx exp, unsigned int attrs_cached,
+   int write_cases, int indent)
+{
+  if (GET_CODE (exp) != IOR)
+return 0;
+  if (GET_CODE (XEXP (exp, 0)) != EQ_ATTR)
+return 0;
+
+  rtx next = exp;
+  int ior_depth = 0;
+  int is_first = 1;
+
+  const char *attr_name_cmp = XSTR (XEXP (exp, 0), 0);
+
+  while (1)
+{
+  rtx op1 = XEXP (next, 0);
+  rtx op2 = XEXP (next, 1);
+
+  if (GET_CODE (op1) != EQ_ATTR)
+return 0;
+
+  const char *attr_name = XSTR (op1, 0);
+  const char *cmp_val = XSTR (op1, 1);
+
+  /* pointer compare is enough.  */
+  if (attr_name_cmp != attr_name)
+return 0;
+
+  if (write_cases)
+{
+  struct attr_desc *attr = find_attr (&attr_name, 0);
+  gcc_assert (attr);
+  if (is_first)
+{
+  fprintf (outf, "(");
+  is_first = 0;
+  int i;
+  for (i = 0; i < cached_attr_count; i++)
+if (attr->name == cached_attrs[i])
+  break;
+
+  if (i < cached_attr_count && (attrs_cached & (1U << i)) != 0)
+fprintf (outf, "cached_%s", attr->name);
+  else if (i < cached_attr_count &&
+   (attrs_to_cache & (1U << i)) != 0)
+fprintf (outf, "(cached_%s = get_attr_%s (insn))", attr->name,
+ attr->name);
+  else
+fprintf (outf, "get_attr_%s (insn)", attr->name);
+  fprintf (outf, ")\n");
+  write_indent (outf, indent);
+  fprintf (outf, "{\n");
+}
+  write_indent (outf, indent);
+  fprintf (outf, "case ");
+  write_attr_valueq (outf, attr, cmp_val);
+  fprintf (outf, ":\n");
+}
+
+  const int code = GET_CODE (op2);
+  if (code != IOR)
+{
+  if (code == EQ_ATTR)
+{
+  const char *attr_name = XSTR (op2, 0);
+  const char *cmp_val = XSTR (op2, 1);
+
+  if (attr_name == alternative_name)
+return 0;
+
+  struct attr_desc *attr = find_attr (&attr_name, 0);
+  gcc_assert (attr);
+
+  if (attr->is_const)
+return 0;
+  else if (write_cases)
+{
+  write_indent (outf, indent);
+  fprintf (outf, "case ");
+  write_attr_valueq (outf, attr, cmp_val);
+  fprintf (outf, ":\n");
+}
+}
+  break;
+}
+  next = op2;
+  ior_depth++;
+}
+  return ior_depth > 2;
+}
+
 /* Write out a series of tests and assignment statements to perform 
tests and
sets of an attribute value.  We are passed an indentation amount 
and prefix

and suffix strings to write around each attribute value (e.g., "return"
@@ -4123,6 +4220,7 @@ write_attr_set (FILE *outf, struct attr_desc 
*attr, int indent, rtx value,

 const char *prefix, const char *suffix, rtx known_true,
 int insn_code, int insn_index, unsigned int attrs_cached)
 {
+  int n_switches = 0;
   if (GET_CODE (value) == COND)
 {
   /* Assume the default value will be the default of the COND 
unless we
@@ -4132,6 +4230,7 @@ write_attr_set (FILE *outf, struct attr_desc 
*attr, int indent, rtx value,

   rtx newexp;
   int first_if = 1;
   int i;
+  int is_switch = 0;

   if (cached_attr_count)
 {
@@ -4176,40 +4275,68 @@ write_attr_set (FILE *outf, struct attr_desc 
*attr, int indent, rtx value,

   if (inner_true == false_rtx)
 continue;

+  is_switch = check_attr_set_switch (outf, testexp, attrs_cached, 0,
+ indent);
+
   attrs_cached_inside = attrs_cached;
   attrs_cached_after = attrs_cached;
   write_indent (outf, indent);
-  fprintf (outf, "%sif ", first_if ? "" : "else ");
-  first_if = 0;
-  write_test_expr (outf, testexp, attrs_cached,
-

[RFC] [nvptx] Try to cope with cuLaunchKernel returning CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES

2016-01-19 Thread Thomas Schwinge

Hi!

With nvptx offloading, in one OpenACC test case, we're running into the
following fatal error (GOMP_DEBUG=1 output):

[...]
info: Function properties for 'LBM_performStreamCollide$_omp_fn$0':
info: used 87 registers, 0 stack, 8 bytes smem, 328 bytes cmem[0], 80 
bytes cmem[2], 0 bytes lmem
[...]
  nvptx_exec: kernel LBM_performStreamCollide$_omp_fn$0: launch gangs=32, 
workers=32, vectors=32

libgomp: cuLaunchKernel error: too many resources requested for launch

Very likely this means that the number of registers used in this function
("used 87 registers"), multiplied by the thread block size (workers *
vectors, "workers=32, vectors=32"), exceeds the hardware maximum.

(One problem certainly might be that we're currently not doing any
register allocation for nvptx, as far as I remember based on the idea
that PTX is only a "virtual ISA", and the PTX JIT compiler would "fix
this up" for us -- which I'm not sure it actually is doing?)

Below I'm posting a prototype patch which makes the execution run
successfully:

[...]
  nvptx_exec: kernel LBM_performStreamCollide$_omp_fn$0: launch gangs=32, 
workers=32, vectors=32
cuLaunchKernel: CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES; retrying with 
reduced number of workers
  nvptx_exec: kernel LBM_performStreamCollide$_omp_fn$0: launch gangs=32, 
workers=16, vectors=32
  nvptx_exec: kernel LBM_performStreamCollide$_omp_fn$0: finished
[...]

As -- I think -- the maximum number of registers in a thread block is
fixed, it would be good to remember the modified dims[GOMP_DIM_WORKER]
(which my patch doesn't).

Alternatively/additionally, we could try experimenting with using the
following of enum CUjit_option "Online compiler and linker options":

CU_JIT_MAX_REGISTERS = 0
Max number of registers that a thread may use. Option type: unsigned 
int Applies to: compiler only 
CU_JIT_THREADS_PER_BLOCK
IN: Specifies minimum number of threads per block to target compilation 
for OUT: Returns the number of threads the compiler actually targeted. This 
restricts the resource utilization fo the compiler (e.g. max registers) such 
that a block with the given number of threads should be able to launch based on 
register limitations. Note, this option does not currently take into account 
any other resource limitations, such as shared memory utilization. Cannot be 
combined with CU_JIT_TARGET. Option type: unsigned int Applies to: compiler 
only 
[...]

..., to have the PTX JIT reduce the number of live registers (if
possible; I don't know), and/or could try experimenting with querying the
active device, enum CUdevice_attribute "Device properties":

[...]
CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_BLOCK = 12
Maximum number of 32-bit registers available per block 
[...]

..., and use that in combination with each function's enum
CUfunction_attribute "Function properties":

CU_FUNC_ATTRIBUTE_MAX_THREADS_PER_BLOCK = 0
The maximum number of threads per block, beyond which a launch of the 
function would fail. This number depends on both the function and the device on 
which the function is currently loaded.
[...]
CU_FUNC_ATTRIBUTE_NUM_REGS = 4
The number of registers used by each thread of this function. 
[...]

... to determine an optimal number of threads per block given the number
of registers (maybe just querying CU_FUNC_ATTRIBUTE_MAX_THREADS_PER_BLOCK
would do that already?).  All these options however are more complicated
than the following simple "back-off" approach:

commit bb0bf9e50026feabe877c9d8174e78c021b002a4
Author: Thomas Schwinge 
Date:   Tue Jan 19 12:31:27 2016 +0100

[nvptx] Try to cope with cuLaunchKernel returning 
CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES
---
 gcc/gimple-fold.c |7 +++
 gcc/tree-vrp.c|1 +
 libgomp/plugin/plugin-nvptx.c |   28 
 3 files changed, 28 insertions(+), 8 deletions(-)

diff --git gcc/gimple-fold.c gcc/gimple-fold.c
index a0e7b7e..e75c58e 100644
--- gcc/gimple-fold.c
+++ gcc/gimple-fold.c
@@ -2935,6 +2935,13 @@ fold_internal_goacc_dim (const gimple *call)
 return NULL_TREE;
 
   int axis = get_oacc_ifn_dim_arg (call);
+  if (axis == GOMP_DIM_WORKER)
+{
+  /* libgomp's nvptx plugin might potentially modify
+dims[GOMP_DIM_WORKER].  */
+  return NULL_TREE;
+}
+
   int size = get_oacc_fn_dim_size (current_function_decl, axis);
   bool is_pos = gimple_call_internal_fn (call) == IFN_GOACC_DIM_POS;
   tree result = NULL_TREE;
diff --git gcc/tree-vrp.c gcc/tree-vrp.c
index e6c11e0..a0a78d2 100644
--- gcc/tree-vrp.c
+++ gcc/tree-vrp.c
@@ -3980,6 +3980,7 @@ extract_range_basic (value_range *vr, gimple *stmt)
  break;
case CFN_GOACC_DIM_SIZE:
case CFN_GOACC_DIM_POS:
+ //TODO: is this kosher regarding libgomp's nvptx plugin potentially 
modifying dims[GOMP_DIM_WORKER]?
  /* Optimizing these

Re: [PATCH][GCC][ARM] testcase memset-inline-10.c uses -mfloat-abi=hard but does not check whether target supports it

2016-01-19 Thread Ramana Radhakrishnan

On Thu, Nov 12, 2015 at 3:16 PM, Andre Vieira
 wrote:
> On 12/11/15 15:08, Andre Vieira wrote:
>>
>> Hi,
>>
>>This patch changes the memset-inline-10.c testcase to make sure that
>> it is only compiled for ARM targets that support -mfloat-abi=hard using
>> the fact that all non-thumb1 targets do.
>>
>>This is correct because all targets for which -mthumb causes the
>> compiler to use thumb2 will support the generation of FP instructions.
>>
>>Tested by running regressions for this testcase for various ARM
>> targets.
>>
>>Is this OK to commit?

This is OK - Sorry about the delay in reviewing this.

I'd like to restructure gcc.target/arm if I could at some point to be
more resilient to multilib testing and prevent such long lists of
directives in tests.

regards
Ramana

>>
>>Thanks,
>>Andre Vieira
>>
>> gcc/testsuite/ChangeLog:
>> 2015-11-06  Andre Vieira  
>>
>>  * gcc.target/arm/memset-inline-10.c: Added
>>  dg-require-effective-target arm_thumb2_ok.
>>
> Now with attachment, sorry about that.
>
> Cheers,
> Andre

Re: [PATCH] Fix RTL DSE (PR rtl-optimization/68955, take 2)

2016-01-19 Thread Jakub Jelinek

On Tue, Jan 19, 2016 at 12:43:53PM +0100, Eric Botcazou wrote:
> > 2016-01-19  Jakub Jelinek  
> > 
> > PR rtl-optimization/68955
> > PR rtl-optimization/64557
> > * dse.c (record_store, check_mem_read_rtx): Don't call get_addr
> > here.  Fix up formatting.
> > * alias.c (get_addr): Handle VALUE + CONST_INT.
> 
> VALUE +/- CONST_INT (and actually also CONST_WIDE_INT).

And CONST_DOUBLE.

> > * gcc.dg/torture/pr68955.c: New test.
> 
> OK, if you add "plus or minus an optional constant offset" to the head 
> comment 
> of the get_addr function, thanks.
> 
> Handling CONST_WIDE_INT looks superfluous to me though (as well as MINUS 
> since 
> it's non-canonical but we'd probably better be forgiving for this one).

So shall I take out the CONST_WIDE_INT/CONST_DOUBLE handling and just
check for CONST_INT_P instead of CONST_SCALAR_INT_P ?  I thought it is just
easy thing to handle, though for DSE which cares about addresses it really
does not matter.  Or can I leave it in?
DSE will only care about CONST_INT and +.  For minus, I thought it can be
canonical for the minimum signed value, if it is originally subtracted (not
the case for DSE).

Jakub

Re: Prune BLOCK_VARs lists in free_lang_data

2016-01-19 Thread Jan Hubicka

Hi,
here is updated patch. It has same effect as the former version.

Bootstrapped/regtested x86_64-linux, OK?

Honza

* tree-ssa-live.c (remove_unused_scope_block_p): Also remove
reudndant typedefs.
Index: tree-ssa-live.c
===
--- tree-ssa-live.c (revision 232466)
+++ tree-ssa-live.c (working copy)
@@ -470,7 +470,8 @@ remove_unused_scope_block_p (tree scope,
 types in different orders depending on whether debug
 information is being generated.  */
 
-  else if (TREE_CODE (*t) == TYPE_DECL
+  else if ((TREE_CODE (*t) == TYPE_DECL
+   && !DECL_IGNORED_P (*t) && !is_redundant_typedef (*t))
   || debug_info_level == DINFO_LEVEL_NORMAL
   || debug_info_level == DINFO_LEVEL_VERBOSE)
;

Re: [PATCH 1/2] DWARF: process all TYPE_DECL nodes when iterating on scopes

2016-01-19 Thread Pierre-Marie de Rodat


On 01/18/2016 12:18 PM, Richard Biener wrote:

Looking for TYPE_DECL_IS_STUB uses I come along dwarf2out_ignore_block
which you'd need to change as well I think.


That is true, thank you!

First, I’ll answer your last point:

Btw, not sure how you get at the "wrong" debug info gen order, I
can't seem to get at it with a C testcase.

As with the other patch this misses a testcase.


That’s true, sorry for that. I could not yield a C reproducer neither, 
so here’s my original Ada testcase: I will include it in the next patch 
under gnat.dg.


So the problem I’m trying to address is the following: Record_Type is 
declared in Debug5, and it is referenced by the R parameter in 
Debug5.Process (so far, so good). Now, because debug info for 
Debug5.Process is emitted before the one for Debug5, Record_Type is 
emitted in the global scope and is not relocated under Debug5 afterwards.



-  else if (TREE_CODE (decl_or_origin) == TYPE_DECL
-   && TYPE_DECL_IS_STUB (decl_or_origin))
+  else if (TREE_CODE (decl_or_origin) == TYPE_DECL)
  die = lookup_type_die (TREE_TYPE (decl_or_origin));


But ... I think this change is wrong.  It is supposed to use the _type_ DIE
in case the FE didn't create a proper TYPE_DECL.  So I think what is
maybe missing is

   else if (TREE_CODE (decl_or_origin) == TYPE_DECL)
 die = lookup_decl_die (decl_or_origin);

?  That is, why should we lookup the type if the type-decl isn't a stub?


While I agree that for non-stub TYPE_DECL, it is sound to look for the 
decl itself rather than the type, this does not fix the bug I intended 
to fix. Indeed, when we reach this point for the TYPE_DECL corresponding 
to Record_Type, add_type_attribute just creates a DW_TAG_typedef for it, 
and the call it does to modified_type just returns the existing 
DW_TAG_record_type (still in the global scope).


So assuming your proposal of the change is correct, should we then use 
TYPE_CONTEXT in order to (potentially) relocate TREE_TYPE (decl) in 
add_type_attribute?


--
Pierre-Marie de Rodat
--  The aim of this test is to check that Ada types appear in the proper
--  context in the debug info.
-- 
--  Checking this directly would be really tedious just scanning for assembly
--  lines, so instead we rely on DWARFv4's .debug_types sections, which must be
--  created only for global-scope types. Checking the number of .debug_types is
--  some hackish way to check that types are output in the proper context (i.e.
--  at global or local scope).
--
--  { dg-options "-g -gdwarf-4 -cargs -fdebug-types-section -dA" }
--  { dg-final { scan-assembler-times "\\(DIE \\(0x\[a-f0-9\]*\\) DW_TAG_type_unit\\)" 0 } }

procedure Debug5 is
   type Array_Type is array (Natural range <>) of Integer;
   type Record_Type (L1, L2 : Natural) is record
  I1 : Integer;
  A1 : Array_Type (1 .. L1);
  I2 : Integer;
  A2 : Array_Type (1 .. L2);
  I3 : Integer;
   end record;

   function Get (L1, L2 : Natural) return Record_Type is
  Result : Record_Type (L1, L2);
   begin
  Result.I1 := 1;
  for I in Result.A1'Range loop
 Result.A1 (I) := I;
  end loop;
  Result.I2 := 2;
  for I in Result.A2'Range loop
 Result.A2 (I) := I;
  end loop;
  Result.I3 := 3;
  return Result;
   end Get;

   R1 : Record_Type := Get (0, 0);
   R2 : Record_Type := Get (1, 0);
   R3 : Record_Type := Get (0, 1);
   R4 : Record_Type := Get (2, 2);

   procedure Process (R : Record_Type) is
   begin
  null;
   end Process;

begin
   Process (R1);
   Process (R2);
   Process (R3);
   Process (R4);
end Debug5;

[C++] Add -fnull-this-pointer

2016-01-19 Thread Jan Hubicka

Hi,
according to Trevor, the assumption about THIS pointer being non-NULL breaks
several bigger C++ packages (definitly including Firefox, but I believe
kdevelop was mentioned, too).  This patch makes the feature to be controlable
by a dedicated flag.  I am not sure about the default. We now have ubsan check
for the bug so I would hope the codebases to be updated soon, but it did not
happen for Firefox for quite a while despite the fact that Martin Liska reported
it.

This patch defaults to -fno-null-this-pointer, but I would be OK with changing
the default and setting it on only in GCC 6. Main point of the patch is to
avoid need of those packages to be built with -fno-delete-null-pointer-checks
(which still subsumes the flag).

The patch is bit inconsistent, becuase C++ FE wil still assume that this pointer
is non-NULL when expanding multiple inheritance accesses.  We did this from
very beginning. I do not know FE enough to see if it is easy to change the
behaviour here or if it is desired.

Bootstrapped/regtsted x86_64-linux.

Honza

* c-family/c.opt (fnull-this-pointer): New flag.
(nonnull_arg_p): Honnor flag_null_this_pointer.
Index: tree.c
===
--- tree.c  (revision 232553)
+++ tree.c  (working copy)
@@ -14016,6 +14022,7 @@ nonnull_arg_p (const_tree arg)
   /* THIS argument of method is always non-NULL.  */
   if (TREE_CODE (TREE_TYPE (cfun->decl)) == METHOD_TYPE
   && arg == DECL_ARGUMENTS (cfun->decl)
+  && flag_null_this_pointer
   && flag_delete_null_pointer_checks)
 return true;
 
Index: c-family/c.opt
===
--- c-family/c.opt  (revision 232553)
+++ c-family/c.opt  (working copy)
@@ -1321,6 +1321,10 @@ Enum(ivar_visibility) String(public) Val
 EnumValue
 Enum(ivar_visibility) String(package) Value(IVAR_VISIBILITY_PACKAGE)
 
+fnull-this-pointer
+C++ ObjC++ Optimization Report Var(flag_null_this_pointer)
+Allow calling methods of NULL pointer
+
 fnonansi-builtins
 C++ ObjC++ Var(flag_no_nonansi_builtin, 0)
 
Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 232553)
+++ doc/invoke.texi (working copy)
@@ -232,6 +232,7 @@ Objective-C and Objective-C++ Dialects}.
 -fobjc-std=objc1 @gol
 -fno-local-ivars @gol
 -fivar-visibility=@r{[}public@r{|}protected@r{|}private@r{|}package@r{]} @gol
+-fnull-this-pointer @gol
 -freplace-objc-classes @gol
 -fzero-link @gol
 -gen-decls @gol
@@ -2361,6 +2362,11 @@ errors if these functions are not inline
 Disable Wpedantic warnings about constructs used in MFC, such as implicit
 int and getting a pointer to member function via non-standard syntax.
 
+@item -fnull-this-pointer
+@opindex fnull-this-pointer
+Disable optimization which take advantage of the fact that calling method
+of @code{NULL} pointer is undefined.
+
 @item -fno-nonansi-builtins
 @opindex fno-nonansi-builtins
 Disable built-in declarations of functions that are not mandated by

Re: [PATCH] Fix RTL DSE (PR rtl-optimization/68955, take 2)

2016-01-19 Thread Eric Botcazou

> So shall I take out the CONST_WIDE_INT/CONST_DOUBLE handling and just
> check for CONST_INT_P instead of CONST_SCALAR_INT_P ?  I thought it is just
> easy thing to handle, though for DSE which cares about addresses it really
> does not matter.  Or can I leave it in?

Your call.

> DSE will only care about CONST_INT and +.  For minus, I thought it can be
> canonical for the minimum signed value, if it is originally subtracted (not
> the case for DSE).

OK.  And var-tracking also manipulates MINUS - CONST_INTs coming from stack 
decrements.

-- 
Eric Botcazou

[chkp] Clear instrumented_version and thunk_info in cgraph_node::reset

2016-01-19 Thread Jan Hubicka

Hi,
this patch makes the code turning instrumentation thunks into transparent
aliases to work.

Bootstrapped/regtested x86_64-linux, will commit it later today.

Honza

* cgraphunit.c (cgraph_node::reset): Clear thunk info and
instrumented_version, too.
Index: cgraphunit.c
===
--- cgraphunit.c(revision 232553)
+++ cgraphunit.c(working copy)
@@ -366,12 +366,14 @@ cgraph_node::reset (void)
   memset (&local, 0, sizeof (local));
   memset (&global, 0, sizeof (global));
   memset (&rtl, 0, sizeof (rtl));
+  memset (&thunk, 0, sizeof (thunk_info));
   analyzed = false;
   definition = false;
   alias = false;
   transparent_alias = false;
   weakref = false;
   cpp_implicit_alias = false;
+  instrumented_version = NULL;
 
   remove_callees ();
   remove_all_references ();

Re: genattrab.c generate switch

2016-01-19 Thread Bernd Schmidt


On 01/18/2016 11:44 PM, Jesper Broge Jørgensen wrote:

I found a formatting tool called uncrustify that comes with a gnu style
config
https://github.com/bengardner/uncrustify/blob/master/etc/gnu-indent.cfg
that needed a few tweaks to format code that looked what is already in
gcc/genattrtab.c

The tweaks was:

indent_with_tabs = 2 // instead of 0
sp_func_def_paren = add // instead of remove
sp_func_proto_paren  = add // instead of remove
sp_func_call_paren = add // instead of remove

So now the code should be correctly formatted.


Best to get that right when editing, though. emacs defaults to GNU style 
and other editors can also be tweaked.



Do i send in a new patch or just respond to the old one with the new
changes?


Usually best to send updated patches (as text/plain attachment to avoid 
word-wrapping and other whitespace damage).



I have also followed instructions at
https://gcc.gnu.org/ml/gcc/2003-06/txt00010.txt to get copyright
assignment though i have not yet received a reply.


Ok, we'll have to wait for that.


Bernd

Re: [C++] Add -fnull-this-pointer

2016-01-19 Thread Markus Trippelsdorf

On 2016.01.19 at 13:11 +0100, Jan Hubicka wrote:
> according to Trevor, the assumption about THIS pointer being non-NULL breaks
> several bigger C++ packages (definitly including Firefox, but I believe
> kdevelop was mentioned, too).  This patch makes the feature to be controlable
> by a dedicated flag.  I am not sure about the default. We now have ubsan check
> for the bug so I would hope the codebases to be updated soon, but it did not
> happen for Firefox for quite a while despite the fact that Martin Liska 
> reported
> it.
> 
> This patch defaults to -fno-null-this-pointer, but I would be OK with changing
> the default and setting it on only in GCC 6. Main point of the patch is to
> avoid need of those packages to be built with -fno-delete-null-pointer-checks
> (which still subsumes the flag).

I can confirm that for QT-5, Chromium and Kdevelop this optimization
needs to be disabled. So it looks like a very common issue in large C++
codebases.

-- 
Markus

Re: Prune BLOCK_VARs lists in free_lang_data

2016-01-19 Thread Richard Biener

On Tue, 19 Jan 2016, Jan Hubicka wrote:

> Hi,
> here is updated patch. It has same effect as the former version.
> 
> Bootstrapped/regtested x86_64-linux, OK?

But what about the comment?

 We track no
 information on whether given type is used or not, so we have
 to keep them even when not emitting debug information,
 otherwise we may end up remapping variables and their (local)
 types in different orders depending on whether debug
 information is being generated.  */

which suggests that the TYPE_DECLs somehow "order" remapping
of local types and that is somehow important (maybe for VLA
types which refer to locals). OTOH local vars are also
duplicated in order before copying stmts (which may introduce
differences because of seeing debug stmts or not refering to
decls/types).

Richard.

> Honza
> 
>   * tree-ssa-live.c (remove_unused_scope_block_p): Also remove
>   reudndant typedefs.
> Index: tree-ssa-live.c
> ===
> --- tree-ssa-live.c   (revision 232466)
> +++ tree-ssa-live.c   (working copy)
> @@ -470,7 +470,8 @@ remove_unused_scope_block_p (tree scope,
>types in different orders depending on whether debug
>information is being generated.  */
>  
> -  else if (TREE_CODE (*t) == TYPE_DECL
> +  else if ((TREE_CODE (*t) == TYPE_DECL
> + && !DECL_IGNORED_P (*t) && !is_redundant_typedef (*t))
>  || debug_info_level == DINFO_LEVEL_NORMAL
>  || debug_info_level == DINFO_LEVEL_VERBOSE)
>   ;
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: [PATCH] Fix debug info handling in prepare_shrink_wrap (PR debug/65779)

2016-01-19 Thread Bernd Schmidt


On 01/19/2016 12:33 AM, Jakub Jelinek wrote:

+  if (MAY_HAVE_DEBUG_INSNS)
+{
+  for (dinsn = BB_END (bb); dinsn != insn; dinsn = PREV_INSN (dinsn))
+   if (DEBUG_INSN_P (dinsn))
+ {
+   df_ref use;
+   FOR_EACH_INSN_USE (use, dinsn)
+ if (refers_to_regno_p (dregno, end_dregno,
+DF_REF_REG (use), (rtx *) NULL))
+   dead_debug_add (debug, use, DF_REF_REGNO (use));
+ }
+}
+
   /* At this point we are committed to moving INSN, but let's try to
  move it as far as we can.  */
   do
@@ -363,6 +380,18 @@ move_insn_for_shrink_wrap (basic_block b
  if (!live_edge || EDGE_COUNT (live_edge->dest->preds) > 1)
break;
  next_block = live_edge->dest;
+ if (MAY_HAVE_DEBUG_INSNS)
+   {
+ FOR_BB_INSNS_REVERSE (bb, dinsn)
+   if (DEBUG_INSN_P (dinsn))
+ {
+   df_ref use;
+   FOR_EACH_INSN_USE (use, dinsn)
+ if (refers_to_regno_p (dregno, end_dregno,
+DF_REF_REG (use), (rtx *) NULL))
+   dead_debug_add (debug, use, DF_REF_REGNO (use));
+ }
+   }
}
 }


Is there a way to merge these two blocks (e.g. by moving this to the 
start of the loop and testing for insn or BB_HEAD)?



Bernd

Re: [testsuite][ARM target attributes] Fix effective_target tests

2016-01-19 Thread Kyrill Tkachov


Hi Christophe,

On 04/01/16 14:21, Christophe Lyon wrote:

On 4 January 2016 at 15:20, Christophe Lyon  wrote:

On 18 December 2015 at 15:16, Kyrill Tkachov
 wrote:

Hi Christophe,


On 17/12/15 22:17, Christophe Lyon wrote:

Hi,

Here is an updated version of this patch.
I did test it with
-mthumb/-march=armv8-a/-mfpu=crypto-neon-fp-armv8/-mfloat-abi=hard in
addition to my usual set of options.

Compared to the previous version:
- I added some doc in sourcebuild.texi
- I no longer modify arm_vfp_ok...
- I replaced all uses of arm_vfp with the new arm_fp because I found
that the existing tests do not actually need to pass -mfpu=vfp: this
is implicitly set as the default when using -mfloat-abi={softfp|hard}
- I chose not to remove arm_vfp_ok because we may need it in the
future, if a test really needs vfp (as opposed to neon for instance)
- in gcc.target/arm/attr-crypto.c I force the initial fpu to be vfp
via pragma instead, so that the next pragma fpu
fpu=crypto-neon-fp-armv8 is always compatible, regardless of the
command-line options/default fpu
- same for attr-neon2.c and attr-neon3.c
- I updated cmp-2.c, unsigned-float.c, vfp-1.c, vfp-ldmdbd.c,
vfp-ldmdbs.c, vfp-ldmiad.c, vfp-ldmias.c, vfp-stmdbd.c, vfp-stmdbs.c,
vfp-stmiad.c, vfp-stmias.c, vnmul-[1234].c to use the new arm_fp
effective target instead of arm_vfp. This is so that they don't need
to use -mfpu=vfp and can use the new dg-add-options arm_fp

The validation results show (in addition to what I originally reported):
- attr-crypto.c and attr-neon3.c now ICE in some cases. This is PR68895.
- depending on the GCC configuration (e.g. --with-fpu=neon)
attr-neon3.c may fail. This is PR68896.

OK?


Thanks for following up on this.
I think you also need to document the new arm_crypto_pragma_ok.


Indeed, I forgot it.

Here is a new version of the patch with a few words added to document
this function.
I did not modify the testcase after Christian's comments and
PR68934: my understanding is that the testscase are valid after
all and Christian is working on fixing the ICE.


With the attachment, this time...


This is ok now.
Sorry for the delay.
I believe Christian is working on the ICEs exposed by this, so we should be ok.

Thanks,
Kyrill




2016-01-04  Christophe Lyon  

 * doc/sourcebuild.texi (arm_crypto_pragma_ok): Document new entry.
 (arm_fp_ok): Likewise.
 (arm_fp): Likewise.
 (arm_crypto): Likewise.
 * lib/target-supports.exp
 (check_effective_target_arm_fp_ok_nocache): New.
 (check_effective_target_arm_fp_ok): New.
 (add_options_for_arm_fp): New.
 (check_effective_target_arm_crypto_ok_nocache): Require
 target_arm_v8_neon_ok instead of arm32.
 (check_effective_target_arm_crypto_pragma_ok_nocache): New.
 (check_effective_target_arm_crypto_pragma_ok): New.
 (add_options_for_arm_vfp): New.
 * gcc.target/arm/attr-crypto.c: Use arm_crypto_pragma_ok effective
 target. Do not force -mfloat-abi=softfp, use arm_fp_ok effective
 target instead. Force initial fpu to vfp.
 * gcc.target/arm/attr-neon-builtin-fail.c: Do not force
 -mfloat-abi=softfp, use arm_fp_ok effective target instead.
 * gcc.target/arm/attr-neon-fp16.c: Likewise. Remove arm_neon_ok
 dependency.
 * gcc.target/arm/attr-neon2.c: Do not force -mfloat-abi=softfp,
 use arm_vfp effective target instead. Force initial fpu to vfp.
 * gcc.target/arm/attr-neon3.c: Likewise.
 * gcc.target/arm/cmp-2.c: Use arm_fp_ok effective target instead of
 arm_vfp_ok.
 * gcc.target/arm/unsigned-float.c: Likewise.
 * gcc.target/arm/vfp-1.c: Likewise.
 * gcc.target/arm/vfp-ldmdbd.c: Likewise.
 * gcc.target/arm/vfp-ldmdbs.c: Likewise.
 * gcc.target/arm/vfp-ldmiad.c: Likewise.
 * gcc.target/arm/vfp-ldmias.c: Likewise.
 * gcc.target/arm/vfp-stmdbd.c: Likewise.
 * gcc.target/arm/vfp-stmdbs.c: Likewise.
 * gcc.target/arm/vfp-stmiad.c: Likewise.
 * gcc.target/arm/vfp-stmias.c: Likewise.
 * gcc.target/arm/vnmul-1.c: Likewise.
 * gcc.target/arm/vnmul-2.c: Likewise.
 * gcc.target/arm/vnmul-3.c: Likewise.
 * gcc.target/arm/vnmul-4.c: Likewise.

OK?

Christophe.



Kyrill



Christophe

2015-12-17  Christophe Lyon  

  * doc/sourcebuild.texi (arm_fp_ok): Document new entry.
  (arm_fp): Likewise.
  * lib/target-supports.exp
  (check_effective_target_arm_fp_ok_nocache): New.
  (check_effective_target_arm_fp_ok): New.
  (add_options_for_arm_fp): New.
  (check_effective_target_arm_crypto_ok_nocache): Require
  target_arm_v8_neon_ok instead of arm32.
  (check_effective_target_arm_crypto_pragma_ok_nocache): New.
  (check_effective_target_arm_crypto_pragma_ok): New.
  (add_options_for_arm_vfp): New.
  * gcc.target/arm/attr-crypto.c: Use arm_crypto_pragma_ok effective
  target. Do not force -mfloat-abi=softfp, use arm_fp_ok effective
  target instead. Force initial fpu to vfp.
  * gcc.target/arm/attr-neon-builtin-f

Re: Prune BLOCK_VARs lists in free_lang_data

2016-01-19 Thread Jan Hubicka

> On Tue, 19 Jan 2016, Jan Hubicka wrote:
> 
> > Hi,
> > here is updated patch. It has same effect as the former version.
> > 
> > Bootstrapped/regtested x86_64-linux, OK?
> 
> But what about the comment?
> 
>  We track no
>  information on whether given type is used or not, so we have
>  to keep them even when not emitting debug information,
>  otherwise we may end up remapping variables and their (local)
>  types in different orders depending on whether debug
>  information is being generated.  */
> 
> which suggests that the TYPE_DECLs somehow "order" remapping
> of local types and that is somehow important (maybe for VLA
> types which refer to locals). OTOH local vars are also
> duplicated in order before copying stmts (which may introduce
> differences because of seeing debug stmts or not refering to
> decls/types).

The original patch is here:
https://gcc.gnu.org/ml/gcc-patches/2011-01/msg01344.html
my understand is that it is all about DECL_UID being stable with -g0
and -g. My patch does not change that becuase I drop ignored and redundant
typedefs even with -g.

Honza

Re: Prune BLOCK_VARs lists in free_lang_data

2016-01-19 Thread Jan Hubicka

> > On Tue, 19 Jan 2016, Jan Hubicka wrote:
> > 
> > > Hi,
> > > here is updated patch. It has same effect as the former version.
> > > 
> > > Bootstrapped/regtested x86_64-linux, OK?
> > 
> > But what about the comment?
> > 
> >  We track no
> >  information on whether given type is used or not, so we have
> >  to keep them even when not emitting debug information,
> >  otherwise we may end up remapping variables and their (local)
> >  types in different orders depending on whether debug
> >  information is being generated.  */
> > 
> > which suggests that the TYPE_DECLs somehow "order" remapping
> > of local types and that is somehow important (maybe for VLA
> > types which refer to locals). OTOH local vars are also
> > duplicated in order before copying stmts (which may introduce
> > differences because of seeing debug stmts or not refering to
> > decls/types).
> 
> The original patch is here:
> https://gcc.gnu.org/ml/gcc-patches/2011-01/msg01344.html
> my understand is that it is all about DECL_UID being stable with -g0
> and -g. My patch does not change that becuase I drop ignored and redundant
> typedefs even with -g.

Alexandre, aslo your original mail mentions:
> Anyhow, it seems to me that dropping local type decls from lexical
> scopes doesn't buy us much, and even though it is indeed a sledgehammer,
> as richi put it, this fixes the problem, and I can't envision other
> simpler solutions that wouldn't risk running into the problems mentioned
> above, so...

> Regstrapped on x86_64-linux-gnu and i686-pc-linux-gnu.  Ok to install?


> One solution I do envision, which might help, would be to try to figure
> out which types are unused, and discard those.  Say, scan all variables
> within a lexical scope (including nested blocks), deciding which ones
> can be discarded, and then, as we move out of the nesting, we can
> decide, from last to first, which types are unused, and mark as used
> types refereced from retained variables and other types that are used,
> removing those that, when reached during this backward scan, remain
> marked as unused.  Or something along these lines, taking nested
> functions into account, if needed, and anything else I may have missed
> ;-)

dropping the type_decls definitly buys us a memory with LTO and firefox.  How
hard would be to implement the prunning of dead TYPE_DECLs as you suggest?

Honza
> 
> Honza

[committed] Readd __tls_get_addr and interceptor_tls_get_addr to libtsan for ABI compatibility (PR sanitizer/68824)

2016-01-19 Thread Jakub Jelinek

Hi!

The fix for this PR landed upstream, so I've cherry picked it, and after
bootstrapping/regtesting it on x86_64-linux last night committed to trunk.

2016-01-19  Jakub Jelinek  

PR sanitizer/68824
* tsan/tsan_interceptors.cc (NEED_TLS_GET_ADDR, __tls_get_addr,
InitializeInterceptors): Cherry pick upstream r258119.

--- libsanitizer/tsan/tsan_interceptors.cc.jj   2015-11-23 18:21:10.356619453 
+0100
+++ libsanitizer/tsan/tsan_interceptors.cc  2016-01-18 22:36:28.784072137 
+0100
@@ -2227,6 +2227,11 @@ static void HandleRecvmsg(ThreadState *t
 // Since the interceptor only initializes memory for msan, the simplest 
solution
 // is to disable the interceptor in tsan (other sanitizers do not call
 // signal handlers from COMMON_INTERCEPTOR_ENTER).
+// As __tls_get_addr has been intercepted in the past, to avoid breaking
+// libtsan ABI, keep it around, but just call the real function.
+#if SANITIZER_INTERCEPT_TLS_GET_ADDR
+#define NEED_TLS_GET_ADDR
+#endif
 #undef SANITIZER_INTERCEPT_TLS_GET_ADDR
 
 #define COMMON_INTERCEPT_FUNCTION(name) INTERCEPT_FUNCTION(name)
@@ -2446,6 +2451,12 @@ static void syscall_post_fork(uptr pc, i
 
 #include "sanitizer_common/sanitizer_common_syscalls.inc"
 
+#ifdef NEED_TLS_GET_ADDR
+TSAN_INTERCEPTOR(void *, __tls_get_addr, void *arg) {
+  return REAL(__tls_get_addr)(arg);
+}
+#endif
+
 namespace __tsan {
 
 static void finalize(void *arg) {
@@ -2627,6 +2638,10 @@ void InitializeInterceptors() {
   TSAN_INTERCEPT(__cxa_atexit);
   TSAN_INTERCEPT(_exit);
 
+#ifdef NEED_TLS_GET_ADDR
+  TSAN_INTERCEPT(__tls_get_addr);
+#endif
+
 #if !SANITIZER_MAC
   // Need to setup it, because interceptors check that the function is 
resolved.
   // But atexit is emitted directly into the module, so can't be resolved.

Jakub

Re: Prune BLOCK_VARs lists in free_lang_data

2016-01-19 Thread Jakub Jelinek

On Tue, Jan 19, 2016 at 01:42:00PM +0100, Jan Hubicka wrote:
> > One solution I do envision, which might help, would be to try to figure
> > out which types are unused, and discard those.  Say, scan all variables
> > within a lexical scope (including nested blocks), deciding which ones
> > can be discarded, and then, as we move out of the nesting, we can
> > decide, from last to first, which types are unused, and mark as used
> > types refereced from retained variables and other types that are used,
> > removing those that, when reached during this backward scan, remain
> > marked as unused.  Or something along these lines, taking nested
> > functions into account, if needed, and anything else I may have missed
> > ;-)
> 
> dropping the type_decls definitly buys us a memory with LTO and firefox.  How
> hard would be to implement the prunning of dead TYPE_DECLs as you suggest?

The problem might be if some types are only referenced from unused variables
for which we emit just debug stmts but no other references to them.
Or would we not prune in that case (i.e. only prune types if no variables
before unused locals prunning refer to those types)?

Jakub

Re: Prune BLOCK_VARs lists in free_lang_data

2016-01-19 Thread Richard Biener

On Tue, 19 Jan 2016, Jan Hubicka wrote:

> > On Tue, 19 Jan 2016, Jan Hubicka wrote:
> > 
> > > Hi,
> > > here is updated patch. It has same effect as the former version.
> > > 
> > > Bootstrapped/regtested x86_64-linux, OK?
> > 
> > But what about the comment?
> > 
> >  We track no
> >  information on whether given type is used or not, so we have
> >  to keep them even when not emitting debug information,
> >  otherwise we may end up remapping variables and their (local)
> >  types in different orders depending on whether debug
> >  information is being generated.  */
> > 
> > which suggests that the TYPE_DECLs somehow "order" remapping
> > of local types and that is somehow important (maybe for VLA
> > types which refer to locals). OTOH local vars are also
> > duplicated in order before copying stmts (which may introduce
> > differences because of seeing debug stmts or not refering to
> > decls/types).
> 
> The original patch is here:
> https://gcc.gnu.org/ml/gcc-patches/2011-01/msg01344.html
> my understand is that it is all about DECL_UID being stable with -g0
> and -g. My patch does not change that becuase I drop ignored and redundant
> typedefs even with -g.

Yes, I see that but the comment suggests there are 2nd-order effects
because those TYPE_DECLs you remove serve as "ordering" point for
refered types and decls during inlining (as we remap blocks and
their decls and refered-to vars first).

The thing that would make the order -g dependent would be remapping
a type or decl when remapping a debug stmt.  Not sure if anything
else is required to refer to them "first" (local vars for example).

Richard.

Re: [PATCH] Fix debug info handling in prepare_shrink_wrap (PR debug/65779)

2016-01-19 Thread Jakub Jelinek

On Tue, Jan 19, 2016 at 01:27:32PM +0100, Bernd Schmidt wrote:
> Is there a way to merge these two blocks (e.g. by moving this to the start
> of the loop and testing for insn or BB_HEAD)?

Sure, like this?

2016-01-19  Jakub Jelinek  

PR debug/65779
* shrink-wrap.c: Include valtrack.h.
(move_insn_for_shrink_wrap): Add DEBUG argument.  If
MAY_HAVE_DEBUG_INSNS, call dead_debug_add on DEBUG_INSNs
in between insn and where it will be moved to.  Call
dead_debug_insert_temp.
(prepare_shrink_wrap): Adjust caller.  Call dead_debug_local_init
first and dead_debug_local_finish at the end.
For uses and defs bitmap, handle all regs in between REGNO and
END_REGNO, not just the first one.

* gcc.dg/pr65779.c: New test.

--- gcc/shrink-wrap.c.jj2016-01-19 09:20:22.525791441 +0100
+++ gcc/shrink-wrap.c   2016-01-19 13:56:18.256816084 +0100
@@ -39,6 +39,7 @@ along with GCC; see the file COPYING3.
 #include "shrink-wrap.h"
 #include "regcprop.h"
 #include "rtl-iter.h"
+#include "valtrack.h"
 
 
 /* Return true if INSN requires the stack frame to be set up.
@@ -149,7 +150,8 @@ static bool
 move_insn_for_shrink_wrap (basic_block bb, rtx_insn *insn,
   const HARD_REG_SET uses,
   const HARD_REG_SET defs,
-  bool *split_p)
+  bool *split_p,
+  struct dead_debug_local *debug)
 {
   rtx set, src, dest;
   bitmap live_out, live_in, bb_uses, bb_defs;
@@ -158,6 +160,8 @@ move_insn_for_shrink_wrap (basic_block b
   unsigned int end_sregno = FIRST_PSEUDO_REGISTER;
   basic_block next_block;
   edge live_edge;
+  rtx_insn *dinsn;
+  df_ref def;
 
   /* Look for a simple register assignment.  We don't use single_set here
  because we can't deal with any CLOBBERs, USEs, or REG_UNUSED secondary
@@ -302,6 +306,20 @@ move_insn_for_shrink_wrap (basic_block b
  move it as far as we can.  */
   do
 {
+  if (MAY_HAVE_DEBUG_INSNS)
+   {
+ FOR_BB_INSNS_REVERSE (bb, dinsn)
+   if (DEBUG_INSN_P (dinsn))
+ {
+   df_ref use;
+   FOR_EACH_INSN_USE (use, dinsn)
+ if (refers_to_regno_p (dregno, end_dregno,
+DF_REF_REG (use), (rtx *) NULL))
+   dead_debug_add (debug, use, DF_REF_REGNO (use));
+ }
+   else if (dinsn == insn)
+ break;
+   }
   live_out = df_get_live_out (bb);
   live_in = df_get_live_in (next_block);
   bb = next_block;
@@ -384,6 +402,12 @@ move_insn_for_shrink_wrap (basic_block b
SET_REGNO_REG_SET (bb_uses, i);
 }
 
+  /* Insert debug temps for dead REGs used in subsequent debug insns.  */
+  if (debug->used && !bitmap_empty_p (debug->used))
+FOR_EACH_INSN_DEF (def, insn)
+  dead_debug_insert_temp (debug, DF_REF_REGNO (def), insn,
+ DEBUG_TEMP_BEFORE_WITH_VALUE);
+
   emit_insn_after (PATTERN (insn), bb_note (bb));
   delete_insn (insn);
   return true;
@@ -404,6 +428,8 @@ prepare_shrink_wrap (basic_block entry_b
   HARD_REG_SET uses, defs;
   df_ref def, use;
   bool split_p = false;
+  unsigned int i;
+  struct dead_debug_local debug;
 
   if (JUMP_P (BB_END (entry_block)))
 {
@@ -414,19 +440,22 @@ prepare_shrink_wrap (basic_block entry_b
   copyprop_hardreg_forward_bb_without_debug_insn (entry_block);
 }
 
+  dead_debug_local_init (&debug, NULL, NULL);
   CLEAR_HARD_REG_SET (uses);
   CLEAR_HARD_REG_SET (defs);
+
   FOR_BB_INSNS_REVERSE_SAFE (entry_block, insn, curr)
 if (NONDEBUG_INSN_P (insn)
&& !move_insn_for_shrink_wrap (entry_block, insn, uses, defs,
-  &split_p))
+  &split_p, &debug))
   {
/* Add all defined registers to DEFs.  */
FOR_EACH_INSN_DEF (def, insn)
  {
x = DF_REF_REG (def);
if (REG_P (x) && HARD_REGISTER_P (x))
- SET_HARD_REG_BIT (defs, REGNO (x));
+ for (i = REGNO (x); i < END_REGNO (x); i++)
+   SET_HARD_REG_BIT (defs, i);
  }
 
/* Add all used registers to USESs.  */
@@ -434,9 +463,12 @@ prepare_shrink_wrap (basic_block entry_b
  {
x = DF_REF_REG (use);
if (REG_P (x) && HARD_REGISTER_P (x))
- SET_HARD_REG_BIT (uses, REGNO (x));
+ for (i = REGNO (x); i < END_REGNO (x); i++)
+   SET_HARD_REG_BIT (uses, i);
  }
   }
+
+  dead_debug_local_finish (&debug, NULL);
 }
 
 /* Return whether basic block PRO can get the prologue.  It can not if it
--- gcc/testsuite/gcc.dg/pr65779.c.jj   2016-01-19 13:53:13.534358036 +0100
+++ gcc/testsuite/gcc.dg/pr65779.c  2016-01-19 13:53:13.534358036 +0100
@@ -0,0 +1,42 @@
+/* PR debug/65779 */
+/* { dg-do assemble } */
+/* { dg-options "-O2 -fcompare-debug" } */

Re: [PATCH] Fix debug info handling in prepare_shrink_wrap (PR debug/65779)

2016-01-19 Thread Bernd Schmidt


On 01/19/2016 02:08 PM, Jakub Jelinek wrote:

On Tue, Jan 19, 2016 at 01:27:32PM +0100, Bernd Schmidt wrote:

Is there a way to merge these two blocks (e.g. by moving this to the start
of the loop and testing for insn or BB_HEAD)?


Sure, like this?


That's ok. I'm assuming you know best how to use the dead_debug stuff.


Bernd

Re: [C++] Add -fnull-this-pointer

2016-01-19 Thread Trevor Saunders

On Tue, Jan 19, 2016 at 01:11:44PM +0100, Jan Hubicka wrote:
> Hi,
> according to Trevor, the assumption about THIS pointer being non-NULL breaks

That was Markus, not me.

> several bigger C++ packages (definitly including Firefox, but I believe
> kdevelop was mentioned, too).  This patch makes the feature to be controlable
> by a dedicated flag.  I am not sure about the default. We now have ubsan check
> for the bug so I would hope the codebases to be updated soon, but it did not
> happen for Firefox for quite a while despite the fact that Martin Liska 
> reported
> it.
> 
> This patch defaults to -fno-null-this-pointer, but I would be OK with changing

fwiw I find the naming a bit confusing maybe I'm just tired but it takes
some puzlling for me to know which way is being strict and which way is
allowing this.

> the default and setting it on only in GCC 6. Main point of the patch is to
> avoid need of those packages to be built with -fno-delete-null-pointer-checks
> (which still subsumes the flag).

Personally I'd rather try and be strict.  I suspect it often will be
easy to find and fix the bugs when the optimization is enabled.  Of
course if some projects don't care they can pass flags themselves.

Trev

> 
> The patch is bit inconsistent, becuase C++ FE wil still assume that this 
> pointer
> is non-NULL when expanding multiple inheritance accesses.  We did this from
> very beginning. I do not know FE enough to see if it is easy to change the
> behaviour here or if it is desired.
> 
> Bootstrapped/regtsted x86_64-linux.
> 
> Honza
> 
>   * c-family/c.opt (fnull-this-pointer): New flag.
>   (nonnull_arg_p): Honnor flag_null_this_pointer.
> Index: tree.c
> ===
> --- tree.c(revision 232553)
> +++ tree.c(working copy)
> @@ -14016,6 +14022,7 @@ nonnull_arg_p (const_tree arg)
>/* THIS argument of method is always non-NULL.  */
>if (TREE_CODE (TREE_TYPE (cfun->decl)) == METHOD_TYPE
>&& arg == DECL_ARGUMENTS (cfun->decl)
> +  && flag_null_this_pointer
>&& flag_delete_null_pointer_checks)
>  return true;
>  
> Index: c-family/c.opt
> ===
> --- c-family/c.opt(revision 232553)
> +++ c-family/c.opt(working copy)
> @@ -1321,6 +1321,10 @@ Enum(ivar_visibility) String(public) Val
>  EnumValue
>  Enum(ivar_visibility) String(package) Value(IVAR_VISIBILITY_PACKAGE)
>  
> +fnull-this-pointer
> +C++ ObjC++ Optimization Report Var(flag_null_this_pointer)
> +Allow calling methods of NULL pointer
> +
>  fnonansi-builtins
>  C++ ObjC++ Var(flag_no_nonansi_builtin, 0)
>  
> Index: doc/invoke.texi
> ===
> --- doc/invoke.texi   (revision 232553)
> +++ doc/invoke.texi   (working copy)
> @@ -232,6 +232,7 @@ Objective-C and Objective-C++ Dialects}.
>  -fobjc-std=objc1 @gol
>  -fno-local-ivars @gol
>  -fivar-visibility=@r{[}public@r{|}protected@r{|}private@r{|}package@r{]} @gol
> +-fnull-this-pointer @gol
>  -freplace-objc-classes @gol
>  -fzero-link @gol
>  -gen-decls @gol
> @@ -2361,6 +2362,11 @@ errors if these functions are not inline
>  Disable Wpedantic warnings about constructs used in MFC, such as implicit
>  int and getting a pointer to member function via non-standard syntax.
>  
> +@item -fnull-this-pointer
> +@opindex fnull-this-pointer
> +Disable optimization which take advantage of the fact that calling method
> +of @code{NULL} pointer is undefined.
> +
>  @item -fno-nonansi-builtins
>  @opindex fno-nonansi-builtins
>  Disable built-in declarations of functions that are not mandated by

Re: [PATCH 2/4] Equate MEM_REFs and ARRAY_REFs in tree-ssa-scopedtables.c

2016-01-19 Thread Alan Lawrence


On 19/01/16 09:46, Christophe Lyon wrote:

On 19 January 2016 at 04:05, H.J. Lu  wrote:

On Thu, Dec 24, 2015 at 3:55 AM, Alan Lawrence  wrote:

This version changes the test cases to fix failures on some platforms, by
rewriting the initializers so that they aren't pushed out to the constant pool.

gcc/ChangeLog:

 * tree-ssa-scopedtables.c (avail_expr_hash): Hash MEM_REF and ARRAY_REF
 using get_ref_base_and_extent.
 (equal_mem_array_ref_p): New.
 (hashable_expr_equal_p): Add call to previous.



This caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69352



Hi Alan,

This patch also caused regressions on arm-none-linux-gnueabihf
with GCC configured as:
--with-thumb --with-cpu=cortex-a57 --with-fpu=crypto-neon-fp-armv8

These tests now fail:
gcc.dg/torture/pr61742.c   -O2  (test for excess errors)
gcc.dg/torture/pr61742.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  (test for excess errors)
gcc.dg/torture/pr61742.c   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  (test for excess errors)
gcc.dg/torture/pr61742.c   -O3 -g  (test for excess errors)



Hmm, I still see these passing, both natively on arm-none-linux-gnueabihf and 
with a cross-build. hf implies --with-float=hard, right? Do you see what the 
error messages are?


Thanks, Alan

Re: [PATCH][RFC][Offloading] Fix PR68463

2016-01-19 Thread Ilya Verbin

On Tue, Jan 19, 2016 at 10:36:28 +0100, Jakub Jelinek wrote:
> On Tue, Jan 19, 2016 at 09:57:01AM +0100, Richard Biener wrote:
> > On Mon, 18 Jan 2016, Ilya Verbin wrote:
> > > On Fri, Jan 15, 2016 at 09:15:01 +0100, Richard Biener wrote:
> > > > On Fri, 15 Jan 2016, Ilya Verbin wrote:
> > > > > II) The __offload_func_table, __offload_funcs_end, 
> > > > > __offload_var_table,
> > > > > __offload_vars_end are now provided by the linker script, instead of
> > > > > crtoffload{begin,end}.o, this allows to surround all offload objects, 
> > > > > even
> > > > > those that are not claimed by lto-plugin.
> > > > > Unfortunately it works only with ld, but doen't work with gold, 
> > > > > because
> > > > > https://sourceware.org/bugzilla/show_bug.cgi?id=15373
> > > > > Any thoughts how to enable this linker script for gold?
> > > > 
> > > > The easiest way would probably to add this handling to the default
> > > > "linker script" in gold.  I don't see an easy way around requiring
> > > > changes to gold here - maybe dumping the default linker script from
> > > > bfd and injecting the rules with some scripting so you have a complete
> > > > script.  Though likely gold won't grok that result.
> > > > 
> > > > Really a question for Ian though.
> > > 
> > > Or the gcc driver can add crtoffload{begin,end}.o, but the problem is 
> > > that it
> > > can't determine whether the program contains offloading or not.  So it 
> > > can add
> > > them to all -fopenmp/-fopenacc programs, if the compiler was configured 
> > > with
> > > --enable-offload-targets=...  The overhead would be about 340 bytes for
> > > binaries which doesn't use offloading.  Is this acceptable?  (Jakub?)
> > 
> > Can lto-wrapper add them as plugin outputs?  Or does that wreck ordering?

Currently it's implemented this way, but it will not work after my patch,
because e.g. offload-without-lto.o and offload-with-lto.o will be linked in
this order:
offload-without-lto.o, crtoffloadbegin.o, offload-with-lto.o, crtoffloadend.o
^
(will be not claimed by the plugin)

But we need this one:
crtoffloadbegin.o, offload-without-lto.o, offload-with-lto.o, crtoffloadend.o

> Yeah, if that would work, it would be certainly appreciated, one thing is
> wasting .text space and relocations in all -fopenmp programs (for -fopenacc
> programs one kind of assumes there will be some offloading in there),
> another one some extra constructor/destructor or what that would be even
> worse.

They contain only 5 symbols, without constructors/destructors.

  -- Ilya

Re: [PATCH] Fix ICE with asm "m" (stmt-expr) operand (PR middle-end/67653)

2016-01-19 Thread Jakub Jelinek

On Tue, Jan 19, 2016 at 10:00:00AM +0100, Richard Biener wrote:
> On Tue, 19 Jan 2016, Jakub Jelinek wrote:
> > Here is an attempt to fix ICE on statement expression in "m" asm input
> > operand.  The problem is that gimplify_asm_expr attempts to mark it
> > addressable, but that can be just too late, a temporary the stmt-expression
> > gimplifies to might not be addressable and may be used already in the
> > gimplified code.  Normally the C/C++ FEs attempt to mark the operand
> > addressable already, but in case of statement expression the temporaries
> > might not exist yet.
> > The patch turns also the PR29119 testcase into invalid test, but you've
> > already said in that PR it should be invalid and I agree with that.
> 
> Hmm, but can't we detect this in the FE?

We could diagnose a statement expression in "m", but not sure if that is all
that can get wrong, or if all statement expressions are problematic.

> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> What happens if we just do _not_ mark the memory input addressable?
> Shouldn't IRA/LRA in the end satisfy the constraint by spilling
> a non-memory input and using the spill slot?

Well, if you want to make broken testcases work, it is always possible
to call say prepare_gimple_addressable, but I'd think it is preferrable
to tell people that what they do is really going to do something different
from what they expect (that the operand, while being a memory input, will
be some temporary containing a copy of the value rather than than the
variable itself.

Jakub

Re: [PATCH 2/4] Equate MEM_REFs and ARRAY_REFs in tree-ssa-scopedtables.c

2016-01-19 Thread Christophe Lyon

On 19 January 2016 at 14:22, Alan Lawrence  wrote:
> On 19/01/16 09:46, Christophe Lyon wrote:
>>
>> On 19 January 2016 at 04:05, H.J. Lu  wrote:
>>>
>>> On Thu, Dec 24, 2015 at 3:55 AM, Alan Lawrence 
>>> wrote:

 This version changes the test cases to fix failures on some platforms,
 by
 rewriting the initializers so that they aren't pushed out to the
 constant pool.

 gcc/ChangeLog:

  * tree-ssa-scopedtables.c (avail_expr_hash): Hash MEM_REF and
 ARRAY_REF
  using get_ref_base_and_extent.
  (equal_mem_array_ref_p): New.
  (hashable_expr_equal_p): Add call to previous.

>>>
>>> This caused:
>>>
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69352
>>>
>>
>> Hi Alan,
>>
>> This patch also caused regressions on arm-none-linux-gnueabihf
>> with GCC configured as:
>> --with-thumb --with-cpu=cortex-a57 --with-fpu=crypto-neon-fp-armv8
>>
>> These tests now fail:
>> gcc.dg/torture/pr61742.c   -O2  (test for excess errors)
>> gcc.dg/torture/pr61742.c   -O2 -flto -fno-use-linker-plugin
>> -flto-partition=none  (test for excess errors)
>> gcc.dg/torture/pr61742.c   -O3 -fomit-frame-pointer -funroll-loops
>> -fpeel-loops -ftracer -finline-functions  (test for excess errors)
>> gcc.dg/torture/pr61742.c   -O3 -g  (test for excess errors)
>>
>
> Hmm, I still see these passing, both natively on arm-none-linux-gnueabihf
> and with a cross-build. hf implies --with-float=hard, right? Do you see what
> the error messages are?
>

Ha! gas complains that "IT blocks containing 32-bit Thumb instructions
are deprecated in ARMv8"

This is PR67591:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67591

So, not related to your patch in fact. Sorry for the noise.

Christophe.

> Thanks, Alan
>

Re: [RFC] [nvptx] Try to cope with cuLaunchKernel returning CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES

2016-01-19 Thread Alexander Monakov

On Tue, 19 Jan 2016, Thomas Schwinge wrote:

> Hi!
> 
> With nvptx offloading, in one OpenACC test case, we're running into the
> following fatal error (GOMP_DEBUG=1 output):
> 
> [...]
> info: Function properties for 'LBM_performStreamCollide$_omp_fn$0':
> info: used 87 registers, 0 stack, 8 bytes smem, 328 bytes cmem[0], 80 
> bytes cmem[2], 0 bytes lmem
> [...]
>   nvptx_exec: kernel LBM_performStreamCollide$_omp_fn$0: launch gangs=32, 
> workers=32, vectors=32
> 
> libgomp: cuLaunchKernel error: too many resources requested for launch
> 
> Very likely this means that the number of registers used in this function
> ("used 87 registers"), multiplied by the thread block size (workers *
> vectors, "workers=32, vectors=32"), exceeds the hardware maximum.

Yes, today most CUDA GPUs allow 64K regs per block, some allow 32K, so
87*32*32 definitely overflows that limit.  A reference is available in CUDA C
Programming, appendix G, table 13:
http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#compute-capabilities
 
> (One problem certainly might be that we're currently not doing any
> register allocation for nvptx, as far as I remember based on the idea
> that PTX is only a "virtual ISA", and the PTX JIT compiler would "fix
> this up" for us -- which I'm not sure it actually is doing?)

(well, if you want I can point out that
 1) GCC never emits launch bounds so PTX JIT has to guess limits -- that's
 something I'd like to play with in the future, time permitting
 2) OpenACC register copying at forks increases (pseudo-)register pressure
 3) I think if you inspect PTX code you'll see it used way more than 87 regs)

As for the proposed patch, does the OpenACC spec leave the implementation
freedom to spawn a different number of workers than requested?  (honest
question -- I didn't look at the spec that closely)

> Alternatively/additionally, we could try experimenting with using the
> following of enum CUjit_option "Online compiler and linker options":
[snip]
> ..., to have the PTX JIT reduce the number of live registers (if
> possible; I don't know), and/or could try experimenting with querying the
> active device, enum CUdevice_attribute "Device properties":
> 
> [...]
> CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_BLOCK = 12
> Maximum number of 32-bit registers available per block 
> [...]
> 
> ..., and use that in combination with each function's enum
> CUfunction_attribute "Function properties":
[snip]
> ... to determine an optimal number of threads per block given the number
> of registers (maybe just querying CU_FUNC_ATTRIBUTE_MAX_THREADS_PER_BLOCK
> would do that already?).

I have implemented that for OpenMP offloading, but also since CUDA 6.0 there's
cuOcc* (occupancy query) interface that allows to simply ask the driver about
the per-function launch limit.

Thanks.
Alexander

Re: [RFC] [nvptx] Try to cope with cuLaunchKernel returning CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES

2016-01-19 Thread Nathan Sidwell


On 01/19/16 06:49, Thomas Schwinge wrote:


(One problem certainly might be that we're currently not doing any
register allocation for nvptx, as far as I remember based on the idea
that PTX is only a "virtual ISA", and the PTX JIT compiler would "fix
this up" for us -- which I'm not sure it actually is doing?)


My understanding is that the JIT   compiler does register allocation.


int axis = get_oacc_ifn_dim_arg (call);
+  if (axis == GOMP_DIM_WORKER)
+{
+  /* libgomp's nvptx plugin might potentially modify
+dims[GOMP_DIM_WORKER].  */
+  return NULL_TREE;
+}


this is almost certainly wrong.   You're preventing constant folding in the 
compiler.


nathan

Re: C++ PATCH for c++/68586 (rejects-valid with enum in C++11)

2016-01-19 Thread Jason Merrill


OK, thanks.

Jason

RFA: MIPS: Fix race condition causing PR 69129

2016-01-19 Thread Nick Clifton

Hi Catherine, Hi Eric, Hi Matthew,

  GCC PR 69129 reports a problem with the MIPS backend:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69129

  I traced the problem down to a race condition in
  mips_compute_frame_info.  This calls mips_global_pointer, which
  through a torturous chain of inferior calls can end up with
  mips_get_cprestore_base_and_offset trying to use the information in
  the frame structure which has yet to be computed...

  The attached patch fixes the problem by moving the initialisation of
  the global_pointer field in the frame structure to after the args_size
  and hard_frame_pointer_offset fields have been initialised.

  Tested with no regressions on a mipsisa32-elf toolchain.  (I know that
  there are lots of different possible mips configurations.  I was not
  sure which one(s) I should test, so I chose one at random).

  OK to apply ?

Cheers
  Nick

gcc/ChangeLog
2016-01-19  Nick Clifton  

PR target/69129
* config/mips/mips.c (mips_compute_frame_info): Move the
initialisation of the global_pointer field to after the
initialisation of the hard_frame_pointer_offset and args_size
fields.

gcc/testsuite/ChangeLog
2016-01-19  Nick Clifton  

PR target/69129
* gcc.target/mips/pr69129.c: New testcase.

Index: gcc/config/mips/mips.c
===
--- gcc/config/mips/mips.c	(revision 232560)
+++ gcc/config/mips/mips.c	(working copy)
@@ -10347,8 +10347,6 @@
   memset (frame, 0, sizeof (*frame));
   size = get_frame_size ();
 
-  cfun->machine->global_pointer = mips_global_pointer ();
-
   /* The first two blocks contain the outgoing argument area and the $gp save
  slot.  This area isn't needed in leaf functions.  We can also skip it
  if we know that none of the called functions will use this space.
@@ -10375,6 +10373,26 @@
   frame->args_size = crtl->outgoing_args_size;
   frame->cprestore_size = MIPS_GP_SAVE_AREA_SIZE;
 }
+
+  /* MIPS16 code offsets the frame pointer by the size of the outgoing
+ arguments.  This tends to increase the chances of using unextended
+ instructions for local variables and incoming arguments.  */
+  if (TARGET_MIPS16)
+frame->hard_frame_pointer_offset = frame->args_size;
+
+  /* PR 69129: Beware of a possible race condition.  mips_global_pointer
+ might call mips_cfun_has_inflexible_gp_ref_p which in turn can call
+ mips_find_gp_ref which will iterate over the current insn sequence.
+ If any of these insns use the cprestore_save_slot_operand or
+ cprestore_load_slot_operand predicates in order to be recognised then
+ they will call mips_cprestore_address_p which calls
+ mips_get_cprestore_base_and_offset which expects the frame information
+ to be filled in...  In fact mips_get_cprestore_base_and_offset only
+ needs the args_size and hard_frame_pointer_offset fields to be filled
+ in, which is why the global_pointer field is initialised here and not
+ earlier.  */
+  cfun->machine->global_pointer = mips_global_pointer ();
+
   offset = frame->args_size + frame->cprestore_size;
 
   /* Move above the local variables.  */
@@ -10520,12 +10538,6 @@
 frame->acc_save_offset = frame->acc_sp_offset - offset;
   if (frame->num_cop0_regs > 0)
 frame->cop0_save_offset = frame->cop0_sp_offset - offset;
-
-  /* MIPS16 code offsets the frame pointer by the size of the outgoing
- arguments.  This tends to increase the chances of using unextended
- instructions for local variables and incoming arguments.  */
-  if (TARGET_MIPS16)
-frame->hard_frame_pointer_offset = frame->args_size;
 }
 
 /* Return the style of GP load sequence that is being used for the
--- /dev/null	2016-01-19 08:08:42.474962807 +
+++ gcc/testsuite/gcc.target/mips/pr69129.c	2016-01-19 13:18:58.554155529 +
@@ -0,0 +1,29 @@
+_Noreturn void fn1 (int) __attribute__((__visibility__("hidden")));
+
+void
+fn2 (void *p1)
+{
+  int a[7];
+  float *b;
+  int c, n;
+
+  if (c != p1) /* { dg-warning "comparison between pointer and integer" } */
+fn1 (1);
+
+  n = 0;
+  for (; c; n++)
+{
+  int d;
+  if (a[n] != d)
+	fn1(n);
+}
+
+  b = p1;
+
+  while (1)
+{
+  *b = 3.40282347e38f;
+  if (a[0])
+	return;
+}
+}

Re: [RFC] [nvptx] Try to cope with cuLaunchKernel returning CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES

2016-01-19 Thread Alexander Monakov

On Tue, 19 Jan 2016, Alexander Monakov wrote:
> > ... to determine an optimal number of threads per block given the number
> > of registers (maybe just querying CU_FUNC_ATTRIBUTE_MAX_THREADS_PER_BLOCK
> > would do that already?).
> 
> I have implemented that for OpenMP offloading, but also since CUDA 6.0 there's
> cuOcc* (occupancy query) interface that allows to simply ask the driver about
> the per-function launch limit.

Sorry, I should have mentioned that CU_FUNC_ATTRIBUTE_MAX_THREADS_PER_BLOCK is
indeed sufficient for limiting threads per block, which is trivially
translatable into workers per gang in OpenACC.  IMO it's also a cleaner
approach in this case, compared to iterative backoff (if, again, the
implementation is free to do that).

When mentioning cuOcc* I was thinking about finding an optimal number of
blocks per device, which is a different story.

Alexander

Re: C++ PATCH to suppress bogus -Wunused warning for parameter packs (PR c++/68965)

2016-01-19 Thread Marek Polacek

On Mon, Jan 18, 2016 at 11:17:04AM -0500, Jason Merrill wrote:
> But we do currently print
> 
> wa.C:1:24: warning: unused parameter ‘xs#0’ [-Wunused-parameter]
> wa.C:1:24: warning: unused parameter ‘xs#1’ [-Wunused-parameter]
> wa.C:1:24: warning: unused parameter ‘xs#2’ [-Wunused-parameter]
> 
> for that testcase, and I think your patch would remove this warning as well.

Right.  How about this instead?  In this version, I only set TREE_USED when
we actually use the pack, so we'll keep the warning for the testcase above.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2016-01-19  Marek Polacek  

PR c++/68965
* pt.c (tsubst_copy): Mark elements in expanded vector as used.

* g++.dg/cpp1y/parameter-pack-1.C: New test.
* g++.dg/cpp1y/parameter-pack-2.C: New test.

diff --git gcc/cp/pt.c gcc/cp/pt.c
index 866b4b1..6062ebe 100644
--- gcc/cp/pt.c
+++ gcc/cp/pt.c
@@ -14010,7 +14010,12 @@ tsubst_copy (tree t, tree args, tsubst_flags_t 
complain, tree in_decl)
  --c_inhibit_evaluation_warnings;
 
  if (TREE_CODE (expanded) == TREE_VEC)
-   len = TREE_VEC_LENGTH (expanded);
+   {
+ len = TREE_VEC_LENGTH (expanded);
+ /* Set TREE_USED for the benefit of -Wunused.  */
+ for (int i = 0; i < len; i++)
+   TREE_USED (TREE_VEC_ELT (expanded, i)) = true;
+   }
 
  if (expanded == error_mark_node)
return error_mark_node;
diff --git gcc/testsuite/g++.dg/cpp1y/parameter-pack-1.C 
gcc/testsuite/g++.dg/cpp1y/parameter-pack-1.C
index e69de29..27a6bf9 100644
--- gcc/testsuite/g++.dg/cpp1y/parameter-pack-1.C
+++ gcc/testsuite/g++.dg/cpp1y/parameter-pack-1.C
@@ -0,0 +1,23 @@
+// PR c++/68965
+// { dg-do compile { target c++14 } }
+// { dg-options "-Wall -Wextra" }
+
+auto count = [](auto&&... xs)
+{
+return sizeof...(xs);
+};
+
+struct count_struct
+{
+template
+auto operator()(Ts&&... xs)
+{
+return sizeof...(xs);
+}
+};
+
+int main()
+{
+count(1,2,3);
+count_struct{}(1,2,3);
+}
diff --git gcc/testsuite/g++.dg/cpp1y/parameter-pack-2.C 
gcc/testsuite/g++.dg/cpp1y/parameter-pack-2.C
index e69de29..9520875 100644
--- gcc/testsuite/g++.dg/cpp1y/parameter-pack-2.C
+++ gcc/testsuite/g++.dg/cpp1y/parameter-pack-2.C
@@ -0,0 +1,21 @@
+// PR c++/68965
+// { dg-do compile { target c++14 } }
+// { dg-options "-Wall -Wextra" }
+
+auto count = [](auto&&... xs) // { dg-warning "unused parameter" }
+{
+};
+
+struct count_struct
+{
+template
+auto operator()(Ts&&... xs) // { dg-warning "unused parameter" }
+{
+}
+};
+
+int main()
+{
+count(1,2,3);
+count_struct{}(1,2,3);
+}

Marek

Re: [PATCH] Fix ICE in vectorizable_store ().

2016-01-19 Thread Kirill Yukhin

Hello Richard,
On 15 Jan 12:54, Richard Biener wrote:
> On Fri, 15 Jan 2016, Kirill Yukhin wrote:
> 
> > Hello,
> > Thet patch in the bottom adds check if rhs is "useless_type_conversion_p"
> > in vectorizable_store () to avoid subsequent gcc_assert.
> > 
> > This change is very similar to [1].
> > 
> > Bootstrapped & regtest in progress.
> > 
> > Is it ok for main trunk if regtest pass?
> 
> Ok, but please add a testcase that is fixed.
I've updated the patch to allow RHSes w/o vectypes (like from built-ins)
And now bootstrap & regtest show no issues.

Unfortunatelly, this issue arised on huge Fortran workload (POP2)
only w/ LTO and I was unable to extract a case from it (spent few hours though).

So, may be this patch is acceptable w/o reg test?

--
Thanks, K

> 
> Thanks,
> Richard.
> 
> > gcc/
> > * tree-vect-stmts.c (vectorizable_store): Check
> > rhs vectype.
> > 
> > 
> > [1] - https://gcc.gnu.org/ml/gcc-patches/2015-11/msg01551.html
> > 
> > --
> > Thanks, K
> > 

diff --git b/gcc/tree-vect-stmts.c a/gcc/tree-vect-stmts.c
index 872fa07..2aaa335 100644
--- b/gcc/tree-vect-stmts.c
+++ a/gcc/tree-vect-stmts.c
@@ -5282,7 +5282,7 @@ vectorizable_store (gimple *stmt, gimple_stmt_iterator 
*gsi, gimple **vec_stmt,
 
   gcc_assert (gimple_assign_single_p (stmt));
 
-  tree vectype = STMT_VINFO_VECTYPE (stmt_info);
+  tree vectype = STMT_VINFO_VECTYPE (stmt_info), rhs_vectype = NULL_TREE;
   unsigned int nunits = TYPE_VECTOR_SUBPARTS (vectype);
 
   if (loop_vinfo)
@@ -5308,7 +5308,8 @@ vectorizable_store (gimple *stmt, gimple_stmt_iterator 
*gsi, gimple **vec_stmt,
 }
 
   op = gimple_assign_rhs1 (stmt);
-  if (!vect_is_simple_use (op, vinfo, &def_stmt, &dt))
+
+  if (!vect_is_simple_use (op, vinfo, &def_stmt, &dt, &rhs_vectype))
 {
   if (dump_enabled_p ())
 dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
@@ -5316,6 +5317,9 @@ vectorizable_store (gimple *stmt, gimple_stmt_iterator 
*gsi, gimple **vec_stmt,
   return false;
 }
 
+  if (rhs_vectype && !useless_type_conversion_p (vectype, rhs_vectype))
+return false;
+
   elem_type = TREE_TYPE (vectype);
   vec_mode = TYPE_MODE (vectype);

Re: C++ PATCH to suppress bogus -Wunused warning for parameter packs (PR c++/68965)

2016-01-19 Thread Jason Merrill


OK, thanks.

Jason

[PATCH] Fix bootstrap with older and non-GCC host compilers

2016-01-19 Thread Richard Biener


It also seems we're wrongly using values defined for the host while
looking at GIMPLE IL for the target.

Bootstrap / regtest running on x86_64-unknown-linux-gnu.

Ok?

Richard.

2016-01-19  Richard Biener  

* hsa-gen.c (get_memory_order_name): Use MEMMODEL_ constants
and name.
(get_memory_order): Likewise.

Index: gcc/hsa-gen.c
===
--- gcc/hsa-gen.c   (revision 232561)
+++ gcc/hsa-gen.c   (working copy)
@@ -4417,18 +4417,18 @@ get_memory_order_name (unsigned memmodel
 {
   switch (memmodel)
 {
-case __ATOMIC_RELAXED:
-  return "__ATOMIC_RELAXED";
-case __ATOMIC_CONSUME:
-  return "__ATOMIC_CONSUME";
-case __ATOMIC_ACQUIRE:
-  return "__ATOMIC_ACQUIRE";
-case __ATOMIC_RELEASE:
-  return "__ATOMIC_RELEASE";
-case __ATOMIC_ACQ_REL:
-  return "__ATOMIC_ACQ_REL";
-case __ATOMIC_SEQ_CST:
-  return "__ATOMIC_SEQ_CST";
+case MEMMODEL_RELAXED:
+  return "MEMMODEL_RELAXED";
+case MEMMODEL_CONSUME:
+  return "MEMMODEL_CONSUME";
+case MEMMODEL_ACQUIRE:
+  return "MEMMODEL_ACQUIRE";
+case MEMMODEL_RELEASE:
+  return "MEMMODEL_RELEASE";
+case MEMMODEL_ACQ_REL:
+  return "MEMMODEL_ACQ_REL";
+case MEMMODEL_SEQ_CST:
+  return "MEMMODEL_SEQ_CST";
 default:
   return NULL;
 }
@@ -4442,13 +4442,13 @@ get_memory_order (unsigned memmodel, loc
 {
   switch (memmodel)
 {
-case __ATOMIC_RELAXED:
+case MEMMODEL_RELAXED:
   return BRIG_MEMORY_ORDER_RELAXED;
-case __ATOMIC_ACQUIRE:
+case MEMMODEL_ACQUIRE:
   return BRIG_MEMORY_ORDER_SC_ACQUIRE;
-case __ATOMIC_RELEASE:
+case MEMMODEL_RELEASE:
   return BRIG_MEMORY_ORDER_SC_RELEASE;
-case __ATOMIC_ACQ_REL:
+case MEMMODEL_ACQ_REL:
   return BRIG_MEMORY_ORDER_SC_ACQUIRE_RELEASE;
 default:
   HSA_SORRY_ATV (location,

Re: [PATCH] Fix ICE in vectorizable_store ().

2016-01-19 Thread Richard Biener

On Tue, 19 Jan 2016, Kirill Yukhin wrote:

> Hello Richard,
> On 15 Jan 12:54, Richard Biener wrote:
> > On Fri, 15 Jan 2016, Kirill Yukhin wrote:
> > 
> > > Hello,
> > > Thet patch in the bottom adds check if rhs is "useless_type_conversion_p"
> > > in vectorizable_store () to avoid subsequent gcc_assert.
> > > 
> > > This change is very similar to [1].
> > > 
> > > Bootstrapped & regtest in progress.
> > > 
> > > Is it ok for main trunk if regtest pass?
> > 
> > Ok, but please add a testcase that is fixed.
> I've updated the patch to allow RHSes w/o vectypes (like from built-ins)
> And now bootstrap & regtest show no issues.
> 
> Unfortunatelly, this issue arised on huge Fortran workload (POP2)
> only w/ LTO and I was unable to extract a case from it (spent few hours 
> though).
> 
> So, may be this patch is acceptable w/o reg test?

Ok.

Thanks,
Richard.

> --
> Thanks, K
> 
> > 
> > Thanks,
> > Richard.
> > 
> > > gcc/
> > >   * tree-vect-stmts.c (vectorizable_store): Check
> > >   rhs vectype.
> > > 
> > > 
> > > [1] - https://gcc.gnu.org/ml/gcc-patches/2015-11/msg01551.html
> > > 
> > > --
> > > Thanks, K
> > > 
> 
> diff --git b/gcc/tree-vect-stmts.c a/gcc/tree-vect-stmts.c
> index 872fa07..2aaa335 100644
> --- b/gcc/tree-vect-stmts.c
> +++ a/gcc/tree-vect-stmts.c
> @@ -5282,7 +5282,7 @@ vectorizable_store (gimple *stmt, gimple_stmt_iterator 
> *gsi, gimple **vec_stmt,
>  
>gcc_assert (gimple_assign_single_p (stmt));
>  
> -  tree vectype = STMT_VINFO_VECTYPE (stmt_info);
> +  tree vectype = STMT_VINFO_VECTYPE (stmt_info), rhs_vectype = NULL_TREE;
>unsigned int nunits = TYPE_VECTOR_SUBPARTS (vectype);
>  
>if (loop_vinfo)
> @@ -5308,7 +5308,8 @@ vectorizable_store (gimple *stmt, gimple_stmt_iterator 
> *gsi, gimple **vec_stmt,
>  }
>  
>op = gimple_assign_rhs1 (stmt);
> -  if (!vect_is_simple_use (op, vinfo, &def_stmt, &dt))
> +
> +  if (!vect_is_simple_use (op, vinfo, &def_stmt, &dt, &rhs_vectype))
>  {
>if (dump_enabled_p ())
>  dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> @@ -5316,6 +5317,9 @@ vectorizable_store (gimple *stmt, gimple_stmt_iterator 
> *gsi, gimple **vec_stmt,
>return false;
>  }
>  
> +  if (rhs_vectype && !useless_type_conversion_p (vectype, rhs_vectype))
> +return false;
> +
>elem_type = TREE_TYPE (vectype);
>vec_mode = TYPE_MODE (vectype);
>  
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: [PATCH][ARM] PR target/69135: Mark ARMv8 vcvt instructions as unconditional

2016-01-19 Thread Ramana Radhakrishnan

On Fri, Jan 15, 2016 at 3:05 PM, Kyrill Tkachov
 wrote:
> Hi all,
>
> In this PR the ARMv8 vcvt instructions end up being conditionalised when
> they don't have a conditional form.
> setting the predicable attribute to "no" is not enough. We need to set the
> "conds" attribute to unconditional as well.
>
> Bootstrapped and tested on arm-none-linux-gnueabihf.
> Ok for trunk and GCC 5?
>

Ok and for all afflicted release branches. Please check 4.9 just in case.

Ramana

> Thanks,
> Kyrill
>
> 2016-01-15  Kyrylo Tkachov  
>
> PR target/69135
> * config/arm/vfp.md (lsi2): Set "conds"
> attribute to unconditional.  Remove %? from output template.
>
> 2016-01-15  Kyrylo Tkachov  
>
> PR target/69135
> * gcc.target/arm/pr69135_1.c: New test.

Re: [C++] Add -fnull-this-pointer

2016-01-19 Thread Richard Biener

On Tue, Jan 19, 2016 at 2:18 PM, Trevor Saunders  wrote:
> On Tue, Jan 19, 2016 at 01:11:44PM +0100, Jan Hubicka wrote:
>> Hi,
>> according to Trevor, the assumption about THIS pointer being non-NULL breaks
>
> That was Markus, not me.
>
>> several bigger C++ packages (definitly including Firefox, but I believe
>> kdevelop was mentioned, too).  This patch makes the feature to be controlable
>> by a dedicated flag.  I am not sure about the default. We now have ubsan 
>> check
>> for the bug so I would hope the codebases to be updated soon, but it did not
>> happen for Firefox for quite a while despite the fact that Martin Liska 
>> reported
>> it.
>>
>> This patch defaults to -fno-null-this-pointer, but I would be OK with 
>> changing
>
> fwiw I find the naming a bit confusing maybe I'm just tired but it takes
> some puzlling for me to know which way is being strict and which way is
> allowing this.
>
>> the default and setting it on only in GCC 6. Main point of the patch is to
>> avoid need of those packages to be built with -fno-delete-null-pointer-checks
>> (which still subsumes the flag).
>
> Personally I'd rather try and be strict.  I suspect it often will be
> easy to find and fix the bugs when the optimization is enabled.  Of
> course if some projects don't care they can pass flags themselves.

Agreed.  As we already have a flag that can be used as a workaround I don't see
a reason to add another more specific one.  That just makes it a
lesser incentive
for people to fix their code.

Richard.

> Trev
>
>>
>> The patch is bit inconsistent, becuase C++ FE wil still assume that this 
>> pointer
>> is non-NULL when expanding multiple inheritance accesses.  We did this from
>> very beginning. I do not know FE enough to see if it is easy to change the
>> behaviour here or if it is desired.
>>
>> Bootstrapped/regtsted x86_64-linux.
>>
>> Honza
>>
>>   * c-family/c.opt (fnull-this-pointer): New flag.
>>   (nonnull_arg_p): Honnor flag_null_this_pointer.
>> Index: tree.c
>> ===
>> --- tree.c(revision 232553)
>> +++ tree.c(working copy)
>> @@ -14016,6 +14022,7 @@ nonnull_arg_p (const_tree arg)
>>/* THIS argument of method is always non-NULL.  */
>>if (TREE_CODE (TREE_TYPE (cfun->decl)) == METHOD_TYPE
>>&& arg == DECL_ARGUMENTS (cfun->decl)
>> +  && flag_null_this_pointer
>>&& flag_delete_null_pointer_checks)
>>  return true;
>>
>> Index: c-family/c.opt
>> ===
>> --- c-family/c.opt(revision 232553)
>> +++ c-family/c.opt(working copy)
>> @@ -1321,6 +1321,10 @@ Enum(ivar_visibility) String(public) Val
>>  EnumValue
>>  Enum(ivar_visibility) String(package) Value(IVAR_VISIBILITY_PACKAGE)
>>
>> +fnull-this-pointer
>> +C++ ObjC++ Optimization Report Var(flag_null_this_pointer)
>> +Allow calling methods of NULL pointer
>> +
>>  fnonansi-builtins
>>  C++ ObjC++ Var(flag_no_nonansi_builtin, 0)
>>
>> Index: doc/invoke.texi
>> ===
>> --- doc/invoke.texi   (revision 232553)
>> +++ doc/invoke.texi   (working copy)
>> @@ -232,6 +232,7 @@ Objective-C and Objective-C++ Dialects}.
>>  -fobjc-std=objc1 @gol
>>  -fno-local-ivars @gol
>>  -fivar-visibility=@r{[}public@r{|}protected@r{|}private@r{|}package@r{]} 
>> @gol
>> +-fnull-this-pointer @gol
>>  -freplace-objc-classes @gol
>>  -fzero-link @gol
>>  -gen-decls @gol
>> @@ -2361,6 +2362,11 @@ errors if these functions are not inline
>>  Disable Wpedantic warnings about constructs used in MFC, such as implicit
>>  int and getting a pointer to member function via non-standard syntax.
>>
>> +@item -fnull-this-pointer
>> +@opindex fnull-this-pointer
>> +Disable optimization which take advantage of the fact that calling method
>> +of @code{NULL} pointer is undefined.
>> +
>>  @item -fno-nonansi-builtins
>>  @opindex fno-nonansi-builtins
>>  Disable built-in declarations of functions that are not mandated by

Re: [hsa merge 00/10] Merge of HSA branch

2016-01-19 Thread Richard Biener

On Tue, Jan 19, 2016 at 11:45 AM, Martin Jambor  wrote:
> Hi,
>
> On Wed, Jan 13, 2016 at 06:39:25PM +0100, Martin Jambor wrote:
>> Hi,
>>
>> this is hopefully the last big re-post of the HSA patches...
>
> I have committed the combined patch as revision 232549 after
> bootstrapping and testing all languages on x86_64-linux and i686-linux
> and verifying I did not break powerpc-aix more than it was before.
>
> I will be updating gcc offloading wiki in a few days, meanwhile you
> can use README.hsa file from the branch:
>
> https://gcc.gnu.org/viewcvs/gcc/branches/hsa/gcc/README.hsa?view=markup
>
> I will be also posting followup testsuite patches.
>
>>
>> Thanks everybody for patience and feedback.  While we are of course
>> opened for mor more of it, let's also hope the approval process will
>> finish soon as it should now.
>
> I can't but repeat my thanks, especially to Jakub for the review and
> help with the many last-minute issues.

I think the merge warrants a NEWS entry on gcc.gnu.org/

Richard.

> Martin

Re: [PATCH] fix gimplification of call parameters (PR cilkplus/69267)

2016-01-19 Thread Ryan Burn

Does this look ok?

> On Jan 15, 2016, at 5:41 PM, Ryan Burn  wrote:
> 
> This patch changes the function cilk_gimplify_call_params_in_spawned_fn to 
> use gimplify_arg instead of gimplify_expr. It fixes an ICE when calling a 
> function with a constructed empty class as the argument.
> 
> Bootstrapped and regression tested on x86_64-linux.
> 
> 2016-01-15  Ryan Burn  
> 
>PR cilkplus/69267
>* cilk.c (cilk_gimplify_call_params_in_spawned_fn): Change to use
>gimplify_arg. Removed superfluous post_p argument.
>* c-family.h (cilk_gimplify_call_params_in_spawned_fn): Removed
>superfluous post_p argument.
>* c-gimplify.c (c_gimplify_expr): Likewise.
> 
> gcc/cp/ChangeLog:
> 
> 2016-01-15  Ryan Burn  
> 
>PR cilkplus/69267
>* cp-gimplify.c (cilk_cp_gimplify_call_params_in_spawned_fn): Removed
>superfluous post_p argument in call to
>cilk_gimplify_call_params_in_spawned_fn.
> 
> gcc/testsuite/ChangeLog:
> 
> 2016-01-15  Ryan Burn
> 
> PR cilkplus/69267
> * g++.dg/cilk-plus/CK/pr69267.cc: New test.
> 
> 
> 
>

Re: [PATCH] Fix bootstrap with older and non-GCC host compilers

2016-01-19 Thread Jakub Jelinek

On Tue, Jan 19, 2016 at 03:19:17PM +0100, Richard Biener wrote:
> 
> It also seems we're wrongly using values defined for the host while
> looking at GIMPLE IL for the target.
> 
> Bootstrap / regtest running on x86_64-unknown-linux-gnu.
> 
> Ok?
> 
> Richard.
> 
> 2016-01-19  Richard Biener  
> 
>   * hsa-gen.c (get_memory_order_name): Use MEMMODEL_ constants
>   and name.
>   (get_memory_order): Likewise.
> 
> Index: gcc/hsa-gen.c
> ===
> --- gcc/hsa-gen.c (revision 232561)
> +++ gcc/hsa-gen.c (working copy)
> @@ -4417,18 +4417,18 @@ get_memory_order_name (unsigned memmodel
>  {
>switch (memmodel)
>  {
> -case __ATOMIC_RELAXED:
> -  return "__ATOMIC_RELAXED";
> -case __ATOMIC_CONSUME:
> -  return "__ATOMIC_CONSUME";
> -case __ATOMIC_ACQUIRE:
> -  return "__ATOMIC_ACQUIRE";
> -case __ATOMIC_RELEASE:
> -  return "__ATOMIC_RELEASE";
> -case __ATOMIC_ACQ_REL:
> -  return "__ATOMIC_ACQ_REL";
> -case __ATOMIC_SEQ_CST:
> -  return "__ATOMIC_SEQ_CST";
> +case MEMMODEL_RELAXED:
> +  return "MEMMODEL_RELAXED";
> +case MEMMODEL_CONSUME:
> +  return "MEMMODEL_CONSUME";
> +case MEMMODEL_ACQUIRE:
> +  return "MEMMODEL_ACQUIRE";
> +case MEMMODEL_RELEASE:
> +  return "MEMMODEL_RELEASE";
> +case MEMMODEL_ACQ_REL:
> +  return "MEMMODEL_ACQ_REL";
> +case MEMMODEL_SEQ_CST:
> +  return "MEMMODEL_SEQ_CST";

The case changes are ok, though it is not handling various
other memory models (MEMMOVED_SYNC_{ACQUIRE,RELEASE,SEQ_CST}).
For the returned strings, that is used in user visible context (warning),
so I think neither is ok, better use "relaxed", "consume", "acquire",
"release", "acq_rel", "seq_cst" (and the rest are internals for __sync_
builtins).

>  default:
>return NULL;
>  }
> @@ -4442,13 +4442,13 @@ get_memory_order (unsigned memmodel, loc
>  {
>switch (memmodel)
>  {
> -case __ATOMIC_RELAXED:
> +case MEMMODEL_RELAXED:
>return BRIG_MEMORY_ORDER_RELAXED;
> -case __ATOMIC_ACQUIRE:
> +case MEMMODEL_ACQUIRE:
>return BRIG_MEMORY_ORDER_SC_ACQUIRE;
> -case __ATOMIC_RELEASE:
> +case MEMMODEL_RELEASE:
>return BRIG_MEMORY_ORDER_SC_RELEASE;
> -case __ATOMIC_ACQ_REL:
> +case MEMMODEL_ACQ_REL:
>return BRIG_MEMORY_ORDER_SC_ACQUIRE_RELEASE;
>  default:
>HSA_SORRY_ATV (location,

This LGTM (though I'm really surprised it doesn't have seq_cst,
e.g. OpenMP #pragma omp atomic will be either relaxed, or seq_cst.

Jakub

Re: [C++] Add -fnull-this-pointer

2016-01-19 Thread Markus Trippelsdorf

On 2016.01.19 at 15:23 +0100, Richard Biener wrote:
> On Tue, Jan 19, 2016 at 2:18 PM, Trevor Saunders  
> wrote:
> > On Tue, Jan 19, 2016 at 01:11:44PM +0100, Jan Hubicka wrote:
> >> Hi,
> >> according to Trevor, the assumption about THIS pointer being non-NULL 
> >> breaks
> >
> > That was Markus, not me.
> >
> >> several bigger C++ packages (definitly including Firefox, but I believe
> >> kdevelop was mentioned, too).  This patch makes the feature to be 
> >> controlable
> >> by a dedicated flag.  I am not sure about the default. We now have ubsan 
> >> check
> >> for the bug so I would hope the codebases to be updated soon, but it did 
> >> not
> >> happen for Firefox for quite a while despite the fact that Martin Liska 
> >> reported
> >> it.
> >>
> >> This patch defaults to -fno-null-this-pointer, but I would be OK with 
> >> changing
> >
> > fwiw I find the naming a bit confusing maybe I'm just tired but it takes
> > some puzlling for me to know which way is being strict and which way is
> > allowing this.
> >
> >> the default and setting it on only in GCC 6. Main point of the patch is to
> >> avoid need of those packages to be built with 
> >> -fno-delete-null-pointer-checks
> >> (which still subsumes the flag).
> >
> > Personally I'd rather try and be strict.  I suspect it often will be
> > easy to find and fix the bugs when the optimization is enabled.  Of
> > course if some projects don't care they can pass flags themselves.
> 
> Agreed.  As we already have a flag that can be used as a workaround I don't 
> see
> a reason to add another more specific one.  That just makes it a
> lesser incentive
> for people to fix their code.

Well, -fno-delete-null-pointer-checks is a big hammer, that disables way
to many optimizations than really necessary.

-- 
Markus

Re: [RFC] [nvptx] Try to cope with cuLaunchKernel returning CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES

2016-01-19 Thread Thomas Schwinge

Hi!

On Tue, 19 Jan 2016 08:47:02 -0500, Nathan Sidwell  wrote:
> On 01/19/16 06:49, Thomas Schwinge wrote:
> > int axis = get_oacc_ifn_dim_arg (call);
> > +  if (axis == GOMP_DIM_WORKER)
> > +{
> > +  /* libgomp's nvptx plugin might potentially modify
> > +dims[GOMP_DIM_WORKER].  */
> > +  return NULL_TREE;
> > +}
> 
> this is almost certainly wrong.   You're preventing constant folding in the 
> compiler.

Yes, because if libgomp can modify dims[GOMP_DIM_WORKER], in the compiler
we can no assume it to be constant?  (Did result in a run-time test
verification failure.)  Of course, my hammer might be a too big one
(which is why this is a RFC).

Grüße
 Thomas

Re: [C++] Add -fnull-this-pointer

2016-01-19 Thread Jakub Jelinek

On Tue, Jan 19, 2016 at 03:37:02PM +0100, Markus Trippelsdorf wrote:
> > Agreed.  As we already have a flag that can be used as a workaround I don't 
> > see
> > a reason to add another more specific one.  That just makes it a
> > lesser incentive
> > for people to fix their code.
> 
> Well, -fno-delete-null-pointer-checks is a big hammer, that disables way
> to many optimizations than really necessary.

But then perhaps it will be better incentive for the projects to fix their
cruft.  With a specialized option they will keep broken code forever.

Jakub

Re: [PATCH] ARM PR68620 (ICE with FP16 on armeb)

2016-01-19 Thread Alan Lawrence


On 19/01/16 11:15, Christophe Lyon wrote:


For neon_vdupn, I chose to implement neon_vdup_nv4hf and
neon_vdup_nv8hf instead of updating the VX iterator because I thought
it was not desirable to impact neon_vrev32.


Well, the same instruction will suffice for vrev32'ing vectors of HF just as
well as vectors of HI, so I think I'd argue that's harmless enough. To gain the
benefit, we'd need to update arm_evpc_neon_vrev with a few new cases, though.


Since this is more intrusive, I'd rather leave that part for later. OK?


Sure.


+#ifdef __ARM_BIG_ENDIAN
+  /* Here, 3 is (4-1) where 4 is the number of lanes. This is also the
+ right value for vectors with 8 lanes.  */
+#define __arm_lane(__vec, __idx) (__idx ^ 3)
+#else
+#define __arm_lane(__vec, __idx) __idx
+#endif
+


Looks right, but sounds... my concern here is that I'm hoping at some point we
will move the *other* vget/set_lane intrinsics to use GCC vector extensions
too. At which time (unlike __aarch64_lane which can be used everywhere) this
will be the wrong formula. Can we name (and/or comment) it to avoid misleading
anyone? The key characteristic seems to be that it is for vectors of 16-bit
elements only.


I'm not to follow, here. Looking at the patterns for
neon_vget_lane_*internal in neon.md,
I can see 2 flavours: one for VD, one for VQ2. The latter uses "halfelts".

Do you prefer that I create 2 macros (say __arm_lane and __arm_laneq),
that would be similar to the aarch64 ones (by computing the number of
lanes of the input vector), but the "q" one would use half the total
number of lanes instead?


That works for me! Sthg like:

#define __arm_lane(__vec, __idx) NUM_LANES(__vec) - __idx
#define __arm_laneq(__vec, __idx) (__idx & (NUM_LANES(__vec)/2)) + 
(NUM_LANES(__vec)/2 - __idx)

//or similarly
#define __arm_laneq(__vec, __idx) (__idx ^ (NUM_LANES(__vec)/2 - 1))

Alternatively I'd been thinking

#define __arm_lane_32xN(__idx) __idx ^ 1
#define __arm_lane_16xN(__idx) __idx ^ 3
#define __arm_lane_8xN(__idx) __idx ^ 7

Bear in mind PR64893 that we had on AArch64 :-(

Cheers, Alan

Re: [RFC] [nvptx] Try to cope with cuLaunchKernel returning CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES

2016-01-19 Thread Thomas Schwinge

Hi!

On Tue, 19 Jan 2016 17:07:17 +0300, Alexander Monakov  
wrote:
> On Tue, 19 Jan 2016, Alexander Monakov wrote:
> > > ... to determine an optimal number of threads per block given the number
> > > of registers (maybe just querying CU_FUNC_ATTRIBUTE_MAX_THREADS_PER_BLOCK
> > > would do that already?).
> > 
> > I have implemented that for OpenMP offloading, but also since CUDA 6.0 
> > there's
> > cuOcc* (occupancy query) interface that allows to simply ask the driver 
> > about
> > the per-function launch limit.

You mean you already have implemented something along the lines I
proposed?

> Sorry, I should have mentioned that CU_FUNC_ATTRIBUTE_MAX_THREADS_PER_BLOCK is
> indeed sufficient for limiting threads per block, which is trivially
> translatable into workers per gang in OpenACC.

That's good to know, thanks!

> IMO it's also a cleaner
> approach in this case, compared to iterative backoff (if, again, the
> implementation is free to do that).

It is not explicitly spelled out in OpenACC 2.0a, but it got clarified in
OpenACC 2.5.  See "2.5.7. num workers clause": "[...]  The implementation
may use a different value than specified based on limitations imposed by
the target architecture".

> When mentioning cuOcc* I was thinking about finding an optimal number of
> blocks per device, which is a different story.

:-)


Grüße
 Thomas

Re: [PATCH] Fix PR c++/69283 (auto deduction fails when ADL is required)

2016-01-19 Thread Jason Merrill


On 01/18/2016 10:55 PM, Patrick Palka wrote:

mark_used is wrongly diagnosing a use of a TEMPLATE_DECL (e.g. the call
to f1 in function f3 of auto-fn29.C below) for having an undeduced
'auto' return type.  This doesn't make sense, because an 'auto' used
inside a template doesn't get deduced until after the template is
instantiated.  So for a TEMPLATE_DECL we shouldn't diagnose a use of
undeduced 'auto' here.  After instantiation, presumably we will call
mark_used on the resulting FUNCTION_DECL which will check for undeduced
auto appropriately.
@@ -5112,7 +5112,9 @@ mark_used (tree decl, tsubst_flags_t complain)
|| DECL_LANG_SPECIFIC (decl) == NULL
|| DECL_THUNK_P (decl))
  {
-  if (!processing_template_decl && type_uses_auto (TREE_TYPE (decl)))
+  if (!processing_template_decl
+ && TREE_CODE (decl) != TEMPLATE_DECL
+ && type_uses_auto (TREE_TYPE (decl)))


How does a TEMPLATE_DECL get in here?  Does it have null DECL_LANG_SPECIFIC?

I'd think mark_used of a TEMPLATE_DECL should return after setting 
TREE_USED, there's nothing else to do with it.


Jason

Re: [ping] pending patches

2016-01-19 Thread Jason Merrill


On 01/05/2016 04:30 AM, Eric Botcazou wrote:

It doesn't look to me like DW_AT_endianity is applicable to array types
or members in DWARF 3/4; instead, it should be applied to the underlying
base type.


OK, the attached patch does that so is more invasive as expected.


Please document the reverse parameter to modified_type_die.  OK with 
that change.


Jason

Re: [PATCH] Fix bootstrap with older and non-GCC host compilers

2016-01-19 Thread Richard Biener

On Tue, 19 Jan 2016, Jakub Jelinek wrote:

> On Tue, Jan 19, 2016 at 03:19:17PM +0100, Richard Biener wrote:
> > 
> > It also seems we're wrongly using values defined for the host while
> > looking at GIMPLE IL for the target.
> > 
> > Bootstrap / regtest running on x86_64-unknown-linux-gnu.
> > 
> > Ok?
> > 
> > Richard.
> > 
> > 2016-01-19  Richard Biener  
> > 
> > * hsa-gen.c (get_memory_order_name): Use MEMMODEL_ constants
> > and name.
> > (get_memory_order): Likewise.
> > 
> > Index: gcc/hsa-gen.c
> > ===
> > --- gcc/hsa-gen.c   (revision 232561)
> > +++ gcc/hsa-gen.c   (working copy)
> > @@ -4417,18 +4417,18 @@ get_memory_order_name (unsigned memmodel
> >  {
> >switch (memmodel)
> >  {
> > -case __ATOMIC_RELAXED:
> > -  return "__ATOMIC_RELAXED";
> > -case __ATOMIC_CONSUME:
> > -  return "__ATOMIC_CONSUME";
> > -case __ATOMIC_ACQUIRE:
> > -  return "__ATOMIC_ACQUIRE";
> > -case __ATOMIC_RELEASE:
> > -  return "__ATOMIC_RELEASE";
> > -case __ATOMIC_ACQ_REL:
> > -  return "__ATOMIC_ACQ_REL";
> > -case __ATOMIC_SEQ_CST:
> > -  return "__ATOMIC_SEQ_CST";
> > +case MEMMODEL_RELAXED:
> > +  return "MEMMODEL_RELAXED";
> > +case MEMMODEL_CONSUME:
> > +  return "MEMMODEL_CONSUME";
> > +case MEMMODEL_ACQUIRE:
> > +  return "MEMMODEL_ACQUIRE";
> > +case MEMMODEL_RELEASE:
> > +  return "MEMMODEL_RELEASE";
> > +case MEMMODEL_ACQ_REL:
> > +  return "MEMMODEL_ACQ_REL";
> > +case MEMMODEL_SEQ_CST:
> > +  return "MEMMODEL_SEQ_CST";
> 
> The case changes are ok, though it is not handling various
> other memory models (MEMMOVED_SYNC_{ACQUIRE,RELEASE,SEQ_CST}).
> For the returned strings, that is used in user visible context (warning),
> so I think neither is ok, better use "relaxed", "consume", "acquire",
> "release", "acq_rel", "seq_cst" (and the rest are internals for __sync_
> builtins).
> 
> >  default:
> >return NULL;
> >  }
> > @@ -4442,13 +4442,13 @@ get_memory_order (unsigned memmodel, loc
> >  {
> >switch (memmodel)
> >  {
> > -case __ATOMIC_RELAXED:
> > +case MEMMODEL_RELAXED:
> >return BRIG_MEMORY_ORDER_RELAXED;
> > -case __ATOMIC_ACQUIRE:
> > +case MEMMODEL_ACQUIRE:
> >return BRIG_MEMORY_ORDER_SC_ACQUIRE;
> > -case __ATOMIC_RELEASE:
> > +case MEMMODEL_RELEASE:
> >return BRIG_MEMORY_ORDER_SC_RELEASE;
> > -case __ATOMIC_ACQ_REL:
> > +case MEMMODEL_ACQ_REL:
> >return BRIG_MEMORY_ORDER_SC_ACQUIRE_RELEASE;
> >  default:
> >HSA_SORRY_ATV (location,
> 
> This LGTM (though I'm really surprised it doesn't have seq_cst,
> e.g. OpenMP #pragma omp atomic will be either relaxed, or seq_cst.

I'll defer the diagnostic name change to Martin then and commit the
following to unbreak the build.

Richard.

2016-01-19  Richard Biener  

* hsa-gen.c (get_memory_order_name): Use MEMMODEL_ constants
and name.
(get_memory_order): Likewise.

Index: gcc/hsa-gen.c
===
--- gcc/hsa-gen.c   (revision 232564)
+++ gcc/hsa-gen.c   (working copy)
@@ -4417,17 +4417,17 @@ get_memory_order_name (unsigned memmodel
 {
   switch (memmodel)
 {
-case __ATOMIC_RELAXED:
+case MEMMODEL_RELAXED:
   return "__ATOMIC_RELAXED";
-case __ATOMIC_CONSUME:
+case MEMMODEL_CONSUME:
   return "__ATOMIC_CONSUME";
-case __ATOMIC_ACQUIRE:
+case MEMMODEL_ACQUIRE:
   return "__ATOMIC_ACQUIRE";
-case __ATOMIC_RELEASE:
+case MEMMODEL_RELEASE:
   return "__ATOMIC_RELEASE";
-case __ATOMIC_ACQ_REL:
+case MEMMODEL_ACQ_REL:
   return "__ATOMIC_ACQ_REL";
-case __ATOMIC_SEQ_CST:
+case MEMMODEL_SEQ_CST:
   return "__ATOMIC_SEQ_CST";
 default:
   return NULL;
@@ -4442,13 +4442,13 @@ get_memory_order (unsigned memmodel, loc
 {
   switch (memmodel)
 {
-case __ATOMIC_RELAXED:
+case MEMMODEL_RELAXED:
   return BRIG_MEMORY_ORDER_RELAXED;
-case __ATOMIC_ACQUIRE:
+case MEMMODEL_ACQUIRE:
   return BRIG_MEMORY_ORDER_SC_ACQUIRE;
-case __ATOMIC_RELEASE:
+case MEMMODEL_RELEASE:
   return BRIG_MEMORY_ORDER_SC_RELEASE;
-case __ATOMIC_ACQ_REL:
+case MEMMODEL_ACQ_REL:
   return BRIG_MEMORY_ORDER_SC_ACQUIRE_RELEASE;
 default:
   HSA_SORRY_ATV (location,

Re: [RFC] [nvptx] Try to cope with cuLaunchKernel returning CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES

2016-01-19 Thread Alexander Monakov

On Tue, 19 Jan 2016, Thomas Schwinge wrote:

> Hi!
> 
> On Tue, 19 Jan 2016 17:07:17 +0300, Alexander Monakov  
> wrote:
> > On Tue, 19 Jan 2016, Alexander Monakov wrote:
> > > > ... to determine an optimal number of threads per block given the number
> > > > of registers (maybe just querying 
> > > > CU_FUNC_ATTRIBUTE_MAX_THREADS_PER_BLOCK
> > > > would do that already?).
> > > 
> > > I have implemented that for OpenMP offloading, but also since CUDA 6.0 
> > > there's
> > > cuOcc* (occupancy query) interface that allows to simply ask the driver 
> > > about
> > > the per-function launch limit.
> 
> You mean you already have implemented something along the lines I
> proposed?

Yes, I was implementing OpenMP teams, and it made sense to add warps per block
limiting at the same time (i.e. query CU_FUNC_ATTRIBUTE_... and limit if
default or requested number of threads per team is too high).  I intend to
post that patch as part of a larger series shortly (but the patch itself is
simple enough, although a small tweak will be needed to make it apply to
OpenACC too).

Alexander

Re: [PATCH][ARM] PR target/69135: Mark ARMv8 vcvt instructions as unconditional

2016-01-19 Thread Kyrill Tkachov



On 19/01/16 14:21, Ramana Radhakrishnan wrote:

On Fri, Jan 15, 2016 at 3:05 PM, Kyrill Tkachov
 wrote:

Hi all,

In this PR the ARMv8 vcvt instructions end up being conditionalised when
they don't have a conditional form.
setting the predicable attribute to "no" is not enough. We need to set the
"conds" attribute to unconditional as well.

Bootstrapped and tested on arm-none-linux-gnueabihf.
Ok for trunk and GCC 5?


Ok and for all afflicted release branches. Please check 4.9 just in case.


Thanks,
This pattern was introduced for GCC 5, so 4.9 is not afflicted.
I did double-check.
Committed to trunk and GCC 5.

Kyrill


Ramana


Thanks,
Kyrill

2016-01-15  Kyrylo Tkachov  

 PR target/69135
 * config/arm/vfp.md (lsi2): Set "conds"
 attribute to unconditional.  Remove %? from output template.

2016-01-15  Kyrylo Tkachov  

 PR target/69135
 * gcc.target/arm/pr69135_1.c: New test.

Re: [PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr

2016-01-19 Thread Richard Biener

On Tue, Jan 19, 2016 at 4:13 PM, Christian Bruel  wrote:
>
>
> On 01/19/2016 04:01 PM, Christian Bruel wrote:
>>
>> Hi Richard,
>>
>> thanks for your input,
>>
>> On 01/18/2016 12:36 PM, Richard Biener wrote:
>>>
>>> On Fri, Jan 8, 2016 at 2:29 PM, Christian Bruel 
>>> wrote:

 When compiling code with attribute targets on arm or aarch64,
 vector_type_mode returns different results (eg Vmode or BLKmode)
 depending
 on the current simd flags that are not set between functions.

 for example the following code:

 #include 

 extern int8x8_t a;
 extern int8x8_t b;

 int16x8_t
 __attribute__ ((target("fpu=neon")))
 foo(void)
 {
  return vaddl_s8 (a, b);
 }

 Triggers gcc_asserts in copy_to_mode_regs while expanding NEON builtins
 ,
 because the mismatch and DECL_MODE current's TYPE_MODE used in
 expand_builtin for global variables.

 but the best explanation is in the vector_type_mode:
 /* Vector types need to re-check the target flags each time we report
   the machine mode.  We need to do this because attribute target can
   change the result of vector_mode_supported_p and have_regs_of_mode
   on a per-function basis.  Thus the TYPE_MODE of a VECTOR_TYPE can
   change on a per-function basis.  */

 I first tried to hack the 2 machine descriptions to insert
 convert_to_mode
 or relayout_decls here and there, but I found this very fragile. Instead
 a
 more central relayout the of type while expanding gave good results, as
 proposed here.

 bootstraped and tested with no regression for arm, aarch64 and i586.

 Does this look to be the right approach ?

 nb: for testing this patch is complementary with

 https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00332.html
 https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00248.html

 thanks for your comments.
>>>
>>> A x86 specific testcase that ICEs as well:
>>>
>>> typedef int v8si __attribute__((vector_size(32)));
>>> v8si a;
>>> v8si __attribute__((target("avx"))) foo()
>>> {
>>> return a;
>>> }
>>>
>>> in your patch not using the shared DECL_RTL of the global var
>>> "fixes" this so I think a conceptually better fix would be to
>>> "adjust" DECL_RTL from globals via a adjust_address (or so).
>>>
>>> Also given that we do
>>>
>>> /* ... fall through ...  */
>>>
>>>   case FUNCTION_DECL:
>>>   case RESULT_DECL:
>>> decl_rtl = DECL_RTL (exp);
>>>   expand_decl_rtl:
>>> gcc_assert (decl_rtl);
>>> decl_rtl = copy_rtx (decl_rtl);
>>>
>>> thus always "unshare" DECL_RTL anyway it might be not so
>>> bad to simply do
>>>
>>>decl_rtl = adjust_address (decl_rtl, TYPE_MODE (type), 0);
>>>
>>> instead of that to avoid one copy.
>>>
>>> Index: expr.c
>>> ===
>>> --- expr.c  (revision 232496)
>>> +++ expr.c  (working copy)
>>> @@ -9597,7 +9597,10 @@ expand_expr_real_1 (tree exp, rtx target
>>>  decl_rtl = DECL_RTL (exp);
>>>expand_decl_rtl:
>>>  gcc_assert (decl_rtl);
>>> -  decl_rtl = copy_rtx (decl_rtl);
>>> +  if (MEM_P (decl_rtl))
>>> +   decl_rtl = adjust_address (decl_rtl, TYPE_MODE (type), 0);
>>> +  else
>>> +   decl_rtl = copy_rtx (decl_rtl);
>>>  /* Record writes to register variables.  */
>>>  if (modifier == EXPAND_WRITE
>>> && REG_P (decl_rtl)
>>>
>>> untested apart from on the x86_64 testcase (which it fixes).  One could
>>> guard
>>> this further to only apply on vector typed decls with mismatched mode of
>>> course.
>>>
>>> I think that re-layouting globals is not very good design.
>>>
>>> Richard.
>>
>> A few other ICEs with this implementation, for instance if the context
>> is not in a function, such as
>>
>> typedef __simd64_int8_t int8x8_t;
>>
>> extern int8x8_t b;
>> int8x8_t *a = &b;
>>
>> So, to avoid a var re-layout and a copy_rtx (implied by adjust_address
>> btw). What about just calling 'change_address' ? like: (very lightly
>> tested)
>>
>> Index: expr.c
>> ===
>> --- expr.c(revision 232564)
>> +++ expr.c(working copy)
>> @@ -9392,7 +9392,8 @@
>>enum expand_modifier modifier, rtx *alt_rtl,
>>bool inner_reference_p)
>>{
>> -  rtx op0, op1, temp, decl_rtl;
>> +  rtx op0, op1, temp;
>> +  rtx decl_rtl = NULL_RTX;
>>  tree type;
>>  int unsignedp;
>>  machine_mode mode, dmode;
>> @@ -9590,11 +9591,22 @@
>>  && (TREE_STATIC (exp) || DECL_EXTERNAL (exp)))
>>layout_decl (exp, 0);
>>
>> +  decl_rtl = DECL_RTL (exp);
>> +
>> +  if (MEM_P (decl_rtl)
>> +  && (VECTOR_TYPE_P (type) && DECL_MODE (exp) != mode))
>> +{
>> +  if (current_function_decl
>> +  && (! reload_completed && !reload_in_progress))

RE: [Patch, MIPS] Remove definition of TARGET_PROMOTE_PROTOTYPES

2016-01-19 Thread Maciej W. Rozycki

On Sat, 12 Dec 2015, Matthew Fortune wrote:

> > >   * config/mips/mips.c (mips_promote_function_mode): New function.
> > >   (TARGET_PROMOTE_FUNCTION_MODE): Define as above function.
> > >   (TARGET_PROMOTE_PROTOTYPES): Remove.
> 
> I'm OK with this change on the basis that MIPS has been providing stronger
> guarantees than required by the various standards. I.e. after this change
> MIPS will have undefined behaviour for a mismatch in types between a
> call to an un-prototyped function and its definition:

 Indeed this is exactly what the current ISO C language standard mandates 
-- if an unprototyped call is made to a function whose definition has been 
prototyped and the types of the arguments after promotion are incompatible 
with the types of the respective parameters, then behaviour is undefined.

> extern void foo();
> 
> void caller(int a)
> {
>   foo(a);
> }
> 
> --
> 
> void foo(short a)
> {
>   // the value of 'a' can be out of range of a short because the caller
>   // did not get the right type for the argument.
> }

 Which is exactly the case with the piece of code you quoted.  Behaviour 
of this code would be defined if the `a' parameter of `foo' was of the 
`int' type.  See Section 6.5.2.2 "Function calls", clause 6, for details.

  Maciej

Re: [RFC][PATCH, ARM 7/8] ARMv8-M Security Extension's cmse_nonsecure_call: use __gnu_cmse_nonsecure_call]

2016-01-19 Thread Andre Vieira (lists)


On 16/01/16 14:49, Senthil Kumar Selvaraj wrote:

User-agent: mu4e 0.9.13; emacs 24.5.1

Hi,

Apologies for the bad posting style (I don't have the
original email handy), but shouldn't _gnu_cmse_nonsecure_call be defined
with the .global directive in the below hunk (to make it visible when linking)?

diff --git a/libgcc/config/arm/cmse_nonsecure_call.S b/libgcc/config/arm/cm=
se_nonsecure_call.S
new file mode 100644
index ..bdc140f5bbe87c6599db225b1b9=
b7bbc7d606710
--- /dev/null
+++ b/libgcc/config/arm/cmse_nonsecure_call.S
@@ -0,0 +1,87 @@
+.syntax unified
+.thumb
+__gnu_cmse_nonsecure_call:

Right now, it ends up as a local symbol, and compiling and linking a
program with cmse_nonsecure_call (say cmse-11.c), results in a linker
error - the linker doesn't find the symbol even if it is present in
libgcc.a. I found the problem that way - dumping symbols for my variant
of libgcc.a and grepping showed the symbol to be available but local.

Regards
Senthil


Hi Senthil,

Thanks for catching that!

Cheers,
Andre

Re: [PATCH] Fix PR c++/69283 (auto deduction fails when ADL is required)

2016-01-19 Thread Patrick Palka

On Tue, Jan 19, 2016 at 9:56 AM, Jason Merrill  wrote:
> On 01/18/2016 10:55 PM, Patrick Palka wrote:
>>
>> mark_used is wrongly diagnosing a use of a TEMPLATE_DECL (e.g. the call
>> to f1 in function f3 of auto-fn29.C below) for having an undeduced
>> 'auto' return type.  This doesn't make sense, because an 'auto' used
>> inside a template doesn't get deduced until after the template is
>> instantiated.  So for a TEMPLATE_DECL we shouldn't diagnose a use of
>> undeduced 'auto' here.  After instantiation, presumably we will call
>> mark_used on the resulting FUNCTION_DECL which will check for undeduced
>> auto appropriately.
>> @@ -5112,7 +5112,9 @@ mark_used (tree decl, tsubst_flags_t complain)
>> || DECL_LANG_SPECIFIC (decl) == NULL
>> || DECL_THUNK_P (decl))
>>   {
>> -  if (!processing_template_decl && type_uses_auto (TREE_TYPE (decl)))
>> +  if (!processing_template_decl
>> + && TREE_CODE (decl) != TEMPLATE_DECL
>> + && type_uses_auto (TREE_TYPE (decl)))
>
>
> How does a TEMPLATE_DECL get in here?  Does it have null DECL_LANG_SPECIFIC?

(In the test case auto-fn29.C,) When instantiating the template
function f3,we call tsubst on the CALL_EXPR "f1 (v);".  There, ADL is
performed on the identifier f1 (which is the CALL_EXPR_FN) which
returns the TEMPLATE_DECL f1.  Then mark_used is called on this
CALL_EXPR_FN, only if it's a decl.

If in the test case the call to "f1 (v);" is replaced with "Ape::f1
(v);" then the CALL_EXPR_FN is then an OVERLOAD (to the TEMPLATE_DECL
f1), i.e. not a decl, so we don't call mark_used on it in tsubst.

The DECL_LANG_SPECIFIC of this decl is not null.

>
> I'd think mark_used of a TEMPLATE_DECL should return after setting
> TREE_USED, there's nothing else to do with it.

Consider it changed.

>
> Jason
>
>

[wwwdocs] Add special math functions to libstdc++ changes

2016-01-19 Thread Jonathan Wakely


This documents Ed's recent addition to trunk.

Committed to cvs.
Index: htdocs/gcc-6/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/changes.html,v
retrieving revision 1.46
diff -u -r1.46 changes.html
--- htdocs/gcc-6/changes.html	22 Dec 2015 19:23:31 -	1.46
+++ htdocs/gcc-6/changes.html	19 Jan 2016 15:47:29 -
@@ -110,7 +110,9 @@
 
 Runtime Library (libstdc++)
   
- Experimental support for C++17, including the following
+Extensions to the C++ Library to support mathematical special
+functions (ISO/IEC 29124:2010), thanks to Edward Smith-Rowland. 
+Experimental support for C++17, including the following
   new features:
   
 std::uncaught_exceptions function (this is also
@@ -129,12 +131,12 @@
 
 An experimental implementation of the File System TS.
 Experimental support for most features of the second version of the
-Library Fundamentals TS, including polymorphic memory resources and
-array support in shared_ptr, thanks to Fan You.
+Library Fundamentals TS. This includes polymorphic memory resources
+and array support in shared_ptr, thanks to Fan You.
 Some assertions checked by Debug Mode can now also be enabled by
 _GLIBCXX_ASSERTIONS. The subset of checks enabled by
 the new macro have less run-time overhead than the full
-_GLIBCXX_DEBUG checks and and don't affect the library
+_GLIBCXX_DEBUG checks and don't affect the library
 ABI, so can be enabled per-translation unit.
 
 Timed mutex types are supported on more targets, including Darwin.

Re: [PATCH 4/4][AArch64] Cost CCMP instruction sequences to choose better expand order

2016-01-19 Thread H.J. Lu

On Tue, Dec 15, 2015 at 2:33 AM, Wilco Dijkstra  wrote:
> ping
>
>> -Original Message-
>> From: Wilco Dijkstra [mailto:wilco.dijks...@arm.com]
>> Sent: 13 November 2015 16:03
>> To: 'gcc-patches@gcc.gnu.org'
>> Subject: [PATCH 4/4][AArch64] Cost CCMP instruction sequences to choose 
>> better expand order
>>
>> This patch adds CCMP selection based on rtx costs. This is based on Jiong's 
>> already approved patch https://gcc.gnu.org/ml/gcc-
>> patches/2015-09/msg01434.html with some minor refactoring and the tests 
>> updated.
>>
>> OK for commit?
>>
>> ChangeLog:
>> 2015-11-13  Jiong Wang  
>>
>> gcc/
>>   * ccmp.c (expand_ccmp_expr_1): Cost the instruction sequences
>>   generated from different expand order.
>>

It breaks bootstrap on Linux/x86:

https://gcc.gnu.org/ml/gcc-regression/2016-01/msg00332.html

-- ../../src-trunk/gcc/ccmp.c: In function ârtx_def*
expand_ccmp_expr_1(gimple*, rtx_def**, rtx_def**)â:
../../src-trunk/gcc/ccmp.c:173:14: error: âretâ may be used
uninitialized in this function [-Werror=maybe-uninitialized]
rtx tmp2, ret, ret2;
  ^~~

cc1plus: all warnings being treated as errors
Makefile:1085: recipe for target 'ccmp.o' failed
make[6]: *** [ccmp.o] Error 1

H.J.

[wwwdocs] Update changes.html for LTO and IPA

2016-01-19 Thread Jan Hubicka

Hi,
this patch mentiones few user visible changes I can think of.  I will
add some quality data on firefox once stage3 closes.

It seems that the optimization section of changes.html deserve some work :)

Honza

Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/changes.html,v
retrieving revision 1.46
diff -u -r1.46 changes.html
--- changes.html22 Dec 2015 19:23:31 -  1.46
+++ changes.html19 Jan 2016 15:42:56 -
@@ -43,6 +43,64 @@
of array bounds.  In particular, it enables
-fsanitize=bounds as well as instrumentation of
flexible array member-like arrays.
+Type based alias analysis now disambiguate accesses to different
+   pointers. This improve precision of the alias oracle by about 20-30%
+   on higher-level C++ programs. Programs doing invalid type punning
+   of pointer types may now need -fno-strict-aliasing
+   to work correctly.
+Alias oracle now correctly supports weakref and
+   alias attributes. This makes it possible to access
+   both variable and its alias in one translation unit which is common
+   with link-time optimization.
+Value range propagation now assume that this pointer
+   of C++ methods is non-NULL.  This eliminates many NULL pointer checks
+   but also breaks some non-conforming code-bases (such as Qt-5, Chromium,
+   KDevelop). As a termporary work-around
+   -fno-delete-null-pointer-checks can be used. Wrong
+   code can be identified -fsanitize=undefined.
+Link-time optimization improvements:
+
+  warning and error attributes are now
+ correctly preserved by the declaration linking and thus
+ -D_FORTIFY_SOURCE=2 is now supported with 
-flto.
+  Type merging was fixed to handle C and Fortran interoperability
+ rules as defined by the Fortran2005 language standard.
+ 
+ As an exception, CHARACTER(KIND=C_CHAR) is not 
inter-operable
+ with char in all cases because it is an array while
+ char is scalar.
+ INTEGER(KIND=C_SIGNED_CHAR) should be used instead.
+ In general, this inter-operability can not be implemented, for
+ example, on targets where function passing conventions of arrays
+ differs from scalars.
+  More of type information is now preserved at link-time reducing
+ the loss of accuracy of the type based alias analysis oracle compared
+ to builds without link time optimization.
+  Invalid type punning on global variables and declarations is now
+ reported with -Wodr-type-mismatch.
+  The size of LTO object files was reduced by about 11% (measured
+ by compiling Firefox 46.0).
+  Link-time parallelization (enabled using -flto=n)
+ was significantly improved by decreasing the size of streamed
+ data when partitioning program.  The size of streamed
+ IL while compiling Firefox 46.0 was reduced by 66%.
+  Linker plugin was extended to pass information about type of
+ binary produced to GCC back-end (that can be also manually controlled
+ by -flinker-output).  This makes it possible to
+ properly confiugre code generator and support incremental
+ linking. Inremental linking of LTO objects by gcc -r is
+ now supported on plugin enabled setups. Because code generation 
happens
+ during the incremnetal linking step, the whole program optimization
+ is not performed. GCC 7 will support incremental IL linking.
+
+Inter-procedural optimization improvements:
+
+  Basic jump threading is now performed before profile construction
+ and inline analysis resulting in more realistic size and time 
estimates
+ that drive heuristics of inliner and function cloning passes.
+  Function cloning now more aggressively eliminate unused function
+ parameters.
+

[wwwdocs] Document C++17 static_assert feature

2016-01-19 Thread Jonathan Wakely


Patch for the /projects/cxx1z.html page to document the unary
static_assert in C++17.

Committed to CVS.
Index: htdocs/projects/cxx1z.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/cxx1z.html,v
retrieving revision 1.8
diff -u -r1.8 cxx1z.html
--- htdocs/projects/cxx1z.html	20 Oct 2015 02:50:54 -	1.8
+++ htdocs/projects/cxx1z.html	19 Jan 2016 16:11:28 -
@@ -86,6 +86,12 @@
6 
   __cpp_nontype_template_args >= 201411 
 
+
+   Extending static_assert 
+  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3928.html";>N3928 
+  6
+   __cpp_static_assert >= 201411 
+

Re: [PATCH 4/4][AArch64] Cost CCMP instruction sequences to choose better expand order

2016-01-19 Thread Wilco Dijkstra

H.J. Lu  wrote:
> It breaks bootstrap on Linux/x86:

Sorry about that - it looks like the warning levels seem to have changed since 
that patch was tested...

I have a trivial fix which I'll get checked in soon.

Wilco

Do not redirect calls to cxa_pure_virtual into bultin_unreachable in some cases

2016-01-19 Thread Jan Hubicka

Hi,
currently we optimize calls to pure virtual functions to __builtin_unreachable.
While this is correct it does make code harder to debug.  This patch makes
ipa-devirt to treat cxa_pure_virtual specially and devirtualize to it when
it is the only posible target.

Bootstrapped/regtested x86_64-linux. There are no calls to cxa_pure_virtual
in firefox, so I suppose dropping this optimization is quite cheap.

Honza

PR ipa/66223
* ipa-devirt.c (is_cxa_pure_virtual_p): New function.
(maybe_record_node): Record cxa_pure_virtual as the only possible
target if there are not ohter candidates.
(possible_polymorphic_call_target_p): Accept cxa_pure_virtual.

* g++.dg/ipa/devirt-50.C: New testcase.
Index: ipa-devirt.c
===
--- ipa-devirt.c(revision 232555)
+++ ipa-devirt.c(working copy)
@@ -2327,6 +2327,17 @@ referenced_from_vtable_p (struct cgraph_
   return found;
 }
 
+/* Return if TARGET is cxa_pure_virtual.  */
+
+static bool
+is_cxa_pure_virtual_p (tree target)
+{
+  return target && TREE_CODE (TREE_TYPE (target)) != METHOD_TYPE
+&& DECL_NAME (target)
+&& !strcmp (IDENTIFIER_POINTER (DECL_NAME (target)),
+"__cxa_pure_virtual");
+}
+
 /* If TARGET has associated node, record it in the NODES array.
CAN_REFER specify if program can refer to the target directly.
if TARGET is unknown (NULL) or it can not be inserted (for example because
@@ -2341,11 +2352,12 @@ maybe_record_node (vec  &
 {
   struct cgraph_node *target_node, *alias_target;
   enum availability avail;
+  bool pure_virtual = is_cxa_pure_virtual_p (target);
 
-  /* cxa_pure_virtual and __builtin_unreachable do not need to be added into
+  /* __builtin_unreachable do not need to be added into
  list of targets; the runtime effect of calling them is undefined.
  Only "real" virtual methods should be accounted.  */
-  if (target && TREE_CODE (TREE_TYPE (target)) != METHOD_TYPE)
+  if (target && TREE_CODE (TREE_TYPE (target)) != METHOD_TYPE && !pure_virtual)
 return;
 
   if (!can_refer)
@@ -2388,6 +2400,7 @@ maybe_record_node (vec  &
  ??? Maybe it would make sense to be more aggressive for LTO even
  elsewhere.  */
   if (!flag_ltrans
+  && !pure_virtual
   && type_in_anonymous_namespace_p (DECL_CONTEXT (target))
   && (!target_node
   || !referenced_from_vtable_p (target_node)))
@@ -2401,6 +2414,20 @@ maybe_record_node (vec  &
 {
   gcc_assert (!target_node->global.inlined_to);
   gcc_assert (target_node->real_symbol_p ());
+  /* Only add pure virtual if it is the only possible target.  This way
+we will preserve the diagnostics about pure virtual called in many
+cases without disabling optimization in other.  */
+  if (pure_virtual)
+   {
+ if (nodes.length ())
+   return;
+   }
+  /* If we found a real target, take away cxa_pure_virtual.  */
+  else if (!pure_virtual && nodes.length () == 1
+  && is_cxa_pure_virtual_p (nodes[0]->decl))
+   nodes.pop ();
+  if (pure_virtual && nodes.length ())
+   return;
   if (!inserted->add (target))
{
  cached_polymorphic_call_targets->add (target_node);
@@ -3328,6 +3355,9 @@ possible_polymorphic_call_target_p (tree
   || fcode == BUILT_IN_TRAP))
 return true;
 
+  if (is_cxa_pure_virtual_p (n->decl))
+return true;
+
   if (!odr_hash)
 return true;
   targets = possible_polymorphic_call_targets (otr_type, otr_token, ctx, 
&final);
Index: testsuite/g++.dg/ipa/devirt-50.C
===
--- testsuite/g++.dg/ipa/devirt-50.C(revision 0)
+++ testsuite/g++.dg/ipa/devirt-50.C(revision 0)
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-tree-optimized"  } */
+struct B {
+B* self;
+B() : self( this ) { self->f(); }
+virtual void f() = 0;
+};
+
+struct D : B
+{
+void f() {}
+};
+
+int main()
+{
+D d;
+}
+
+/* { dg-final { scan-tree-dump "cxa_pure_virtual" "optimized"} } */

Re: Remove outdated text from lto.texi

2016-01-19 Thread Jeff Law


On 01/19/2016 03:56 AM, Kugan wrote:

Hi,

lto.texi has "Currently, the linker plugin works only in combination
with the Gold linker, but a GNU ld implementation is under development".
I don't think this is true any more. Attached patch removes this. is
this OK for trunk?

Thanks,
Kugan

gcc/ChangeLog:

2016-01-19  Kugan Vivekanandarajah  

* doc/lto.texi: Remove text that says only Gold has linker plugin
support.

OK.

Thanks,
jeff

Re: Do not redirect calls to cxa_pure_virtual into bultin_unreachable in some cases

2016-01-19 Thread Bernd Schmidt


On 01/19/2016 05:50 PM, Jan Hubicka wrote:


+static bool
+is_cxa_pure_virtual_p (tree target)
+{
+  return target && TREE_CODE (TREE_TYPE (target)) != METHOD_TYPE
+&& DECL_NAME (target)
+&& !strcmp (IDENTIFIER_POINTER (DECL_NAME (target)),
+"__cxa_pure_virtual");
+}


Formatting.


Bernd

Re: [PATCH, AArch64] Fix for PR67896 (C++ FE cannot distinguish __Poly{8,16,64,128}_t types)

2016-01-19 Thread James Greenhalgh

On Tue, Jan 19, 2016 at 06:28:18AM +0100, Roger Ferrer Ibáñez wrote:
> Hi,
> 
> aarch64-builtins.c defines several SIMD builtin types. Among these
> SIMD types there are the polynomials __Poly{8,16,64,128}_t. These are
> built by a call to build_distinct_type_copy
> (unsigned_int{Q,H,D,T}I_type_node), respectively, i.e. they are not
> VECTOR_TYPEs. A later loop, traverses an array containing all the SIMD
> types and skips those types for which a VECTOR_TYPE does not have to
> be built: this is, types __Poly{8,16,64,128}_t. That same loop does
> SET_TYPE_STRUCTURAL_EQUALITY on the newly created vector type, but it
> does this unconditionally, thus setting TYPE_CANONICAL of types
> __Poly{8,16,64,128}_t to NULL.
> 
> The net effect of this is that the C++ FE is unable to distinguish
> between all __Poly{8,16,64,128}_t and between vector types with the
> same number of elements but different polynomial type as element type
> (like __Poly8x8_t vs __Poly16x8_t). Note that sizeof (correctly)
> returns different values for all these types. This patch simply
> protects SET_TYPE_STRUCTURAL_EQUALITY inside the branch that creates
> the vector type.
> 
> I have bootstrapped and regression tested this on a small board
> aarch64-unknown-linux-gnu host without new regressions.

This patch looks technically correct to me, though there is a small
style issue to correct (in-line below), and your ChangeLogs don't fit
our usual style.

> P.S.: I haven't signed the copyright assignment to the FSF. The change
> is really small but I can do the paperwork if required.

I have no experience with making the call as to which sort of change
is "small enough" to include in GCC without copyright assignment, so
I've CC'ed the AArch64 maintainers, who may be able to give more advice.

> gcc/ChangeLog:
> 
> 2016-01-19 Roger Ferrer Ibáñez 

Two spaces between the date and your name, and between your name and your
email, as so:

2016-01-19  Roger Ferrer Ibáñez  


> 
> PR aarch64/67896

PR target/67896


> * aarch64-builtins.c (aarch64_init_simd_builtin_types): Do not set

This should be the path from the directory in which the ChangeLog resides:

* config/aarch64/aarch64-builtins.c

And a full-stop at the end of the line. So this whole entry would look like:

2016-01-19  Roger Ferrer Ibáñez  

PR target/67896
* config/aarch64/aarch64-builtins.c
(aarch64_init_simd_builtin_types): Do not set structural
equality to __Poly{8,16,64,128}_t types.

> structural equality to __Poly{8,16,64,128}_t types
> 
> gcc/testsuite/ChangeLog:
> 
> 2016-01-19 Roger Ferrer Ibáñez 
> 
> PR aarch64/67896
> * pr67896.C: New test

Same comments apply here:

2016-01-19 Roger Ferrer Ibáñez 
 
PR target/67896
* gcc.target/aarch64/simd/pr67896.C: New.

> diff --git a/gcc/config/aarch64/aarch64-builtins.c 
> b/gcc/config/aarch64/aarch64-builtins.c
> index bd7a8dd..edacf10 100644
> --- a/gcc/config/aarch64/aarch64-builtins.c
> +++ b/gcc/config/aarch64/aarch64-builtins.c
> @@ -610,14 +610,16 @@ aarch64_init_simd_builtin_types (void)
>enum machine_mode mode = aarch64_simd_types[i].mode;
>  
>if (aarch64_simd_types[i].itype == NULL)
> - aarch64_simd_types[i].itype =
> -   build_distinct_type_copy
> + {
> +   aarch64_simd_types[i].itype =
> + build_distinct_type_copy
>   (build_vector_type (eltype, GET_MODE_NUNITS (mode)));

I'd rewrap this as:

  aarch64_simd_types[i].itype
= build_distinct_type_copy
  (build_vector_type (eltype, GET_MODE_NUNITS (mode)));

Thanks for the patch!
James

> +   SET_TYPE_STRUCTURAL_EQUALITY (aarch64_simd_types[i].itype);
> + }
>  
>tdecl = add_builtin_type (aarch64_simd_types[i].name,
>   aarch64_simd_types[i].itype);
>TYPE_NAME (aarch64_simd_types[i].itype) = tdecl;
> -  SET_TYPE_STRUCTURAL_EQUALITY (aarch64_simd_types[i].itype);
>  }
>  
>  #define AARCH64_BUILD_SIGNED_TYPE(mode)  \
> diff --git a/gcc/testsuite/gcc.target/aarch64/simd/pr67896.C 
> b/gcc/testsuite/gcc.target/aarch64/simd/pr67896.C
> new file mode 100644
> index 000..1f916e0
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/simd/pr67896.C
> @@ -0,0 +1,7 @@
> +typedef __Poly8_t A;
> +typedef __Poly16_t A; /* { dg-error "conflicting declaration" } */
> +typedef __Poly64_t A; /* { dg-error "conflicting declaration" } */
> +typedef __Poly128_t A; /* { dg-error "conflicting declaration" } */
> +
> +typedef __Poly8x8_t B;
> +typedef __Poly16x8_t B; /* { dg-error "conflicting declaration" } */ 
> -- 
> 2.1.4
>

Re: [wwwdocs] gcc-6/changes.html: diagnostics, Levenshtein, -Wmisleading-indentation, jit (v2)

2016-01-19 Thread Gerald Pfeifer

On Fri, 15 Jan 2016, David Malcolm wrote:
> Here's an updated version of the above, which the W3C validator
> reports as being clean (fixing various "&" and "<" and a missing
> end-tag).

Nice - and a lot of nice changes you implemented since GCC 5!

> OK to commit?

Yep.  I was going to flag "misspelled" as being misspelt, but
my dictionary tells me you can actually say both. ;-)

Gerald

RE: [PATCH] [ARC] Add basic support for double load and store instructions

2016-01-19 Thread Claudiu Zissulescu

Hi,

I've prepared a new patch based on the received review (attached). I also added 
a mod on invoke.texi regarding mll64 documentation. This mod was missing in the 
first patch.

I have tested it with dg.exp for arc700, archs and archs+ll64.

Please let me know if everything is alright.

Thank you,
Claudiu

gcc/
2015-01-19  Claudiu Zissulescu  

* config/arc/arc.c (TARGET_DWARF_REGISTER_SPAN): Define.
(arc_init): Check validity mll64 option.
(arc_save_restore): Use double load/store instruction.
(arc_expand_movmem): Likewise.
(arc_split_move): Don't split if we have double load/store
instructions. Returns a boolean.
(arc_process_double_reg_moves): Change function to return boolean
instead of a sequence of instructions.
(arc_dwarf_register_span): New function.
* config/arc/arc-protos.h (arc_split_move): Change prototype.
* config/arc/arc.h (TARGET_CPU_CPP_BUILTINS): Define __ARC_LL64__.
* config/arc/arc.md (*movdi_insn): Emit ldd/std instructions.
(*movdf_insn): Likewise.
* config/arc/arc.opt (mll64): New option.
* config/arc/predicates.md (even_register_operand): New predicate.
* doc/invoke.texi (ARC Options): Add mll64 documentation

> -Original Message-
> From: Joern Wolfgang Rennecke [mailto:g...@amylaar.uk]
> Sent: Sunday, January 17, 2016 7:21 AM
> To: Claudiu Zissulescu; gcc-patches@gcc.gnu.org
> Cc: Francois Bedard; jeremy.benn...@embecosm.com
> Subject: Re: [PATCH] [ARC] Add basic support for double load and store
> instructions
> 
> 
> 
> On 15/01/16 12:40, Claudiu Zissulescu wrote:
> 
>   (arc_save_restore): Use double load/store instruction.
>   (arc_expand_movmem): Likewise.
> 
> 
>  >if (n_pieces >= (unsigned int) (optimize_size ? 3 : 15))
>  >  return false;
>  > -  if (piece > 4)
>  > +  if (TARGET_LL64 && (piece != 8) && (align >= 4))
>  > +piece = 8;
>  > +  else if (piece > 4)
>  >  piece = 4;
>  >dst_addr = force_offsettable (XEXP (operands[0], 0), size, 0);
> 
> That bit doesn't make sense to me.
> Assume the alignment is 8.  Thus, piece becomes 8 too.  Then the above
> conditional gets processed, and it sets piece to 4.
> I think instead of "(piece != 8) && (align >= 4)" it should be:
> "(piece >= 8)"
> 
>   * config/arc/arc.md (*movdi_insn): Emit ldd/std instructions.
> 
> 
>  > -  "&& reload_completed && optimize"
>  > -  [(set (match_dup 2) (match_dup 3)) (set (match_dup 4) (match_dup 5))]
> > -  "arc_split_move (operands);"
>  > +  "reload_completed"
>  > +  [(match_dup 2)]
>  > +  "operands[2] = arc_split_move (operands);"
> 
> arc_split_move uses, inter alia,  operands[2]..operands[[5].
> Thus, it is not save to stop mentioning these in the pattern.
> 
> > (*movdf_insn): Likewise.
> Likewise.
> 
> When you say 'basic support', I suppose you have a plan to re-visit this later
> to get the register allocator to use register pairs, and stop regrename
> breaking them up?

0001-ARC-Add-basic-support-for-double-load-and-store-inst.patch
Description: 0001-ARC-Add-basic-support-for-double-load-and-store-inst.patch

[PATCH] c++/59759 - ICE in unify, using std::enable_if on classes

2016-01-19 Thread Martin Sebor


Attached is the patch to avoid the ICE that Kai posted below
with the test case Marek asked for in his response.  I didn't
see any further followup on the list.

  https://gcc.gnu.org/ml/gcc-patches/2015-05/msg02325.html

Martin
gcc/testsuite/ChangeLog:
2016-01-19  Martin Sebor  

	PR c++/59759
	* gcc/testsuite/g++.dg/template/pr59759.C: New test.

gcc/cp/ChangeLog:
2015-05-26  Kai Tietz  

	PR c++/69277
	* pt.c (unify): Don't ICE on VAR_DECL.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 6062ebe..3361796 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -19928,11 +19928,11 @@ unify (tree tparms, tree targs, tree parm, tree arg, int strict,
   return unify_template_argument_mismatch (explain_p, parm, arg);
 
 case VAR_DECL:
-  /* A non-type template parameter that is a variable should be a
+  /* A non-type template parameter that is a variable should be
 	 an integral constant, in which case, it whould have been
-	 folded into its (constant) value. So we should not be getting
-	 a variable here.  */
-  gcc_unreachable ();
+	 folded into its (constant) value.  So we should not see
+	 a variable here except for ill-formed programs.  */
+  return unify_template_argument_mismatch (explain_p, parm, arg);
 
 case TYPE_ARGUMENT_PACK:
 case NONTYPE_ARGUMENT_PACK:
diff --git a/gcc/testsuite/g++.dg/template/pr59759.C b/gcc/testsuite/g++.dg/template/pr59759.C
new file mode 100644
index 000..c6a0b04
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/pr59759.C
@@ -0,0 +1,23 @@
+// PR c++/59759 - internal compiler error: in unify, using std::enable_if
+//on classes
+// { dg-do compile }
+
+template 
+struct B { };
+
+template 
+struct C {
+  typedef T U;
+};
+
+const int x = 0;
+
+// The default argument below is invalid for A.
+template , int>::U = x>
+struct A;
+
+template 
+void f (A*) {
+  A* map;   // { dg-error "not a class type" }
+  f (map);   // { dg-error "no matching function" }
+}

Re: [PATCH 4/4][AArch64] Cost CCMP instruction sequences to choose better expand order

2016-01-19 Thread Wilco Dijkstra

H.J. Lu  wrote:
> It breaks bootstrap on Linux/x86:

Committed trivial fix as r232576:

Index: gcc/ChangeLog
===
--- gcc/ChangeLog(revision 232575)
+++ gcc/ChangeLog(working copy)
@@ -1,3 +1,7 @@
+2016-01-19  Wilco Dijkstra  
+
+* ccmp.c (expand_ccmp_expr_1): Avoid spurious unused warnings.
+
  2016-01-19  Jan Hubicka  

  PR ipa/66223
Index: gcc/ccmp.c
===
--- gcc/ccmp.c(revision 232575)
+++ gcc/ccmp.c(working copy)
@@ -170,7 +170,7 @@
int unsignedp0, unsignedp1;
rtx_code rcode0, rcode1;
int speed_p = optimize_insn_for_speed_p ();
-  rtx tmp2, ret, ret2;
+  rtx tmp2, ret = NULL_RTX, ret2 = NULL_RTX;
unsigned cost1 = MAX_COST;
unsigned cost2 = MAX_COST;

Re: [C++] Add -fnull-this-pointer

2016-01-19 Thread Mike Stump

On Jan 19, 2016, at 6:41 AM, Jakub Jelinek  wrote:
> But then perhaps it will be better incentive for the projects to fix their
> cruft.  With a specialized option they will keep broken code forever.

Flags are forever.

C++ PATCH for c++/59759 (ICE with variable as default template argument)

2016-01-19 Thread Jason Merrill

The comment removed by this patch asserts that we can't see a VAR_DECL 
in unify, because any use of a variable as a template argument will have 
been adjusted to be either its constant value or its address.  But that 
isn't actually true when the type of the non-type template parameter 
depends on a type template argument, so we need to adjust based on what 
we end up getting.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit fb06c19fb2adc8d69e924db2072437904a4e9c58
Author: Jason Merrill 
Date:   Tue Jan 19 12:09:17 2016 -0500

	PR c++/59759
	* pt.c (convert_template_argument): Handle VAR_DECL properly.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 866b4b1..9305e1d 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -19923,11 +19923,20 @@ unify (tree tparms, tree targs, tree parm, tree arg, int strict,
   return unify_template_argument_mismatch (explain_p, parm, arg);
 
 case VAR_DECL:
-  /* A non-type template parameter that is a variable should be a
-	 an integral constant, in which case, it whould have been
-	 folded into its (constant) value. So we should not be getting
-	 a variable here.  */
-  gcc_unreachable ();
+  /* We might get a variable as a non-type template argument in parm if the
+	 corresponding parameter is type-dependent.  Make any necessary
+	 adjustments based on whether arg is a reference.  */
+  if (CONSTANT_CLASS_P (arg))
+	parm = fold_non_dependent_expr (parm);
+  else if (REFERENCE_REF_P (arg))
+	{
+	  tree sub = TREE_OPERAND (arg, 0);
+	  STRIP_NOPS (sub);
+	  if (TREE_CODE (sub) == ADDR_EXPR)
+	arg = TREE_OPERAND (sub, 0);
+	}
+  /* Now use the normal expression code to check whether they match.  */
+  goto expr;
 
 case TYPE_ARGUMENT_PACK:
 case NONTYPE_ARGUMENT_PACK:
@@ -19960,7 +19969,7 @@ unify (tree tparms, tree targs, tree parm, tree arg, int strict,
   if (is_overloaded_fn (parm) || type_unknown_p (parm))
 	return unify_success (explain_p);
   gcc_assert (EXPR_P (parm));
-
+expr:
   /* We must be looking at an expression.  This can happen with
 	 something like:
 
diff --git a/gcc/testsuite/g++.dg/cpp0x/temp_default6.C b/gcc/testsuite/g++.dg/cpp0x/temp_default6.C
new file mode 100644
index 000..10cde2d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/temp_default6.C
@@ -0,0 +1,27 @@
+// PR c++/59759
+// { dg-do compile { target c++11 } }
+
+namespace std {
+template 
+struct B {
+  static constexpr _Tp value = 0;
+};
+typedef B false_type;
+struct C : false_type {};
+template 
+struct is_integral : C {};
+template 
+struct enable_if {
+  typedef _Tp type;
+};
+}
+enum class enabled;
+extern constexpr enabled dummy{};
+template ::value,
+  T>::type = dummy>
+class A;
+template 
+void f(A*) {
+  A* map;
+  f(map);
+}
diff --git a/gcc/testsuite/g++.dg/cpp0x/temp_default7.C b/gcc/testsuite/g++.dg/cpp0x/temp_default7.C
new file mode 100644
index 000..c517aad
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/temp_default7.C
@@ -0,0 +1,27 @@
+// PR c++/59759
+// { dg-do compile { target c++11 } }
+
+namespace std {
+template 
+struct B {
+  static constexpr _Tp value = 0;
+};
+typedef B false_type;
+struct C : false_type {};
+template 
+struct is_integral : C {};
+template 
+struct enable_if {
+  typedef _Tp type;
+};
+}
+enum class enabled;
+constexpr enabled dummy{};
+template ::value,
+  enabled>::type = dummy>
+class A;
+template 
+void f(A*) {
+  A* map;
+  f(map);
+}

Re: [patch, fortran] Inline MATMUL(A,TRANSPOSE(B)), PR 66094

2016-01-19 Thread Toon Moene


On 01/18/2016 08:55 PM, Toon Moene wrote:


On 01/17/2016 01:44 PM, Thomas Koenig wrote:


So... comments?  Toon, would this help you?  Could yo maybe give this
a spin?


Thanks, the nightly test at my home computer will build with your patch.


That was the plan; unfortunately, the system crashed while doing this 
(due to an unrelated problem).


However, today I *did* run the test harness with your modification:

https://gcc.gnu.org/ml/gcc-testresults/2016-01/msg01795.html

Looks good.  These are the messages related to your new test cases:

/home/toon/compilers/trunk/gcc/testsuite/gfortran.dg/inline_matmul_13.f90:34:2: 
Warning: Code for reallocating the allocatable array at (1) will be 
added [-Wrealloc-lhs]
/home/toon/compilers/trunk/gcc/testsuite/gfortran.dg/inline_matmul_13.f90:41:2: 
Warning: Code for reallocating the allocatable array at (1) will be 
added [-Wrealloc-lhs]


PASS: gfortran.dg/inline_matmul_13.f90   -O0   (test for warnings, line 34)
PASS: gfortran.dg/inline_matmul_13.f90   -O0   (test for warnings, line 41)
PASS: gfortran.dg/inline_matmul_13.f90   -O0  (test for excess errors)
...
PASS: gfortran.dg/inline_matmul_13.f90   -O0  execution test
PASS: gfortran.dg/inline_matmul_13.f90   -O0   scan-tree-dump-times 
original "_gfortran_matmul" 0


and ditto for higher optimization levels.

The bounds tests added also completed correctly.

Thanks !

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news

Re: [wwwdocs] gcc-6/changes.html: diagnostics, Levenshtein, -Wmisleading-indentation, jit (v2)

2016-01-19 Thread Manuel López-Ibáñez


On 19/01/16 17:08, Gerald Pfeifer wrote:

On Fri, 15 Jan 2016, David Malcolm wrote:

Here's an updated version of the above, which the W3C validator
reports as being clean (fixing various "&" and "<" and a missing
end-tag).


Nice - and a lot of nice changes you implemented since GCC 5!



Am I the only one who doesn't see the colors at 
https://gcc.gnu.org/gcc-6/changes.html#c-family nor 
https://gcc.gnu.org/gcc-5/changes.html#fortran ?


Firefox 43.0.4 says "Content Security Policy: The page's settings blocked the 
loading of a resource at self ("default-src https://gcc.gnu.org http: https:")."


Cheers,

Manuel.

Re: [PATCH v2] libstdc++: Make certain exceptions transaction_safe.

2016-01-19 Thread Torvald Riegel

On Sat, 2016-01-16 at 10:57 +0100, Dominique d'Humières wrote:
> > Addressed these, fixed a problem with using GLIBCXX_WEAK_DEFINITION
> > (which is only set on Darwin despite the generic-sounding name -- so
> > just use __attribute__((weak)) directly), and also updated
> > testsuite_abi.cc so that it knows about CXXABI_1.3.10.
> >
> > Approved by Jonathan Wakely.  Committed as r232454.
> This breaks bootstrap on darwin, see 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69310.

Please give this patch a try.  I've only tested it on x86_64-linux.
Jon, okay from your side if Darwin testing succeeds?
commit 6987f84f278d2cbf5b828a8c81c1be84b292b1af
Author: Torvald Riegel 
Date:   Tue Jan 19 18:36:14 2016 +0100

libstdc: Use weak alias instead of just alias in TM support.

	PR libstdc++/69310
	* src/c++11/cow-stdexcept.cc: Use weak alias instead of just alias
	to make Darwin happy.

diff --git a/libstdc++-v3/src/c++11/cow-stdexcept.cc b/libstdc++-v3/src/c++11/cow-stdexcept.cc
index a0f505c..a070747 100644
--- a/libstdc++-v3/src/c++11/cow-stdexcept.cc
+++ b/libstdc++-v3/src/c++11/cow-stdexcept.cc
@@ -364,7 +364,9 @@ _ZGTtNSt##NAME##C1EPKc (CLASS* that, const char* s)			\
  construct the COW string in the latter manually.  Note that the	\
  exception classes will not be declared transaction_safe if the	\
  shared empty _Rep is disabled with --enable-fully-dynamic-string	\
- (in which case _GLIBCXX_FULLY_DYNAMIC_STRING is nonzero).  */	\
+ (in which case _GLIBCXX_FULLY_DYNAMIC_STRING is nonzero).		\
+ The alias declarations are also declared weak because Darwin	\
+ doesn't support non-weak aliases.  */\
   CLASS e("");\
   _ITM_memcpyRnWt(that, &e, sizeof(CLASS));\
   _txnal_cow_string_C1_for_exceptions(_txnal_##BASE##_get_msg(that),	\
@@ -372,7 +374,7 @@ _ZGTtNSt##NAME##C1EPKc (CLASS* that, const char* s)			\
 }	\
 void	\
 _ZGTtNSt##NAME##C2EPKc (CLASS*, const char*)\
-  __attribute__((alias ("_ZGTtNSt" #NAME "C1EPKc")));			\
+  __attribute__((weak, alias ("_ZGTtNSt" #NAME "C1EPKc")));		\
 void	\
 _ZGTtNSt##NAME##C1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE( \
 CLASS* that, const std::__sso_string& s)\
@@ -393,7 +395,7 @@ _ZGTtNSt##NAME##D1Ev(CLASS* that)	\
 { _txnal_cow_string_D1(_txnal_##BASE##_get_msg(that)); }		\
 void	\
 _ZGTtNSt##NAME##D2Ev(CLASS*)		\
-__attribute__((alias ("_ZGTtNSt" #NAME "D1Ev")));			\
+__attribute__((weak, alias ("_ZGTtNSt" #NAME "D1Ev")));			\
 void	\
 _ZGTtNSt##NAME##D0Ev(CLASS* that)	\
 {	\

Re: [PATCH v2] libstdc++: Make certain exceptions transaction_safe.

2016-01-19 Thread Jonathan Wakely


On 19/01/16 20:10 +0100, Torvald Riegel wrote:

On Sat, 2016-01-16 at 10:57 +0100, Dominique d'Humières wrote:

> Addressed these, fixed a problem with using GLIBCXX_WEAK_DEFINITION
> (which is only set on Darwin despite the generic-sounding name -- so
> just use __attribute__((weak)) directly), and also updated
> testsuite_abi.cc so that it knows about CXXABI_1.3.10.
>
> Approved by Jonathan Wakely.  Committed as r232454.
This breaks bootstrap on darwin, see 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69310.


Please give this patch a try.  I've only tested it on x86_64-linux.
Jon, okay from your side if Darwin testing succeeds?


Yes, OK.

Re: [wwwdocs] gcc-6/changes.html: diagnostics, Levenshtein, -Wmisleading-indentation, jit (v2)

2016-01-19 Thread Mike Stump

On Jan 19, 2016, at 11:05 AM, Manuel López-Ibáñez  wrote:
> 
> Am I the only one who doesn't see the colors at 
> https://gcc.gnu.org/gcc-6/changes.html#c-family nor 
> https://gcc.gnu.org/gcc-5/changes.html#fortran ?

Yes.  The darkslategrey of the headings is very close to black, but the links 
should be blue.

> Firefox 43.0.4 says "Content Security Policy: The page's settings blocked the 
> loading of a resource at self ("default-src https://gcc.gnu.org http: 
> https:").”

Just hit clear and then reload, and see if it comes back.  Don’t think it will.

Re: [wwwdocs] gcc-6/changes.html: diagnostics, Levenshtein, -Wmisleading-indentation, jit (v2)

2016-01-19 Thread Manuel López-Ibáñez

On 19 January 2016 at 19:31, Mike Stump  wrote:
> On Jan 19, 2016, at 11:05 AM, Manuel López-Ibáñez  
> wrote:
>>
>> Am I the only one who doesn't see the colors at 
>> https://gcc.gnu.org/gcc-6/changes.html#c-family nor 
>> https://gcc.gnu.org/gcc-5/changes.html#fortran ?
>
> Yes.  The darkslategrey of the headings is very close to black, but the links 
> should be blue.

Those colors are fine but the example diagnostics should also have
colors. They had when I added them some months ago.

[PATCH] Fix c/68513 for GCC5 (match.pd and SAVE_EXPRs)

2016-01-19 Thread Marek Polacek

Recently on IRC we've concluded that for GCC 5 the simplest solution
will be to just disable the problematic pattern on GENERIC.  So done
in the following.  (The problem was that the match.pd pattern created
SAVE_EXPRs which then leaked into gimplification.)

Bootstrapped/regtested on x86_64-linux, ok for 5?

2016-01-19  Marek Polacek  

PR c/68513
* match.pd ((x & ~m) | (y & m)): Only perform on GIMPLE.

* gcc.dg/pr68513.c: New test.

diff --git gcc/match.pd gcc/match.pd
index e40720e..0b557e6 100644
--- gcc/match.pd
+++ gcc/match.pd
@@ -385,8 +385,9 @@ along with GCC; see the file COPYING3.  If not see
 /* (x & ~m) | (y & m) -> ((x ^ y) & m) ^ x */
 (simplify
   (bit_ior:c (bit_and:c@3 @0 (bit_not @2)) (bit_and:c@4 @1 @2))
-  (if ((TREE_CODE (@3) != SSA_NAME || has_single_use (@3))
-   && (TREE_CODE (@4) != SSA_NAME || has_single_use (@4)))
+  (if (GIMPLE
+   && (TREE_CODE (@3) != SSA_NAME || has_single_use (@3))
+   && (TREE_CODE (@4) != SSA_NAME || has_single_use (@4)))
(bit_xor (bit_and (bit_xor @0 @1) @2) @0)))
 
 
diff --git gcc/testsuite/gcc.dg/pr68513.c gcc/testsuite/gcc.dg/pr68513.c
index e69de29..86f878d 100644
--- gcc/testsuite/gcc.dg/pr68513.c
+++ gcc/testsuite/gcc.dg/pr68513.c
@@ -0,0 +1,125 @@
+/* PR c/68513 */
+/* { dg-do compile } */
+/* { dg-options "-funsafe-math-optimizations -fno-math-errno -O 
-Wno-div-by-zero" } */
+
+int i;
+unsigned u;
+volatile int *e;
+
+#define E (i ? *e : 0)
+
+/* Can't trigger some of them because operand_equal_p will return false
+   for side-effects.  */
+
+/* (x & ~m) | (y & m) -> ((x ^ y) & m) ^ x */
+int
+fn1 (void)
+{
+  int r = 0;
+  r += (short) (E & ~u | i & u);
+  r += -(short) (E & ~u | i & u);
+  r += (short) -(E & ~u | i & u);
+  return r;
+}
+
+/* sqrt(x) < y is x >= 0 && x != +Inf, when y is large.  */
+double
+fn2 (void)
+{
+  double r;
+  r = __builtin_sqrt (E) < __builtin_inf ();
+  return r;
+}
+
+/* sqrt(x) < c is the same as x >= 0 && x < c*c.  */
+double
+fn3 (void)
+{
+  double r;
+  r = __builtin_sqrt (E) < 1.3;
+  return r;
+}
+
+/* copysign(x,y)*copysign(x,y) -> x*x.  */
+double
+fn4 (double y, double x)
+{
+  return __builtin_copysign (E, y) * __builtin_copysign (E, y);
+}
+
+/* x <= +Inf is the same as x == x, i.e. !isnan(x).  */
+int
+fn5 (void)
+{
+  return E <= __builtin_inf ();
+}
+
+/* Fold (A & ~B) - (A & B) into (A ^ B) - B.  */
+int
+fn6 (void)
+{
+  return (i & ~E) - (i & E);
+}
+
+/* Fold (A & B) - (A & ~B) into B - (A ^ B).  */
+int
+fn7 (void)
+{
+  return (i & E) - (i & ~E);
+}
+
+/* x + (x & 1) -> (x + 1) & ~1 */
+int
+fn8 (void)
+{
+  return E + (E & 1);
+}
+
+/* Simplify comparison of something with itself.  */
+int
+fn9 (void)
+{
+  return E <= E | E >= E;
+}
+
+/* Fold (A & ~B) - (A & B) into (A ^ B) - B.  */
+int
+fn10 (void)
+{
+  return (i & ~E) - (i & E);
+}
+
+/* abs(x)*abs(x) -> x*x.  Should be valid for all types.  */
+int
+fn11 (void)
+{
+  return __builtin_abs (E) * __builtin_abs (E);
+}
+
+/* (x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2) */
+int
+fn12 (void)
+{
+  return (E | 11) & 12;
+}
+
+/* fold_range_test */
+int
+fn13 (const char *s)
+{
+  return s[E] != '\0' && s[E] != '/';
+}
+
+/* fold_comparison */
+int
+fn14 (void)
+{
+  return (!!i ? : (u *= E / 0)) >= (u = E);
+}
+
+/* fold_mult_zconjz */
+_Complex int
+fn15 (_Complex volatile int *z)
+{
+  return *z * ~*z;
+}

Marek

[PATCH][committed] libitm: Remove dead code.

2016-01-19 Thread Torvald Riegel

I missed dead code when I removed the cacheline stuff.
local_type_traits hasn't been updated either, apparently leading to
bootstrap issues.  So we just remove more dead code.

Tested fine on x86_64-linux.  Committed.
commit c608b69c3c49c7d29033faf328fd4d117f31fd9f
Author: Torvald Riegel 
Date:   Tue Jan 19 20:43:10 2016 +0100

libitm: Remove dead code.

	* local_type_traits: Remove file.
	* libitm_i.h: Don't include it anymore.
	(sized_integral): Remove.

diff --git a/libitm/libitm_i.h b/libitm/libitm_i.h
index 751b4ab..ae88ff0 100644
--- a/libitm/libitm_i.h
+++ b/libitm/libitm_i.h
@@ -36,7 +36,6 @@
 #include 
 #include 
 #include 
-#include "local_type_traits"
 #include "local_atomic"
 
 /* Don't require libgcc_s.so for exceptions.  */
@@ -49,13 +48,6 @@ namespace GTM HIDDEN {
 
 using namespace std;
 
-// A helper template for accessing an unsigned integral of SIZE bytes.
-template struct sized_integral { };
-template<> struct sized_integral<1> { typedef uint8_t type; };
-template<> struct sized_integral<2> { typedef uint16_t type; };
-template<> struct sized_integral<4> { typedef uint32_t type; };
-template<> struct sized_integral<8> { typedef uint64_t type; };
-
 typedef unsigned int gtm_word __attribute__((mode (word)));
 
 // These values are given to GTM_restart_transaction and indicate the
diff --git a/libitm/local_type_traits b/libitm/local_type_traits
deleted file mode 100644
index 131e8d2..000
--- a/libitm/local_type_traits
+++ /dev/null
@@ -1,1901 +0,0 @@
-// C++0x type_traits -*- C++ -*-
-
-// Copyright (C) 2007-2016 Free Software Foundation, Inc.
-//
-// This file is part of the GNU ISO C++ Library.  This library is free
-// software; you can redistribute it and/or modify it under the
-// terms of the GNU General Public License as published by the
-// Free Software Foundation; either version 3, or (at your option)
-// any later version.
-
-// This library is distributed in the hope that it will be useful,
-// but WITHOUT ANY WARRANTY; without even the implied warranty of
-// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-// GNU General Public License for more details.
-
-// Under Section 7 of GPL version 3, you are granted additional
-// permissions described in the GCC Runtime Library Exception, version
-// 3.1, as published by the Free Software Foundation.
-
-// You should have received a copy of the GNU General Public License and
-// a copy of the GCC Runtime Library Exception along with this program;
-// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
-// .
-
-// 
-//
-// This is a copy of the libstdc++ header, with the trivial modification
-// of ignoring the c++config.h include.  If and when the top-level build is
-// fixed so that target libraries can be built using the newly built, we can
-// delete this file.
-//
-// 
-
-/** @file include/type_traits
- *  This is a Standard C++ Library header.
- */
-
-#ifndef _GLIBCXX_TYPE_TRAITS
-#define _GLIBCXX_TYPE_TRAITS 1
-
-// #pragma GCC system_header
-
-// #ifndef __GXX_EXPERIMENTAL_CXX0X__
-// # include 
-// #else
-// #include 
-
-namespace std // _GLIBCXX_VISIBILITY(default)
-{
-// _GLIBCXX_BEGIN_NAMESPACE_VERSION
-
-  /**
-   * @addtogroup metaprogramming
-   * @{
-   */
-
-  /// integral_constant
-  template
-struct integral_constant
-{
-  static constexpr _Tp  value = __v;
-  typedef _Tp   value_type;
-  typedef integral_constant<_Tp, __v>   type;
-  constexpr operator value_type() { return value; }
-};
-  
-  /// typedef for true_type
-  typedef integral_constant true_type;
-
-  /// typedef for false_type
-  typedef integral_constantfalse_type;
-
-  template
-constexpr _Tp integral_constant<_Tp, __v>::value;
-
-  // Meta programming helper types.
-
-  template
-struct conditional;
-
-  template
-struct __or_;
-
-  template<>
-struct __or_<>
-: public false_type
-{ };
-
-  template
-struct __or_<_B1>
-: public _B1
-{ };
-
-  template
-struct __or_<_B1, _B2>
-: public conditional<_B1::value, _B1, _B2>::type
-{ };
-
-  template
-struct __or_<_B1, _B2, _B3, _Bn...>
-: public conditional<_B1::value, _B1, __or_<_B2, _B3, _Bn...>>::type
-{ };
-
-  template
-struct __and_;
-
-  template<>
-struct __and_<>
-: public true_type
-{ };
-
-  template
-struct __and_<_B1>
-: public _B1
-{ };
-
-  template
-struct __and_<_B1, _B2>
-: public conditional<_B1::value, _B2, _B1>::type
-{ };
-
-  template
-struct __and_<_B1, _B2, _B3, _Bn...>
-: public conditional<_B1::value, __and_<_B2, _B3, _Bn...>, _B1>::type
-{ };
-
-  template
-struct __not_
-: public integral_constant
-{ };
-
-  struct __sfinae_types
-  {
-typedef char __one;
-

1 2 >

1 - 100 of 129 matches

Mail list logo