Re: [PATCH] Disable guality tests for powerpc*-linux*

2016-03-29 Thread Andreas Schwab
Jakub Jelinek  writes:

> For guality, the most effective test for regressions is simply always
> running contrib/test_summary after all your bootstraps and then just
> diffing up that against the same from earlier bootstrap.

Or use contrib/compare_tests.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [PATCH] Fix PR tree-optimization/59124 (bogus -Warray-bounds warning)

2016-03-29 Thread Richard Biener
On Sun, Mar 27, 2016 at 11:37 PM, Patrick Palka  wrote:
> On Sun, 27 Mar 2016, Patrick Palka wrote:
>
>> In unrolling of the inner loop in the test case below we introduce
>> unreachable code that otherwise contains out-of-bounds array accesses.
>> This is because the estimation of the maximum number of iterations of
>> the inner loop is too conservative: we assume 6 iterations instead of
>> the actual 4.
>>
>> Nonetheless, VRP should be able to tell that the code is unreachable so
>> that it doesn't warn about it.  The only thing holding VRP back is that
>> it doesn't look through conditionals of the form
>>
>>if (j_10 != CST1)where j_10 = j_9 + CST2
>>
>> so that it could add the assertion
>>
>>j_9 != (CST1 - CST2)
>>
>> This patch teaches VRP to detect such conditionals and to add such
>> assertions, so that it could remove instead of warn about the
>> unreachable code created during loop unrolling.
>>
>> What this addition does with the test case below is something like this:
>>
>> ASSERT_EXPR (i <= 5);
>> for (i = 1; i < 6; i++)
>>   {
>> j = i - 1;
>> if (j == 0)
>>   break;
>> // ASSERT_EXPR (i != 1)
>> bar[j] = baz[j];
>>
>> j = i - 2
>> if (j == 0)
>>   break;
>> // ASSERT_EXPR (i != 2)
>> bar[j] = baz[j];
>>
>> j = i - 3
>> if (j == 0)
>>   break;
>> // ASSERT_EXPR (i != 3)
>> bar[j] = baz[j];
>>
>> j = i - 4
>> if (j == 0)
>>   break;
>> // ASSERT_EXPR (i != 4)
>> bar[j] = baz[j];
>>
>> j = i - 5
>> if (j == 0)
>>   break;
>> // ASSERT_EXPR (i != 5)
>> bar[j] = baz[j];
>>
>> j = i - 6
>> if (j == 0)
>>   break;
>> // ASSERT_EXPR (i != 6)
>> bar[j] = baz[j]; // unreachable because (i != 6 && i <= 5) is always 
>> false
>>   }
>>
>> (I think the patch I sent a year ago that improved the
>>  register_edge_assert stuff would have fixed this too.  I'll try to
>>  post it again during next stage 1.
>>  https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00908.html)
>>
>> Bootstrap + regtest in progress on x86_64-pc-linux-gnu, does this look
>> OK to commit after testing?
>>
>> gcc/ChangeLog:
>>
>>   PR tree-optimization/59124
>>   * tree-vrp.c (register_edge_assert_for): For NAME != CST1
>>   where NAME = A + CST2 add the assertion A != (CST1 - CST2).
>>
>> gcc/testsuite/ChangeLog:
>>
>>   PR tree-optimization/59124
>>   * gcc.dg/Warray-bounds-19.c: New test.
>> ---
>>  gcc/testsuite/gcc.dg/Warray-bounds-19.c | 17 +
>>  gcc/tree-vrp.c  | 22 ++
>>  2 files changed, 39 insertions(+)
>>  create mode 100644 gcc/testsuite/gcc.dg/Warray-bounds-19.c
>>
>> diff --git a/gcc/testsuite/gcc.dg/Warray-bounds-19.c 
>> b/gcc/testsuite/gcc.dg/Warray-bounds-19.c
>> new file mode 100644
>> index 000..e2f9661
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/Warray-bounds-19.c
>> @@ -0,0 +1,17 @@
>> +/* PR tree-optimization/59124 */
>> +/* { dg-options "-O3 -Warray-bounds" } */
>> +
>> +unsigned baz[6];
>> +
>> +void foo(unsigned *bar, unsigned n)
>> +{
>> +  unsigned i, j;
>> +
>> +  if (n > 6)
>> +n = 6;
>> +
>> +  for (i = 1; i < n; i++)
>> +for (j = i - 1; j > 0; j--)
>> +  bar[j - 1] = baz[j - 1];
>> +}
>> +
>> diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
>> index b5654c5..31bd575 100644
>> --- a/gcc/tree-vrp.c
>> +++ b/gcc/tree-vrp.c
>> @@ -5820,6 +5820,28 @@ register_edge_assert_for (tree name, edge e, 
>> gimple_stmt_iterator si,
>>   }
>>  }
>>
>> +  /* In the case of NAME != CST1 where NAME = A + CST2 we can
>> + assert that NAME != (CST1 - CST2).  */
>
> This should say A != (...) not NAME != (...)
>
>> +  if ((comp_code == EQ_EXPR || comp_code == NE_EXPR)
>> +  && TREE_CODE (val) == INTEGER_CST)
>> +{
>> +  gimple *def_stmt = SSA_NAME_DEF_STMT (name);
>> +
>> +  if (is_gimple_assign (def_stmt)
>> +   && gimple_assign_rhs_code (def_stmt) == PLUS_EXPR)
>> + {
>> +   tree op0 = gimple_assign_rhs1 (def_stmt);
>> +   tree op1 = gimple_assign_rhs2 (def_stmt);
>> +   if (TREE_CODE (op0) == SSA_NAME
>> +   && TREE_CODE (op1) == INTEGER_CST)
>> + {
>> +   op1 = int_const_binop (MINUS_EXPR, val, op1);
>> +   register_edge_assert_for_2 (op0, e, si, comp_code,
>> +   op0, op1, is_else_edge);
>
> The last argument to register_edge_assert_for_2() should be false not
> is_else_edge since comp_code is already inverted.
>
> Consider these two things fixed.  Also I moved down the new code so that
> it's at the very bottom of register_edge_assert_for.  Here's an updated
> patch that passes bootstrap + regtest.
>
> -- 8< --
>
> gcc/ChangeLog:
>
> PR tree-optimization/59124
> * tree-vrp.c (register_edge_assert_for): For NAME != CST1
> where NAME = A + CST2 add the assertion A != (CST1 - CST2).
>
> gcc/testsuite/ChangeLog:
>
> PR tree-optimization/59124
> * gcc.

Re: [PATCH] Fix in-tree gmp/mpfr/mpc generation (PR 67728)

2016-03-29 Thread Richard Biener
On Mon, Mar 28, 2016 at 2:44 PM, Bernd Edlinger
 wrote:
>
> Hi,
>
> as described in the tracker we have bootstrap problems with in-tree gmp-6.1.0
> on certain targets, and also a linker issue with check-mpc due to the changed
> mpfr library path.

Hum, in-tree gmp 6.1.0 is not supported (only the version downloaded by
download_prerequesites is).

> These are triggered by overriding CFLAGS and LDFLAGS in in-tree builds.
> It did not happen with the gmp/mpfr/mpc versions that download_prerequisites
> installs, but the currently latest version of these libraries use CFLAGS to 
> pass
> -DNO_ASM which is overridden by gcc and causes the gmp-6.1.0 to be
> mis-compiled.

So you pass down AM_CFLAGS=-DNO_ASM but how does that reliably work
for all gmp versions?

>  And the mpc issue is triggered by overriding LDFLAGS
> and the changed mpfr library path.  So this started with mpfr v3.1.0 which
> moved the sources into a src sub-directory.
>
> The proposed patch fixes these problems by passing -DNO_ASM in AM_CFLAGS,
> and adding both possible mpfr library paths to HOST_LIB_PATH_mpfr.
> I've also adjusted HOST_LIB_PATH_mpc although it did not yet create problems.

But you remove a possibly good .libs lib_path.

> Boot-strapped and regression tested on x86_64-pc-linux-gnu, with different
> gmp versions including the latest snapshot.
> I have additionally built arm cross compilers, which was not working before.
>
> Is this OK for trunk?

I don't think so.  Supporting an arbitrary mix of in-tree versions is
a nightmare.

If you really want to go down this route see @extra_mpc_gmp_configure_flags@
and add a variant for the lib-paths.  I don't have a good answer for -DNO_ASM
than to fix gmp/mpfr to not pass down this kind of configuation via CFLAGs.

Please instead do the testing required to ensure bumping the versions downloaded
by download_prerequesite works during next stage1.

Richard.

>
> Thanks
> Bernd.


Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible

2016-03-29 Thread Richard Biener
On Mon, Mar 28, 2016 at 9:57 PM, Bin.Cheng  wrote:
> Sorry, Should have replied to gcc-patches list.
>
> Thanks,
> bin
>
> -- Forwarded message --
> From: "Bin.Cheng" 
> Date: Tue, 29 Mar 2016 03:55:04 +0800
> Subject: Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking
> DR against its innermost loop bahavior if possible
> To: Richard Biener 
>
> On 3/17/16, Richard Biener  wrote:
>> On Wed, Mar 16, 2016 at 5:17 PM, Bin.Cheng  wrote:
>>> On Wed, Mar 16, 2016 at 12:20 PM, Richard Biener
>>>  wrote:

 Hmm.
>>> Hi,
>>> Thanks for reviewing.

 +  equal_p = true;
 +  if (e1->base_address && e2->base_address)
 +equal_p &= operand_equal_p (e1->base_address, e2->base_address, 0);
 +  if (e1->offset && e2->offset)
 +equal_p &= operand_equal_p (e1->offset, e2->offset, 0);

 surely better to return false early.

 I think we don't want this in tree-data-refs.h also because of ...

 @@ -615,15 +619,29 @@
 hash_memrefs_baserefs_and_store_DRs_read_written_info
 (data_reference_p a)
data_reference_p *master_dr, *base_master_dr;and REALPART) before
 creating the DR (or adjust the equality function
>>> and hashing
tree ref = DR_REF (a);
tree base_ref = DR_BASE_OBJECT (a);
 +  innermost_loop_behavior *innermost = &DR_INNERMOST (a);
tree ca = bb_predicate (gimple_bb (DR_STMT (a)));
bool exist1, exist2;

 -  while (TREE_CODE (ref) == COMPONENT_REF
 -|| TREE_CODE (ref) == IMAGPART_EXPR
 -|| TREE_CODE (ref) == REALPART_EXPR)
 -ref = TREE_OPERAND (ref, 0);
 +  /* If reference in DR has innermost loop behavior and it is not
 + a compound memory reference, we store it to innermost_DR_map,
 + otherwise to ref_DR_map.  */
 +  if (TREE_CODE (ref) == COMPONENT_REF
 +  || TREE_CODE (ref) == IMAGPART_EXPR
 +  || TREE_CODE (ref) == REALPART_EXPR
 +  || !(DR_BASE_ADDRESS (a) || DR_OFFSET (a)
 +  || DR_INIT (a) || DR_STEP (a) || DR_ALIGNED_TO (a)))
 +{
 +  while (TREE_CODE (ref) == COMPONENT_REF
 +|| TREE_CODE (ref) == IMAGPART_EXPR
 +|| TREE_CODE (ref) == REALPART_EXPR)
 +   ref = TREE_OPERAND (ref, 0);
 +
 +  master_dr = &ref_DR_map->get_or_insert (ref, &exist1);
 +}
 +  else
 +master_dr = &innermost_DR_map->get_or_insert (innermost, &exist1);

 we don't want an extra hashmap but replace ref_DR_map entirely.  So we'd
 need to
 strip outermost non-variant handled-components (COMPONENT_REF, IMAGPART
 and REALPART) before creating the DR (or adjust the equality function
 and hashing
 to disregard them which means subtracting their offset from DR_INIT.
>>> I am not sure if I understand correctly.  But for component reference,
>>> it is the base object that we want to record/track.  For example,
>>>
>>>   for (i = 0; i < N; i++) {
>>> m = *data++;
>>>
>>> m1 = p1->x - m;
>>> m2 = p2->x + m;
>>>
>>> p3->y = (m1 >= m2) ? p1->y : p2->y;
>>>
>>> p1++;
>>> p2++;
>>> p3++;
>>>   }
>>> We want to infer that reads of p1/p2 in condition statement won't trap
>>> because there are unconditional reads of the structures, though the
>>> unconditional reads are actual of other sub-objects.  Here it is the
>>> invariant part of address that we want to track.
>>
>> Well, the variant parts - we want to strip invariant parts as far as we can
>> (offsetof (x) and offsetof (y))
>>
>>> Also illustrated by this example, we can't rely on data-ref analyzer
>>> here.  Because in gathering/scattering cases, the address could be not
>>> affine at all.
>>
>> Sure, but that's a different issue.
>>

 To adjust the references we collect you'd maybe could use a callback
 to get_references_in_stmt
 to adjust them.

 OTOH post-processing the DRs in if_convertible_loop_p_1 can be as simple
 as
>>> Is this a part of the method you suggested above, or is it an
>>> alternative one?  If it's the latter, then I have below questions
>>> embedded.
>>
>> It is an alternative to adding a hook to get_references_in_stmt and
>> probably "easier".
>>

 Index: tree-if-conv.c
 ===
 --- tree-if-conv.c  (revision 234215)
 +++ tree-if-conv.c  (working copy)
 @@ -1235,6 +1220,38 @@ if_convertible_loop_p_1 (struct loop *lo

for (i = 0; refs->iterate (i, &dr); i++)
  {
 +  tree *refp = &DR_REF (dr);
 +  while ((TREE_CODE (*refp) == COMPONENT_REF
 + && TREE_OPERAND (*refp, 2) == NULL_TREE)
 +|| TREE_CODE (*refp) == IMAGPART_EXPR
 +|| TREE_CODE (*refp) == REALPART_EXPR)
 +   refp = &TREE_OPERAND (*refp, 0);
 +  if (refp != &DR_REF (dr))
 +   {
 + tree saved_base = *refp;
 +  

[genmatch] reject duplicate captures used as arguments in user-defined predicates

2016-03-29 Thread Prathamesh Kulkarni
Hi,
I suppose we should reject duplicate captures used as "arguments" in user
defined predicates ?
eg:
(match (foo @0 @0)
  match-template)
The attached patch prints error "duplicate capture id" for above pattern.
Bootstrapped+tested on x86_64-pc-linux-gnu.
Ok for trunk ?

Thanks,
Prathamesh
diff --git a/gcc/genmatch.c b/gcc/genmatch.c
index 1f5f45c..eca5508 100644
--- a/gcc/genmatch.c
+++ b/gcc/genmatch.c
@@ -3602,7 +3602,7 @@ private:
   const char *get_number ();
 
   id_base *parse_operation ();
-  operand *parse_capture (operand *, bool);
+  operand *parse_capture (operand *, bool, bool);
   operand *parse_expr ();
   c_expr *parse_c_expr (cpp_ttype);
   operand *parse_op ();
@@ -3832,7 +3832,7 @@ parser::parse_operation ()
  capture = '@'  */
 
 struct operand *
-parser::parse_capture (operand *op, bool require_existing)
+parser::parse_capture (operand *op, bool require_existing, bool 
error_on_existing = false)
 {
   source_location src_loc = eat_token (CPP_ATSIGN)->src_loc;
   const cpp_token *token = peek ();
@@ -3852,6 +3852,8 @@ parser::parse_capture (operand *op, bool require_existing)
fatal_at (src_loc, "unknown capture id");
   num = next_id;
 }
+  else if (error_on_existing)
+fatal_at (src_loc, "duplicate capture id");
   return new capture (src_loc, num, op);
 }
 
@@ -4530,7 +4532,7 @@ parser::parse_pattern ()
  capture_ids = new cid_map_t;
  e = new expr (p, e_loc);
  while (peek ()->type == CPP_ATSIGN)
-   e->append_op (parse_capture (NULL, false));
+   e->append_op (parse_capture (NULL, false, true));
  eat_token (CPP_CLOSE_PAREN);
}
   if (p->nargs != -1


ChangeLog
Description: Binary data


Re: RFA: PATCH to tree-inline.c:remap_decls for c++/70353 (ICE with __func__ and constexpr)

2016-03-29 Thread Richard Biener
On Mon, Mar 28, 2016 at 11:26 PM, Jason Merrill  wrote:
> The constexpr evaluation code uses the inlining code to remap the constexpr
> function body for evaluation so that recursion works properly.  In this
> testcase __func__ is declared as a static local variable, so rather than
> remap it, remap_decls tries to add it to the local_decls list for the
> function we're inlining into.  But there is no such function in this case,
> so we crash.
>
> Avoid the add_local_decl call when cfun is null avoids the ICE (thanks
> Jakub), but results in an undefined symbol.  Calling
> varpool_node::finalize_decl instead allows cgraph to handle the reference
> from 'c' properly.
>
> OK if testing passes?

So ce will never be instantiated?

And 'c' will have a DECL_INITIAL of __func__ so I wonder why the cgraph
code when finalizing 'c' does not end up seeing __func__ and finalizing it?
Ah, it only creates a varpool-node it seems but never finalizes it itself.
Honza?

Richard.


Re: [genmatch] reject duplicate captures used as arguments in user-defined predicates

2016-03-29 Thread Richard Biener
On Tue, 29 Mar 2016, Prathamesh Kulkarni wrote:

> Hi,
> I suppose we should reject duplicate captures used as "arguments" in user
> defined predicates ?
> eg:
> (match (foo @0 @0)
>   match-template)
> The attached patch prints error "duplicate capture id" for above pattern.
> Bootstrapped+tested on x86_64-pc-linux-gnu.
> Ok for trunk ?

Using a duplicate probably doesn't make sense but it works just fine.
You get res_args[0] == res_args[1] == @0 in the above case.

Richard.


Re: [PATCH, libgomp] Rewire OpenACC async

2016-03-29 Thread Chung-Lin Tang
I've updated this patch for trunk (as attached), and re-tested without
regressions. This patch is still a fix for 
libgomp.oacc-c-c++-common/asyncwait-1.c,
which FAILs right now.

ChangeLog is still as before. Is this okay for trunk?

Thanks,
Chung-Lin

On 2015/12/22 4:58 PM, Chung-Lin Tang wrote:
> Ping.
> 
> On 2015/11/24 6:27 PM, Chung-Lin Tang wrote:
>> Hi, this patch reworks some of the way that asynchronous copyouts are
>> implemented for OpenACC in libgomp.
>>
>> Before this patch, we had a somewhat confusing way of implementing this
>> by having two refcounts for each mapping: refcount and async_refcount,
>> which I never got working again after the last wave of async regressions
>> showed up.
>>
>> So this patch implements what I believe to be a simplification: 
>> async_refcount
>> is removed, and instead of trying to queue the async copyouts during 
>> unmapping
>> we actually do that during the plugin event handling. This requires a 
>> addition
>> of the async stream integer as an argument to the register_async_cleanup
>> plugin hook, but overall I think this should be more elegant than before.
>>
>> This patch fixes the libgomp.oacc-c-c++-common/asyncwait-1.c regression.
>> It also fixed data-[23].c regressions before, but some other recent check-in
>> happened to already fixed those.
>>
>> Tested without regressions, is this okay for trunk?
>>
>> Thanks,
>> Chung-Lin
>>
>> 2015-11-24  Chung-Lin Tang  
>>
>> * oacc-plugin.h (GOMP_PLUGIN_async_unmap_vars): Add int parameter.
>> * oacc-plugin.c (GOMP_PLUGIN_async_unmap_vars): Add 'int async'
>> parameter, use to set async stream around call to gomp_unmap_vars,
>> call gomp_unmap_vars() with 'do_copyfrom' set to true.
>> * plugin/plugin-nvptx.c (struct ptx_event): Add 'int val' field.
>> (event_gc): Adjust event handling loop, collect PTX_EVT_ASYNC_CLEANUP
>> events and call GOMP_PLUGIN_async_unmap_vars() for each of them.
>> (event_add): Add int parameter, initialize 'val' field when
>> adding new ptx_event struct.
>> (nvptx_evec): Adjust event_add() call arguments.
>> (nvptx_host2dev): Likewise.
>> (nvptx_dev2host): Likewise.
>> (nvptx_wait_async): Likewise.
>> (nvptx_wait_all_async): Likewise.
>> (GOMP_OFFLOAD_openacc_register_async_cleanup): Add async parameter,
>> pass to event_add() call.
>> * oacc-host.c (host_openacc_register_async_cleanup): Add 'int async'
>> parameter.
>> * oacc-mem.c (gomp_acc_remove_pointer): Adjust async case to
>> call openacc.register_async_cleanup_func() hook.
>> * oacc-parallel.c (GOACC_parallel_keyed): Likewise.
>> * target.c (gomp_copy_from_async): Delete function.
>> (gomp_map_vars): Remove async_refcount.
>> (gomp_unmap_vars): Likewise.
>> (gomp_load_image_to_device): Likewise.
>> (omp_target_associate_ptr): Likewise.
>> * libgomp.h (struct splay_tree_key_s): Remove async_refcount.
>> (acc_dispatch_t.register_async_cleanup_func): Add int parameter.
>> (gomp_copy_from_async): Remove.
>>
> 

Index: oacc-host.c
===
--- oacc-host.c	(revision 234516)
+++ oacc-host.c	(working copy)
@@ -144,7 +144,8 @@ host_openacc_exec (void (*fn) (void *),
 }
 
 static void
-host_openacc_register_async_cleanup (void *targ_mem_desc __attribute__ ((unused)))
+host_openacc_register_async_cleanup (void *targ_mem_desc __attribute__ ((unused)),
+ int async __attribute__ ((unused)))
 {
 }
 
Index: oacc-mem.c
===
--- oacc-mem.c	(revision 234516)
+++ oacc-mem.c	(working copy)
@@ -661,10 +661,7 @@ gomp_acc_remove_pointer (void *h, bool force_copyf
   if (async < acc_async_noval)
 gomp_unmap_vars (t, true);
   else
-{
-  gomp_copy_from_async (t);
-  acc_dev->openacc.register_async_cleanup_func (t);
-}
+t->device_descr->openacc.register_async_cleanup_func (t, async);
 
   gomp_debug (0, "  %s: mappings restored\n", __FUNCTION__);
 }
Index: oacc-parallel.c
===
--- oacc-parallel.c	(revision 234516)
+++ oacc-parallel.c	(working copy)
@@ -186,10 +186,7 @@ GOACC_parallel_keyed (int device, void (*fn) (void
   if (async < acc_async_noval)
 gomp_unmap_vars (tgt, true);
   else
-{
-  gomp_copy_from_async (tgt);
-  acc_dev->openacc.register_async_cleanup_func (tgt);
-}
+tgt->device_descr->openacc.register_async_cleanup_func (tgt, async);
 
   acc_dev->openacc.async_set_async_func (acc_async_sync);
 }
Index: target.c
===
--- target.c	(revision 234516)
+++ target.c	(working copy)
@@ -663,7 +663,6 @@ gomp_map_vars (struct gomp_device_descr *devicep,
 		tgt->list[i].offset = 0;
 		tgt->list[i].length = k->host_end - k->h

Re: [PATCH] Fix in-tree gmp/mpfr/mpc generation (PR 67728)

2016-03-29 Thread Bernd Edlinger
On 29.03.2016 at 10:32, Richard Biener wrote:
> On Mon, Mar 28, 2016 at 2:44 PM, Bernd Edlinger
>  wrote:
>>
>> Hi,
>>
>> as described in the tracker we have bootstrap problems with in-tree gmp-6.1.0
>> on certain targets, and also a linker issue with check-mpc due to the changed
>> mpfr library path.
>
> Hum, in-tree gmp 6.1.0 is not supported (only the version downloaded by
> download_prerequesites is).
>

Yes, that is what I thought too, but people out there expect something
different, and run into problems.

>> These are triggered by overriding CFLAGS and LDFLAGS in in-tree builds.
>> It did not happen with the gmp/mpfr/mpc versions that download_prerequisites
>> installs, but the currently latest version of these libraries use CFLAGS to 
>> pass
>> -DNO_ASM which is overridden by gcc and causes the gmp-6.1.0 to be
>> mis-compiled.
>
> So you pass down AM_CFLAGS=-DNO_ASM but how does that reliably work
> for all gmp versions?
>

gmp-4.3.2 did use CPPFLAGS=-DNO_ASM, that is not overridden by the build
machinery, when I pass AM_CFLAGS=-DNO_ASM it is simply defined twice.
gmp-6.1.0 did add -DNO_ASM to CFLAGS, and we break it by overriding
CFLAGS.  By passing -DNO_ASM in AM_CFLAGS we would un-break this
version.
Mark Glisse moved recently the -DNO_ASM from CFLAGS to config.h,
so that version is immune against overriding CFLAGS and defining
AM_CFLAGS=-DNO_ASM is redundant, and does nothing.

>>   And the mpc issue is triggered by overriding LDFLAGS
>> and the changed mpfr library path.  So this started with mpfr v3.1.0 which
>> moved the sources into a src sub-directory.
>>
>> The proposed patch fixes these problems by passing -DNO_ASM in AM_CFLAGS,
>> and adding both possible mpfr library paths to HOST_LIB_PATH_mpfr.
>> I've also adjusted HOST_LIB_PATH_mpc although it did not yet create problems.
>
> But you remove a possibly good .libs lib_path.

No .libs was always wrong as it looks.
At least mpc-0.8.1 and mpc-1.0.3 use src/.libs
However we do not use that path to link mpc, instead we use:

HOST_GMPLIBS = -L$$r/$(HOST_SUBDIR)/gmp/.libs 
-L$$r/$(HOST_SUBDIR)/mpfr/.libs -L$$r/$(HOST_SUBDIR)/mpc/src/.libs -lmpc 
-lmpfr -lgmp

which does it right.


>
>> Boot-strapped and regression tested on x86_64-pc-linux-gnu, with different
>> gmp versions including the latest snapshot.
>> I have additionally built arm cross compilers, which was not working before.
>>
>> Is this OK for trunk?
>
> I don't think so.  Supporting an arbitrary mix of in-tree versions is
> a nightmare.
>
> If you really want to go down this route see @extra_mpc_gmp_configure_flags@
> and add a variant for the lib-paths.  I don't have a good answer for -DNO_ASM
> than to fix gmp/mpfr to not pass down this kind of configuation via CFLAGs.
>

I am afraid, I don't have the power to do that.
gmp-6.2.0 will not be affected, but is not yet released.
All versions of mpc did pass the configure --with-mpfr-lib option
to LDFLAGS which was broken in-tree, but did still work before
mpfr 3.1.0 because the LD_LIBRARY_PATH contained mfr/.libs

> Please instead do the testing required to ensure bumping the versions 
> downloaded
> by download_prerequesite works during next stage1.
>

Sure if we only have to support a single gmp-version that would allow us
to remove a lot of kludges that are only necessary for gmp-4.3.2 or
mpfr before 3.1.0 instead of adding new ones.

But which versions will that be?

gmp-6.1.0 or gmp-6.2.0 (not yet released) ?
mpfr-3.1.4 ?
mpc-1.0.3 ?


Bernd.

> Richard.
>
>>
>> Thanks
>> Bernd.


Re: [RS6000, PATCH] PR70052, ICE compiling _Decimal128 test case

2016-03-29 Thread Alan Modra
On Fri, Mar 25, 2016 at 07:36:34PM +1030, Alan Modra wrote:
> +2016-03-25  Alan Modra  
> +
> + PR target/70052
> + * config/rs6000/constraints.md (j): Simplify.
> + * config/rs6000/predicates.md (easy_fp_constant): Exclude
> + decimal float 0.D.
> + * config/rs6000/rs6000.md (zero_fp): New mode_attr.
> + (mov_hardfloat, mov_hardfloat32, mov_hardfloat64,
> +  mov_64bit_dm, mov_32bit): Use zero_fp in place of j
> + in all constraint alternatives.
> + (movtd_64bit_nodm): Delete "j" constraint alternative.
> +
[snip]
> +2016-03-25  Alan Modra  
> +
> + * gcc.dg/dfp/pr70052.c: New test.
> +

Testing showed that this problem exists on the gcc-5 branch too.  I've
backported the above and bootstrapped plus regression tested on
powerpc64le-linux.  OK for gcc-5?

-- 
Alan Modra
Australia Development Lab, IBM


Re: RFA: PATCH to tree-inline.c:remap_decls for c++/70353 (ICE with __func__ and constexpr)

2016-03-29 Thread Jan Hubicka
> On Mon, Mar 28, 2016 at 11:26 PM, Jason Merrill  wrote:
> > The constexpr evaluation code uses the inlining code to remap the constexpr
> > function body for evaluation so that recursion works properly.  In this
> > testcase __func__ is declared as a static local variable, so rather than
> > remap it, remap_decls tries to add it to the local_decls list for the
> > function we're inlining into.  But there is no such function in this case,
> > so we crash.
> >
> > Avoid the add_local_decl call when cfun is null avoids the ICE (thanks
> > Jakub), but results in an undefined symbol.  Calling
> > varpool_node::finalize_decl instead allows cgraph to handle the reference
> > from 'c' properly.
> >
> > OK if testing passes?
> 
> So ce will never be instantiated?
> 
> And 'c' will have a DECL_INITIAL of __func__ so I wonder why the cgraph
> code when finalizing 'c' does not end up seeing __func__ and finalizing it?
> Ah, it only creates a varpool-node it seems but never finalizes it itself.
> Honza?

While we walk DECL_INITIAL to populate symbol table, we want explicit
cgraph_finalize/varpool_finalize on every symbol that needs to be output.
Otherwise they count just as external references.

Honza
> 
> Richard.


[PATCH] Fix PR hsa/70402

2016-03-29 Thread Martin Liška
Hello.

As reported here: 
https://github.com/HSAFoundation/gccbrig/issues/3#issuecomment-199887172,
our current HSA back-end emits a SBR instruction that is followed by a jump 
instruction
for cases where index is outside of array of labels. However, as mentioned here:
http://www.hsafoundation.com/html/HSA_Library.htm#PRM/Topics/08_Branch/branch.htm?Highlight=sbr:
"The program execution is undefined if the number of labels in labelList is 
less than or equal to the index value."

Thus, the patch generates a guard for the 'index' value before a SBR insn is 
executed.

Patch survives hsa.c.exp testcase on x86_64-linux-gnu and bootstrap on 
x86_64-linux-gnu
machine has been running.

Ready to be installed as soon as it finishes?
Thanks,
Martin

gcc/ChangeLog:

2016-03-29  Martin Liska  

PR hsa/70402
* hsa-gen.c (gen_hsa_insns_for_switch_stmt): Guard index
value that is really in range handled by SBR instruction.
* hsa-brig.c (emit_switch_insn): Do not emit unconditional
jump.
* hsa-dump.c (dump_hsa_insn_1): Do not dump default BB.
* hsa.h (hsa_insn_sbr::m_default_bb): Remove field.
---
 gcc/hsa-brig.c |  4 
 gcc/hsa-dump.c |  3 ---
 gcc/hsa-gen.c  | 52 ++--
 gcc/hsa.h  |  3 ---
 4 files changed, 46 insertions(+), 16 deletions(-)

diff --git a/gcc/hsa-brig.c b/gcc/hsa-brig.c
index 9b6c0b8..14c1d3f 100644
--- a/gcc/hsa-brig.c
+++ b/gcc/hsa-brig.c
@@ -1537,10 +1537,6 @@ emit_switch_insn (hsa_insn_sbr *sbr)
 
   brig_code.add (&repr, sizeof (repr));
   brig_insn_count++;
-
-  /* Emit jump to default label.  */
-  hsa_bb *hbb = hsa_bb_for_bb (sbr->m_default_bb);
-  emit_unconditional_jump (&hbb->m_label_ref);
 }
 
 /* Emit a HSA convert instruction and all necessary directives, schedule
diff --git a/gcc/hsa-dump.c b/gcc/hsa-dump.c
index b69b34d..0885bc8 100644
--- a/gcc/hsa-dump.c
+++ b/gcc/hsa-dump.c
@@ -931,9 +931,6 @@ dump_hsa_insn_1 (FILE *f, hsa_insn_basic *insn, int *indent)
  if (i != sbr->m_jump_table.length () - 1)
fprintf (f, ", ");
}
-
-  fprintf (f, "] /* default: BB %i */",
-  hsa_bb_for_bb (sbr->m_default_bb)->m_index);
 }
   else if (is_a  (insn))
 {
diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index 72eecf9..4467650 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -1530,7 +1530,7 @@ hsa_insn_br::operator new (size_t)
 
 hsa_insn_sbr::hsa_insn_sbr (hsa_op_reg *index, unsigned jump_count)
   : hsa_insn_basic (1, BRIG_OPCODE_SBR, BRIG_TYPE_B1, index),
-m_width (BRIG_WIDTH_1), m_jump_table (vNULL), m_default_bb (NULL),
+m_width (BRIG_WIDTH_1), m_jump_table (vNULL),
 m_label_code_list (new hsa_op_code_list (jump_count))
 {
 }
@@ -3398,11 +3398,56 @@ get_switch_size (gswitch *s)
 static void
 gen_hsa_insns_for_switch_stmt (gswitch *s, hsa_bb *hbb)
 {
+  gimple_stmt_iterator it = gsi_for_stmt (s);
+  gsi_prev (&it);
+
+  /* Create preambule that verifies that index - lowest_label >= 0.  */
+  edge e = split_block (hbb->m_bb, gsi_stmt (it));
+  hbb = hsa_init_new_bb (e->dest);
+
+  /* As gimple validator expects true/false edges just in case
+ last STMT of a basic block is a GMIPLE_COND, we need to
+ create an empty BB that will contain the comparasion insns.  */
+  e = split_block (e->dest, (gimple *) NULL);
+
+  e->flags &= ~EDGE_FALLTHRU;
+  e->flags |= EDGE_TRUE_VALUE;
+
   function *func = DECL_STRUCT_FUNCTION (current_function_decl);
   tree index_tree = gimple_switch_index (s);
   tree lowest = get_switch_low (s);
+  tree highest = get_switch_high (s);
 
   hsa_op_reg *index = hsa_cfun->reg_for_gimple_ssa (index_tree);
+
+  hsa_op_reg *cmp1_reg = new hsa_op_reg (BRIG_TYPE_B1);
+  hsa_op_immed *cmp1_immed = new hsa_op_immed (lowest);
+  hbb->append_insn (new hsa_insn_cmp (BRIG_COMPARE_GE, cmp1_reg->m_type,
+ cmp1_reg, index, cmp1_immed));
+
+  hsa_op_reg *cmp2_reg = new hsa_op_reg (BRIG_TYPE_B1);
+  hsa_op_immed *cmp2_immed = new hsa_op_immed (highest);
+  hbb->append_insn (new hsa_insn_cmp (BRIG_COMPARE_LE, cmp2_reg->m_type,
+ cmp2_reg, index, cmp2_immed));
+
+  hsa_op_reg *cmp_reg = new hsa_op_reg (BRIG_TYPE_B1);
+  hbb->append_insn (new hsa_insn_basic (3, BRIG_OPCODE_AND, cmp_reg->m_type,
+   cmp_reg, cmp1_reg, cmp2_reg));
+
+  hbb->append_insn (new hsa_insn_br (cmp_reg));
+
+  tree default_label = gimple_switch_default_label (s);
+  basic_block default_label_bb = label_to_block_fn (func,
+   CASE_LABEL (default_label));
+
+  make_edge (e->src, default_label_bb, EDGE_FALSE_VALUE);
+
+  free_dominance_info (CDI_DOMINATORS);
+  calculate_dominance_info (CDI_DOMINATORS);
+
+  /* Basic block with the SBR instruction.  */
+  hbb = hsa_init_new_bb (e->dest);
+
   hsa_op_reg *sub_index = new hsa_op_reg (index->m_type);
   hbb->append_insn (new hsa_insn_basic (3, BRIG_OP

Re: [PATCH] Fix PR tree-optimization/59124 (bogus -Warray-bounds warning)

2016-03-29 Thread Patrick Palka
On Tue, 29 Mar 2016, Richard Biener wrote:

> On Sun, Mar 27, 2016 at 11:37 PM, Patrick Palka  wrote:
> > On Sun, 27 Mar 2016, Patrick Palka wrote:
> >
> >> In unrolling of the inner loop in the test case below we introduce
> >> unreachable code that otherwise contains out-of-bounds array accesses.
> >> This is because the estimation of the maximum number of iterations of
> >> the inner loop is too conservative: we assume 6 iterations instead of
> >> the actual 4.
> >>
> >> Nonetheless, VRP should be able to tell that the code is unreachable so
> >> that it doesn't warn about it.  The only thing holding VRP back is that
> >> it doesn't look through conditionals of the form
> >>
> >>if (j_10 != CST1)where j_10 = j_9 + CST2
> >>
> >> so that it could add the assertion
> >>
> >>j_9 != (CST1 - CST2)
> >>
> >> This patch teaches VRP to detect such conditionals and to add such
> >> assertions, so that it could remove instead of warn about the
> >> unreachable code created during loop unrolling.
> >>
> >> What this addition does with the test case below is something like this:
> >>
> >> ASSERT_EXPR (i <= 5);
> >> for (i = 1; i < 6; i++)
> >>   {
> >> j = i - 1;
> >> if (j == 0)
> >>   break;
> >> // ASSERT_EXPR (i != 1)
> >> bar[j] = baz[j];
> >>
> >> j = i - 2
> >> if (j == 0)
> >>   break;
> >> // ASSERT_EXPR (i != 2)
> >> bar[j] = baz[j];
> >>
> >> j = i - 3
> >> if (j == 0)
> >>   break;
> >> // ASSERT_EXPR (i != 3)
> >> bar[j] = baz[j];
> >>
> >> j = i - 4
> >> if (j == 0)
> >>   break;
> >> // ASSERT_EXPR (i != 4)
> >> bar[j] = baz[j];
> >>
> >> j = i - 5
> >> if (j == 0)
> >>   break;
> >> // ASSERT_EXPR (i != 5)
> >> bar[j] = baz[j];
> >>
> >> j = i - 6
> >> if (j == 0)
> >>   break;
> >> // ASSERT_EXPR (i != 6)
> >> bar[j] = baz[j]; // unreachable because (i != 6 && i <= 5) is always 
> >> false
> >>   }
> >>
> >> (I think the patch I sent a year ago that improved the
> >>  register_edge_assert stuff would have fixed this too.  I'll try to
> >>  post it again during next stage 1.
> >>  https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00908.html)
> >>
> >> Bootstrap + regtest in progress on x86_64-pc-linux-gnu, does this look
> >> OK to commit after testing?
> >>
> >> gcc/ChangeLog:
> >>
> >>   PR tree-optimization/59124
> >>   * tree-vrp.c (register_edge_assert_for): For NAME != CST1
> >>   where NAME = A + CST2 add the assertion A != (CST1 - CST2).
> >>
> >> gcc/testsuite/ChangeLog:
> >>
> >>   PR tree-optimization/59124
> >>   * gcc.dg/Warray-bounds-19.c: New test.
> >> ---
> >>  gcc/testsuite/gcc.dg/Warray-bounds-19.c | 17 +
> >>  gcc/tree-vrp.c  | 22 ++
> >>  2 files changed, 39 insertions(+)
> >>  create mode 100644 gcc/testsuite/gcc.dg/Warray-bounds-19.c
> >>
> >> diff --git a/gcc/testsuite/gcc.dg/Warray-bounds-19.c 
> >> b/gcc/testsuite/gcc.dg/Warray-bounds-19.c
> >> new file mode 100644
> >> index 000..e2f9661
> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.dg/Warray-bounds-19.c
> >> @@ -0,0 +1,17 @@
> >> +/* PR tree-optimization/59124 */
> >> +/* { dg-options "-O3 -Warray-bounds" } */
> >> +
> >> +unsigned baz[6];
> >> +
> >> +void foo(unsigned *bar, unsigned n)
> >> +{
> >> +  unsigned i, j;
> >> +
> >> +  if (n > 6)
> >> +n = 6;
> >> +
> >> +  for (i = 1; i < n; i++)
> >> +for (j = i - 1; j > 0; j--)
> >> +  bar[j - 1] = baz[j - 1];
> >> +}
> >> +
> >> diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
> >> index b5654c5..31bd575 100644
> >> --- a/gcc/tree-vrp.c
> >> +++ b/gcc/tree-vrp.c
> >> @@ -5820,6 +5820,28 @@ register_edge_assert_for (tree name, edge e, 
> >> gimple_stmt_iterator si,
> >>   }
> >>  }
> >>
> >> +  /* In the case of NAME != CST1 where NAME = A + CST2 we can
> >> + assert that NAME != (CST1 - CST2).  */
> >
> > This should say A != (...) not NAME != (...)
> >
> >> +  if ((comp_code == EQ_EXPR || comp_code == NE_EXPR)
> >> +  && TREE_CODE (val) == INTEGER_CST)
> >> +{
> >> +  gimple *def_stmt = SSA_NAME_DEF_STMT (name);
> >> +
> >> +  if (is_gimple_assign (def_stmt)
> >> +   && gimple_assign_rhs_code (def_stmt) == PLUS_EXPR)
> >> + {
> >> +   tree op0 = gimple_assign_rhs1 (def_stmt);
> >> +   tree op1 = gimple_assign_rhs2 (def_stmt);
> >> +   if (TREE_CODE (op0) == SSA_NAME
> >> +   && TREE_CODE (op1) == INTEGER_CST)
> >> + {
> >> +   op1 = int_const_binop (MINUS_EXPR, val, op1);
> >> +   register_edge_assert_for_2 (op0, e, si, comp_code,
> >> +   op0, op1, is_else_edge);
> >
> > The last argument to register_edge_assert_for_2() should be false not
> > is_else_edge since comp_code is already inverted.
> >
> > Consider these two things fixed.  Also I moved down the new code so that
> > it's at the very bottom of register_edge_assert_for.  Here's a

Re: Goodbye REG_LIVE_LENGTH

2016-03-29 Thread Bernd Schmidt

On 03/25/2016 11:00 PM, Alan Modra wrote:

I'll also prepare a patch to delete REG_LIVE_LENGTH everywhere.


Like this.  Bootstrapped and regression tested x86_64-linux.
OK for stage1?


Oh wow that's a lot of stuff removed. Ok for this and the 
FREQ_CALLS_CROSSED patch.



Bernd



[PATCH ARM v3] PR69770 -mlong-calls does not affect calls to __gnu_mcount_nc generated by -pg

2016-03-29 Thread Charles Baylis
On 29 March 2016 at 02:16, Kugan  wrote:
>
> Hi Charles,
>
> +static void
> +arm_emit_long_call_profile_insn ()
> +{
> +  rtx sym_ref = gen_rtx_SYMBOL_REF (Pmode, "__gnu_mcount_nc");
> +  /* if movt/movw are not available, use a constant pool */
> +  if (!arm_arch_thumb2)
>
> Should this be !TARGET_USE_MOVT?

Hi Kugan,

Thanks for the review.

TARGET_USE_MOVT has additional conditions which mean that it can be
false on targets with MOVW/MOVT depending on the tuning parameters for
the target CPU. Because this patch works in a slightly odd way, I
think it is better to use MOVW/MOVT where possible so that the
slightly hacky use of the literal pool is avoided. Since this only
happens when profiling, it is not essential to have the fully
optimised code sequence here. I'm happy to change it if anybody feels
strongly though.

I've noticed in the quoted snippet that there are some GNU coding
style errors, so I've respun the patch with those corrected.

gcc/ChangeLog:

2016-03-29  Charles Baylis  

* config/arm/arm-protos.h (arm_emit_long_call_profile): New function.
* config/arm/arm.c (arm_emit_long_call_profile_insn): New function.
(arm_expand_prologue): Likewise.
(thumb1_expand_prologue): Likewise.
(arm_output_long_call_to_profile_func): Likewise.
(arm_emit_long_call_profile): Likewise.
* config/arm/arm.h: (ASM_OUTPUT_REG_PUSH) Update comment.
* config/arm/arm.md (arm_long_call_profile): New pattern.
* config/arm/bpabi.h (ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS): New
define.
* config/arm/thumb1.md (thumb1_long_call_profile): New pattern.
* config/arm/unspecs.md (unspecv): Add VUNSPEC_LONG_CALL_PROFILE.

gcc/testsuite/ChangeLog:

2016-03-29  Charles Baylis  

* gcc.target/arm/pr69770.c: New test.
From 5785ddcfd518c44cf87b0fc74b4397fd98d1b0c1 Mon Sep 17 00:00:00 2001
From: Charles Baylis 
Date: Tue, 29 Mar 2016 12:28:25 +0100
Subject: [PATCH] PR69770 -mlong-calls does not affect calls to __gnu_mcount_nc
 generated by -pg

gcc/ChangeLog:

2016-03-29  Charles Baylis  

* config/arm/arm-protos.h (arm_emit_long_call_profile): New function.
* config/arm/arm.c (arm_emit_long_call_profile_insn): New function.
(arm_expand_prologue): Likewise.
(thumb1_expand_prologue): Likewise.
(arm_output_long_call_to_profile_func): Likewise.
(arm_emit_long_call_profile): Likewise.
* config/arm/arm.h: (ASM_OUTPUT_REG_PUSH) Update comment.
* config/arm/arm.md (arm_long_call_profile): New pattern.
* config/arm/bpabi.h (ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS): New
	define.
* config/arm/thumb1.md (thumb1_long_call_profile): New pattern.
* config/arm/unspecs.md (unspecv): Add VUNSPEC_LONG_CALL_PROFILE.

gcc/testsuite/ChangeLog:

2016-03-29  Charles Baylis  

* gcc.target/arm/pr69770.c: New test.

Change-Id: I9b8de01fea083f17f729c3801f83174bedb3b0c6

diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 0083673..324c9f4 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -343,6 +343,7 @@ extern void arm_register_target_pragmas (void);
 extern void arm_cpu_cpp_builtins (struct cpp_reader *);
 
 extern bool arm_is_constant_pool_ref (rtx);
+void arm_emit_long_call_profile ();
 
 /* Flags used to identify the presence of processor capabilities.  */
 
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index c868490..885657a 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -21426,6 +21426,21 @@ output_probe_stack_range (rtx reg1, rtx reg2)
   return "";
 }
 
+static void
+arm_emit_long_call_profile_insn ()
+{
+  rtx sym_ref = gen_rtx_SYMBOL_REF (Pmode, "__gnu_mcount_nc");
+  /* If movt/movw are not available, use a constant pool.  */
+  if (!arm_arch_thumb2)
+  {
+sym_ref = force_const_mem (Pmode, sym_ref);
+  }
+  rtvec vec = gen_rtvec (1, sym_ref);
+  rtx tmp = gen_rtx_UNSPEC_VOLATILE (VOIDmode, vec, VUNSPEC_LONG_CALL_PROFILE);
+  emit_insn (tmp);
+}
+
+
 /* Generate the prologue instructions for entry into an ARM or Thumb-2
function.  */
 void
@@ -21789,6 +21804,10 @@ arm_expand_prologue (void)
   arm_load_pic_register (mask);
 }
 
+  if (crtl->profile && TARGET_LONG_CALLS
+  && ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS)
+arm_emit_long_call_profile_insn ();
+
   /* If we are profiling, make sure no instructions are scheduled before
  the call to mcount.  Similarly if the user has requested no
  scheduling in the prolog.  Similarly if we want non-call exceptions
@@ -24985,6 +25004,10 @@ thumb1_expand_prologue (void)
   if (frame_pointer_needed)
 thumb_set_frame_pointer (offsets);
 
+  if (crtl->profile && TARGET_LONG_CALLS
+  && ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS)
+arm_emit_long_call_profile_insn ();
+
   /* If we are profiling, make sure no instructions are scheduled before
  the call to mcount.  Similarly if the user has 

[PATCH 1/2] Do not verify CFG if a function is discarded (PR

2016-03-29 Thread Martin Liška
Hello.

The problem with the original patch is that I'm forced to produce
an empty BB to produce true/false edge needed for the 'index' check:

/home/marxin/Programming/testhsa/run_tests/012-switch/switch-5.c:28:9: error: 
true/false edge after a non-GIMPLE_COND in bb 4
/home/marxin/Programming/testhsa/run_tests/012-switch/switch-5.c:28:9: internal 
compiler error: verify_flow_info failed
0x93121a verify_flow_info()
../../gcc/cfghooks.c:260
0xd5ae4e execute_function_todo
../../gcc/passes.c:1971
0xd59ea6 do_per_function
../../gcc/passes.c:1645
0xd5afc2 execute_todo
../../gcc/passes.c:2011

It would nicer to not produce empty block for that purpose, but the question
is if the change is acceptable during the stage4?

Thanks,
Martin
>From 02f574e46565c70e56bcb07f2a5d1b9371e008fc Mon Sep 17 00:00:00 2001
From: marxin 
Date: Tue, 29 Mar 2016 13:33:16 +0200
Subject: [PATCH 1/2] Do not verify CFG if a function is discarded (PR
 hsa/70402)

gcc/ChangeLog:

2016-03-29  Martin Liska  

	* passes.c (execute_function_todo): Do not verify CFG
	if a function is discarded.
---
 gcc/passes.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/passes.c b/gcc/passes.c
index 9d90251..daa0d7f 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -1964,9 +1964,10 @@ execute_function_todo (function *fn, void *data)
 	   not verify SSA operands whose verifier will choke on that.  */
 	verify_ssa (true, !from_ipa_pass);
 	  /* IPA passes leave basic-blocks unsplit, so make sure to
-	 not trip on that.  */
+	 not trip on that.  Do not verify CFG if a function is marked
+	 to be discarded (e.g. by HSA gen pass).  */
 	  if ((cfun->curr_properties & PROP_cfg)
-	  && !from_ipa_pass)
+	  && !from_ipa_pass && !(flags & TODO_discard_function))
 	verify_flow_info ();
 	  if (current_loops
 	  && loops_state_satisfies_p (LOOP_CLOSED_SSA))
-- 
2.7.1



[PATCH 2/2] Fix PR hsa/70402

2016-03-29 Thread Martin Liška
Second part of the patch set which omits one split_block (compared to the 
original patch).
Acceptable just in case the first part will be accepted.

Thanks
Martin
>From 2a9a8f11ea1ecd04c3b9915ff77fc791c55632da Mon Sep 17 00:00:00 2001
From: marxin 
Date: Tue, 29 Mar 2016 12:06:20 +0200
Subject: [PATCH 2/2] Fix PR hsa/70402

gcc/ChangeLog:

2016-03-29  Martin Liska  

	PR hsa/70402
	* hsa-gen.c (gen_hsa_insns_for_switch_stmt): Guard index
	value that is really in range handled by SBR instruction.
	* hsa-brig.c (emit_switch_insn): Do not emit unconditional
	jump.
	* hsa-dump.c (dump_hsa_insn_1): Do not dump default BB.
	* hsa.h (hsa_insn_sbr::m_default_bb): Remove field.
---
 gcc/hsa-brig.c |  4 
 gcc/hsa-dump.c |  3 ---
 gcc/hsa-gen.c  | 45 +++--
 gcc/hsa.h  |  3 ---
 4 files changed, 39 insertions(+), 16 deletions(-)

diff --git a/gcc/hsa-brig.c b/gcc/hsa-brig.c
index 9b6c0b8..14c1d3f 100644
--- a/gcc/hsa-brig.c
+++ b/gcc/hsa-brig.c
@@ -1537,10 +1537,6 @@ emit_switch_insn (hsa_insn_sbr *sbr)
 
   brig_code.add (&repr, sizeof (repr));
   brig_insn_count++;
-
-  /* Emit jump to default label.  */
-  hsa_bb *hbb = hsa_bb_for_bb (sbr->m_default_bb);
-  emit_unconditional_jump (&hbb->m_label_ref);
 }
 
 /* Emit a HSA convert instruction and all necessary directives, schedule
diff --git a/gcc/hsa-dump.c b/gcc/hsa-dump.c
index b69b34d..0885bc8 100644
--- a/gcc/hsa-dump.c
+++ b/gcc/hsa-dump.c
@@ -931,9 +931,6 @@ dump_hsa_insn_1 (FILE *f, hsa_insn_basic *insn, int *indent)
 	  if (i != sbr->m_jump_table.length () - 1)
 	fprintf (f, ", ");
 	}
-
-  fprintf (f, "] /* default: BB %i */",
-	   hsa_bb_for_bb (sbr->m_default_bb)->m_index);
 }
   else if (is_a  (insn))
 {
diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index 72eecf9..a8b868e 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -1530,7 +1530,7 @@ hsa_insn_br::operator new (size_t)
 
 hsa_insn_sbr::hsa_insn_sbr (hsa_op_reg *index, unsigned jump_count)
   : hsa_insn_basic (1, BRIG_OPCODE_SBR, BRIG_TYPE_B1, index),
-m_width (BRIG_WIDTH_1), m_jump_table (vNULL), m_default_bb (NULL),
+m_width (BRIG_WIDTH_1), m_jump_table (vNULL),
 m_label_code_list (new hsa_op_code_list (jump_count))
 {
 }
@@ -3398,11 +3398,49 @@ get_switch_size (gswitch *s)
 static void
 gen_hsa_insns_for_switch_stmt (gswitch *s, hsa_bb *hbb)
 {
+  gimple_stmt_iterator it = gsi_for_stmt (s);
+  gsi_prev (&it);
+
+  /* Create preambule that verifies that index - lowest_label >= 0.  */
+  edge e = split_block (hbb->m_bb, gsi_stmt (it));
+  e->flags &= ~EDGE_FALLTHRU;
+  e->flags |= EDGE_TRUE_VALUE;
+
   function *func = DECL_STRUCT_FUNCTION (current_function_decl);
   tree index_tree = gimple_switch_index (s);
   tree lowest = get_switch_low (s);
+  tree highest = get_switch_high (s);
 
   hsa_op_reg *index = hsa_cfun->reg_for_gimple_ssa (index_tree);
+
+  hsa_op_reg *cmp1_reg = new hsa_op_reg (BRIG_TYPE_B1);
+  hsa_op_immed *cmp1_immed = new hsa_op_immed (lowest);
+  hbb->append_insn (new hsa_insn_cmp (BRIG_COMPARE_GE, cmp1_reg->m_type,
+  cmp1_reg, index, cmp1_immed));
+
+  hsa_op_reg *cmp2_reg = new hsa_op_reg (BRIG_TYPE_B1);
+  hsa_op_immed *cmp2_immed = new hsa_op_immed (highest);
+  hbb->append_insn (new hsa_insn_cmp (BRIG_COMPARE_LE, cmp2_reg->m_type,
+  cmp2_reg, index, cmp2_immed));
+
+  hsa_op_reg *cmp_reg = new hsa_op_reg (BRIG_TYPE_B1);
+  hbb->append_insn (new hsa_insn_basic (3, BRIG_OPCODE_AND, cmp_reg->m_type,
+	cmp_reg, cmp1_reg, cmp2_reg));
+
+  hbb->append_insn (new hsa_insn_br (cmp_reg));
+
+  tree default_label = gimple_switch_default_label (s);
+  basic_block default_label_bb = label_to_block_fn (func,
+		CASE_LABEL (default_label));
+
+  make_edge (e->src, default_label_bb, EDGE_FALSE_VALUE);
+
+  free_dominance_info (CDI_DOMINATORS);
+  calculate_dominance_info (CDI_DOMINATORS);
+
+  /* Basic block with the SBR instruction.  */
+  hbb = hsa_init_new_bb (e->dest);
+
   hsa_op_reg *sub_index = new hsa_op_reg (index->m_type);
   hbb->append_insn (new hsa_insn_basic (3, BRIG_OPCODE_SUB, sub_index->m_type,
 	sub_index, index,
@@ -3414,11 +3452,6 @@ gen_hsa_insns_for_switch_stmt (gswitch *s, hsa_bb *hbb)
   unsigned HOST_WIDE_INT size = tree_to_uhwi (get_switch_size (s));
 
   hsa_insn_sbr *sbr = new hsa_insn_sbr (sub_index, size + 1);
-  tree default_label = gimple_switch_default_label (s);
-  basic_block default_label_bb = label_to_block_fn (func,
-		CASE_LABEL (default_label));
-
-  sbr->m_default_bb = default_label_bb;
 
   /* Prepare array with default label destination.  */
   for (unsigned HOST_WIDE_INT i = 0; i <= size; i++)
diff --git a/gcc/hsa.h b/gcc/hsa.h
index 1d6baab..1d55cef 100644
--- a/gcc/hsa.h
+++ b/gcc/hsa.h
@@ -562,9 +562,6 @@ public:
   /* Jump table.  */
   vec  m_jump_table;
 
-  /* Default label basic block.  */
-  basic_block m_default_bb;
-
   /* Code list for label references.  */
   hsa_op_code_list *m_label_code_list;
 
-- 
2.7.1



Re: [DOC Patch] Add sample for @cc constraint

2016-03-29 Thread Bernd Schmidt

On 03/28/2016 12:03 AM, David Wohlferd wrote:

On 3/24/2016 8:00 AM, Bernd Schmidt wrote:
 > More problematic than a lack of documentation is that I haven't been
able to find an executable testcase. If you could adapt your example for
use in gcc.target/i386, that would be even more important.

It looks like Richard included some "scan-assembler" statements in the
suites with the original checkin
(https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=225122). Is that
not sufficient?  If not, I'm certainly prepared to create a couple
executable cases for the next rev of this patch.


I don't think it's sufficient. I would like executable code that 
verifies that this feature is indeed working as intended.



 > I don't think the manual should point out the obvious. I'd be
surprised if this wasn't documented or at least strongly implied
elsewhere for normal operands.

Well, *I* thought it was obvious, because it is both documented and
implied elsewhere.

However, the compiler doesn't see it that way.  Normally, attempting to
overlap 'clobbers' and 'outputs' generates compile errors, but not when
outputting and clobbering flags.  I filed pr68095 about this (including
a rough draft at a patch), but apparently not everyone sees this the way
I do.


Is there any _actual_ problem here? Like, if you combine the output and 
the clobber you run into problems? Looks to me like an explicit "cc" 
clobber is just ignored on x86. We just need to make sure this stays 
working (testcases).



 >> +Note: On the x86 platform, flags are normally considered clobbered by
 >> +extended asm whether the @code{"cc"} clobber is specified or not.
 >
 > Is it really necessary or helpful to mention that here? Not only is
it not strictly correct (an output operand is not also considered
clobbered), but to me it breaks the flow because you're left wondering
how that sentence relates to the example (it doesn't).

The problem I am trying to fix here is that on x86, the "cc" is implicit
for all extended asm statements, whether it is specified or not and
whether there is a flags output or not.  However, that fact isn't
documented anywhere.  So, where does that info go?  It could go right by
the docs for "cc", but since this behavior only applies to x86, that
would make the docs there messy.


My question would be, can this information ever be relevant to users? 
They may notice that their code still works if they omit the "cc", but 
that's not really a habit we want to encourage. I think this is an 
internal implementation detail that doesn't necessarily even have to be 
documented.



Bernd


Re: [PATCH] Fix PR tree-optimization/59124 (bogus -Warray-bounds warning)

2016-03-29 Thread Richard Biener
On Tue, Mar 29, 2016 at 1:23 PM, Patrick Palka  wrote:
> On Tue, 29 Mar 2016, Richard Biener wrote:
>
>> On Sun, Mar 27, 2016 at 11:37 PM, Patrick Palka  wrote:
>> > On Sun, 27 Mar 2016, Patrick Palka wrote:
>> >
>> >> In unrolling of the inner loop in the test case below we introduce
>> >> unreachable code that otherwise contains out-of-bounds array accesses.
>> >> This is because the estimation of the maximum number of iterations of
>> >> the inner loop is too conservative: we assume 6 iterations instead of
>> >> the actual 4.
>> >>
>> >> Nonetheless, VRP should be able to tell that the code is unreachable so
>> >> that it doesn't warn about it.  The only thing holding VRP back is that
>> >> it doesn't look through conditionals of the form
>> >>
>> >>if (j_10 != CST1)where j_10 = j_9 + CST2
>> >>
>> >> so that it could add the assertion
>> >>
>> >>j_9 != (CST1 - CST2)
>> >>
>> >> This patch teaches VRP to detect such conditionals and to add such
>> >> assertions, so that it could remove instead of warn about the
>> >> unreachable code created during loop unrolling.
>> >>
>> >> What this addition does with the test case below is something like this:
>> >>
>> >> ASSERT_EXPR (i <= 5);
>> >> for (i = 1; i < 6; i++)
>> >>   {
>> >> j = i - 1;
>> >> if (j == 0)
>> >>   break;
>> >> // ASSERT_EXPR (i != 1)
>> >> bar[j] = baz[j];
>> >>
>> >> j = i - 2
>> >> if (j == 0)
>> >>   break;
>> >> // ASSERT_EXPR (i != 2)
>> >> bar[j] = baz[j];
>> >>
>> >> j = i - 3
>> >> if (j == 0)
>> >>   break;
>> >> // ASSERT_EXPR (i != 3)
>> >> bar[j] = baz[j];
>> >>
>> >> j = i - 4
>> >> if (j == 0)
>> >>   break;
>> >> // ASSERT_EXPR (i != 4)
>> >> bar[j] = baz[j];
>> >>
>> >> j = i - 5
>> >> if (j == 0)
>> >>   break;
>> >> // ASSERT_EXPR (i != 5)
>> >> bar[j] = baz[j];
>> >>
>> >> j = i - 6
>> >> if (j == 0)
>> >>   break;
>> >> // ASSERT_EXPR (i != 6)
>> >> bar[j] = baz[j]; // unreachable because (i != 6 && i <= 5) is always 
>> >> false
>> >>   }
>> >>
>> >> (I think the patch I sent a year ago that improved the
>> >>  register_edge_assert stuff would have fixed this too.  I'll try to
>> >>  post it again during next stage 1.
>> >>  https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00908.html)
>> >>
>> >> Bootstrap + regtest in progress on x86_64-pc-linux-gnu, does this look
>> >> OK to commit after testing?
>> >>
>> >> gcc/ChangeLog:
>> >>
>> >>   PR tree-optimization/59124
>> >>   * tree-vrp.c (register_edge_assert_for): For NAME != CST1
>> >>   where NAME = A + CST2 add the assertion A != (CST1 - CST2).
>> >>
>> >> gcc/testsuite/ChangeLog:
>> >>
>> >>   PR tree-optimization/59124
>> >>   * gcc.dg/Warray-bounds-19.c: New test.
>> >> ---
>> >>  gcc/testsuite/gcc.dg/Warray-bounds-19.c | 17 +
>> >>  gcc/tree-vrp.c  | 22 ++
>> >>  2 files changed, 39 insertions(+)
>> >>  create mode 100644 gcc/testsuite/gcc.dg/Warray-bounds-19.c
>> >>
>> >> diff --git a/gcc/testsuite/gcc.dg/Warray-bounds-19.c 
>> >> b/gcc/testsuite/gcc.dg/Warray-bounds-19.c
>> >> new file mode 100644
>> >> index 000..e2f9661
>> >> --- /dev/null
>> >> +++ b/gcc/testsuite/gcc.dg/Warray-bounds-19.c
>> >> @@ -0,0 +1,17 @@
>> >> +/* PR tree-optimization/59124 */
>> >> +/* { dg-options "-O3 -Warray-bounds" } */
>> >> +
>> >> +unsigned baz[6];
>> >> +
>> >> +void foo(unsigned *bar, unsigned n)
>> >> +{
>> >> +  unsigned i, j;
>> >> +
>> >> +  if (n > 6)
>> >> +n = 6;
>> >> +
>> >> +  for (i = 1; i < n; i++)
>> >> +for (j = i - 1; j > 0; j--)
>> >> +  bar[j - 1] = baz[j - 1];
>> >> +}
>> >> +
>> >> diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
>> >> index b5654c5..31bd575 100644
>> >> --- a/gcc/tree-vrp.c
>> >> +++ b/gcc/tree-vrp.c
>> >> @@ -5820,6 +5820,28 @@ register_edge_assert_for (tree name, edge e, 
>> >> gimple_stmt_iterator si,
>> >>   }
>> >>  }
>> >>
>> >> +  /* In the case of NAME != CST1 where NAME = A + CST2 we can
>> >> + assert that NAME != (CST1 - CST2).  */
>> >
>> > This should say A != (...) not NAME != (...)
>> >
>> >> +  if ((comp_code == EQ_EXPR || comp_code == NE_EXPR)
>> >> +  && TREE_CODE (val) == INTEGER_CST)
>> >> +{
>> >> +  gimple *def_stmt = SSA_NAME_DEF_STMT (name);
>> >> +
>> >> +  if (is_gimple_assign (def_stmt)
>> >> +   && gimple_assign_rhs_code (def_stmt) == PLUS_EXPR)
>> >> + {
>> >> +   tree op0 = gimple_assign_rhs1 (def_stmt);
>> >> +   tree op1 = gimple_assign_rhs2 (def_stmt);
>> >> +   if (TREE_CODE (op0) == SSA_NAME
>> >> +   && TREE_CODE (op1) == INTEGER_CST)
>> >> + {
>> >> +   op1 = int_const_binop (MINUS_EXPR, val, op1);
>> >> +   register_edge_assert_for_2 (op0, e, si, comp_code,
>> >> +   op0, op1, is_else_edge);
>> >
>> > The last argument to register_edge_assert_for_2() should be false n

Re: [PATCH 1/2] Do not verify CFG if a function is discarded (PR

2016-03-29 Thread Richard Biener
On Tue, Mar 29, 2016 at 1:42 PM, Martin Liška  wrote:
> Hello.
>
> The problem with the original patch is that I'm forced to produce
> an empty BB to produce true/false edge needed for the 'index' check:
>
> /home/marxin/Programming/testhsa/run_tests/012-switch/switch-5.c:28:9: error: 
> true/false edge after a non-GIMPLE_COND in bb 4
> /home/marxin/Programming/testhsa/run_tests/012-switch/switch-5.c:28:9: 
> internal compiler error: verify_flow_info failed
> 0x93121a verify_flow_info()
> ../../gcc/cfghooks.c:260
> 0xd5ae4e execute_function_todo
> ../../gcc/passes.c:1971
> 0xd59ea6 do_per_function
> ../../gcc/passes.c:1645
> 0xd5afc2 execute_todo
> ../../gcc/passes.c:2011
>
> It would nicer to not produce empty block for that purpose, but the question
> is if the change is acceptable during the stage4?

Hmm, why don't we short-cut things earlier in execute_one_pass where
we handle TODO_discard_function?
That is, sth like

Index: gcc/passes.c
===
--- gcc/passes.c(revision 234453)
+++ gcc/passes.c(working copy)
@@ -2334,6 +2334,33 @@ execute_one_pass (opt_pass *pass)

   /* Do it!  */
   todo_after = pass->execute (cfun);
+
+  if (todo_after & TODO_discard_function)
+{
+  pass_fini_dump_file (pass);
+
+  gcc_assert (cfun);
+  /* As cgraph_node::release_body expects release dominators info,
+we have to release it.  */
+  if (dom_info_available_p (CDI_DOMINATORS))
+   free_dominance_info (CDI_DOMINATORS);
+
+  if (dom_info_available_p (CDI_POST_DOMINATORS))
+   free_dominance_info (CDI_POST_DOMINATORS);
+
+  tree fn = cfun->decl;
+  pop_cfun ();
+  gcc_assert (!cfun);
+  cgraph_node::get (fn)->release_body ();
+
+  current_pass = NULL;
+  redirect_edge_var_map_empty ();
+
+  ggc_collect ();
+
+  return true;
+}
+
   do_per_function (clear_last_verified, NULL);

   /* Stop timevar.  */
@@ -2373,23 +2400,6 @@ execute_one_pass (opt_pass *pass)
   current_pass = NULL;
   redirect_edge_var_map_empty ();

-  if (todo_after & TODO_discard_function)
-{
-  gcc_assert (cfun);
-  /* As cgraph_node::release_body expects release dominators info,
-we have to release it.  */
-  if (dom_info_available_p (CDI_DOMINATORS))
-   free_dominance_info (CDI_DOMINATORS);
-
-  if (dom_info_available_p (CDI_POST_DOMINATORS))
-   free_dominance_info (CDI_POST_DOMINATORS);
-
-  tree fn = cfun->decl;
-  pop_cfun ();
-  gcc_assert (!cfun);
-  cgraph_node::get (fn)->release_body ();
-}
-
   /* Signal this is a suitable GC collection point.  */
   if (!((todo_after | pass->todo_flags_finish) & TODO_do_not_ggc_collect))
 ggc_collect ();


> Thanks,
> Martin


Re: [PATCH] Fix in-tree gmp/mpfr/mpc generation (PR 67728)

2016-03-29 Thread Richard Biener
On Tue, 29 Mar 2016, Bernd Edlinger wrote:

> On 29.03.2016 at 10:32, Richard Biener wrote:
> > On Mon, Mar 28, 2016 at 2:44 PM, Bernd Edlinger
> >  wrote:
> >>
> >> Hi,
> >>
> >> as described in the tracker we have bootstrap problems with in-tree 
> >> gmp-6.1.0
> >> on certain targets, and also a linker issue with check-mpc due to the 
> >> changed
> >> mpfr library path.
> >
> > Hum, in-tree gmp 6.1.0 is not supported (only the version downloaded by
> > download_prerequesites is).
> >
> 
> Yes, that is what I thought too, but people out there expect something
> different, and run into problems.
> 
> >> These are triggered by overriding CFLAGS and LDFLAGS in in-tree builds.
> >> It did not happen with the gmp/mpfr/mpc versions that 
> >> download_prerequisites
> >> installs, but the currently latest version of these libraries use CFLAGS 
> >> to pass
> >> -DNO_ASM which is overridden by gcc and causes the gmp-6.1.0 to be
> >> mis-compiled.
> >
> > So you pass down AM_CFLAGS=-DNO_ASM but how does that reliably work
> > for all gmp versions?
> >
> 
> gmp-4.3.2 did use CPPFLAGS=-DNO_ASM, that is not overridden by the build
> machinery, when I pass AM_CFLAGS=-DNO_ASM it is simply defined twice.
> gmp-6.1.0 did add -DNO_ASM to CFLAGS, and we break it by overriding
> CFLAGS.  By passing -DNO_ASM in AM_CFLAGS we would un-break this
> version.
> Mark Glisse moved recently the -DNO_ASM from CFLAGS to config.h,
> so that version is immune against overriding CFLAGS and defining
> AM_CFLAGS=-DNO_ASM is redundant, and does nothing.
> 
> >>   And the mpc issue is triggered by overriding LDFLAGS
> >> and the changed mpfr library path.  So this started with mpfr v3.1.0 which
> >> moved the sources into a src sub-directory.
> >>
> >> The proposed patch fixes these problems by passing -DNO_ASM in AM_CFLAGS,
> >> and adding both possible mpfr library paths to HOST_LIB_PATH_mpfr.
> >> I've also adjusted HOST_LIB_PATH_mpc although it did not yet create 
> >> problems.
> >
> > But you remove a possibly good .libs lib_path.
> 
> No .libs was always wrong as it looks.
> At least mpc-0.8.1 and mpc-1.0.3 use src/.libs
> However we do not use that path to link mpc, instead we use:
> 
> HOST_GMPLIBS = -L$$r/$(HOST_SUBDIR)/gmp/.libs 
> -L$$r/$(HOST_SUBDIR)/mpfr/.libs -L$$r/$(HOST_SUBDIR)/mpc/src/.libs -lmpc 
> -lmpfr -lgmp
> 
> which does it right.
> 
> 
> >
> >> Boot-strapped and regression tested on x86_64-pc-linux-gnu, with different
> >> gmp versions including the latest snapshot.
> >> I have additionally built arm cross compilers, which was not working 
> >> before.
> >>
> >> Is this OK for trunk?
> >
> > I don't think so.  Supporting an arbitrary mix of in-tree versions is
> > a nightmare.
> >
> > If you really want to go down this route see @extra_mpc_gmp_configure_flags@
> > and add a variant for the lib-paths.  I don't have a good answer for 
> > -DNO_ASM
> > than to fix gmp/mpfr to not pass down this kind of configuation via CFLAGs.
> >
> 
> I am afraid, I don't have the power to do that.
> gmp-6.2.0 will not be affected, but is not yet released.
> All versions of mpc did pass the configure --with-mpfr-lib option
> to LDFLAGS which was broken in-tree, but did still work before
> mpfr 3.1.0 because the LD_LIBRARY_PATH contained mfr/.libs
> 
> > Please instead do the testing required to ensure bumping the versions 
> > downloaded
> > by download_prerequesite works during next stage1.
> >
> 
> Sure if we only have to support a single gmp-version that would allow us
> to remove a lot of kludges that are only necessary for gmp-4.3.2 or
> mpfr before 3.1.0 instead of adding new ones.

in-tree builds are already "kludges" itself and I don't see the need
to support a variety of versions people want to put in there.  So maybe
this is simply a documentation issue.

We _do_ support build against a variety of installed versions.

We don't necessarily need to support building with in-tree gmp but not 
mpfr or mpc, etc. (I think we do, at least kind-of, but these libs
have requirements on their versions as well)

> But which versions will that be?
> 
> gmp-6.1.0 or gmp-6.2.0 (not yet released) ?
> mpfr-3.1.4 ?
> mpc-1.0.3 ?

Very simple - any set of versions that fulfill our needs in
passing in-tree builds for all primary and secondary targets
(crosses where those are not platforms used as hosts).

Of course choosing the "latest" when changing versions makes sense
(unless they broke things too badly).  Note that in theory we
can also put slightly modified versions in infrastructure/ or
download a patch in addition to a tarball which we can apply
(like moving NO_ASM to config.h).

Richard.

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: Proposed Patch for Bug 69687

2016-03-29 Thread Bernd Schmidt

On 03/03/2016 03:55 PM, Marcel Böhme wrote:

@@ -4254,7 +4255,9 @@


Please use "diff -p" so that we get information about which function is 
being patched. Are all the places being patched really problematic ones 
where an input file could realistically cause an overflow, or just the 
string functions?



 }
else
 {
- work -> typevec_size *= 2;
+ if (work -> typevec_size > INT_MAX / 2)
+return;


I'm concerned about just returning without any kind of error indication. 
Not sure what we should be calling from libiberty, but I was thinking 
maybe xmalloc_failed.



@@ -4765,11 +4776,14 @@
  {
tem = s->p - s->b;
n += tem;
+  if ( n > INT_MAX / 2)
+return 0;
n *= 2;
s->b = XRESIZEVEC (char, s->b, n);
s->p = s->b + tem;
s->e = s->b + n;
  }


Might also want to guard against overflow from the first addition.


Bernd


[PATCH] PR70424 Relax alignment conservativeness

2016-03-29 Thread Richard Biener

Since 4.9 we're quite "strict" about "less alignment" that we might
possibly know about when seeing dereferences.  But that implementation
is quite inconsistent as treating 1-byte alignment as never
"possibly known" (unless we see a decl).

So the following patch makes us only consider the case where we
see a decl directly as "known" and override the alignment used
on the access.

This makes

int f(long a)
{
int *p=(int*)(a<<1);
return *p;
}

use alignof(int) for the access *p rather than 2-byte alignment
as derived from the left-shift by 1.

This drops a few "legacy" cases where a pointer derived from a decl
is no longer detected.  But if we figure we still need to care about
this we should add a more proper machinery to detect it.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied but
not planning to backport.

Richard.

2016-03-29  Richard Biener  

PR middle-end/70424
* ipa-prop.c (ipa_compute_jump_functions_for_edge): Always
use alignment returned by get_pointer_alignment_1 if it is
bigger than BITS_PER_UNIT.
* builtins.c (get_pointer_alignment_1): Do not return true
for alignment extracted from SSA info.

Index: gcc/ipa-prop.c
===
*** gcc/ipa-prop.c  (revision 234453)
--- gcc/ipa-prop.c  (working copy)
*** ipa_compute_jump_functions_for_edge (str
*** 1639,1649 
  unsigned HOST_WIDE_INT hwi_bitpos;
  unsigned align;
  
! if (get_pointer_alignment_1 (arg, &align, &hwi_bitpos)
  && align % BITS_PER_UNIT == 0
  && hwi_bitpos % BITS_PER_UNIT == 0)
{
- gcc_checking_assert (align != 0);
  jfunc->alignment.known = true;
  jfunc->alignment.align = align / BITS_PER_UNIT;
  jfunc->alignment.misalign = hwi_bitpos / BITS_PER_UNIT;
--- 1639,1649 
  unsigned HOST_WIDE_INT hwi_bitpos;
  unsigned align;
  
! get_pointer_alignment_1 (arg, &align, &hwi_bitpos);
! if (align > BITS_PER_UNIT
  && align % BITS_PER_UNIT == 0
  && hwi_bitpos % BITS_PER_UNIT == 0)
{
  jfunc->alignment.known = true;
  jfunc->alignment.align = align / BITS_PER_UNIT;
  jfunc->alignment.misalign = hwi_bitpos / BITS_PER_UNIT;
Index: gcc/builtins.c
===
*** gcc/builtins.c  (revision 234453)
--- gcc/builtins.c  (working copy)
*** get_pointer_alignment_1 (tree exp, unsig
*** 463,469 
  if (*alignp == 0)
*alignp = 1u << (HOST_BITS_PER_INT - 1);
  /* We cannot really tell whether this result is an approximation.  */
! return true;
}
else
{
--- 463,469 
  if (*alignp == 0)
*alignp = 1u << (HOST_BITS_PER_INT - 1);
  /* We cannot really tell whether this result is an approximation.  */
! return false;
}
else
{



[C++/70393] constexpr constructor

2016-03-29 Thread Nathan Sidwell

This patch fixes 70393  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70393

'ab's construction used to be dynamic but becomes static with C++11 constexpr 
constructors (I'm not sure whether we're doing more than the std requires, but 
that's not important).  However, 'AB's bases are not laid out in program 
declaration order.  B is chosen as the primary base, and A is placed after that.


But the constructor creates A first and then B.  We're simply appending new 
elements onto AB's CONSTRUCTOR, and that's not something varasm is prepared to 
deal with.


First, I add an assert to output_constructor_regular_field so the silent bad 
code generation turns into an ICE.  This discovered the problem in 
g++.dg/cpp0x/constexpr-virtual[34].C testcases too.  I  moved the flushing of 
bitfields earlier so as to move the offset calculation nearer its use (and to 
permit some jump threading in output_constructor_regular_field's code).


Second, the fix itself is in cxx_eval_store_expression.  Currently we scan the 
CONSTRUCTOR ELTS to see if we've met this field before, otherwise we append.  I 
modified the scanning loop to also iterate over the FIELD_DECLS of the record 
itself, and detect when we find the wanted field in the struct before the 
current CTOR ELT.  Then we do an insert at the current point.  (inserting at the 
end is valid, btw).


I did consider just checking 'int_byte_position (local->field) == 0' to detect 
the primary base case, but that struck me as rather fragile, and as constexprs 
get more powerful, likely to break for cases with virtual bases.  (right now 
virtual bases still cause dynamic initialization)


built & tested on x86_64-linux

nathan

2016-03-29  Nathan Sidwell  

	PR c++/70393
	* varasm.c (output_constructor_regular_field): Flush bitfield
	earlier.  Assert we don't want to move backwards.

	gcc/
	PR c++/70393
	* constexpr.c (cxx_eval_store_expression): Keep CONSTRUCTOR
	elements in field order.

	PR c++/70393
	* g++.dg/cpp0x/constexpr-virtual6.C: New.

Index: varasm.c
===
--- varasm.c	(revision 234503)
+++ varasm.c	(working copy)
@@ -4929,6 +4929,14 @@ output_constructor_regular_field (oc_loc
 
   unsigned int align2;
 
+  /* Output any buffered-up bit-fields preceding this element.  */
+  if (local->byte_buffer_in_use)
+{
+  assemble_integer (GEN_INT (local->byte), 1, BITS_PER_UNIT, 1);
+  local->total_bytes++;
+  local->byte_buffer_in_use = false;
+}
+
   if (local->index != NULL_TREE)
 {
   /* Perform the index calculation in modulo arithmetic but
@@ -4945,22 +4953,19 @@ output_constructor_regular_field (oc_loc
   else
 fieldpos = 0;
 
-  /* Output any buffered-up bit-fields preceding this element.  */
-  if (local->byte_buffer_in_use)
-{
-  assemble_integer (GEN_INT (local->byte), 1, BITS_PER_UNIT, 1);
-  local->total_bytes++;
-  local->byte_buffer_in_use = false;
-}
-
   /* Advance to offset of this element.
  Note no alignment needed in an array, since that is guaranteed
  if each element has the proper size.  */
-  if ((local->field != NULL_TREE || local->index != NULL_TREE)
-  && fieldpos > local->total_bytes)
+  if (local->field != NULL_TREE || local->index != NULL_TREE)
 {
-  assemble_zeros (fieldpos - local->total_bytes);
-  local->total_bytes = fieldpos;
+  if (fieldpos > local->total_bytes)
+	{
+	  assemble_zeros (fieldpos - local->total_bytes);
+	  local->total_bytes = fieldpos;
+	}
+  else
+	/* Must not go backwards.  */
+	gcc_assert (fieldpos == local->total_bytes);
 }
 
   /* Find the alignment of this element.  */
Index: cp/constexpr.c
===
--- cp/constexpr.c	(revision 234503)
+++ cp/constexpr.c	(working copy)
@@ -2959,16 +2959,39 @@ cxx_eval_store_expression (const constex
   else
 	{
 	  gcc_assert (TREE_CODE (index) == FIELD_DECL);
-	  for (unsigned HOST_WIDE_INT idx = 0;
+
+	  /* We must keep the CONSTRUCTOR's ELTS in FIELD order.
+	 Usually we meet initializers in that order, but it is
+	 possible for base types to be placed not in program
+	 order.  */
+	  tree fields = TYPE_FIELDS (DECL_CONTEXT (index));
+	  unsigned HOST_WIDE_INT idx;
+
+	  for (idx = 0;
 	   vec_safe_iterate (CONSTRUCTOR_ELTS (*valp), idx, &cep);
 	   idx++)
-	if (index == cep->index)
-	  break;
-	  if (!cep)
 	{
-	  constructor_elt ce = { index, NULL_TREE };
-	  cep = vec_safe_push (CONSTRUCTOR_ELTS (*valp), ce);
+	  if (index == cep->index)
+		goto found;
+
+	  /* The field we're initializing must be on the field
+		 list.  Look to see if it is present before the
+		 field the current ELT initializes.  */
+	  for (; fields != cep->index; fields = DECL_CHAIN (fields))
+		if (index == fields)
+		  goto insert;
 	}
+
+	  /* We fell off the end of the CONSTRUCTOR, so insert a new
+	 entry at the end.  */
+	insert:

[Patch, Fortran, pr70397, v1] [5/6 Regression] ice while allocating ultimate polymorphic

2016-03-29 Thread Andre Vehreschild
Hi all,

here is the trunk version of the patch for the regression reported in
pr70397. Applying the gcc-5 patch to trunk lead to a regression, which
the modified patch resolves now. The technique to solve the ice is
the same as for gcc-5:

> The routine gfc_copy_class_to_class() assumed that both the source
> and destination object's type is unlimited polymorphic, but in this
> case it is true for the destination only, which made gfortran look
> for a non-existent _len component in the source object and therefore
> ICE. This is fixed by the patch by adding a function to return either
> the _len component, when it exists, or a constant zero node to init
> the destination object's _len component with.

Bootstrapped and regtested on x86_64-linux-gnu/F23. Ok for trunk?

Regards,
Andre

PS: Yes, Paul, I know you accepted the patch for gcc-5 for trunk
also, but I feel safer when the changes made get additional approval.
-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
gcc/fortran/ChangeLog:

2016-03-27  Andre Vehreschild  

PR fortran/70397
* trans-expr.c (gfc_class_len_or_zero_get): Add function to return a
constant zero tree, when the class to get the _len component from is
not unlimited polymorphic.
(gfc_copy_class_to_class): Use the new function.
* trans.h: Added interface of new function gfc_class_len_or_zero_get.

gcc/testsuite/ChangeLog:

2016-03-27  Andre Vehreschild  

PR fortran/70397
* gfortran.dg/unlimited_polymorphic_25.f90: New test.
* gfortran.dg/unlimited_polymorphic_26.f90: New test.


diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index 4baadc8..8d039a6 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -173,6 +173,29 @@ gfc_class_len_get (tree decl)
 }
 
 
+/* Try to get the _len component of a class.  When the class is not unlimited
+   poly, i.e. no _len field exists, then return a zero node.  */
+
+tree
+gfc_class_len_or_zero_get (tree decl)
+{
+  tree len;
+  /* For class arrays decl may be a temporary descriptor handle, the vptr is
+ then available through the saved descriptor.  */
+  if (TREE_CODE (decl) == VAR_DECL && DECL_LANG_SPECIFIC (decl)
+  && GFC_DECL_SAVED_DESCRIPTOR (decl))
+decl = GFC_DECL_SAVED_DESCRIPTOR (decl);
+  if (POINTER_TYPE_P (TREE_TYPE (decl)))
+decl = build_fold_indirect_ref_loc (input_location, decl);
+  len = gfc_advance_chain (TYPE_FIELDS (TREE_TYPE (decl)),
+			   CLASS_LEN_FIELD);
+  return len != NULL_TREE ? fold_build3_loc (input_location, COMPONENT_REF,
+	 TREE_TYPE (len), decl, len,
+	 NULL_TREE)
+			  : integer_zero_node;
+}
+
+
 /* Get the specified FIELD from the VPTR.  */
 
 static tree
@@ -250,6 +273,7 @@ gfc_vptr_size_get (tree vptr)
 
 #undef CLASS_DATA_FIELD
 #undef CLASS_VPTR_FIELD
+#undef CLASS_LEN_FIELD
 #undef VTABLE_HASH_FIELD
 #undef VTABLE_SIZE_FIELD
 #undef VTABLE_EXTENDS_FIELD
@@ -1120,7 +1144,7 @@ gfc_copy_class_to_class (tree from, tree to, tree nelems, bool unlimited)
   if (unlimited)
 {
   if (from != NULL_TREE && unlimited)
-	from_len = gfc_class_len_get (from);
+	from_len = gfc_class_len_or_zero_get (from);
   else
 	from_len = integer_zero_node;
 }
diff --git a/gcc/fortran/trans.h b/gcc/fortran/trans.h
index add0cea..512615a 100644
--- a/gcc/fortran/trans.h
+++ b/gcc/fortran/trans.h
@@ -365,6 +365,7 @@ tree gfc_class_set_static_fields (tree, tree, tree);
 tree gfc_class_data_get (tree);
 tree gfc_class_vptr_get (tree);
 tree gfc_class_len_get (tree);
+tree gfc_class_len_or_zero_get (tree);
 gfc_expr * gfc_find_and_cut_at_last_class_ref (gfc_expr *);
 /* Get an accessor to the class' vtab's * field, when a class handle is
available.  */
diff --git a/gcc/testsuite/gfortran.dg/unlimited_polymorphic_25.f90 b/gcc/testsuite/gfortran.dg/unlimited_polymorphic_25.f90
new file mode 100644
index 000..d0b2a2e
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/unlimited_polymorphic_25.f90
@@ -0,0 +1,40 @@
+! { dg-do run }
+!
+! Test contributed by Valery Weber  
+
+module mod
+
+  TYPE, PUBLIC :: base_type
+  END TYPE base_type
+
+  TYPE, PUBLIC :: dict_entry_type
+ CLASS( * ), ALLOCATABLE :: key
+ CLASS( * ), ALLOCATABLE :: val
+  END TYPE dict_entry_type
+
+
+contains
+
+  SUBROUTINE dict_put ( this, key, val )
+CLASS(dict_entry_type), INTENT(INOUT) :: this
+CLASS(base_type), INTENT(IN) :: key, val
+INTEGER  :: istat
+ALLOCATE( this%key, SOURCE=key, STAT=istat )
+  end SUBROUTINE dict_put
+end module mod
+
+program test
+  use mod
+  type(dict_entry_type) :: t
+  type(base_type) :: a, b
+  call dict_put(t, a, b)
+
+  if (.NOT. allocated(t%key)) call abort()
+  select type (x => t%key)
+type is (base_type)
+class default
+  call abort()
+  end select
+  deallocate(t%key)
+end
+
diff --git a/gcc/testsuite/gfortran.dg/unlimited_polymorphic_26.f90 b/gcc/testsuite/gfortran.dg/unlimited_polymorp

Re: [PATCH ARM v3] PR69770 -mlong-calls does not affect calls to __gnu_mcount_nc generated by -pg

2016-03-29 Thread Christophe Lyon
On 29 March 2016 at 13:41, Charles Baylis  wrote:
> On 29 March 2016 at 02:16, Kugan  wrote:
>>
>> Hi Charles,
>>
>> +static void
>> +arm_emit_long_call_profile_insn ()
>> +{
>> +  rtx sym_ref = gen_rtx_SYMBOL_REF (Pmode, "__gnu_mcount_nc");
>> +  /* if movt/movw are not available, use a constant pool */
>> +  if (!arm_arch_thumb2)
>>
>> Should this be !TARGET_USE_MOVT?
>
> Hi Kugan,
>
> Thanks for the review.
>
> TARGET_USE_MOVT has additional conditions which mean that it can be
> false on targets with MOVW/MOVT depending on the tuning parameters for
> the target CPU. Because this patch works in a slightly odd way, I
> think it is better to use MOVW/MOVT where possible so that the
> slightly hacky use of the literal pool is avoided. Since this only
> happens when profiling, it is not essential to have the fully
> optimised code sequence here. I'm happy to change it if anybody feels
> strongly though.
>
> I've noticed in the quoted snippet that there are some GNU coding
> style errors, so I've respun the patch with those corrected.
>
> gcc/ChangeLog:
>
> 2016-03-29  Charles Baylis  
>
> * config/arm/arm-protos.h (arm_emit_long_call_profile): New function.
> * config/arm/arm.c (arm_emit_long_call_profile_insn): New function.
> (arm_expand_prologue): Likewise.
> (thumb1_expand_prologue): Likewise.
> (arm_output_long_call_to_profile_func): Likewise.
> (arm_emit_long_call_profile): Likewise.
> * config/arm/arm.h: (ASM_OUTPUT_REG_PUSH) Update comment.
> * config/arm/arm.md (arm_long_call_profile): New pattern.
> * config/arm/bpabi.h (ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS): New
> define.
> * config/arm/thumb1.md (thumb1_long_call_profile): New pattern.

Hi Charles,

In thumb1.md, I noticed:
@@ -1798,7 +1798,7 @@
   [(unspec_volatile [(match_operand:SI 0 "s_register_operand" "l")]
 VUNSPEC_EH_RETURN)
(clobber (match_scratch:SI 1 "=&l"))]
-  "TARGET_THUMB1"
+  "TARGET_THUMB1 && 0"
   "#"
   "&& reload_completed"
   [(const_int 0)]

which looks like an artifact of WIP.

> * config/arm/unspecs.md (unspecv): Add VUNSPEC_LONG_CALL_PROFILE.
>
> gcc/testsuite/ChangeLog:
>
> 2016-03-29  Charles Baylis  
>
> * gcc.target/arm/pr69770.c: New test.


Re: [PATCH 1/2] Do not verify CFG if a function is discarded (PR

2016-03-29 Thread Martin Liška
On 03/29/2016 02:10 PM, Richard Biener wrote:
> On Tue, Mar 29, 2016 at 1:42 PM, Martin Liška  wrote:
>> Hello.
>>
>> The problem with the original patch is that I'm forced to produce
>> an empty BB to produce true/false edge needed for the 'index' check:
>>
>> /home/marxin/Programming/testhsa/run_tests/012-switch/switch-5.c:28:9: 
>> error: true/false edge after a non-GIMPLE_COND in bb 4
>> /home/marxin/Programming/testhsa/run_tests/012-switch/switch-5.c:28:9: 
>> internal compiler error: verify_flow_info failed
>> 0x93121a verify_flow_info()
>> ../../gcc/cfghooks.c:260
>> 0xd5ae4e execute_function_todo
>> ../../gcc/passes.c:1971
>> 0xd59ea6 do_per_function
>> ../../gcc/passes.c:1645
>> 0xd5afc2 execute_todo
>> ../../gcc/passes.c:2011
>>
>> It would nicer to not produce empty block for that purpose, but the question
>> is if the change is acceptable during the stage4?
> 
> Hmm, why don't we short-cut things earlier in execute_one_pass where
> we handle TODO_discard_function?
> That is, sth like
> 
> Index: gcc/passes.c
> ===
> --- gcc/passes.c(revision 234453)
> +++ gcc/passes.c(working copy)
> @@ -2334,6 +2334,33 @@ execute_one_pass (opt_pass *pass)
> 
>/* Do it!  */
>todo_after = pass->execute (cfun);
> +
> +  if (todo_after & TODO_discard_function)
> +{
> +  pass_fini_dump_file (pass);
> +
> +  gcc_assert (cfun);
> +  /* As cgraph_node::release_body expects release dominators info,
> +we have to release it.  */
> +  if (dom_info_available_p (CDI_DOMINATORS))
> +   free_dominance_info (CDI_DOMINATORS);
> +
> +  if (dom_info_available_p (CDI_POST_DOMINATORS))
> +   free_dominance_info (CDI_POST_DOMINATORS);
> +
> +  tree fn = cfun->decl;
> +  pop_cfun ();
> +  gcc_assert (!cfun);
> +  cgraph_node::get (fn)->release_body ();
> +
> +  current_pass = NULL;
> +  redirect_edge_var_map_empty ();
> +
> +  ggc_collect ();
> +
> +  return true;
> +}
> +
>do_per_function (clear_last_verified, NULL);
> 
>/* Stop timevar.  */
> @@ -2373,23 +2400,6 @@ execute_one_pass (opt_pass *pass)
>current_pass = NULL;
>redirect_edge_var_map_empty ();
> 
> -  if (todo_after & TODO_discard_function)
> -{
> -  gcc_assert (cfun);
> -  /* As cgraph_node::release_body expects release dominators info,
> -we have to release it.  */
> -  if (dom_info_available_p (CDI_DOMINATORS))
> -   free_dominance_info (CDI_DOMINATORS);
> -
> -  if (dom_info_available_p (CDI_POST_DOMINATORS))
> -   free_dominance_info (CDI_POST_DOMINATORS);
> -
> -  tree fn = cfun->decl;
> -  pop_cfun ();
> -  gcc_assert (!cfun);
> -  cgraph_node::get (fn)->release_body ();
> -}
> -
>/* Signal this is a suitable GC collection point.  */
>if (!((todo_after | pass->todo_flags_finish) & TODO_do_not_ggc_collect))
>  ggc_collect ();
> 
> 

Hello Richi.

Thanks for coming with the cleaner version of the patch.
I've incorporated that patch and reg&bootstrap is running.

Installable as soon as it finishes?

Thanks,
Martin

>> Thanks,
>> Martin



Re: [PATCH] Disable guality tests for powerpc*-linux*

2016-03-29 Thread Bill Schmidt
Hi Jakub,

On Tue, 2016-03-29 at 08:53 +0200, Jakub Jelinek wrote:
> On Mon, Mar 28, 2016 at 07:38:46PM -0500, Bill Schmidt wrote:
> > For a long time we've had hundreds of failing guality tests.  These
> > failures don't seem to have any correlation with gdb functionality for
> > POWER, which is working fine.  At this point the value of these tests to
> > us seems questionable.  Fixing these is such low priority that it is
> > unlikely we will ever get around to it.  In the meanwhile, the failures
> > simply clutter up our regression test reports.  Thus I'd like to disable
> > them, and that's what this test does.
> > 
> > Verified to remove hundreds of failure messages on
> > powerpc64le-unknown-linux-gnu. :)  Is this ok for trunk?
> 
> This is IMNSHO very wrong, you then lose tracking of regressions in the
> debug info quality.  It is true that the debug info quality is already
> pretty bad  on powerpc*, it would be really very much desirable if
> anyone had time to analyze some of them and improve stuff,
> but we at least shouldn't regress.  Guality testsuite has various FAILs
> and/or XFAILs on lots of architectures, the problem is that the testing
> matrix is simply too large to have them in the testcases
> - it depends on the target, various ISA settings on the target, on the
> optimization level (most of the guality tests are torture tested through
> -O0 up to -O3 with extra flags), and in some cases also on the version of
> the used GDB.
> 
> For guality, the most effective test for regressions is simply always
> running contrib/test_summary after all your bootstraps and then just
> diffing up that against the same from earlier bootstrap.

And of course we do this, and we can keep doing it.  My main purpose in
opening this issue is to try to understand whether we are getting any
benefit from these tests, rather than just noise.

When you say that "the debug info quality is already pretty bad on
powerpc*," do you mean that it is known to be bad, or simply that we
have a lot of guality failures that may or may not indicate that the
debug info is bad?  I don't have experiential evidence of bad debug info
that shows up during debugging sessions.  Perhaps these are corner cases
that I will never encounter in practice?  Or perhaps the tests are just
badly formed?

The failing tests have all been bit-rotten (or never worked) since
before I joined this project, and from what others tell me, for at least
a decade.  As you suggest here, others have always told me just to
ignore the existing guality failures.  However, this can easily lead to
a culture of "ignore any guality failure, that stuff is junk" which can
cause regressions to be missed.  (I can't say that I've actually
observed this, but it is a concern I have.)

I have been consistently told that the same situation exists on most of
the supported targets, again because of the size of the testing matrix.
I'd be interested in knowing if this is true, or just anecdotal.

The other point, "it would be really very much desirable if
anyone had time to analyze some of them and improve stuff," has to be
answered by "apparently nobody does."  I am currently tracking well over
200 improvements I would like to see made to the powerpc64le target
alone.  Investigating old guality failures isn't even on that list.  Our
team won't have time for it, and if we have bounty money to spend, it
will be spent on more important things.  That's just the economic
reality, not a desire to disrespect the guality tests or anyone
associated with them.

>From my limited perspective, it seems like the guality tests are unique
within the test suite as a set of tests that everyone just expects to
have lots of failures.  Is that healthy?  Will it ever change?

That said, it is clear that you feel the guality tests provide at least
some value in their present state, so we can continue to live with
things as they are.  I'm just curious how others feel about the state of
these tests.

Thanks,
Bill

> 
>   Jakub
> 




Re: Also test -O0 for OpenACC C, C++ offloading test cases

2016-03-29 Thread Thomas Schwinge
Hi!

On Thu, 24 Mar 2016 22:31:29 +0100, I wrote:
> --- libgomp/testsuite/libgomp.oacc-c++/c++.exp
> +++ libgomp/testsuite/libgomp.oacc-c++/c++.exp

>  # Initialize dg.
>  dg-init
> +torture-init
>  
>  # Turn on OpenACC.
>  lappend ALWAYS_CFLAGS "additional_flags=-fopenacc"
> @@ -104,7 +101,26 @@ if { $lang_test_file_found } {
>  
>   setenv ACC_DEVICE_TYPE $offload_target_openacc
>  
> - dg-runtest $tests "$tagopt" "$libstdcxx_includes $DEFAULT_CFLAGS"
> + # To get better test coverage for device-specific code that is only
> + # ever used in offloading configurations, we'd like more thorough
> + # testing for test cases that deal with offloading, which most of all
> + # OpenACC test cases are.  We enable torture testing, but limit it to
> + # -O0 and -O2 only, to avoid testing times exploding too much, under
> + # the assumption that between -O0 and -O[something] there is the
> + # biggest difference in the overall structure of the generated code.
> + switch $offload_target_openacc {
> + host {
> + set-torture-options [list \
> +  { -O2 } ]
> + }
> + default {
> + set-torture-options [list \
> +  { -O0 } \
> +  { -O2 } ]
> + }
> + }
> +
> + gcc-dg-runtest $tests "$tagopt" "$libstdcxx_includes"
>  }
>  }
>  
> @@ -112,4 +128,5 @@ if { $lang_test_file_found } {
>  set GCC_UNDER_TEST "$SAVE_GCC_UNDER_TEST"
>  
>  # All done.
> +torture-finish
>  dg-finish

In a nvptx-none configuration (that is, without libstdc++), this caused:

 Running [...]/libgomp/testsuite/libgomp.oacc-c++/c++.exp ...
+ERROR: tcl error sourcing [...]/libgomp/testsuite/libgomp.oacc-c++/c++.exp
+ERROR: torture-finish: torture_without_loops is not defined
+while executing
+"error "torture-finish: torture_without_loops is not defined""
+invoked from within
+"if [info exists torture_without_loops] {
+  unset torture_without_loops
+} else {
+  error "torture-finish: torture_without_loops is not defined"
+}"
+(procedure "torture-finish" line 4)
+invoked from within
+"torture-finish"
+(file "[...]/libgomp/testsuite/libgomp.oacc-c++/c++.exp" line 131)
+invoked from within
+"source [...]/libgomp/testsuite/libgomp.oacc-c++/c++.exp"
+("uplevel" body line 1)
+invoked from within
+"uplevel #0 source [...]/libgomp/testsuite/libgomp.oacc-c++/c++.exp"
+invoked from within
+"catch "uplevel #0 source $test_file_name""
 Running [...]/libgomp/testsuite/libgomp.oacc-fortran/fortran.exp ...

torture_with_loops and torture_without_loops are set in
gcc/testsuite/lib/torture-options.exp:set-torture-options -- which we
don't call in libgomp.oacc-c++/c++.exp if skipping C++ testing.  As
obvious, fixed in r234519 as follows:

commit 53c452eee566d997bdef3ee0b20e7bb4485d77a4
Author: tschwinge 
Date:   Tue Mar 29 13:24:22 2016 +

Avoid ERROR in libgomp.oacc-c++/c++.exp in non-C++ configurations

libgomp/
* testsuite/libgomp.oacc-c++/c++.exp [!lang_test_file_found]: Call
set-torture-options.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@234519 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog  |5 +
 libgomp/testsuite/libgomp.oacc-c++/c++.exp |4 
 2 files changed, 9 insertions(+)

diff --git libgomp/ChangeLog libgomp/ChangeLog
index e0cd567..f4f30fb 100644
--- libgomp/ChangeLog
+++ libgomp/ChangeLog
@@ -1,3 +1,8 @@
+2016-03-29  Thomas Schwinge  
+
+   * testsuite/libgomp.oacc-c++/c++.exp [!lang_test_file_found]: Call
+   set-torture-options.
+
 2016-03-24  Thomas Schwinge  
 
* testsuite/libgomp.oacc-c++/c++.exp: Set up torture testing, use
diff --git libgomp/testsuite/libgomp.oacc-c++/c++.exp 
libgomp/testsuite/libgomp.oacc-c++/c++.exp
index bbdbe2f..608b298 100644
--- libgomp/testsuite/libgomp.oacc-c++/c++.exp
+++ libgomp/testsuite/libgomp.oacc-c++/c++.exp
@@ -122,6 +122,10 @@ if { $lang_test_file_found } {
 
gcc-dg-runtest $tests "$tagopt" "$libstdcxx_includes"
 }
+} else {
+# Call this once, which placates the subsequent torture-finish.
+set-torture-options [list \
+{ INVALID } ]
 }
 
 # See above.


Grüße
 Thomas


Re: [Patch, Fortran, pr70397, v1] [5/6 Regression] ice while allocating ultimate polymorphic

2016-03-29 Thread Paul Richard Thomas
Hi Andre,

Yes, it is better to play safe :-) OK for trunk.

Thanks

Paul

On 29 March 2016 at 14:55, Andre Vehreschild  wrote:
> Hi all,
>
> here is the trunk version of the patch for the regression reported in
> pr70397. Applying the gcc-5 patch to trunk lead to a regression, which
> the modified patch resolves now. The technique to solve the ice is
> the same as for gcc-5:
>
>> The routine gfc_copy_class_to_class() assumed that both the source
>> and destination object's type is unlimited polymorphic, but in this
>> case it is true for the destination only, which made gfortran look
>> for a non-existent _len component in the source object and therefore
>> ICE. This is fixed by the patch by adding a function to return either
>> the _len component, when it exists, or a constant zero node to init
>> the destination object's _len component with.
>
> Bootstrapped and regtested on x86_64-linux-gnu/F23. Ok for trunk?
>
> Regards,
> Andre
>
> PS: Yes, Paul, I know you accepted the patch for gcc-5 for trunk
> also, but I feel safer when the changes made get additional approval.
> --
> Andre Vehreschild * Email: vehre ad gmx dot de



-- 
The difference between genius and stupidity is; genius has its limits.

Albert Einstein


Re: [PATCH 3/4, libgomp] Resolve deadlock on plugin exit, HSA plugin parts

2016-03-29 Thread Martin Jambor
Hi,

On Sun, Mar 27, 2016 at 06:26:29PM +0800, Chung-Lin Tang wrote:
> On 2016/3/25 上午 02:40, Martin Jambor wrote:
> > On the whole, I am fine with the patch but there are two issues:
> > 
> > First, and generally, when you change the return type of a function,
> > you must document what return values mean in the comment of the
> > function.  Most importantly, it must be immediately apparent whether a
> > function returns true or false on failure from its comment.  So please
> > fix that.
> 
> Thanks, I'll update on that.
> 
> >> >  /* Callback of dispatch queues to report errors.  */
> >> > @@ -454,7 +471,7 @@ queue_callback (hsa_status_t status,
> >> >  hsa_queue_t *queue __attribute__ ((unused)),
> >> >  void *data __attribute__ ((unused)))
> >> >  {
> >> > -  hsa_fatal ("Asynchronous queue error", status);
> >> > +  hsa_error ("Asynchronous queue error", status);
> >> >  }
> > ...I believe this hunk is wrong.  Errors reported in this way mean
> > that something is very wrong and generally happen during execution of
> > code on HSA GPU, i.e. within GOMP_OFFLOAD_run.  And since you left
> > calls in create_single_kernel_dispatch, which is called as a part of
> > GOMP_OFFLOAD_run, intact, I believe you actually want to leave
> > hsa_fatel here too.
> 
> Yes, a fatal exit is okay within the 'run' hook, since we're not holding
> the device lock there. I was only trying to audit the 
> GOMP_OFFLOAD_init_device()
> function, where the queues are created.
> 
> I'm not familiar with the HSA runtime API; will the callback only be triggered
> during GPU kernel execution (inside the 'run' hook), and not for example,
> within hsa_queue_create()? If so, then yes as you advised, the above change to
> queue_callback() should be reverted.
> 

The documentation says the callback is "invoked by the HSA runtime for
every asynchronous event related to the newly created queue."  All
enumerated situations when the callback is called happen at command
launch time (i.e. inside a run hook).

Since creation of the queue is a synchronous event, callback should
not be invoked if it fails.  But of course, the description does not
rule out such failures do not occur out of the blue at any arbitrary
time.  But I think this is as improbable as an GOMP_PLUGIN_malloc
ending up in a fatal error, which is something you do not seem to be
worried about.

So please revert the hunk.

Thanks,

Martin


Re: [RS6000, PATCH] PR70052, ICE compiling _Decimal128 test case

2016-03-29 Thread David Edelsohn
On Tue, Mar 29, 2016 at 6:14 AM, Alan Modra  wrote:
> On Fri, Mar 25, 2016 at 07:36:34PM +1030, Alan Modra wrote:
>> +2016-03-25  Alan Modra  
>> +
>> + PR target/70052
>> + * config/rs6000/constraints.md (j): Simplify.
>> + * config/rs6000/predicates.md (easy_fp_constant): Exclude
>> + decimal float 0.D.
>> + * config/rs6000/rs6000.md (zero_fp): New mode_attr.
>> + (mov_hardfloat, mov_hardfloat32, mov_hardfloat64,
>> +  mov_64bit_dm, mov_32bit): Use zero_fp in place of j
>> + in all constraint alternatives.
>> + (movtd_64bit_nodm): Delete "j" constraint alternative.
>> +
> [snip]
>> +2016-03-25  Alan Modra  
>> +
>> + * gcc.dg/dfp/pr70052.c: New test.
>> +
>
> Testing showed that this problem exists on the gcc-5 branch too.  I've
> backported the above and bootstrapped plus regression tested on
> powerpc64le-linux.  OK for gcc-5?

Okay.

Thanks, David


Re: [PATCH 1/2] Do not verify CFG if a function is discarded (PR

2016-03-29 Thread Richard Biener
On Tue, Mar 29, 2016 at 3:07 PM, Martin Liška  wrote:
> On 03/29/2016 02:10 PM, Richard Biener wrote:
>> On Tue, Mar 29, 2016 at 1:42 PM, Martin Liška  wrote:
>>> Hello.
>>>
>>> The problem with the original patch is that I'm forced to produce
>>> an empty BB to produce true/false edge needed for the 'index' check:
>>>
>>> /home/marxin/Programming/testhsa/run_tests/012-switch/switch-5.c:28:9: 
>>> error: true/false edge after a non-GIMPLE_COND in bb 4
>>> /home/marxin/Programming/testhsa/run_tests/012-switch/switch-5.c:28:9: 
>>> internal compiler error: verify_flow_info failed
>>> 0x93121a verify_flow_info()
>>> ../../gcc/cfghooks.c:260
>>> 0xd5ae4e execute_function_todo
>>> ../../gcc/passes.c:1971
>>> 0xd59ea6 do_per_function
>>> ../../gcc/passes.c:1645
>>> 0xd5afc2 execute_todo
>>> ../../gcc/passes.c:2011
>>>
>>> It would nicer to not produce empty block for that purpose, but the question
>>> is if the change is acceptable during the stage4?
>>
>> Hmm, why don't we short-cut things earlier in execute_one_pass where
>> we handle TODO_discard_function?
>> That is, sth like
>>
>> Index: gcc/passes.c
>> ===
>> --- gcc/passes.c(revision 234453)
>> +++ gcc/passes.c(working copy)
>> @@ -2334,6 +2334,33 @@ execute_one_pass (opt_pass *pass)
>>
>>/* Do it!  */
>>todo_after = pass->execute (cfun);
>> +
>> +  if (todo_after & TODO_discard_function)
>> +{
>> +  pass_fini_dump_file (pass);
>> +
>> +  gcc_assert (cfun);
>> +  /* As cgraph_node::release_body expects release dominators info,
>> +we have to release it.  */
>> +  if (dom_info_available_p (CDI_DOMINATORS))
>> +   free_dominance_info (CDI_DOMINATORS);
>> +
>> +  if (dom_info_available_p (CDI_POST_DOMINATORS))
>> +   free_dominance_info (CDI_POST_DOMINATORS);
>> +
>> +  tree fn = cfun->decl;
>> +  pop_cfun ();
>> +  gcc_assert (!cfun);
>> +  cgraph_node::get (fn)->release_body ();
>> +
>> +  current_pass = NULL;
>> +  redirect_edge_var_map_empty ();
>> +
>> +  ggc_collect ();
>> +
>> +  return true;
>> +}
>> +
>>do_per_function (clear_last_verified, NULL);
>>
>>/* Stop timevar.  */
>> @@ -2373,23 +2400,6 @@ execute_one_pass (opt_pass *pass)
>>current_pass = NULL;
>>redirect_edge_var_map_empty ();
>>
>> -  if (todo_after & TODO_discard_function)
>> -{
>> -  gcc_assert (cfun);
>> -  /* As cgraph_node::release_body expects release dominators info,
>> -we have to release it.  */
>> -  if (dom_info_available_p (CDI_DOMINATORS))
>> -   free_dominance_info (CDI_DOMINATORS);
>> -
>> -  if (dom_info_available_p (CDI_POST_DOMINATORS))
>> -   free_dominance_info (CDI_POST_DOMINATORS);
>> -
>> -  tree fn = cfun->decl;
>> -  pop_cfun ();
>> -  gcc_assert (!cfun);
>> -  cgraph_node::get (fn)->release_body ();
>> -}
>> -
>>/* Signal this is a suitable GC collection point.  */
>>if (!((todo_after | pass->todo_flags_finish) & TODO_do_not_ggc_collect))
>>  ggc_collect ();
>>
>>
>
> Hello Richi.
>
> Thanks for coming with the cleaner version of the patch.
> I've incorporated that patch and reg&bootstrap is running.
>
> Installable as soon as it finishes?

Yes.

Richard.

> Thanks,
> Martin
>
>>> Thanks,
>>> Martin
>


Re: [PATCH] Disable guality tests for powerpc*-linux*

2016-03-29 Thread Richard Biener
On Tue, Mar 29, 2016 at 3:19 PM, Bill Schmidt
 wrote:
> Hi Jakub,
>
> On Tue, 2016-03-29 at 08:53 +0200, Jakub Jelinek wrote:
>> On Mon, Mar 28, 2016 at 07:38:46PM -0500, Bill Schmidt wrote:
>> > For a long time we've had hundreds of failing guality tests.  These
>> > failures don't seem to have any correlation with gdb functionality for
>> > POWER, which is working fine.  At this point the value of these tests to
>> > us seems questionable.  Fixing these is such low priority that it is
>> > unlikely we will ever get around to it.  In the meanwhile, the failures
>> > simply clutter up our regression test reports.  Thus I'd like to disable
>> > them, and that's what this test does.
>> >
>> > Verified to remove hundreds of failure messages on
>> > powerpc64le-unknown-linux-gnu. :)  Is this ok for trunk?
>>
>> This is IMNSHO very wrong, you then lose tracking of regressions in the
>> debug info quality.  It is true that the debug info quality is already
>> pretty bad  on powerpc*, it would be really very much desirable if
>> anyone had time to analyze some of them and improve stuff,
>> but we at least shouldn't regress.  Guality testsuite has various FAILs
>> and/or XFAILs on lots of architectures, the problem is that the testing
>> matrix is simply too large to have them in the testcases
>> - it depends on the target, various ISA settings on the target, on the
>> optimization level (most of the guality tests are torture tested through
>> -O0 up to -O3 with extra flags), and in some cases also on the version of
>> the used GDB.
>>
>> For guality, the most effective test for regressions is simply always
>> running contrib/test_summary after all your bootstraps and then just
>> diffing up that against the same from earlier bootstrap.
>
> And of course we do this, and we can keep doing it.  My main purpose in
> opening this issue is to try to understand whether we are getting any
> benefit from these tests, rather than just noise.
>
> When you say that "the debug info quality is already pretty bad on
> powerpc*," do you mean that it is known to be bad, or simply that we
> have a lot of guality failures that may or may not indicate that the
> debug info is bad?  I don't have experiential evidence of bad debug info
> that shows up during debugging sessions.  Perhaps these are corner cases
> that I will never encounter in practice?  Or perhaps the tests are just
> badly formed?
>
> The failing tests have all been bit-rotten (or never worked) since
> before I joined this project, and from what others tell me, for at least
> a decade.  As you suggest here, others have always told me just to
> ignore the existing guality failures.  However, this can easily lead to
> a culture of "ignore any guality failure, that stuff is junk" which can
> cause regressions to be missed.  (I can't say that I've actually
> observed this, but it is a concern I have.)
>
> I have been consistently told that the same situation exists on most of
> the supported targets, again because of the size of the testing matrix.
> I'd be interested in knowing if this is true, or just anecdotal.
>
> The other point, "it would be really very much desirable if
> anyone had time to analyze some of them and improve stuff," has to be
> answered by "apparently nobody does."  I am currently tracking well over
> 200 improvements I would like to see made to the powerpc64le target
> alone.  Investigating old guality failures isn't even on that list.  Our
> team won't have time for it, and if we have bounty money to spend, it
> will be spent on more important things.  That's just the economic
> reality, not a desire to disrespect the guality tests or anyone
> associated with them.
>
> From my limited perspective, it seems like the guality tests are unique
> within the test suite as a set of tests that everyone just expects to
> have lots of failures.  Is that healthy?  Will it ever change?
>
> That said, it is clear that you feel the guality tests provide at least
> some value in their present state, so we can continue to live with
> things as they are.  I'm just curious how others feel about the state of
> these tests.

I agree with Jakub that disabling the tests is not good.  Just look at
a random testcase that FAILs on powerpc but not on x86_64-linux
for all optimization levels.  You can literally "debug" this manually
as the guality would - there may be ABI issues that make handling
the case hard or there may be simple bugs (like a target reorg pass
not properly caring for debug insns).

Richard.

> Thanks,
> Bill
>
>>
>>   Jakub
>>
>
>


Re: [PATCH] Disable guality tests for powerpc*-linux*

2016-03-29 Thread David Edelsohn
On Mon, Mar 28, 2016 at 8:38 PM, Bill Schmidt
 wrote:
> Hi,
>
> For a long time we've had hundreds of failing guality tests.  These
> failures don't seem to have any correlation with gdb functionality for
> POWER, which is working fine.  At this point the value of these tests to
> us seems questionable.  Fixing these is such low priority that it is
> unlikely we will ever get around to it.  In the meanwhile, the failures
> simply clutter up our regression test reports.  Thus I'd like to disable
> them, and that's what this test does.
>
> Verified to remove hundreds of failure messages on
> powerpc64le-unknown-linux-gnu. :)  Is this ok for trunk?
>
> Thanks,
> Bill
>
>
> 2016-03-28  Bill Schmidt  
>
> * g++.dg/guality/guality.exp: Disable for powerpc*-linux*.
> * gcc.dg/guality/guality.exp: Likewise.

Thanks for everyone else's suggestions.

As far as we understand, debugging quality on POWER is equivalent to
other targets.

There is an issue with PPC64 BE and AIX requiring an extra frame push
when debugging is enabled, which will cause differences between code
with debugging enabled and debugging disabled.  THIS WILL NOT BE
CHANGED.

We have no plans to make code generation a slave to the testsuite.
The testsuite is a tool, successful results from the testsuite is not
a goal unto itself.

This patch is okay.

Thanks, David


[PATCH] New flag in order to dump information about template instantiations.

2016-03-29 Thread Andres Tiraboschi
Hi,
the attached patch adds a new compilation flag
'ftemplate-instantiations' in order
to allow dumping debug information for template instantiations.
This flag has 2 possible values: none(by default) and hreadable, that
prints witch
templates instantiations had been made in a human readable way.
This patch was also made in order to add options easily and to interact with
plugins.
  For example in a plugin can be defined a derived class for
template_instantiations_callbacks
implementing _function_instantiation, _class_instantiation, _using_instantiation
and then using add_template_instantiations_callbacks in order to
access information
about witch template instantiations had been made.

Changelog
2016-03-29  Andres Tiraboschi  

* gcc/c-family/c.opt (ftemplate-instantiations): New flag.
* gcc/flag-types.h (ti_dump_options): New type.
* gcc/cp/decl2.c (cp_write_global_declarations): Added code to
dump information.
* gcc/cp/cp-tree.h (template_instantiations_callbacks): New type.
(call_template_instantiation_callbacks): Declare.
(add_template_instantiations_callbacks): Likewise.
(clean_up_callbacks): Likewise.
* gcc/cp/pt.c (human_readable_template_instantiations): New type.
(instantiation_callbacks): Declare.
(call_template_instantiation_callback): New function.
(call_template_instantiation_callbacks): Likewise.
(add_template_instantiations_callbacks): Likewise.
(initialize_instantiations_callbacks): Likewise.
(clean_up_callbacks): Likewise.
(init_template_processing): Added code to initialize instatiation_callbacks.
(register_specialization): Added code to dump information.
* gcc/doc/invoke.texi (ftemplate-instantiations): Added documentation.


diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 7c5f6c7..a0ebcdc 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -1487,6 +1487,19 @@ fstats
 C++ ObjC++ Var(flag_detailed_statistics)
 Display statistics accumulated during compilation.

+ftemplate-instantiations=
+C++ Joined RejectNegative Enum(ti_dump_options) Var(ti_dump_option)
Init(TI_NONE)
+Dump information about wich templates have been instantiated
+
+Enum
+Name(ti_dump_options) Type(enum ti_dump_options)
UnknownError(unrecognized template instantiation dumping option %qs)
+
+EnumValue
+Enum(ti_dump_options) String(none) Value(TI_NONE)
+
+EnumValue
+Enum(ti_dump_options) String(hreadable) Value(TI_HREADABLE)
+
 fstrict-enums
 C++ ObjC++ Optimization Var(flag_strict_enums)
 Assume that values of enumeration type are always within the minimum
range of that type.
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 15b004d..f682b4a 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -4816,6 +4816,61 @@ struct local_specialization_stack
   hash_map *saved;
 };

+class template_instantiations_callbacks
+{
+public:
+  template_instantiations_callbacks () : next(NULL){}
+
+  void function_instantiation (tree tmpl, tree args, tree spec)
+  {
+_function_instantiation (tmpl, args, spec);
+if (next != NULL)
+  next->function_instantiation (tmpl, args, spec);
+  }
+
+  void class_instantiation (tree tmpl, tree args, tree spec)
+  {
+_class_instantiation (tmpl, args, spec);
+if (next != NULL)
+  next->class_instantiation (tmpl, args, spec);
+  }
+
+  void using_instantiation (tree tmpl, tree args, tree spec)
+  {
+_using_instantiation (tmpl, args, spec);
+if (next != NULL)
+  next->using_instantiation (tmpl, args, spec);
+  }
+
+  void add_callbacks (template_instantiations_callbacks* new_next)
+  {
+if (next)
+  next->add_callbacks (new_next);
+else
+  next = new_next;
+  }
+
+  virtual ~template_instantiations_callbacks ()
+  {
+delete next;
+  }
+
+private:
+  template_instantiations_callbacks* next;
+
+  virtual void _function_instantiation (tree, tree, tree)
+  {
+  }
+
+  virtual void _class_instantiation (tree, tree, tree)
+  {
+  }
+
+  virtual void _using_instantiation (tree, tree, tree)
+  {
+  }
+};
+
 /* in class.c */

 extern int current_class_depth;
@@ -6199,6 +6254,9 @@ extern void register_local_specialization
(tree, tree);
 extern tree retrieve_local_specialization   (tree);
 extern tree extract_fnparm_pack (tree, tree *);
 extern tree template_parm_to_arg(tree);
+extern void call_template_instantiation_callbacks (void);
+extern void add_template_instantiations_callbacks
(template_instantiations_callbacks* new_callback);
+extern void clean_up_callbacks (void);

 /* in repo.c */
 extern void init_repo(void);
diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index 73b0d28..097e3564 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -4914,6 +4914,9 @@ c_parse_final_cleanups (void)
   dump_time_statistics ();
 }

+  call_template_instantiation_callbacks ();
+  clean_up_callbacks ();
+
   timevar_stop (TV_PHASE_DEFERRED);
   timevar_start (TV_PHASE_PARSING);

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index e8

Re: FW: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C

2016-03-29 Thread Thomas Schwinge
Hi!

On Mon, 28 Mar 2016 19:40:22 +0300, Ilya Verbin  wrote:
> Do you plan to commit this patch? :)

Well, I'm also still waiting for you guys to merge (via the upstream
Intel sources repository) my GNU Hurd portability patches; submitted to
GCC in

and the following messages, dated 2014-09-26.  Upon request of Barry M
Tannenbaum then submitted to the Intel web site, and then never heard of
again...  ;-(

> On Mon, Sep 29, 2014 at 09:24:40 -0600, Jeff Law wrote:
> > On 09/29/14 08:26, Thomas Schwinge wrote:
> > > Audit Cilk Plus tests for CILK_NWORKERS=1.
> > >
> > >   gcc/testsuite/
> > >   * c-c++-common/cilk-plus/CK/spawning_arg.c (main): Call
> > >   __cilkrts_set_param to set two workers.
> > >   * c-c++-common/cilk-plus/CK/steal_check.c (main): Likewise.
> > >   * g++.dg/cilk-plus/CK/catch_exc.cc (main): Likewise.

Thanks for reminding me about this.  I confirmed that the problem still
reproduces, and the very same patch still fixes it; now committed in
r234523:

commit 4abd94105ecb1d026406648a37ff2fb43bb26d7c
Author: tschwinge 
Date:   Tue Mar 29 14:39:33 2016 +

[PR testsuite/64177] Audit Cilk Plus tests for CILK_NWORKERS=1

PR testsuite/64177
gcc/testsuite/
* c-c++-common/cilk-plus/CK/spawning_arg.c (main): Call
__cilkrts_set_param to set two workers.
* c-c++-common/cilk-plus/CK/steal_check.c (main): Likewise.
* g++.dg/cilk-plus/CK/catch_exc.cc (main): Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@234523 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/testsuite/ChangeLog   |8 
 .../c-c++-common/cilk-plus/CK/spawning_arg.c  |   15 +++
 gcc/testsuite/c-c++-common/cilk-plus/CK/steal_check.c |   17 ++---
 gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc|   14 ++
 4 files changed, 51 insertions(+), 3 deletions(-)

diff --git gcc/testsuite/ChangeLog gcc/testsuite/ChangeLog
index 11d6863..f9b4b00 100644
--- gcc/testsuite/ChangeLog
+++ gcc/testsuite/ChangeLog
@@ -1,3 +1,11 @@
+2016-03-29  Thomas Schwinge  
+
+   PR testsuite/64177
+   * c-c++-common/cilk-plus/CK/spawning_arg.c (main): Call
+   __cilkrts_set_param to set two workers.
+   * c-c++-common/cilk-plus/CK/steal_check.c (main): Likewise.
+   * g++.dg/cilk-plus/CK/catch_exc.cc (main): Likewise.
+
 2016-03-28  Dominique d'Humieres  
 
g++.dg/ext/fnname5.C: Update the test for Darwin.
diff --git gcc/testsuite/c-c++-common/cilk-plus/CK/spawning_arg.c 
gcc/testsuite/c-c++-common/cilk-plus/CK/spawning_arg.c
index 95e6cab..138b82c 100644
--- gcc/testsuite/c-c++-common/cilk-plus/CK/spawning_arg.c
+++ gcc/testsuite/c-c++-common/cilk-plus/CK/spawning_arg.c
@@ -2,6 +2,17 @@
 /* { dg-options "-fcilkplus" } */
 /* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+extern int __cilkrts_set_param (const char *, const char *);
+
+#ifdef __cplusplus
+}
+#endif
+
+
 void f0(volatile int *steal_flag)
 { 
   int i = 0;
@@ -32,6 +43,10 @@ void f3()
 
 int main()
 {
+  /* Ensure more than one worker.  */
+  if (__cilkrts_set_param("nworkers", "2") != 0)
+__builtin_abort();
+
   f3();
   return 0;
 }
diff --git gcc/testsuite/c-c++-common/cilk-plus/CK/steal_check.c 
gcc/testsuite/c-c++-common/cilk-plus/CK/steal_check.c
index 6e28765..6b41c7f 100644
--- gcc/testsuite/c-c++-common/cilk-plus/CK/steal_check.c
+++ gcc/testsuite/c-c++-common/cilk-plus/CK/steal_check.c
@@ -2,8 +2,16 @@
 /* { dg-options "-fcilkplus" } */
 /* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
 
-// #include 
-extern void __cilkrts_set_param (char *, char *);
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+extern int __cilkrts_set_param (const char *, const char *);
+
+#ifdef __cplusplus
+}
+#endif
+
 
 void foo(volatile int *);
 
@@ -11,7 +19,10 @@ void main2(void);
 
 int main(void)
 {
- //  __cilkrts_set_param ((char *)"nworkers", (char *)"2");
+  /* Ensure more than one worker.  */
+  if (__cilkrts_set_param("nworkers", "2") != 0)
+__builtin_abort();
+
   main2();
   return 0;
 }
diff --git gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc 
gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc
index 0633d19..09ddf8b 100644
--- gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc
+++ gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc
@@ -10,6 +10,16 @@
 #endif
 #include 
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+extern int __cilkrts_set_param (const char *, const char *);
+
+#ifdef __cplusplus
+}
+#endif
+
 
 void func(int volatile* steal_me) 
 {
@@ -59,6 +69,10 @@ void my_test()
 
 int main() 
 {
+  /* Ensure more than one worker.  */
+  if (__cilkrts_set_param("nworkers", "2") != 0)
+__builtin_abort();
+
   my_test();
 #if HAVE_IO
   printf("PASSED\n");


Grüße
 Thomas

Re: FW: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C

2016-03-29 Thread Ilya Verbin
On Tue, Mar 29, 2016 at 17:15:11 +0200, Thomas Schwinge wrote:
> On Mon, 28 Mar 2016 19:40:22 +0300, Ilya Verbin  wrote:
> > Do you plan to commit this patch? :)
> 
> Well, I'm also still waiting for you guys to merge (via the upstream
> Intel sources repository) my GNU Hurd portability patches; submitted to
> GCC in
> 
> and the following messages, dated 2014-09-26.  Upon request of Barry M
> Tannenbaum then submitted to the Intel web site, and then never heard of
> again...  ;-(

I'm going to merge libcilkrts from upstream at stage1.  Your patch is there:
https://bitbucket.org/intelcilkruntime/intel-cilk-runtime/commits/2b33a7bfcbcd1def8108287475755b68b4aef2f7

  -- Ilya


Fix bogus vtable mismatch warnings

2016-03-29 Thread Jan Hubicka
Hi,
this patch fixes bogus warning While building libreoffice we get:
/aux/hubicka/libreoffice2/core/sw/source/core/attr/calbck.cxx:27:1: note: 
virtual method �_ZN2sw16LegacyModifyHintD2Ev.localalias.7�
 sw::LegacyModifyHint::~LegacyModifyHint() {}
 ^
/aux/hubicka/libreoffice2/core/sw/source/core/attr/calbck.cxx:27:1: note: ought 
to match virtual method �__comp_dtor � but does not

While buildilng libreoffice.  This patch makes the compare to accept local
aliases.  Sadly one can't look for alias target because it may get confused by
ICF (I think).  This patch makes us to strip the .localalias suffix.

The patch also fixes the warning to come out right (at least in most cases
when ICF did not happen) and commonizes the suffix hanlding.

Bootstrapped/regtested x86_64-linux and tested on libreoffice.

Honza

PR ipa/70283
* ipa-devirt.c (methods_equal_p): New function.
(compare_virtual_tables): Use it.
* cgraph.h (symbol_table::symbol_suffix_separator): Declare.
* cgraphclones.c (clone_function_name_1): Use
symbol_table::symbol_suffix_separator.
* coverage.c (build_var): Likewise.
* symtab.c (symbol_table::symbol_suffix_separator): New.

Index: cgraph.h
===
--- cgraph.h(revision 234516)
+++ cgraph.h(working copy)
@@ -2173,6 +2173,9 @@ public:
 
   FILE* GTY ((skip)) dump_file;
 
+  /* Return symbol used to separate symbol name from suffix.  */
+  static char symbol_suffix_separator ();
+
 private:
   /* Allocate new callgraph node.  */
   inline cgraph_node * allocate_cgraph_symbol (void);
Index: cgraphclones.c
===
--- cgraphclones.c  (revision 234516)
+++ cgraphclones.c  (working copy)
@@ -512,13 +512,7 @@ clone_function_name_1 (const char *name,
   prefix = XALLOCAVEC (char, len + strlen (suffix) + 2);
   memcpy (prefix, name, len);
   strcpy (prefix + len + 1, suffix);
-#ifndef NO_DOT_IN_LABEL
-  prefix[len] = '.';
-#elif !defined NO_DOLLAR_IN_LABEL
-  prefix[len] = '$';
-#else
-  prefix[len] = '_';
-#endif
+  prefix[len] = symbol_table::symbol_suffix_separator ();
   ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, clone_fn_id_num++);
   return get_identifier (tmp_name);
 }
Index: coverage.c
===
--- coverage.c  (revision 234516)
+++ coverage.c  (working copy)
@@ -745,11 +745,7 @@ build_var (tree fn_decl, tree type, int
   else
 sprintf (buf, "__gcov%u_", counter);
   len = strlen (buf);
-#ifndef NO_DOT_IN_LABEL
-  buf[len - 1] = '.';
-#elif !defined NO_DOLLAR_IN_LABEL
-  buf[len - 1] = '$';
-#endif
+  buf[len - 1] = symbol_table::symbol_suffix_separator ();
   memcpy (buf + len, fn_name, fn_name_len + 1);
   DECL_NAME (var) = get_identifier (buf);
   TREE_STATIC (var) = 1;
Index: ipa-devirt.c
===
--- ipa-devirt.c(revision 234516)
+++ ipa-devirt.c(working copy)
@@ -705,6 +705,29 @@ odr_subtypes_equivalent_p (tree t1, tree
   return odr_types_equivalent_p (t1, t2, false, NULL, visited, loc1, loc2);
 }
 
+/* Return true if DECL1 and DECL2 are identical methods.  Consider
+   name equivalent to name.localalias.xyz.  */
+
+static bool
+methods_equal_p (tree decl1, tree decl2)
+{
+  if (DECL_ASSEMBLER_NAME (decl1) == DECL_ASSEMBLER_NAME (decl2))
+return true;
+  const char sep = symbol_table::symbol_suffix_separator ();
+
+  const char *name1 = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl1));
+  const char *ptr1 = strchr (name1, sep);
+  int len1 = ptr1 ? ptr1 - name1 : strlen (name1);
+
+  const char *name2 = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl2));
+  const char *ptr2 = strchr (name2, sep);
+  int len2 = ptr2 ? ptr2 - name2 : strlen (name2);
+
+  if (len1 != len2)
+return false;
+  return !strncmp (name1, name2, len1);
+}
+
 /* Compare two virtual tables, PREVAILING and VTABLE and output ODR
violation warnings.  */
 
@@ -758,8 +781,8 @@ compare_virtual_tables (varpool_node *pr
 accept the other case.  */
   while (!end2
 && (end1
-|| (DECL_ASSEMBLER_NAME (ref1->referred->decl)
-!= DECL_ASSEMBLER_NAME (ref2->referred->decl)
+|| (methods_equal_p (ref1->referred->decl,
+ ref2->referred->decl)
 && TREE_CODE (ref1->referred->decl) == FUNCTION_DECL))
 && TREE_CODE (ref2->referred->decl) != FUNCTION_DECL)
{
@@ -785,8 +808,7 @@ compare_virtual_tables (varpool_node *pr
}
   while (!end1
 && (end2
-|| (DECL_ASSEMBLER_NAME (ref2->referred->decl)
-!= DECL_ASSEMBLER_NAME (ref1->referred->decl)
+|| (methods_equal_p (ref2->referred->decl, 
ref1->referred->decl)
 && TREE_CODE (ref2->referred->decl) == FUNCTI

[COMMITTED] Add myself as GCC maintainer

2016-03-29 Thread Kelvin Nilsen


I've added myself to the "Write After Approval" maintainers (Committed 
revision 234526):


2016-03-29  Kelvin Nilsen  

* MAINTAINERS (Write After Approval): Add myself.

--
Kelvin Nilsen, Ph.D.  kdnil...@linux.vnet.ibm.com
home office: 801-756-4821, cell: 520-991-6727
IBM Linux Technology Center - PPC Toolchain
Index: ChangeLog
===
--- ChangeLog   (revision 234524)
+++ ChangeLog   (working copy)
@@ -1,3 +1,7 @@
+2016-03-29  Kelvin Nilsen  
+
+   * MAINTAINERS (Write After Approval): Add myself.
+
 2016-03-17  Cary Coutant  
 
Sync with binutils-gdb:
Index: MAINTAINERS
===
--- MAINTAINERS (revision 234524)
+++ MAINTAINERS (working copy)
@@ -517,6 +517,7 @@ Quentin Neill   

 Adam Nemet 
 Thomas Neumann 
 Dan Nicolaescu 
+Kelvin Nilsen  
 James Norris   
 Diego Novillo  
 Dorit Nuzman   


a patch for PR68695

2016-03-29 Thread Vladimir Makarov

  The following patch improves the code in 2 out of 3 cases in

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68695

  The patch uses more accurate costs for the RA cost improvement 
optimization after colouring.


  The patch was tested and bootstrapped on x86-64.  It is hard to 
create a  test to check the correct code generation.  Therefore there is 
no test.  As the patch changes heuristics, a generated code in some 
cases will be different but at least x86-64 tests expecting a specific 
code are not broken by the patch.


  Committed as rev.  234527

Index: ChangeLog
===
--- ChangeLog	(revision 234526)
+++ ChangeLog	(working copy)
@@ -1,3 +1,9 @@
+2016-03-29  Vladimir Makarov  
+
+	PR rtl-optimization/68695
+	* ira-color.c (allocno_copy_cost_saving): New.
+	(improve_allocation): Use it.
+
 2016-03-29  Richard Henderson  
 
 	PR middle-end/70355
Index: ira-color.c
===
--- ira-color.c	(revision 234526)
+++ ira-color.c	(working copy)
@@ -2728,6 +2728,37 @@ allocno_cost_compare_func (const void *v
   return ALLOCNO_NUM (p1) - ALLOCNO_NUM (p2);
 }
 
+/* Return savings on removed copies when ALLOCNO is assigned to
+   HARD_REGNO.  */
+static int
+allocno_copy_cost_saving (ira_allocno_t allocno, int hard_regno)
+{
+  int cost = 0;
+  enum reg_class rclass;
+  ira_copy_t cp, next_cp;
+
+  rclass = REGNO_REG_CLASS (hard_regno);
+  for (cp = ALLOCNO_COPIES (allocno); cp != NULL; cp = next_cp)
+{
+  if (cp->first == allocno)
+	{
+	  next_cp = cp->next_first_allocno_copy;
+	  if (ALLOCNO_HARD_REGNO (cp->second) != hard_regno)
+	continue;
+	}
+  else if (cp->second == allocno)
+	{
+	  next_cp = cp->next_second_allocno_copy;
+	  if (ALLOCNO_HARD_REGNO (cp->first) != hard_regno)
+	continue;
+	}
+  else
+	gcc_unreachable ();
+  cost += cp->freq * ira_register_move_cost[ALLOCNO_MODE (allocno)][rclass][rclass];
+}
+  return cost;
+}
+
 /* We used Chaitin-Briggs coloring to assign as many pseudos as
possible to hard registers.  Let us try to improve allocation with
cost point of view.  This function improves the allocation by
@@ -2768,9 +2799,7 @@ improve_allocation (void)
 	continue;
   check++;
   aclass = ALLOCNO_CLASS (a);
-  allocno_costs = ALLOCNO_UPDATED_HARD_REG_COSTS (a);
-  if (allocno_costs == NULL)
-	allocno_costs = ALLOCNO_HARD_REG_COSTS (a);
+  allocno_costs = ALLOCNO_HARD_REG_COSTS (a);
   if ((hregno = ALLOCNO_HARD_REGNO (a)) < 0)
 	base_cost = ALLOCNO_UPDATED_MEMORY_COST (a);
   else if (allocno_costs == NULL)
@@ -2779,7 +2808,8 @@ improve_allocation (void)
 	   case).  */
 	continue;
   else
-	base_cost = allocno_costs[ira_class_hard_reg_index[aclass][hregno]];
+	base_cost = (allocno_costs[ira_class_hard_reg_index[aclass][hregno]]
+		 - allocno_copy_cost_saving (a, hregno));
   try_p = false;
   get_conflict_and_start_profitable_regs (a, false,
 	  conflicting_regs,
@@ -2797,6 +2827,7 @@ improve_allocation (void)
 	  k = allocno_costs == NULL ? 0 : j;
 	  costs[hregno] = (allocno_costs == NULL
 			   ? ALLOCNO_UPDATED_CLASS_COST (a) : allocno_costs[k]);
+	  costs[hregno] -= allocno_copy_cost_saving (a, hregno);
 	  costs[hregno] -= base_cost;
 	  if (costs[hregno] < 0)
 	try_p = true;
@@ -2835,14 +2866,13 @@ improve_allocation (void)
 	  k = (ira_class_hard_reg_index
 		   [ALLOCNO_CLASS (conflict_a)][conflict_hregno]);
 	  ira_assert (k >= 0);
-	  if ((allocno_costs = ALLOCNO_UPDATED_HARD_REG_COSTS (conflict_a))
+	  if ((allocno_costs = ALLOCNO_HARD_REG_COSTS (conflict_a))
 		  != NULL)
 		spill_cost -= allocno_costs[k];
-	  else if ((allocno_costs = ALLOCNO_HARD_REG_COSTS (conflict_a))
-		   != NULL)
-		spill_cost -= allocno_costs[k];
 	  else
 		spill_cost -= ALLOCNO_UPDATED_CLASS_COST (conflict_a);
+	  spill_cost
+		+= allocno_copy_cost_saving (conflict_a, conflict_hregno);
 	  conflict_nregs
 		= hard_regno_nregs[conflict_hregno][ALLOCNO_MODE (conflict_a)];
 	  for (r = conflict_hregno;


Re: [RFC][PATCH v2, ARM 5/8] ARMv8-M Security Extension's cmse_nonsecure_entry: clear registers

2016-03-29 Thread Andre Vieira (lists)
On 29/01/16 17:07, Andre Vieira (lists) wrote:
> On 26/12/15 01:54, Thomas Preud'homme wrote:
>> [Sending on behalf of Andre Vieira]
>>
>> Hello,
>>
>> This patch extends support for the ARMv8-M Security Extensions
>> 'cmse_nonsecure_entry' attribute to safeguard against leak of
>> information through unbanked registers.
>>
>> When returning from a nonsecure entry function we clear all
>> caller-saved registers that are not used to pass return values, by
>> writing either the LR, in case of general purpose registers, or the
>> value 0, in case of FP registers. We use the LR to write to APSR and
>> FPSCR too. We currently only support 32 FP registers as in we only
>> clear D0-D7.
>> We currently do not support entry functions that pass arguments or
>> return variables on the stack and we diagnose this. This patch relies
>> on the existing code to make sure callee-saved registers used in
>> cmse_nonsecure_entry functions are saved and restored thus retaining
>> their nonsecure mode value, this should be happening already as it is
>> required by AAPCS.
>>
>>
>> *** gcc/ChangeLog ***
>> 2015-10-27  Andre Vieira
>>  Thomas Preud'homme  
>>
>>  * gcc/config/arm/arm.c (output_return_instruction): Clear
>>registers.
>>(thumb2_expand_return): Likewise.
>>(thumb1_expand_epilogue): Likewise.
>>(arm_expand_epilogue): Likewise.
>>(cmse_nonsecure_entry_clear_before_return): New.
>>  * gcc/config/arm/arm.h (TARGET_DSP_ADD): New macro define.
>>  * gcc/config/arm/thumb1.md (*epilogue_insns): Change length
>> attribute.
>>  * gcc/config/arm/thumb2.md (*thumb2_return): Likewise.
>>
>> *** gcc/testsuite/ChangeLog ***
>> 2015-10-27  Andre Vieira
>>  Thomas Preud'homme  
>>
>>  * gcc.target/arm/cmse/cmse.exp: Test different multilibs
>> separate.
>>  * gcc.target/arm/cmse/baseline/cmse-2.c: Test that registers
>> are cleared.
>>  * gcc.target/arm/cmse/mainline/soft/cmse-5.c: New.
>>  * gcc.target/arm/cmse/mainline/hard/cmse-5.c: New.
>>  * gcc.target/arm/cmse/mainline/hard-sp/cmse-5.c: New.
>>  * gcc.target/arm/cmse/mainline/softfp/cmse-5.c: New.
>>  * gcc.target/arm/cmse/mainline/softfp-sp/cmse-5.c: New.
>>
>>
>> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
>> index
>> f12e3c93bbe24b10ed8eee6687161826773ef649..b06e0586a3da50f57645bda13629bc4dbd3d53b7
>> 100644
>> --- a/gcc/config/arm/arm.h
>> +++ b/gcc/config/arm/arm.h
>> @@ -230,6 +230,9 @@ extern void
>> (*arm_lang_output_object_attributes_hook)(void);
>>   /* Integer SIMD instructions, and extend-accumulate instructions.  */
>>   #define TARGET_INT_SIMD \
>> (TARGET_32BIT && arm_arch6 && (arm_arch_notm || arm_arch7em))
>> +/* Parallel addition and subtraction instructions.  */
>> +#define TARGET_DSP_ADD \
>> +  (TARGET_ARM_ARCH >= 6 && (arm_arch_notm || arm_arch7em))
>>
>>   /* Should MOVW/MOVT be used in preference to a constant pool.  */
>>   #define TARGET_USE_MOVT \
>> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
>> index
>> e530b772e3cc053c16421a2a2861d815d53ebb01..0700478ca38307f35d0cb01f83ea182802ba28fa
>> 100644
>> --- a/gcc/config/arm/arm.c
>> +++ b/gcc/config/arm/arm.c
>> @@ -19755,6 +19755,24 @@ output_return_instruction (rtx operand, bool
>> really_return, bool reverse,
>>   default:
>> if (IS_CMSE_ENTRY (func_type))
>>   {
>> +  char flags[12] = "APSR_nzcvq";
>> +  /* Check if we have to clear the 'GE bits' which is only
>> used if
>> + parallel add and subtraction instructions are available.  */
>> +  if (TARGET_DSP_ADD)
>> +{
>> +  /* If so also clear the ge flags.  */
>> +  flags[10] = 'g';
>> +  flags[11] = '\0';
>> +}
>> +  snprintf (instr, sizeof (instr),  "msr%s\t%s, %%|lr",
>> conditional,
>> +flags);
>> +  output_asm_insn (instr, & operand);
>> +  if (TARGET_HARD_FLOAT && TARGET_VFP)
>> +{
>> +  snprintf (instr, sizeof (instr), "vmsr%s\tfpscr, %%|lr",
>> +conditional);
>> +  output_asm_insn (instr, & operand);
>> +}
>> snprintf (instr, sizeof (instr), "bxns%s\t%%|lr",
>> conditional);
>>   }
>> /* Use bx if it's available.  */
>> @@ -23999,6 +24017,17 @@ thumb_pop (FILE *f, unsigned long mask)
>>   static void
>>   thumb1_cmse_nonsecure_entry_return (FILE *f, int
>> reg_containing_return_addr)
>>   {
>> +  char flags[12] = "APSR_nzcvq";
>> +  /* Check if we have to clear the 'GE bits' which is only used if
>> + parallel add and subtraction instructions are available.  */
>> +  if (TARGET_DSP_ADD)
>> +{
>> +  flags[10] = 'g';
>> +  flags[11] = '\0';
>> +}
>> +  asm_fprintf (f, "\tmsr\t%s, %r\n", flags, reg_containing_return_addr);
>> +  if (TARGET_HARD_FLOAT && TARGET_VFP)
>> +asm_fprintf (f, "\tvmsr\tfpscr, %r\n", r

Re: [RFC][PATCH, ARM 0/8] ARMv8-M Security Extensions

2016-03-29 Thread Andre Vieira (lists)
On 26/12/15 01:39, Thomas Preud'homme wrote:
> [Sending on behalf of Andre Vieira]
> 
> Hello,
> 
> This patch series aims at implementing an alpha status support for ARMv8-M's 
> Security Extensions. It is only posted as RFC at this stage. You can find the 
> specification of ARMV8-M Security Extensions in: ARM®v8-M Security 
> Extensions: Requirements on Development Tools 
> (http://infocenter.arm.com/help/topic/com.arm.doc.ecm0359818/index.html).
> 
> We currently:
> - do not support passing arguments or returning on the stack for 
> cmse_nonsecure_{call,entry} functions,
> - do not guarantee padding bits are cleared for arguments or return variables 
> of cmse_nonsecure_{call,entry} functions,
> - only test Security Extensions for -mfpu=fpv5-d16 and fpv5-sp-d16 and only 
> support single and double precision FPU's with d16.
> 
> 
> Andre Vieira (8):
>  Add support for ARMv8-M's Security Extensions flag and intrinsics
>  Add RTL patterns for thumb1 push/pop
>  Handling ARMv8-M Security Extension's cmse_nonsecure_entry attribute
>  ARMv8-M Security Extension's cmse_nonsecure_entry: __acle_se label and bxns 
> return
>  ARMv8-M Security Extension's cmse_nonsecure_entry: clear registers
>  Handling ARMv8-M Security Extension's cmse_nonsecure_call attribute
>  ARMv8-M Security Extension's cmse_nonsecure_call: use 
> __gnu_cmse_nonsecure_call
>  Added support for ARMV8-M Security Extension cmse_nonsecure_caller intrinsic
> 
> 
> Cheers,
> 
> Andre
> 

Hi there, with the second version of the patch to clear registers when
returning from cmse_nonsecure_entry functions we guarantee that padding
bits are cleared when returning from a cmse_nonsecure_entry function.
However, we still do not guarantee this happens for when passing
compound types as arguments to cmse_nonsecure_call's.

Furthermore patch 2/8 has been dropped since it was no longer relevant.

Andre Vieira (8):
 Add support for ARMv8-M's Security Extensions flag and intrinsics
 Add RTL patterns for thumb1 push/pop (DROPPED)
 Handling ARMv8-M Security Extension's cmse_nonsecure_entry attribute
 ARMv8-M Security Extension's cmse_nonsecure_entry: __acle_se label and
bxns return
 ARMv8-M Security Extension's cmse_nonsecure_entry: clear registers
 Handling ARMv8-M Security Extension's cmse_nonsecure_call attribute
 ARMv8-M Security Extension's cmse_nonsecure_call: use
__gnu_cmse_nonsecure_call
 Added support for ARMV8-M Security Extension cmse_nonsecure_caller
intrinsic


Cheers,
Andre


Re: [Patch, Fortran, pr70397, v1] [5/6 Regression] ice while allocating ultimate polymorphic

2016-03-29 Thread Andre Vehreschild
Hi Paul, hi Dominique

thanks for the fast review and error check, respectively. Committed as
r234528.

Regards,
Andre

On Tue, 29 Mar 2016 15:34:13 +0200
Paul Richard Thomas  wrote:

> Hi Andre,
> 
> Yes, it is better to play safe :-) OK for trunk.
> 
> Thanks
> 
> Paul
> 
> On 29 March 2016 at 14:55, Andre Vehreschild  wrote:
> > Hi all,
> >
> > here is the trunk version of the patch for the regression reported in
> > pr70397. Applying the gcc-5 patch to trunk lead to a regression, which
> > the modified patch resolves now. The technique to solve the ice is
> > the same as for gcc-5:
> >  
> >> The routine gfc_copy_class_to_class() assumed that both the source
> >> and destination object's type is unlimited polymorphic, but in this
> >> case it is true for the destination only, which made gfortran look
> >> for a non-existent _len component in the source object and therefore
> >> ICE. This is fixed by the patch by adding a function to return either
> >> the _len component, when it exists, or a constant zero node to init
> >> the destination object's _len component with.  
> >
> > Bootstrapped and regtested on x86_64-linux-gnu/F23. Ok for trunk?
> >
> > Regards,
> > Andre
> >
> > PS: Yes, Paul, I know you accepted the patch for gcc-5 for trunk
> > also, but I feel safer when the changes made get additional approval.
> > --
> > Andre Vehreschild * Email: vehre ad gmx dot de  
> 
> 
> 


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
Index: gcc/fortran/ChangeLog
===
--- gcc/fortran/ChangeLog	(Revision 234523)
+++ gcc/fortran/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,12 @@
+2016-03-29  Andre Vehreschild  
+
+	PR fortran/70397
+	* trans-expr.c (gfc_class_len_or_zero_get): Add function to return a
+	constant zero tree, when the class to get the _len component from is
+	not unlimited polymorphic.
+	(gfc_copy_class_to_class): Use the new function.
+	* trans.h: Added interface of new function gfc_class_len_or_zero_get.
+
 2016-03-28  Alessandro Fanfarillo  
 
 	* trans-decl.c (gfc_build_builtin_function_decls):
Index: gcc/fortran/trans-expr.c
===
--- gcc/fortran/trans-expr.c	(Revision 234523)
+++ gcc/fortran/trans-expr.c	(Arbeitskopie)
@@ -173,6 +173,29 @@
 }
 
 
+/* Try to get the _len component of a class.  When the class is not unlimited
+   poly, i.e. no _len field exists, then return a zero node.  */
+
+tree
+gfc_class_len_or_zero_get (tree decl)
+{
+  tree len;
+  /* For class arrays decl may be a temporary descriptor handle, the vptr is
+ then available through the saved descriptor.  */
+  if (TREE_CODE (decl) == VAR_DECL && DECL_LANG_SPECIFIC (decl)
+  && GFC_DECL_SAVED_DESCRIPTOR (decl))
+decl = GFC_DECL_SAVED_DESCRIPTOR (decl);
+  if (POINTER_TYPE_P (TREE_TYPE (decl)))
+decl = build_fold_indirect_ref_loc (input_location, decl);
+  len = gfc_advance_chain (TYPE_FIELDS (TREE_TYPE (decl)),
+			   CLASS_LEN_FIELD);
+  return len != NULL_TREE ? fold_build3_loc (input_location, COMPONENT_REF,
+	 TREE_TYPE (len), decl, len,
+	 NULL_TREE)
+			  : integer_zero_node;
+}
+
+
 /* Get the specified FIELD from the VPTR.  */
 
 static tree
@@ -250,6 +273,7 @@
 
 #undef CLASS_DATA_FIELD
 #undef CLASS_VPTR_FIELD
+#undef CLASS_LEN_FIELD
 #undef VTABLE_HASH_FIELD
 #undef VTABLE_SIZE_FIELD
 #undef VTABLE_EXTENDS_FIELD
@@ -1120,7 +1144,7 @@
   if (unlimited)
 {
   if (from != NULL_TREE && unlimited)
-	from_len = gfc_class_len_get (from);
+	from_len = gfc_class_len_or_zero_get (from);
   else
 	from_len = integer_zero_node;
 }
Index: gcc/fortran/trans.h
===
--- gcc/fortran/trans.h	(Revision 234523)
+++ gcc/fortran/trans.h	(Arbeitskopie)
@@ -365,6 +365,7 @@
 tree gfc_class_data_get (tree);
 tree gfc_class_vptr_get (tree);
 tree gfc_class_len_get (tree);
+tree gfc_class_len_or_zero_get (tree);
 gfc_expr * gfc_find_and_cut_at_last_class_ref (gfc_expr *);
 /* Get an accessor to the class' vtab's * field, when a class handle is
available.  */
Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog	(Revision 234523)
+++ gcc/testsuite/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,9 @@
+2016-03-29  Andre Vehreschild  
+
+	PR fortran/70397
+	* gfortran.dg/unlimited_polymorphic_25.f90: New test.
+	* gfortran.dg/unlimited_polymorphic_26.f90: New test.
+
 2016-03-29  Thomas Schwinge  
 
 	PR testsuite/64177
Index: gcc/testsuite/gfortran.dg/unlimited_polymorphic_25.f90
===
--- gcc/testsuite/gfortran.dg/unlimited_polymorphic_25.f90	(nicht existent)
+++ gcc/testsuite/gfortran.dg/unlimited_polymorphic_25.f90	(Arbeitskopie)
@@ -0,0 +1,40 @@
+! { dg-do run }
+!
+! Test contributed by Valery Weber  
+
+module mod
+
+  TYPE, PUBLIC :: base_type
+  END TYPE base_type
+

[PATCH] Fix ix86_expand_vector_set (PR target/70421)

2016-03-29 Thread Jakub Jelinek
Hi!

The various blendm expanders look like:
(define_insn "_blendm"
  [(set (match_operand:V48_AVX512VL 0 "register_operand" "=v")
(vec_merge:V48_AVX512VL
  (match_operand:V48_AVX512VL 2 "nonimmediate_operand" "vm")
  (match_operand:V48_AVX512VL 1 "register_operand" "v")
  (match_operand: 3 "register_operand" "Yk")))]
  "TARGET_AVX512F"
  "vblendm\t{%2, %1, %0%{%3%}|%0%{%3%}, %1, %2}"
  [(set_attr "type" "ssemov")
   (set_attr "prefix" "evex")
   (set_attr "mode" "")])
(i.e. their operands[1] is the second argument of VEC_MERGE (aka the value
to take elements from for bits cleared in the mask), while operands[2]
is the first argument of VEC_MERGE (aka the value to take elements from for
bits set in the mask)), so the call to gen_blendm which want to insert a
single element (with index elt) into target by broadcasting val into a
temporary and using mask of 1 << elt uses wrong order of arguments.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2016-03-27  Jakub Jelinek  

PR target/70421
* config/i386/i386.c (ix86_expand_vector_set): Fix up argument order
in gen_blendm expander.

* gcc.dg/torture/pr70421.c: New test.
* gcc.target/i386/avx512f-pr70421.c: New test.

--- gcc/config/i386/i386.c.jj   2016-03-23 10:41:12.0 +0100
+++ gcc/config/i386/i386.c  2016-03-27 22:32:51.748280358 +0200
@@ -46930,7 +46930,7 @@ half:
 {
   tmp = gen_reg_rtx (mode);
   emit_insn (gen_rtx_SET (tmp, gen_rtx_VEC_DUPLICATE (mode, val)));
-  emit_insn (gen_blendm (target, tmp, target,
+  emit_insn (gen_blendm (target, target, tmp,
 force_reg (mmode,
gen_int_mode (1 << elt, mmode;
 }
--- gcc/testsuite/gcc.dg/torture/pr70421.c.jj   2016-03-29 09:25:37.015469084 
+0200
+++ gcc/testsuite/gcc.dg/torture/pr70421.c  2016-03-29 09:25:13.0 
+0200
@@ -0,0 +1,22 @@
+/* PR target/70421 */
+/* { dg-do run } */
+/* { dg-additional-options "-Wno-psabi -w" } */
+
+typedef unsigned V __attribute__ ((vector_size (64)));
+
+unsigned __attribute__ ((noinline, noclone))
+foo (unsigned x, V u, V v)
+{
+  v[1] ^= v[2];
+  x ^= ((V) v)[u[0]];
+  return x;
+}
+
+int
+main ()
+{
+  unsigned x = foo (0x10, (V) { 1 }, (V) { 0x100, 0x1000, 0x1 });
+  if (x != 0x11010)
+__builtin_abort ();
+  return 0;
+}
--- gcc/testsuite/gcc.target/i386/avx512f-pr70421.c.jj  2016-03-29 
09:26:23.380837148 +0200
+++ gcc/testsuite/gcc.target/i386/avx512f-pr70421.c 2016-03-29 
09:27:37.066832846 +0200
@@ -0,0 +1,15 @@
+/* PR target/70421 */
+/* { dg-do run } */
+/* { dg-require-effective-target avx512f } */
+/* { dg-options "-O2 -mavx512f" } */
+
+#include "avx512f-check.h"
+
+#define main() do_main()
+#include "../../gcc.dg/torture/pr70421.c"
+
+static void
+avx512f_test (void)
+{
+  do_main ();
+}

Jakub


Re: [PATCH] Disable guality tests for powerpc*-linux*

2016-03-29 Thread Richard Biener
On March 29, 2016 4:45:44 PM GMT+02:00, David Edelsohn  
wrote:
>On Mon, Mar 28, 2016 at 8:38 PM, Bill Schmidt
> wrote:
>> Hi,
>>
>> For a long time we've had hundreds of failing guality tests.  These
>> failures don't seem to have any correlation with gdb functionality
>for
>> POWER, which is working fine.  At this point the value of these tests
>to
>> us seems questionable.  Fixing these is such low priority that it is
>> unlikely we will ever get around to it.  In the meanwhile, the
>failures
>> simply clutter up our regression test reports.  Thus I'd like to
>disable
>> them, and that's what this test does.
>>
>> Verified to remove hundreds of failure messages on
>> powerpc64le-unknown-linux-gnu. :)  Is this ok for trunk?
>>
>> Thanks,
>> Bill
>>
>>
>> 2016-03-28  Bill Schmidt  
>>
>> * g++.dg/guality/guality.exp: Disable for powerpc*-linux*.
>> * gcc.dg/guality/guality.exp: Likewise.
>
>Thanks for everyone else's suggestions.
>
>As far as we understand, debugging quality on POWER is equivalent to
>other targets.
>
>There is an issue with PPC64 BE and AIX requiring an extra frame push
>when debugging is enabled, which will cause differences between code
>with debugging enabled and debugging disabled.  THIS WILL NOT BE
>CHANGED.

Guality does not check for this but guality tests are in essence debug info 
tests (using gdb).  So definitely for those test cases failing debug quality is 
_not_ on par with x86 Linux.

Richard.

>We have no plans to make code generation a slave to the testsuite.
>The testsuite is a tool, successful results from the testsuite is not
>a goal unto itself.
>
>This patch is okay.
>
>Thanks, David




Re: [PATCH] Disable guality tests for powerpc*-linux*

2016-03-29 Thread Bill Schmidt
Hi Jakub,

Thanks for the information; I really do appreciate it!

On Tue, 2016-03-29 at 17:33 +0200, Jakub Jelinek wrote:
> On Tue, Mar 29, 2016 at 08:19:39AM -0500, Bill Schmidt wrote:
> > When you say that "the debug info quality is already pretty bad on
> > powerpc*," do you mean that it is known to be bad, or simply that we
> > have a lot of guality failures that may or may not indicate that the
> > debug info is bad?  I don't have experiential evidence of bad debug info
> > that shows up during debugging sessions.  Perhaps these are corner cases
> 
> A lot of effort has been spent on x86_64/i?86 to improve the debug info
> for optimized code, while far less effort has been spent on it for say
> powerpc* or s390*.  Many of the guality testcases are derived from
> real-world programs, such as the Linux kernel or python or other packages
> where the lack of QoI of debug info (or sometimes even wrong debug info)
> caused some tool failures or has been a major obstackle to users so that
> they couldn't debug something important.
> And for evidence, we have e.g. in redhat.com bugzilla, significantly more
> complains about debug info on powerpc*/s390* than on i?86/x86_64.

This is good information.  Unfortunately, this is the first time I've
been made aware of it.  If these bugs aren't posted to the FSF bugzilla,
or mirrored to us, we are ignorant that there is even a problem.  At the
moment I'm not aware of any bug reports about debug info on powerpc*.
Please pass these along to me as they arise.  We can't prioritize what
we can't see.

> > before I joined this project, and from what others tell me, for at least
> > a decade.  As you suggest here, others have always told me just to
> > ignore the existing guality failures.  However, this can easily lead to
> 
> Then you've been told wrong suggestions.  You should just keep comparing
> the results against older ones.

That is what I meant.  I apologize for the unclear language.

> 
> > The other point, "it would be really very much desirable if
> > anyone had time to analyze some of them and improve stuff," has to be
> > answered by "apparently nobody does."  I am currently tracking well over
> 
> That would be a wrong answer, several man-years have been spent on analyzing
> and improving those, by Alex, myself, Richi, various others.

Again, this is good information to know about.  But the "stuff" we were
talking about was the failures on powerpc*, and I took what you said to
mean that nobody was working on those.  It sounds like you're saying
that the community has spent time on debug improvements for optimized
code on x86_64/i?86, but only for that target.  Is that a fair
statement?  If so, it seems unsurprising that you would get more bug
reports for the debug information on powerpc* and s/390.

I'm not trying to be critical here.  I'm trying to understand the value
offered by these tests (which I do much better now, thanks), since we
have to prioritize our work carefully for the resources that we have.

Thanks,
Bill



Re: [PATCH] Disable guality tests for powerpc*-linux*

2016-03-29 Thread Jakub Jelinek
On Tue, Mar 29, 2016 at 12:01:20PM -0500, Bill Schmidt wrote:
> Again, this is good information to know about.  But the "stuff" we were
> talking about was the failures on powerpc*, and I took what you said to
> mean that nobody was working on those.  It sounds like you're saying
> that the community has spent time on debug improvements for optimized
> code on x86_64/i?86, but only for that target.  Is that a fair
> statement?  If so, it seems unsurprising that you would get more bug

Well, most of the analysis has been done on x86_64/i?86.  The bug fixes,
DWARF enhancements etc. were then in various areas, if something has been
improved through some GIMPLE change, then likely all targets benefited,
if it was something at the RTL level (or var-tracking pass itself), then
it really depends on the various properties of the machine descriptions,
argument passing etc.
I'm not saying it is possible to have all the guality tests pass at all
optimization levels on all targets, sometimes the value of some variable
is really lost through optimizations and can't be reconstructed in any way,
sometimes it is too costly to track it, etc.
In other cases we have yet to create new DWARF extensions, known stuff is
e.g. debugging vectorized loops, what kind of user experience we want for
users if single set of instructions handles multiple iterations of the loop?
Do we want user to think he is seeing e.g. the first iteration, then the
fifth one, then ninth etc., or provide enough info for the debuggers so that
the user could find out he is in vectorized loop and explicitly request
he is e.g. interested in the 3rd iteration instead of 1st?
Then there will be certainly cases where even without adding any extensions
one can just add some smarts to var-tracking, or change other GCC internals
to handle some stuff better.

Jakub


[PATCH] Fix num_imm_uses (PR tree-optimization/70405)

2016-03-29 Thread Jakub Jelinek
Hi!

The recent change to num_imm_uses (to add support for NULL USE_STMT)
broke it totally, fortunately we have just one user of this function
right now.  I've filed a PR for GCC 7 so that we get a warning on this.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2016-03-29  Jakub Jelinek  

PR tree-optimization/70405
* ssa-iterators.h (num_imm_uses): Add missing braces.

* gcc.dg/pr70405.c: New test.

--- gcc/ssa-iterators.h.jj  2016-01-04 14:55:53.0 +0100
+++ gcc/ssa-iterators.h 2016-03-29 14:31:16.773551024 +0200
@@ -448,9 +448,11 @@ num_imm_uses (const_tree var)
   unsigned int num = 0;
 
   if (!MAY_HAVE_DEBUG_STMTS)
-for (ptr = start->next; ptr != start; ptr = ptr->next)
-  if (USE_STMT (ptr))
-   num++;
+{
+  for (ptr = start->next; ptr != start; ptr = ptr->next)
+   if (USE_STMT (ptr))
+ num++;
+}
   else
 for (ptr = start->next; ptr != start; ptr = ptr->next)
   if (USE_STMT (ptr) && !is_gimple_debug (USE_STMT (ptr)))
--- gcc/testsuite/gcc.dg/pr70405.c.jj   2016-03-29 14:49:18.252808104 +0200
+++ gcc/testsuite/gcc.dg/pr70405.c  2016-03-29 14:33:41.0 +0200
@@ -0,0 +1,15 @@
+/* PR tree-optimization/70405 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fcompare-debug" } */
+/* { dg-additional-options "-mavx512f" { target i?86-*-* x86_64-*-* } } */
+
+typedef short V __attribute__ ((vector_size (32)));
+
+int
+foo (V *p)
+{
+  V v = *p;
+  v >>= v;
+  v -= v[0];
+  return v[3];
+}

Jakub


Re: [PATCH] Fix num_imm_uses (PR tree-optimization/70405)

2016-03-29 Thread Jeff Law

On 03/29/2016 11:23 AM, Jakub Jelinek wrote:

Hi!

The recent change to num_imm_uses (to add support for NULL USE_STMT)
broke it totally, fortunately we have just one user of this function
right now.  I've filed a PR for GCC 7 so that we get a warning on this.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2016-03-29  Jakub Jelinek  

PR tree-optimization/70405
* ssa-iterators.h (num_imm_uses): Add missing braces.

* gcc.dg/pr70405.c: New test.
Not caught by -Wmisleading-indentation?  Seems like it'd be worth a bug 
report for that.


OK for the trunk.
jeff



[PATCH] Fix simplify_shift_const_1 once more (PR rtl-optimization/70429)

2016-03-29 Thread Jakub Jelinek
Hi!

This is a case similar to the LSHIFTRT I've fixed recently.
But, unlike LSHIFTRT, which can be handled by masking at the outer level,
ASHIFTRT would need outer sign extension, so most of the time 2 outer
operations in addition to the kept two inner shifts, which is IMHO very
unlikely to ever be successfully combined on any target nor actually
beneficial.  So this patch just avoids that optimization for ASHIFTRT
if there are different modes.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-03-29  Jakub Jelinek  

PR rtl-optimization/70429
* combine.c (simplify_shift_const_1): For ASHIFTRT don't optimize
(cst1 >> count) >> cst2 into (cst1 >> cst2) >> count if
mode != result_mode.

* gcc.c-torture/execute/pr70429.c: New test.

--- gcc/combine.c.jj2016-03-15 17:11:17.0 +0100
+++ gcc/combine.c   2016-03-29 10:40:11.835477469 +0200
@@ -10533,6 +10533,11 @@ simplify_shift_const_1 (enum rtx_code co
   >> orig_count, result_mode,
   &complement_p))
break;
+ /* For ((int) (cstLL >> count)) >> cst2 just give up.  Queing up
+outer sign extension (often left and right shift) is hardly
+more efficient than the original.  See PR70429.  */
+ if (code == ASHIFTRT && mode != result_mode)
+   break;
 
  rtx new_rtx = simplify_const_binary_operation (code, mode,
 XEXP (varop, 0),
--- gcc/testsuite/gcc.c-torture/execute/pr70429.c.jj2016-03-29 
10:42:07.517901546 +0200
+++ gcc/testsuite/gcc.c-torture/execute/pr70429.c   2016-03-29 
10:41:52.0 +0200
@@ -0,0 +1,17 @@
+/* PR rtl-optimization/70429 */
+
+__attribute__((noinline, noclone)) int
+foo (int a)
+{
+  return (int) (0x14ff6e2207db5d1fLL >> a) >> 4;
+}
+
+int
+main ()
+{
+  if (sizeof (int) != 4 || sizeof (long long) != 8 || __CHAR_BIT__ != 8)
+return 0;
+  if (foo (1) != 0x3edae8 || foo (2) != -132158092)
+__builtin_abort ();
+  return 0;
+}

Jakub


Re: [PATCH] Fix num_imm_uses (PR tree-optimization/70405)

2016-03-29 Thread Jakub Jelinek
On Tue, Mar 29, 2016 at 11:28:20AM -0600, Jeff Law wrote:
> On 03/29/2016 11:23 AM, Jakub Jelinek wrote:
> >The recent change to num_imm_uses (to add support for NULL USE_STMT)
> >broke it totally, fortunately we have just one user of this function
> >right now.  I've filed a PR for GCC 7 so that we get a warning on this.
> >
> >Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> >trunk?
> >
> >2016-03-29  Jakub Jelinek  
> >
> > PR tree-optimization/70405
> > * ssa-iterators.h (num_imm_uses): Add missing braces.
> >
> > * gcc.dg/pr70405.c: New test.
> Not caught by -Wmisleading-indentation?  Seems like it'd be worth a bug
> report for that.

Not caught.  I've filed PR70436 for that.

Jakub


Re: [RFA][PATCH][tree-optimization/64058] Add new coalescing tie breaker heuristic V2

2016-03-29 Thread Peter Bergner
On Wed, 2016-03-23 at 01:49 -0600, Jeff Law wrote:
>  
> +/* This represents a conflict graph.  Implemented as an array of bitmaps.
> +   A full matrix is used for conflicts rather than just upper triangular 
> form.
> +   this make sit much simpler and faster to perform conflict merges.  */

s/make sit/makes it/

Peter




Re: [PATCH] Fix simplify_shift_const_1 once more (PR rtl-optimization/70429)

2016-03-29 Thread Jeff Law

On 03/29/2016 11:21 AM, Jakub Jelinek wrote:

Hi!

This is a case similar to the LSHIFTRT I've fixed recently.
But, unlike LSHIFTRT, which can be handled by masking at the outer level,
ASHIFTRT would need outer sign extension, so most of the time 2 outer
operations in addition to the kept two inner shifts, which is IMHO very
unlikely to ever be successfully combined on any target nor actually
beneficial.  So this patch just avoids that optimization for ASHIFTRT
if there are different modes.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-03-29  Jakub Jelinek  

PR rtl-optimization/70429
* combine.c (simplify_shift_const_1): For ASHIFTRT don't optimize
(cst1 >> count) >> cst2 into (cst1 >> cst2) >> count if
mode != result_mode.

* gcc.c-torture/execute/pr70429.c: New test.
But isn't the point of this code that cst1 >> cst2 turns into a compile 
time constant just leaving one runtime shift of the result by count?



Jeff


Re: [PATCH] Fix simplify_shift_const_1 once more (PR rtl-optimization/70429)

2016-03-29 Thread Jakub Jelinek
On Tue, Mar 29, 2016 at 11:34:29AM -0600, Jeff Law wrote:
> >This is a case similar to the LSHIFTRT I've fixed recently.
> >But, unlike LSHIFTRT, which can be handled by masking at the outer level,
> >ASHIFTRT would need outer sign extension, so most of the time 2 outer
> >operations in addition to the kept two inner shifts, which is IMHO very
> >unlikely to ever be successfully combined on any target nor actually
> >beneficial.  So this patch just avoids that optimization for ASHIFTRT
> >if there are different modes.
> >
> >Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> >
> >2016-03-29  Jakub Jelinek  
> >
> > PR rtl-optimization/70429
> > * combine.c (simplify_shift_const_1): For ASHIFTRT don't optimize
> > (cst1 >> count) >> cst2 into (cst1 >> cst2) >> count if
> > mode != result_mode.
> >
> > * gcc.c-torture/execute/pr70429.c: New test.
> But isn't the point of this code that cst1 >> cst2 turns into a compile time
> constant just leaving one runtime shift of the result by count?

But with the mode change then you are changing
(cst1 >> count) >> cst2
into
cst1 >> cst2) >> count) << (bitsz - cst2)) >> (bitsz - cst2))
so even when cst1 >> cst2 from there is a constant, the rest is not.
Even on targets that have instructions that let you sign extend
in one insn (more often only for selected values of cst2), it is still
trading two shifts for one shift and one sign extension.

If mode == result_mode, then we'll still optimize
(cst1 >> count) >> cst2, it is just the
((narrowing_conversion) (cst1 >> count)) >> cst2 which this patch changes.

Jakub


Re: [PATCH] Fix ix86_expand_vector_set (PR target/70421)

2016-03-29 Thread Jeff Law

On 03/29/2016 11:05 AM, Jakub Jelinek wrote:

Hi!

The various blendm expanders look like:
(define_insn "_blendm"
   [(set (match_operand:V48_AVX512VL 0 "register_operand" "=v")
 (vec_merge:V48_AVX512VL
   (match_operand:V48_AVX512VL 2 "nonimmediate_operand" "vm")
   (match_operand:V48_AVX512VL 1 "register_operand" "v")

One could argue this ordering is just asking for trouble.

Ultimately, I'll defer to Uros and Kirill.

Jeff



Re: [PATCH] Fix num_imm_uses (PR tree-optimization/70405)

2016-03-29 Thread Bernd Schmidt

On 03/29/2016 07:28 PM, Jeff Law wrote:

On 03/29/2016 11:23 AM, Jakub Jelinek wrote:

Hi!

The recent change to num_imm_uses (to add support for NULL USE_STMT)
broke it totally, fortunately we have just one user of this function
right now.  I've filed a PR for GCC 7 so that we get a warning on this.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
ok for
trunk?

2016-03-29  Jakub Jelinek  

PR tree-optimization/70405
* ssa-iterators.h (num_imm_uses): Add missing braces.

* gcc.dg/pr70405.c: New test.

Not caught by -Wmisleading-indentation?  Seems like it'd be worth a bug
report for that.


Actually this looks like the dangling-else regression I've complained 
about previously. When I added that warning, I intentionally made it catch


if (foo)
  for (..)
if (bar)
...
else
  

but at some point the code was changed so as to no longer warn for this 
case.



Bernd



Re: [PATCH] Fix simplify_shift_const_1 once more (PR rtl-optimization/70429)

2016-03-29 Thread Jeff Law

On 03/29/2016 11:43 AM, Jakub Jelinek wrote:

On Tue, Mar 29, 2016 at 11:34:29AM -0600, Jeff Law wrote:

This is a case similar to the LSHIFTRT I've fixed recently.
But, unlike LSHIFTRT, which can be handled by masking at the outer level,
ASHIFTRT would need outer sign extension, so most of the time 2 outer
operations in addition to the kept two inner shifts, which is IMHO very
unlikely to ever be successfully combined on any target nor actually
beneficial.  So this patch just avoids that optimization for ASHIFTRT
if there are different modes.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-03-29  Jakub Jelinek  

PR rtl-optimization/70429
* combine.c (simplify_shift_const_1): For ASHIFTRT don't optimize
(cst1 >> count) >> cst2 into (cst1 >> cst2) >> count if
mode != result_mode.

* gcc.c-torture/execute/pr70429.c: New test.

But isn't the point of this code that cst1 >> cst2 turns into a compile time
constant just leaving one runtime shift of the result by count?


But with the mode change then you are changing
(cst1 >> count) >> cst2
into
cst1 >> cst2) >> count) << (bitsz - cst2)) >> (bitsz - cst))
Why can't we sign extend cst1 >> cst2 at compile time and use a ASHIFTRT 
for the >> count shift?   Even if we've got a mode change to deal with, 
we can generate the constant in whatever mode we want.


I must  be missing something here.

jeff


Re: [PATCH] Fix ix86_expand_vector_set (PR target/70421)

2016-03-29 Thread Jakub Jelinek
On Tue, Mar 29, 2016 at 11:44:15AM -0600, Jeff Law wrote:
> On 03/29/2016 11:05 AM, Jakub Jelinek wrote:
> >Hi!
> >
> >The various blendm expanders look like:
> >(define_insn "_blendm"
> >   [(set (match_operand:V48_AVX512VL 0 "register_operand" "=v")
> > (vec_merge:V48_AVX512VL
> >   (match_operand:V48_AVX512VL 2 "nonimmediate_operand" "vm")
> >   (match_operand:V48_AVX512VL 1 "register_operand" "v")
> One could argue this ordering is just asking for trouble.

I bet the reason for this ordering are both the x86 intrinsics and
the HW behavior (see e.g. the order of arguments in the insn template
of the define_insn, etc.).
I think VEC_MERGE's definition on which argument you pick the elements from
for 0 bits in the mask vs. 1 bits in the mask is the exact opposite of what
the x86 HW wants and the intrinsics expect.

Jakub


Re: [PATCH] Fix num_imm_uses (PR tree-optimization/70405)

2016-03-29 Thread Jakub Jelinek
On Tue, Mar 29, 2016 at 07:46:32PM +0200, Bernd Schmidt wrote:
> On 03/29/2016 07:28 PM, Jeff Law wrote:
> >On 03/29/2016 11:23 AM, Jakub Jelinek wrote:
> >>Hi!
> >>
> >>The recent change to num_imm_uses (to add support for NULL USE_STMT)
> >>broke it totally, fortunately we have just one user of this function
> >>right now.  I've filed a PR for GCC 7 so that we get a warning on this.
> >>
> >>Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
> >>ok for
> >>trunk?
> >>
> >>2016-03-29  Jakub Jelinek  
> >>
> >>PR tree-optimization/70405
> >>* ssa-iterators.h (num_imm_uses): Add missing braces.
> >>
> >>* gcc.dg/pr70405.c: New test.
> >Not caught by -Wmisleading-indentation?  Seems like it'd be worth a bug
> >report for that.
> 
> Actually this looks like the dangling-else regression I've complained about
> previously. When I added that warning, I intentionally made it catch
> 
> if (foo)
>   for (..)
> if (bar)
> ...
> else
>   
> 
> but at some point the code was changed so as to no longer warn for this
> case.

Indeed, GCC 3.4 warns about this:
pr70405-3.c:7: warning: suggest explicit braces to avoid ambiguous `else'
That warning is still in there under -Wparentheses, but doesn't trigger
anymore.

Jakub


Re: [PATCH] Fix simplify_shift_const_1 once more (PR rtl-optimization/70429)

2016-03-29 Thread Jakub Jelinek
On Tue, Mar 29, 2016 at 11:47:57AM -0600, Jeff Law wrote:
> >>>2016-03-29  Jakub Jelinek  
> >>>
> >>>   PR rtl-optimization/70429
> >>>   * combine.c (simplify_shift_const_1): For ASHIFTRT don't optimize
> >>>   (cst1 >> count) >> cst2 into (cst1 >> cst2) >> count if
> >>>   mode != result_mode.
> >>>
> >>>   * gcc.c-torture/execute/pr70429.c: New test.
> >>But isn't the point of this code that cst1 >> cst2 turns into a compile time
> >>constant just leaving one runtime shift of the result by count?
> >
> >But with the mode change then you are changing
> >(cst1 >> count) >> cst2
> >into
> >cst1 >> cst2) >> count) << (bitsz - cst2)) >> (bitsz - cst))
> Why can't we sign extend cst1 >> cst2 at compile time and use a ASHIFTRT for
> the >> count shift?   Even if we've got a mode change to deal with, we can
> generate the constant in whatever mode we want.

I don't understand how you could do that.
In the original source there is a variable shift count first, then narrowing
cast, then further arithmetic shift by constant.
So sure, you can shift the cst1 by cst2, but which bit you want to sign
extend on depends on the count value, only known at runtime.

Consider the testcase I've posted in the patch:
__attribute__((noinline, noclone)) int
foo (int a)
{
  return (int) (0x14ff6e2207db5d1fLL >> a) >> 4;
}

if a is 1, 0x14ff6e2207db5d1fLL >> a is
0xa7fb71103edae8f
and bit 31 of this is 0, so in the end you get
0x03edae8f >> 4
If a is 2, 0x14ff6e2207db5d1fLL >> a is
0x53fdb8881f6d747
and bit 31 of this is 1, so in the end you get
0x81f6d747 >> 4
Now, if you want to shift by 4 first, you have cst1 >> cst2
0x14ff6e2207db5d1LL, but you need to sign extend this, but which bit
from depends on count (and the difference between bitsizes of mode and
result_mode).

Jakub


Re: [PATCH] Fix simplify_shift_const_1 once more (PR rtl-optimization/70429)

2016-03-29 Thread Segher Boessenkool
On Tue, Mar 29, 2016 at 07:21:29PM +0200, Jakub Jelinek wrote:
> This is a case similar to the LSHIFTRT I've fixed recently.
> But, unlike LSHIFTRT, which can be handled by masking at the outer level,
> ASHIFTRT would need outer sign extension, so most of the time 2 outer
> operations in addition to the kept two inner shifts, which is IMHO very
> unlikely to ever be successfully combined on any target nor actually
> beneficial.  So this patch just avoids that optimization for ASHIFTRT
> if there are different modes.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Okay with spello fixed ("queuing").  Thanks,


Segher


Re: [PATCH] Disable guality tests for powerpc*-linux*

2016-03-29 Thread Mike Stump
On Mar 29, 2016, at 7:45 AM, David Edelsohn  wrote:
> We have no plans to make code generation a slave to the testsuite.
> The testsuite is a tool, successful results from the testsuite is not
> a goal unto itself.
> 
> This patch is okay.

We look forward to the day when someone can find the time and energy and desire 
to make subsets of this work better and reenable those as they bring them back 
online.  I view it as I do for turning off C++ testing on a PIC target, if no 
one wants to make it work nicely, then it is better to just turn it off.  
Anyone with the desire to make these tests work nicely will step forward and 
donate as they are able to.  If someone would like that work done, you can edit 
up a TODO or projects file to describe the work you’d like done, and try and 
find someone that would like to do the work, or, just do the work yourself.  If 
someone has the free time, and wants to tackle this project, merely step 
forward and let others know.  This is how we make progress.

Re: [PATCH] Disable guality tests for powerpc*-linux*

2016-03-29 Thread Jakub Jelinek
On Tue, Mar 29, 2016 at 11:34:17AM -0700, Mike Stump wrote:
> On Mar 29, 2016, at 7:45 AM, David Edelsohn  wrote:
> > We have no plans to make code generation a slave to the testsuite.
> > The testsuite is a tool, successful results from the testsuite is not
> > a goal unto itself.
> > 
> > This patch is okay.
> 
> We look forward to the day when someone can find the time and energy and
> desire to make subsets of this work better and reenable those as they
> bring them back online.  I view it as I do for turning off C++ testing on
> a PIC target, if no one wants to make it work nicely, then it is better to
> just turn it off.  Anyone with the desire to make these tests work nicely
> will step forward and donate as they are able to.  If someone would like
> that work done, you can edit up a TODO or projects file to describe the
> work you’d like done, and try and find someone that would like to do the
> work, or, just do the work yourself.  If someone has the free time, and
> wants to tackle this project, merely step forward and let others know. 
> This is how we make progress.

The problem with the disabling is not in those tests that don't pass right
now on whatever target you are testing on, but with any regressions in tests
that pass right now but will not pass in half a year or year because of GCC
changes; if the tests are disabled, nobody will notice that, one can't look
at gcc-regressions or elsewhere to find out quickly where it regressed, etc.

Jakub


Re: RFA: PATCH to tree-inline.c:remap_decls for c++/70353 (ICE with __func__ and constexpr)

2016-03-29 Thread Jason Merrill

On 03/29/2016 06:37 AM, Jan Hubicka wrote:

On Mon, Mar 28, 2016 at 11:26 PM, Jason Merrill  wrote:

The constexpr evaluation code uses the inlining code to remap the constexpr
function body for evaluation so that recursion works properly.  In this
testcase __func__ is declared as a static local variable, so rather than
remap it, remap_decls tries to add it to the local_decls list for the
function we're inlining into.  But there is no such function in this case,
so we crash.

Avoid the add_local_decl call when cfun is null avoids the ICE (thanks
Jakub), but results in an undefined symbol.  Calling
varpool_node::finalize_decl instead allows cgraph to handle the reference
from 'c' properly.

OK if testing passes?


So ce will never be instantiated?


Right, because no calls to it survive constexpr evaluation.  And the 
front end avoids finalizing it in make_rtl_for_nonlocal_decl...which is 
another place I could fix this.  Thus.


Tested x86_64-pc-linux-gnu, applying to trunk.

Jason

commit 28b2d2bfe2c55aa41e1d540a30595357f000279c
Author: Jason Merrill 
Date:   Mon Mar 28 17:14:43 2016 -0400

	PR c++/70353

gcc/
	* tree-inline.c (remap_decls): Don't add_local_decl if
	cfun is null.
gcc/cp/
	* decl.c (make_rtl_for_nonlocal_decl): Don't defer local statics
	in constexpr functions.

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index cd5db3f..cfae210 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -6251,8 +6251,11 @@ make_rtl_for_nonlocal_decl (tree decl, tree init, const char* asmspec)
 return;
 
   /* We defer emission of local statics until the corresponding
- DECL_EXPR is expanded.  */
-  defer_p = DECL_FUNCTION_SCOPE_P (decl) || DECL_VIRTUAL_P (decl);
+ DECL_EXPR is expanded.  But with constexpr its function might never
+ be expanded, so go ahead and tell cgraph about the variable now.  */
+  defer_p = ((DECL_FUNCTION_SCOPE_P (decl)
+	  && !DECL_DECLARED_CONSTEXPR_P (DECL_CONTEXT (decl)))
+	 || DECL_VIRTUAL_P (decl));
 
   /* Defer template instantiations.  */
   if (DECL_LANG_SPECIFIC (decl)
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-__func__2.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-__func__2.C
new file mode 100644
index 000..e678290
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-__func__2.C
@@ -0,0 +1,13 @@
+// PR c++/70353
+// { dg-do link { target c++11 } }
+
+constexpr const char* ce ()
+{
+  return __func__;
+}
+
+const char *c = ce();
+
+int main()
+{
+}
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index 9d4f8f7..5206d20 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -616,7 +616,8 @@ remap_decls (tree decls, vec **nonlocalized_list,
 	  /* We need to add this variable to the local decls as otherwise
 	 nothing else will do so.  */
 	  if (TREE_CODE (old_var) == VAR_DECL
-	  && ! DECL_EXTERNAL (old_var))
+	  && ! DECL_EXTERNAL (old_var)
+	  && cfun)
 	add_local_decl (cfun, old_var);
 	  if ((!optimize || debug_info_level > DINFO_LEVEL_TERSE)
 	  && !DECL_IGNORED_P (old_var)


Re: [PATCH][ARM][4.9 Backport] PR target/69875 Fix atomic_loaddi expansion

2016-03-29 Thread Christophe Lyon
On 16 March 2016 at 16:54, Ramana Radhakrishnan
 wrote:
> On Wed, Feb 24, 2016 at 11:23 AM, Kyrill Tkachov
>  wrote:
>> Hi all,
>>
>> This is the GCC 4.9 backport of
>> https://gcc.gnu.org/ml/gcc-patches/2016-02/msg01338.html.
>> The differences are that TARGET_HAVE_LPAE has to be defined in arm.h in a
>> different way because
>> the ARM_FSET_HAS_CPU1 mechanism doesn't exist on this branch. Also, due to
>> the location of insn_flags
>> and the various FL_* (on the 4.9 branch they're defined locally in arm.c
>> rather than in arm-protos.h)
>> I chose to define TARGET_HAVE_LPAE in terms of hardware divide instruction
>> availability. This should be
>> an equivalent definition.
>>
>> Also, the scan-assembler tests that check for the DMB instruction are
>> updated to check for
>> "dmb sy" rather than "dmb ish", because the memory barrier instruction
>> changed on trunk for GCC 6.
>>
>> Bootstrapped and tested on the GCC 4.9 branch on arm-none-linux-gnueabihf.
>>
>>
>> Ok for the branch after the trunk patch has had a few days to bake?
>
>
> OK.
>
Hi Kyrylo,

Since you backported this to branches 4.9 and 5, I've noticed cross-GCC build
failures:
--target arm-none-linux-gnueabihf
--with-mode=arm
--with-cpu=cortex-a57
--with-fpu=crypto-neon-fp-armv8

The build succeeds --with-mode=thumb.

The error message I'm seeing is:
/tmp/6190285_22.tmpdir/ccuX17sh.s: Assembler messages:
/tmp/6190285_22.tmpdir/ccuX17sh.s:34: Error: bad instruction `ldrdeq r0,r1,[r0]'
make[4]: *** [load_8_.lo] Error 1

while building libatomic

Christophe


> Ramana
>>
>> Thanks,
>> Kyrill
>>
>> 2016-02-24  Kyrylo Tkachov  
>>
>> PR target/69875
>> * config/arm/arm.h (TARGET_HAVE_LPAE): Define.
>> * config/arm/unspecs.md (VUNSPEC_LDRD_ATOMIC): New value.
>> * config/arm/sync.md (arm_atomic_loaddi2_ldrd): New pattern.
>> (atomic_loaddi_1): Delete.
>> (atomic_loaddi): Rewrite expander using the above changes.
>>
>> 2016-02-24  Kyrylo Tkachov  
>>
>> PR target/69875
>> * gcc.target/arm/atomic_loaddi_acquire.x: New file.
>> * gcc.target/arm/atomic_loaddi_relaxed.x: Likewise.
>> * gcc.target/arm/atomic_loaddi_seq_cst.x: Likewise.
>> * gcc.target/arm/atomic_loaddi_1.c: New test.
>> * gcc.target/arm/atomic_loaddi_2.c: Likewise.
>> * gcc.target/arm/atomic_loaddi_3.c: Likewise.
>> * gcc.target/arm/atomic_loaddi_4.c: Likewise.
>> * gcc.target/arm/atomic_loaddi_5.c: Likewise.
>> * gcc.target/arm/atomic_loaddi_6.c: Likewise.
>> * gcc.target/arm/atomic_loaddi_7.c: Likewise.
>> * gcc.target/arm/atomic_loaddi_8.c: Likewise.
>> * gcc.target/arm/atomic_loaddi_9.c: Likewise.


Re: [PATCH] c++/67376 Comparison with pointer to past-the-end, of array fails inside constant expression

2016-03-29 Thread Jason Merrill

On 03/28/2016 06:04 PM, Martin Sebor wrote:

+  && compare_tree_int (arg1, 0) == 0)


This can be integer_zerop.


+   case GE_EXPR:
+   case EQ_EXPR:
+   case LE_EXPR:
+ return boolean_false_node;
+   case GT_EXPR:
+   case LT_EXPR:
+   case NE_EXPR:
+ return boolean_true_node;


EQ and NE make sense, but I would expect both > and >= to be true, < and 
<= to be false.


Are we confident that arr[0] won't make it here as POINTER_PLUS_EXPR or 
some such?


Jason



[PATCH] PR target/70439: Properly check conflict between DRAP register and __builtin_eh_return

2016-03-29 Thread H.J. Lu
Since %ecx can't be used for both DRAP register and __builtin_eh_return,
we need to check if crtl->drap_reg uses %ecx before using %ecx for
__builtin_eh_return.

Testing on x86-64.  OK for trunk if there are no regressions?


H.J.
---
PR target/70439
* config/i386/i386.c (ix86_expand_epilogue): Properly check
conflict between DRAP register and __builtin_eh_return.
---
 gcc/config/i386/i386.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 1639704..aafe171 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -13463,8 +13463,9 @@ ix86_expand_epilogue (int style)
  rtx sa = EH_RETURN_STACKADJ_RTX;
  rtx_insn *insn;
 
- /* Stack align doesn't work with eh_return.  */
- gcc_assert (!stack_realign_drap);
+ /* %ecx can't be used for both DRAP register and eh_return.  */
+ gcc_assert (!crtl->drap_reg
+ || REGNO (crtl->drap_reg) != CX_REG);
  /* Neither does regparm nested functions.  */
  gcc_assert (!ix86_static_chain_on_stack);
 
-- 
2.5.5



Re: [PATCH] Disable guality tests for powerpc*-linux*

2016-03-29 Thread H.J. Lu
On Tue, Mar 29, 2016 at 11:42 AM, Jakub Jelinek  wrote:
> On Tue, Mar 29, 2016 at 11:34:17AM -0700, Mike Stump wrote:
>> On Mar 29, 2016, at 7:45 AM, David Edelsohn  wrote:
>> > We have no plans to make code generation a slave to the testsuite.
>> > The testsuite is a tool, successful results from the testsuite is not
>> > a goal unto itself.
>> >
>> > This patch is okay.
>>
>> We look forward to the day when someone can find the time and energy and
>> desire to make subsets of this work better and reenable those as they
>> bring them back online.  I view it as I do for turning off C++ testing on
>> a PIC target, if no one wants to make it work nicely, then it is better to
>> just turn it off.  Anyone with the desire to make these tests work nicely
>> will step forward and donate as they are able to.  If someone would like
>> that work done, you can edit up a TODO or projects file to describe the
>> work you’d like done, and try and find someone that would like to do the
>> work, or, just do the work yourself.  If someone has the free time, and
>> wants to tackle this project, merely step forward and let others know.
>> This is how we make progress.
>
> The problem with the disabling is not in those tests that don't pass right
> now on whatever target you are testing on, but with any regressions in tests
> that pass right now but will not pass in half a year or year because of GCC
> changes; if the tests are disabled, nobody will notice that, one can't look
> at gcc-regressions or elsewhere to find out quickly where it regressed, etc.
>
> Jakub

One issue with gcc.dg/guality/guality.exp is when there is an ICE
regression during guality init, all sudden there is no failure in guality
tests:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68545

Next time, when ICE is fixed, a bunch of guality failures show up.

-- 
H.J.


[PATCH] PR testsuite/70364: Properly align stack in gcc.target/i386/cleanup-[12].c

2016-03-29 Thread H.J. Lu
Tested on x86-64.  OK for trunk?

H.J.
---
PR testsuite/70364
* gcc.target/i386/cleanup-1.c: Include .
(check): New function.
(bar): Call check.
(foo): Align stack to 16 bytes when calling bar.
* gcc.target/i386/cleanup-2.c: Likewise.
---
 gcc/testsuite/gcc.target/i386/cleanup-1.c | 17 ++---
 gcc/testsuite/gcc.target/i386/cleanup-2.c | 17 ++---
 2 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/gcc/testsuite/gcc.target/i386/cleanup-1.c 
b/gcc/testsuite/gcc.target/i386/cleanup-1.c
index fc82f35..dcfcc4e 100644
--- a/gcc/testsuite/gcc.target/i386/cleanup-1.c
+++ b/gcc/testsuite/gcc.target/i386/cleanup-1.c
@@ -4,6 +4,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -47,6 +48,14 @@ handler (void *p __attribute__((unused)))
   _exit (0);
 }
 
+static void
+__attribute__((noinline))
+check (intptr_t p)
+{
+  if ((p & 15) != 0)
+abort ();
+}
+
 static int __attribute__((noinline))
 fn5 (void)
 {
@@ -59,6 +68,8 @@ void
 bar (void)
 {
   char dummy __attribute__((cleanup (counter)));
+  unsigned long tmp[4] __attribute__((aligned(16)));
+  check ((intptr_t) tmp);
   fn5 ();
 }
 
@@ -133,9 +144,9 @@ foo (int x)
".type  _L_mutex_lock_%=, @function\n"
 "_L_mutex_lock_%=:\n"
 "1:\t" "leaq   %1, %%rdi\n"
-"2:\t" "subq   $128, %%rsp\n"
+"2:\t" "subq   $136, %%rsp\n"
 "3:\t" "call   bar\n"
-"4:\t" "addq   $128, %%rsp\n"
+"4:\t" "addq   $136, %%rsp\n"
 "5:\t" "jmp24f\n"
 "6:\t" ".size _L_mutex_lock_%=, .-_L_mutex_lock_%=\n\t"
".previous\n\t"
@@ -179,7 +190,7 @@ foo (int x)
".sleb128 4b-3b\n"
 "16:\t"".byte  0x40 + (4b-3b-1) # DW_CFA_advance_loc\n\t"
".byte  0x0e# DW_CFA_def_cfa_offset\n\t"
-   ".uleb128 128\n\t"
+   ".uleb128 136\n\t"
".byte  0x16# DW_CFA_val_expression\n\t"
".uleb128 0x10\n\t"
".uleb128 20f-17f\n"
diff --git a/gcc/testsuite/gcc.target/i386/cleanup-2.c 
b/gcc/testsuite/gcc.target/i386/cleanup-2.c
index 0ec7c31..7e603233 100644
--- a/gcc/testsuite/gcc.target/i386/cleanup-2.c
+++ b/gcc/testsuite/gcc.target/i386/cleanup-2.c
@@ -4,6 +4,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -47,6 +48,14 @@ handler (void *p __attribute__((unused)))
   _exit (0);
 }
 
+static void
+__attribute__((noinline))
+check (intptr_t p)
+{
+  if ((p & 15) != 0)
+abort ();
+}
+
 static int __attribute__((noinline))
 fn5 (void)
 {
@@ -59,6 +68,8 @@ void
 bar (void)
 {
   char dummy __attribute__((cleanup (counter)));
+  unsigned long tmp[4] __attribute__((aligned(16)));
+  check ((intptr_t) tmp);
   fn5 ();
 }
 
@@ -74,9 +85,9 @@ foo (int x)
".type  _L_mutex_lock_%=, @function\n"
 "_L_mutex_lock_%=:\n"
 "1:\t" "leaq   %1, %%rdi\n"
-"2:\t" "subq   $128, %%rsp\n"
+"2:\t" "subq   $136, %%rsp\n"
 "3:\t" "call   bar\n"
-"4:\t" "addq   $128, %%rsp\n"
+"4:\t" "addq   $136, %%rsp\n"
 "5:\t" "jmp21f\n"
 "6:\t" ".size _L_mutex_lock_%=, .-_L_mutex_lock_%=\n\t"
".previous\n\t"
@@ -160,7 +171,7 @@ foo (int x)
".uleb128 6b-5b-1\n"
 "19:\t"".byte  0x40 + (3b-1b) # DW_CFA_advance_loc\n\t"
".byte  0xe # DW_CFA_def_cfa_offset\n\t"
-   ".uleb128 128\n\t"
+   ".uleb128 136\n\t"
".byte  0x40 + (5b-3b) # DW_CFA_advance_loc\n\t"
".byte  0xe # DW_CFA_def_cfa_offset\n\t"
".uleb128 0\n\t"
-- 
2.5.5



Re: [PATCH,boehm-gc] Use mmap instead of brk on kfreebsd & hurd too

2016-03-29 Thread Thomas Schwinge
Hi!

On Sun, 31 Aug 2014 17:20:04 +0200, Samuel Thibault  
wrote:
> Please use mmap instead of brk on kfreebsd and hurd too.
> Also, using anonymous memory is faster on the Hurd.

> [patch]

Thanks; finally committed in r234534:

commit 04a4d1ce0425912054b6f8db5bc15029bf87e055
Author: tschwinge 
Date:   Tue Mar 29 21:05:07 2016 +

[Hurd, kFreeBSD] boehm-gc: Use mmap instead of brk

boehm-gc/
* configure.host: Set gc_use_mmap on *-kfreebsd-gnu* and *-gnu*.
* include/private/gcconfig.h [HURD && USE_MMAP]: Define
USE_MMAP_ANON.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@234534 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 boehm-gc/ChangeLog  | 6 ++
 boehm-gc/configure.host | 2 +-
 boehm-gc/include/private/gcconfig.h | 2 +-
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git boehm-gc/ChangeLog boehm-gc/ChangeLog
index c41734a..6896c67 100644
--- boehm-gc/ChangeLog
+++ boehm-gc/ChangeLog
@@ -1,3 +1,9 @@
+2016-03-29  Samuel Thibault  
+
+   * configure.host: Set gc_use_mmap on *-kfreebsd-gnu* and *-gnu*.
+   * include/private/gcconfig.h [HURD && USE_MMAP]: Define
+   USE_MMAP_ANON.
+
 2016-03-16  Andreas Schwab  
 
* include/private/gcconfig.h [AARCH64] (ALIGNMENT, CPP_WORDSZ):
diff --git boehm-gc/configure.host boehm-gc/configure.host
index 97f4dac..229a038 100644
--- boehm-gc/configure.host
+++ boehm-gc/configure.host
@@ -41,7 +41,7 @@ else
 fi
 
 case "${host}" in
-  *-linux*)
+  *-linux*|*-kfreebsd-gnu*|*-gnu*)
 gc_use_mmap=yes
 ;;
 esac
diff --git boehm-gc/include/private/gcconfig.h 
boehm-gc/include/private/gcconfig.h
index aa81f15..44b9d7d 100644
--- boehm-gc/include/private/gcconfig.h
+++ boehm-gc/include/private/gcconfig.h
@@ -2137,7 +2137,7 @@
 #   endif
 # endif
 
-#if defined(LINUX) && defined(USE_MMAP)
+#if (defined(LINUX) || defined(HURD)) && defined(USE_MMAP)
 /* The kernel may do a somewhat better job merging mappings etc.   */
 /* with anonymous mappings.
*/
 #   define USE_MMAP_ANON


Grüße
 Thomas


signature.asc
Description: PGP signature


Re: [PATCH] hurd: align -p and -pg behavior on Linux

2016-03-29 Thread Thomas Schwinge
Hi!

On Wed, 24 Feb 2016 23:46:36 +0100, I wrote:
> On Sat, 19 Sep 2015 14:00:23 +0200, Samuel Thibault  
> wrote:
> > On Linux, -p and -pg do not make gcc link against libc_p.a, only
> > -profile does (as documented in r11246), and thus people expect -p
> 
> (Yo, 20 years ago...)
> 
> > and -pg to work without libc_p.a installed (it is actually even not
> > available any more in Debian).  We should thus rather make the Hurd port
> > do the same to avoid build failures.
> 
> Conceptually, ACK.

> I'm now testing the following patch:

Now committed in r234535:

commit 9b2eb5d3268cf674f9a6964479f20428e0b43500
Author: tschwinge 
Date:   Tue Mar 29 21:17:53 2016 +

[Hurd] Specs maintenance

gcc/
* config/gnu.h (CPP_SPEC, LIB_SPEC): Don't override.
* config/i386/gnu.h (STARTFILE_SPEC): Use gcrt1.o instead of
gcrt0.o if linking dynamically.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@234535 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog | 6 ++
 gcc/config/gnu.h  | 8 
 gcc/config/i386/gnu.h | 4 ++--
 3 files changed, 8 insertions(+), 10 deletions(-)

diff --git gcc/ChangeLog gcc/ChangeLog
index 866531f..37e2504 100644
--- gcc/ChangeLog
+++ gcc/ChangeLog
@@ -1,3 +1,9 @@
+2016-03-29  Thomas Schwinge  
+
+   * config/gnu.h (CPP_SPEC, LIB_SPEC): Don't override.
+   * config/i386/gnu.h (STARTFILE_SPEC): Use gcrt1.o instead of
+   gcrt0.o if linking dynamically.
+
 2016-03-10  Jan Hubicka  
 
PR ipa/70283
diff --git gcc/config/gnu.h gcc/config/gnu.h
index 1d98ec8..1dbecda 100644
--- gcc/config/gnu.h
+++ gcc/config/gnu.h
@@ -19,14 +19,6 @@ You should have received a copy of the GNU General Public 
License
 along with GCC.  If not, see .
 */
 
-/* Provide GCC options for standard feature-test macros.  */
-#undef CPP_SPEC
-#define CPP_SPEC "%{posix:-D_POSIX_SOURCE}"
-
-/* Default C library spec.  */
-#undef LIB_SPEC
-#define LIB_SPEC "%{pthread:-lpthread} %{pg|p|profile:-lc_p;:-lc}"
-
 #undef GNU_USER_TARGET_OS_CPP_BUILTINS
 #define GNU_USER_TARGET_OS_CPP_BUILTINS()  \
 do {   \
diff --git gcc/config/i386/gnu.h gcc/config/i386/gnu.h
index c726d31..9d2f94f 100644
--- gcc/config/i386/gnu.h
+++ gcc/config/i386/gnu.h
@@ -27,11 +27,11 @@ along with GCC.  If not, see .
 #undef STARTFILE_SPEC
 #if defined HAVE_LD_PIE
 #define STARTFILE_SPEC \
-  "%{!shared: 
%{pg|p|profile:gcrt0.o%s;pie:Scrt1.o%s;static:crt0.o%s;:crt1.o%s}} \
+  "%{!shared: 
%{pg|p|profile:%{static:gcrt0.o%s;:gcrt1.o%s};pie:Scrt1.o%s;static:crt0.o%s;:crt1.o%s}}
 \
crti.o%s %{static:crtbeginT.o%s;shared|pie:crtbeginS.o%s;:crtbegin.o%s}"
 #else
 #define STARTFILE_SPEC \
-  "%{!shared: %{pg|p|profile:gcrt0.o%s;static:crt0.o%s;:crt1.o%s}} \
+  "%{!shared: 
%{pg|p|profile:%{static:gcrt0.o%s;:gcrt1.o%s};static:crt0.o%s;:crt1.o%s}} \
crti.o%s %{static:crtbeginT.o%s;shared|pie:crtbeginS.o%s;:crtbegin.o%s}"
 #endif
 


Grüße
 Thomas


signature.asc
Description: PGP signature


Re: [PATCH] hurd: align -p and -pg behavior on Linux

2016-03-29 Thread Samuel Thibault
Hello,

Thomas Schwinge, on Tue 29 Mar 2016 23:19:09 +0200, wrote:
> On Wed, 24 Feb 2016 23:46:36 +0100, I wrote:
> > On Sat, 19 Sep 2015 14:00:23 +0200, Samuel Thibault 
> >  wrote:
> > > On Linux, -p and -pg do not make gcc link against libc_p.a, only
> > > -profile does (as documented in r11246), and thus people expect -p
> > 
> > (Yo, 20 years ago...)
> > 
> > > and -pg to work without libc_p.a installed (it is actually even not
> > > available any more in Debian).  We should thus rather make the Hurd port
> > > do the same to avoid build failures.
> > 
> > Conceptually, ACK.
> 
> > I'm now testing the following patch:
> 
> Now committed in r234535:

Groovy :)

Could you also commit to gcc 5 branches so we get it in Debian without
having to poke Doko?

Samuel


Re: [PATCH v2] sanitize paths used in regular expression

2016-03-29 Thread Mike Stump
On Feb 8, 2016, at 2:26 PM, Zach Welch  wrote:
> 
> Ping.  From what I see, my patch has not yet been committed.  Can I talk
> someone into taking care of that for me?

I had hoped that someone would commit it for you.

Committed revision 234533.


Fix overflow in loop peeling code

2016-03-29 Thread Jan Hubicka
Hi,
this patch fixes stupid overflow in tree-ssa-loop-ivcanon.c.
If the estimated number of execution of loop is INT_MAX+1 it will get peeled
incorrectly.

Bootstrapped/regtested x86_64-linux and committed (it is regression WRT the
RTL implementation)

Honza

* tree-ssa-loop-ivcanon.c (try_peel_loop): Change type of peel
to HOST_WIDE_INT
Index: tree-ssa-loop-ivcanon.c
===
--- tree-ssa-loop-ivcanon.c (revision 234516)
+++ tree-ssa-loop-ivcanon.c (working copy)
@@ -935,7 +935,7 @@ try_peel_loop (struct loop *loop,
   edge exit, tree niter,
   HOST_WIDE_INT maxiter)
 {
-  int npeel;
+  HOST_WIDE_INT npeel;
   struct loop_size size;
   int peeled_size;
   sbitmap wont_exit;
@@ -990,7 +990,7 @@ try_peel_loop (struct loop *loop,
 {
   if (dump_file)
 fprintf (dump_file, "Not peeling: rolls too much "
-"(%i + 1 > --param max-peel-times)\n", npeel);
+"(%i + 1 > --param max-peel-times)\n", (int) npeel);
   return false;
 }
   npeel++;
@@ -998,7 +998,7 @@ try_peel_loop (struct loop *loop,
   /* Check peeled loops size.  */
   tree_estimate_loop_size (loop, exit, NULL, &size,
   PARAM_VALUE (PARAM_MAX_PEELED_INSNS));
-  if ((peeled_size = estimated_peeled_sequence_size (&size, npeel))
+  if ((peeled_size = estimated_peeled_sequence_size (&size, (int) npeel))
   > PARAM_VALUE (PARAM_MAX_PEELED_INSNS))
 {
   if (dump_file)
@@ -1032,7 +1032,7 @@ try_peel_loop (struct loop *loop,
   if (dump_file && (dump_flags & TDF_DETAILS))
 {
   fprintf (dump_file, "Peeled loop %d, %i times.\n",
-  loop->num, npeel);
+  loop->num, (int) npeel);
 }
   if (loop->any_upper_bound)
 loop->nb_iterations_upper_bound -= npeel;


Re: C PATCH for c/70297 (crash with duplicate typedefs and -g)

2016-03-29 Thread Jeff Law

On 03/23/2016 07:52 AM, Marek Polacek wrote:

On Mon, Mar 21, 2016 at 09:57:54PM +0100, Richard Biener wrote:

On March 21, 2016 6:55:28 PM GMT+01:00, Marek Polacek  
wrote:

This PR points out to a GC problem: when we freed a duplicate typedef,
we were
leaving its type in the variants list, with its TYPE_NAME still
pointing to the
now-freed TYPE_DECL, leading to a crash.   I was lucky to discover that
the
exact same problem was fixed in November for the C++ FE, so I could
just follow
suit here.  And because that change didn't add a new testcase, I'm
putting the
new test into c-c++-common/.

Bootstrapped/regtested on x86_64-linux, ok for trunk/5?


But IIRC this will drop the aligned attribute effect as that applied to the new 
type?


Yes.  So this metamorphosed into another issue: what to do with

typedef int T;
typedef int T __attribute__((aligned (16)));
typedef int T __attribute__((aligned (32)));
?  Either we can reject this, or do what clang (and other compilers) do, that
is pick the strictest alignment - 32 in this case.  The following patch does
that.
I've also checked what the C FE does for a few other attributes:
packed
visibility
  - ignored on TYPE_DECLs
used
unused
deprecated
scalar_storage_order
  - we seem to do the right thing
transparent_union
  - typedef int T;
typedef int T __attribute__((transparent_union));
gives an error
vector_size
mode
  - typedef int T;
typedef int T __attribute__((mode (SI)));
works while
typedef int T;
typedef int T __attribute__((mode (DI)));
results in "conflicts" error
similarly for vector_size

Well, here's the patch.  Thoughts?

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2016-03-23  Marek Polacek  

PR c/70297
* c-decl.c (merge_decls): Also set TYPE_ALIGN and TYPE_USER_ALIGN.

* decl.c (duplicate_decls): Also set TYPE_ALIGN and TYPE_USER_ALIGN.

* c-c++-common/pr70297.c: New test.
* g++.dg/cpp0x/typedef-redecl.C: New test.
* gcc.dg/typedef-redecl2.c: New test.

OK.
jeff



Re: [RFA][PATCH][tree-optimization/64058] Add new coalescing tie breaker heuristic V2

2016-03-29 Thread Jeff Law

On 03/29/2016 11:34 AM, Peter Bergner wrote:

On Wed, 2016-03-23 at 01:49 -0600, Jeff Law wrote:


+/* This represents a conflict graph.  Implemented as an array of bitmaps.
+   A full matrix is used for conflicts rather than just upper triangular form.
+   this make sit much simpler and faster to perform conflict merges.  */


s/make sit/makes it/

Thanks.  Fixed in the obvious way.

jeff


Re: [PATCH] Fix PR tree-optimization/59124 (bogus -Warray-bounds warning)

2016-03-29 Thread Patrick Palka
On Tue, 29 Mar 2016, Richard Biener wrote:

> On Tue, Mar 29, 2016 at 1:23 PM, Patrick Palka  wrote:
> > On Tue, 29 Mar 2016, Richard Biener wrote:
> >
> >> On Sun, Mar 27, 2016 at 11:37 PM, Patrick Palka  
> >> wrote:
> >> > On Sun, 27 Mar 2016, Patrick Palka wrote:
> >> >
> >> >> In unrolling of the inner loop in the test case below we introduce
> >> >> unreachable code that otherwise contains out-of-bounds array accesses.
> >> >> This is because the estimation of the maximum number of iterations of
> >> >> the inner loop is too conservative: we assume 6 iterations instead of
> >> >> the actual 4.
> >> >>
> >> >> Nonetheless, VRP should be able to tell that the code is unreachable so
> >> >> that it doesn't warn about it.  The only thing holding VRP back is that
> >> >> it doesn't look through conditionals of the form
> >> >>
> >> >>if (j_10 != CST1)where j_10 = j_9 + CST2
> >> >>
> >> >> so that it could add the assertion
> >> >>
> >> >>j_9 != (CST1 - CST2)
> >> >>
> >> >> This patch teaches VRP to detect such conditionals and to add such
> >> >> assertions, so that it could remove instead of warn about the
> >> >> unreachable code created during loop unrolling.
> >> >>
> >> >> What this addition does with the test case below is something like this:
> >> >>
> >> >> ASSERT_EXPR (i <= 5);
> >> >> for (i = 1; i < 6; i++)
> >> >>   {
> >> >> j = i - 1;
> >> >> if (j == 0)
> >> >>   break;
> >> >> // ASSERT_EXPR (i != 1)
> >> >> bar[j] = baz[j];
> >> >>
> >> >> j = i - 2
> >> >> if (j == 0)
> >> >>   break;
> >> >> // ASSERT_EXPR (i != 2)
> >> >> bar[j] = baz[j];
> >> >>
> >> >> j = i - 3
> >> >> if (j == 0)
> >> >>   break;
> >> >> // ASSERT_EXPR (i != 3)
> >> >> bar[j] = baz[j];
> >> >>
> >> >> j = i - 4
> >> >> if (j == 0)
> >> >>   break;
> >> >> // ASSERT_EXPR (i != 4)
> >> >> bar[j] = baz[j];
> >> >>
> >> >> j = i - 5
> >> >> if (j == 0)
> >> >>   break;
> >> >> // ASSERT_EXPR (i != 5)
> >> >> bar[j] = baz[j];
> >> >>
> >> >> j = i - 6
> >> >> if (j == 0)
> >> >>   break;
> >> >> // ASSERT_EXPR (i != 6)
> >> >> bar[j] = baz[j]; // unreachable because (i != 6 && i <= 5) is 
> >> >> always false
> >> >>   }
> >> >>
> >> >> (I think the patch I sent a year ago that improved the
> >> >>  register_edge_assert stuff would have fixed this too.  I'll try to
> >> >>  post it again during next stage 1.
> >> >>  https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00908.html)
> >> >>
> >> >> Bootstrap + regtest in progress on x86_64-pc-linux-gnu, does this look
> >> >> OK to commit after testing?
> >> >>
> >> >> gcc/ChangeLog:
> >> >>
> >> >>   PR tree-optimization/59124
> >> >>   * tree-vrp.c (register_edge_assert_for): For NAME != CST1
> >> >>   where NAME = A + CST2 add the assertion A != (CST1 - CST2).
> >> >>
> >> >> gcc/testsuite/ChangeLog:
> >> >>
> >> >>   PR tree-optimization/59124
> >> >>   * gcc.dg/Warray-bounds-19.c: New test.
> >> >> ---
> >> >>  gcc/testsuite/gcc.dg/Warray-bounds-19.c | 17 +
> >> >>  gcc/tree-vrp.c  | 22 ++
> >> >>  2 files changed, 39 insertions(+)
> >> >>  create mode 100644 gcc/testsuite/gcc.dg/Warray-bounds-19.c
> >> >>
> >> >> diff --git a/gcc/testsuite/gcc.dg/Warray-bounds-19.c 
> >> >> b/gcc/testsuite/gcc.dg/Warray-bounds-19.c
> >> >> new file mode 100644
> >> >> index 000..e2f9661
> >> >> --- /dev/null
> >> >> +++ b/gcc/testsuite/gcc.dg/Warray-bounds-19.c
> >> >> @@ -0,0 +1,17 @@
> >> >> +/* PR tree-optimization/59124 */
> >> >> +/* { dg-options "-O3 -Warray-bounds" } */
> >> >> +
> >> >> +unsigned baz[6];
> >> >> +
> >> >> +void foo(unsigned *bar, unsigned n)
> >> >> +{
> >> >> +  unsigned i, j;
> >> >> +
> >> >> +  if (n > 6)
> >> >> +n = 6;
> >> >> +
> >> >> +  for (i = 1; i < n; i++)
> >> >> +for (j = i - 1; j > 0; j--)
> >> >> +  bar[j - 1] = baz[j - 1];
> >> >> +}
> >> >> +
> >> >> diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
> >> >> index b5654c5..31bd575 100644
> >> >> --- a/gcc/tree-vrp.c
> >> >> +++ b/gcc/tree-vrp.c
> >> >> @@ -5820,6 +5820,28 @@ register_edge_assert_for (tree name, edge e, 
> >> >> gimple_stmt_iterator si,
> >> >>   }
> >> >>  }
> >> >>
> >> >> +  /* In the case of NAME != CST1 where NAME = A + CST2 we can
> >> >> + assert that NAME != (CST1 - CST2).  */
> >> >
> >> > This should say A != (...) not NAME != (...)
> >> >
> >> >> +  if ((comp_code == EQ_EXPR || comp_code == NE_EXPR)
> >> >> +  && TREE_CODE (val) == INTEGER_CST)
> >> >> +{
> >> >> +  gimple *def_stmt = SSA_NAME_DEF_STMT (name);
> >> >> +
> >> >> +  if (is_gimple_assign (def_stmt)
> >> >> +   && gimple_assign_rhs_code (def_stmt) == PLUS_EXPR)
> >> >> + {
> >> >> +   tree op0 = gimple_assign_rhs1 (def_stmt);
> >> >> +   tree op1 = gimple_assign_rhs2 (def_stmt);
> >> >> +   if (TREE_CODE (op0) == SSA_NAME
> >> >> + 

[PATCH][PR target/63890] Turn on ACCUMULATE_OUTGOING_ARGS when profiling on darwin

2016-03-29 Thread Jeff Law


As discussed in the BZ, 32bit Darwin will create a mis-aligned stack 
when profiling is enabled.


As noted in c#9, the 32bit x86 port was enabling A_O_A when producing 
unwind info, this is why darwin was functional.  After Jan's patch from 
2013, we stopped turning on A_O_A in that situation and broke profiling 
for 32bit darwin as collateral damage.


Mike's fix was pretty simple, when profiling on darwin, turn on A_O_A. 
Jan expressed some concerns, but I find myself in agreement with Richard 
& Bernd that Mike's patch is a reasonable fix.


I bootstrapped and regression tested the x86_64 darwin port with the 
patch.  I also tried to bootstrap the x86 darwin port with -p enabled by 
default.  Not surprisingly, it failed during stage2 configuration due to 
the bug in 63890.  With the patch applied, stage2 configures just fine 
with profiling enabled and bootstrapping proceeds normally.


I'm installing this on the trunk momentarily.

Jeff
commit e088bc4d87d83993a2d0fcea35c77b1b174e3a35
Author: Jeff Law 
Date:   Tue Mar 29 21:56:51 2016 -0600

PR target/63890
* config/i386/i386.h (ACCUMULATE_OUTGOING_ARGS): Use when profiling
and TARGET_MACHO.

* tree-vrp.c (register_edge_assert_for_2): For NAME != CST1

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index af1b6c2..40fddc4 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2016-03-30  Mike Stump  
+
+   PR target/63890
+   * config/i386/i386.h (ACCUMULATE_OUTGOING_ARGS): Use when profiling
+   and TARGET_MACHO.
+
 2016-03-30  Patrick Palka  
 
PR tree-optimization/59124
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 8d39b5d..d0b418b 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -1638,7 +1638,8 @@ enum reg_class
 
 #define ACCUMULATE_OUTGOING_ARGS \
   ((TARGET_ACCUMULATE_OUTGOING_ARGS && optimize_function_for_speed_p (cfun)) \
-   || TARGET_STACK_PROBE || TARGET_64BIT_MS_ABI)
+   || TARGET_STACK_PROBE || TARGET_64BIT_MS_ABI \
+   || (TARGET_MACHO && crtl->profile))
 
 /* If defined, a C expression whose value is nonzero when we want to use PUSH
instructions to pass outgoing arguments.  */


Re: [PATCH] c++/67376 Comparison with pointer to past-the-end, of array fails inside constant expression

2016-03-29 Thread Martin Sebor

On 03/29/2016 12:54 PM, Jason Merrill wrote:

On 03/28/2016 06:04 PM, Martin Sebor wrote:

+   && compare_tree_int (arg1, 0) == 0)


This can be integer_zerop.


Sure.




+case GE_EXPR:
+case EQ_EXPR:
+case LE_EXPR:
+  return boolean_false_node;
+case GT_EXPR:
+case LT_EXPR:
+case NE_EXPR:
+  return boolean_true_node;


EQ and NE make sense, but I would expect both > and >= to be true, < and
<= to be false.


I was convinced I had a reason for this but it doesn't seem
to affect regression test results so I must have been wrong.

Relational expressions involving object and null pointers are
undefined in C and I thought unspecified in C++, but given that
GCC evaluates (0 < p) to true it looks like you're right and C++
does seem to require all the others to evaluate as you said.

With the decision to remove the nullptr changes I tried to keep
the amount of testing of null pointers to the minimum necessary
to exercise the fix for comment #10 on 67376.  In light of your
expectation I've added a test to better exercise the relational
expressions involving pointers to struct data members.
Interestingly, stepping through it revealed that the problem
cases you pointed out above are actually handled by
generic_simplify() and never end up in fold_comparison().

Attached is an updated patch.


Are we confident that arr[0] won't make it here as POINTER_PLUS_EXPR or
some such?


I'm as confident as I can be given that this is my first time
working in this area.  Which piece of code or what assumption
in particular are you concerned about?

Martin
PR c++/67376 - [5/6 regression] Comparison with pointer to past-the-end
	of array fails inside constant expression
PR c++/70170 - [6 regression] bogus not a constant expression error comparing
	pointer to array to null
PR c++/70172 - incorrect reinterpret_cast from integer to pointer error
	on invalid constexpr initialization
PR c++/70228 - insufficient detail in diagnostics for a constexpr out of bounds
	array subscript

gcc/testsuite/ChangeLog:
2016-03-29  Martin Sebor  

	PR c++/67376
	PR c++/70170
	PR c++/70172
	PR c++/70228
	* g++.dg/cpp0x/constexpr-array-ptr10.C: New test.
	* g++.dg/cpp0x/constexpr-array-ptr9.C: New test.
	* g++.dg/cpp0x/constexpr-nullptr-1.C: New test.
	* g++.dg/cpp0x/constexpr-array5.C: Adjust text of expected diagnostic.
	* g++.dg/cpp0x/constexpr-string.C: Same.
	* g++.dg/cpp0x/constexpr-wstring2.C: Same.
	* g++.dg/cpp0x/pr65398.C: Same.
	* g++.dg/ext/constexpr-vla1.C: Same.
	* g++.dg/ext/constexpr-vla2.C: Same.
	* g++.dg/ext/constexpr-vla3.C: Same.
	* g++.dg/ubsan/pr63956.C: Same.

gcc/cp/ChangeLog:
2016-03-29  Martin Sebor  

	PR c++/67376
	PR c++/70170
	PR c++/70172
	PR c++/70228
	* constexpr.c (diag_array_subscript): New function.
	(cxx_eval_array_reference): Detect out of bounds array indices.

gcc/ChangeLog:
2016-03-29  Martin Sebor  

	PR c++/67376
	* fold-const.c (maybe_nonzero_address): New function.
	(fold_comparison): Call it.  Fold equality and relational
	expressions involving null pointers.
	(tree_single_nonzero_warnv_p): Call maybe_nonzero_address.

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 8ea7111..2415094 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -1837,6 +1837,30 @@ find_array_ctor_elt (tree ary, tree dindex, bool insert = false)
   return -1;
 }
 
+/* Under the control of CTX, issue a detailed diagnostic for
+   an out-of-bounds subscript INDEX into the expression ARRAY.  */
+
+static void
+diag_array_subscript (const constexpr_ctx *ctx, tree array, tree index)
+{
+  if (!ctx->quiet)
+{
+  tree arraytype = TREE_TYPE (array);
+
+  /* Convert the unsigned array subscript to a signed integer to avoid
+	 printing huge numbers for small negative values.  */
+  tree sidx = fold_convert (ssizetype, index);
+  if (DECL_P (array))
+	{
+	  error ("array subscript value %qE is outside the bounds "
+		 "of array %qD of type %qT", sidx, array, arraytype);
+	  inform (DECL_SOURCE_LOCATION (array), "declared here");
+	}
+  else
+	error ("array subscript value %qE is outside the bounds "
+	   "of array type %qT", sidx, arraytype);
+}
+}
 
 /* Subroutine of cxx_eval_constant_expression.
Attempt to reduce a reference to an array slot.  */
@@ -1861,6 +1885,7 @@ cxx_eval_array_reference (const constexpr_ctx *ctx, tree t,
 	false,
 	non_constant_p, overflow_p);
   VERIFY_CONSTANT (index);
+
   if (lval && ary == oldary && index == oldidx)
 return t;
   else if (lval)
@@ -1885,8 +1910,7 @@ cxx_eval_array_reference (const constexpr_ctx *ctx, tree t,
   if (!tree_fits_shwi_p (index)
   || (i = tree_to_shwi (index)) < 0)
 {
-  if (!ctx->quiet)
-	error ("negative array subscript");
+  diag_array_subscript (ctx, ary, index);
   *non_constant_p = true;
   return t;
 }
@@ -1898,8 +1922,7 @@ cxx_eval_array_reference (const constexpr_ctx *ctx, tree t,
   VERIFY_CONSTANT (nelts);
   i

Re: [PATCH] Fix num_imm_uses (PR tree-optimization/70405)

2016-03-29 Thread Richard Biener
On March 29, 2016 7:54:16 PM GMT+02:00, Jakub Jelinek  wrote:
>On Tue, Mar 29, 2016 at 07:46:32PM +0200, Bernd Schmidt wrote:
>> On 03/29/2016 07:28 PM, Jeff Law wrote:
>> >On 03/29/2016 11:23 AM, Jakub Jelinek wrote:
>> >>Hi!
>> >>
>> >>The recent change to num_imm_uses (to add support for NULL
>USE_STMT)
>> >>broke it totally, fortunately we have just one user of this
>function
>> >>right now.  I've filed a PR for GCC 7 so that we get a warning on
>this.
>> >>
>> >>Fixed thusly, bootstrapped/regtested on x86_64-linux and
>i686-linux,
>> >>ok for
>> >>trunk?
>> >>
>> >>2016-03-29  Jakub Jelinek  
>> >>
>> >>PR tree-optimization/70405
>> >>* ssa-iterators.h (num_imm_uses): Add missing braces.
>> >>
>> >>* gcc.dg/pr70405.c: New test.
>> >Not caught by -Wmisleading-indentation?  Seems like it'd be worth a
>bug
>> >report for that.
>> 
>> Actually this looks like the dangling-else regression I've complained
>about
>> previously. When I added that warning, I intentionally made it catch
>> 
>> if (foo)
>>   for (..)
>> if (bar)
>> ...
>> else
>>   
>> 
>> but at some point the code was changed so as to no longer warn for
>this
>> case.
>
>Indeed, GCC 3.4 warns about this:
>pr70405-3.c:7: warning: suggest explicit braces to avoid ambiguous
>`else'
>That warning is still in there under -Wparentheses, but doesn't trigger
>anymore.

Sounds like poor testsuite coverage then...  Did both FEs warn?

Richard.

>   Jakub




Re: [PATCH] Fix num_imm_uses (PR tree-optimization/70405)

2016-03-29 Thread Jakub Jelinek
On Wed, Mar 30, 2016 at 07:45:23AM +0200, Richard Biener wrote:
> >Indeed, GCC 3.4 warns about this:
> >pr70405-3.c:7: warning: suggest explicit braces to avoid ambiguous
> >`else'
> >That warning is still in there under -Wparentheses, but doesn't trigger
> >anymore.
> 
> Sounds like poor testsuite coverage then...  Did both FEs warn?

Dunno, GCC 2.96-RH, 3.2, 3.4 certainly warned only in C, not C++, 4.0+ warns in
neither C nor C++, don't have anything older around.

Jakub