date:20150609

[PATCH] Fix PR66396

2015-06-09 Thread Richard Biener


Bootstrapped / tested on x86_64-unknown-linux-gnu, applied.

Richard.

2015-06-09  Richard Biener  

PR tree-optimization/66396
* graphite-isl-ast-to-gimple.c (graphite_regenerate_ast_isl):
Rename virtual operands.

Index: gcc/graphite-isl-ast-to-gimple.c
===
--- gcc/graphite-isl-ast-to-gimple.c(revision 224212)
+++ gcc/graphite-isl-ast-to-gimple.c(working copy)
@@ -1069,6 +1069,10 @@ graphite_regenerate_ast_isl (scop_p scop
 
   translate_isl_ast (context_loop, root_node, if_region->true_region->entry,
 ip);
+
+  mark_virtual_operands_for_renaming (cfun);
+  update_ssa (TODO_update_ssa);
+
   graphite_verify ();
   scev_reset ();
   recompute_all_dominators ();

Re: [Patch, C++, PR65882] Check tf_warning flag in build_new_op_1

2015-06-09 Thread Marek Polacek

This is not my patch, but I'd like to ping it anyway as it also fixes
PR66467 (you might want to add a testcase from this PR as well).

On Sun, Apr 26, 2015 at 02:11:14PM +0300, Mikhail Maltsev wrote:
> The attached patch was bootstrapped and regtested on
> x86_64-unknown-linux-gnu.
> 
> -- 
> Regards,
> Mikhail Maltsev

> gcc/cp/ChangeLog:
> 
> 2015-04-26  Mikhail Maltsev  
> 
>   * call.c (build_new_op_1): Check tf_warning flag in all cases
> 
> gcc/testsuite/ChangeLog:
> 
> 2015-04-26  Mikhail Maltsev  
> 
>   * g++.dg/diagnostic/inhibit-warn.C: New test.
> 
> 

> diff --git a/gcc/cp/call.c b/gcc/cp/call.c
> index 7bdf236..689d542 100644
> --- a/gcc/cp/call.c
> +++ b/gcc/cp/call.c
> @@ -5677,8 +5677,9 @@ build_new_op_1 (location_t loc, enum tree_code code, 
> int flags, tree arg1,
>  case TRUTH_ORIF_EXPR:
>  case TRUTH_AND_EXPR:
>  case TRUTH_OR_EXPR:
> -  warn_logical_operator (loc, code, boolean_type_node,
> -  code_orig_arg1, arg1, code_orig_arg2, arg2);
> +  if (complain & tf_warning)
> + warn_logical_operator (loc, code, boolean_type_node,
> +code_orig_arg1, arg1, code_orig_arg2, arg2);
>/* Fall through.  */
>  case GT_EXPR:
>  case LT_EXPR:
> @@ -5686,8 +5687,9 @@ build_new_op_1 (location_t loc, enum tree_code code, 
> int flags, tree arg1,
>  case LE_EXPR:
>  case EQ_EXPR:
>  case NE_EXPR:
> -  if ((code_orig_arg1 == BOOLEAN_TYPE)
> -   ^ (code_orig_arg2 == BOOLEAN_TYPE))
> +  if ((complain & tf_warning)
> +   && ((code_orig_arg1 == BOOLEAN_TYPE)
> +   ^ (code_orig_arg2 == BOOLEAN_TYPE)))
>   maybe_warn_bool_compare (loc, code, arg1, arg2);
>/* Fall through.  */
>  case PLUS_EXPR:
> diff --git a/gcc/testsuite/g++.dg/diagnostic/inhibit-warn.C 
> b/gcc/testsuite/g++.dg/diagnostic/inhibit-warn.C
> new file mode 100644
> index 000..5655eb4
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/diagnostic/inhibit-warn.C
> @@ -0,0 +1,32 @@
> +// PR c++/65882
> +// { dg-do compile { target c++11 } }
> +// { dg-options "-Wbool-compare" }
> +
> +// Check that we don't ICE because of reentering error reporting routines 
> while
> +// evaluating template parameters
> +
> +template
> +struct type_function {
> +  static constexpr bool value = false;
> +};
> +
> +template
> +struct dependent_type {
> +  typedef int type;
> +};
> +
> +template
> +typename dependent_type<(5 > type_function::value)>::type
> +bar();
> +
> +template
> +typename dependent_type<(5 > type_function::value)>::type
> +foo()
> +{
> +  return bar();
> +}
> +
> +int main()
> +{
> +  foo();
> +}


Marek

Re: [PATCH] Optimize (CST1 << A) == CST2 (PR tree-optimization/66299)

2015-06-09 Thread Richard Biener

On Mon, Jun 8, 2015 at 7:03 PM, Marc Glisse  wrote:
> On Mon, 8 Jun 2015, Marek Polacek wrote:
>
>>   PR tree-optimization/66299
>>   * match.pd ((CST1 << A) == CST2 -> A == ctz (CST2) - ctz (CST1)
>>   ((CST1 << A) != CST2 -> A != ctz (CST2) - ctz (CST1)): New
>
>
> You are braver than I am, I would have abbreviated ctz (CST2) - ctz (CST1)
> to CST3 in the ChangeLog ;-)
>
>> +/* (CST1 << A) == CST2 -> A == ctz (CST2) - ctz (CST1)
>> +   (CST1 << A) != CST2 -> A != ctz (CST2) - ctz (CST1)
>> +   if CST2 != 0.  */
>> +(for cmp (ne eq)
>> + (simplify
>> +  (cmp (lshift INTEGER_CST@0 @1) INTEGER_CST@2)
>> +  (with {
>> +   unsigned int cand = wi::ctz (@2) - wi::ctz (@0); }
>> +  (if (!integer_zerop (@2)
>
>
> You can probably use directly wi::ne_p (@2, 0) here. Shouldn't this be
> indented one space more?

Yes, one space more.  I suppose using integer_zerop might in theory
allow for handling vector shifts at some point ...?

>> +   && wi::eq_p (wi::lshift (@0, cand), @2))
>> +   (cmp @1 { build_int_cst (TREE_TYPE (@1), cand); })
>
>
> Making 'cand' signed, you could return 0 when cand<0, like (2< could also return 0 when the candidate turns out not to work: (3< Tweaking it so that (6<=31 for TYPE_OVERFLOW_WRAPS and
> false for TYPE_OVERFLOW_UNDEFINED is probably more controversial.

Hm, yes.  I think signed overflow != shift amount overflow, so testing
the overflow
macros for this isn't valid.

Otherwise the patch looks ok to me as well - mind doing the improvement above?

Thanks,
Richard.

> FWIW, the patch looks good to me, thanks.
>
> --
> Marc Glisse

Re: [patch] Adjust gcc-plugin.h

2015-06-09 Thread Richard Biener

On Mon, Jun 8, 2015 at 7:52 PM, Andrew MacLeod  wrote:
> On 06/08/2015 09:32 AM, Richard Biener wrote:
>>
>> On Mon, Jun 8, 2015 at 2:07 PM, Andrew MacLeod 
>> wrote:
>>>
>>> During the original flattening process I decided to use gcc-plugin.h as
>>> the
>>> kitchen sink for all includes that plugins might need.  I think this has
>>> worked well for plugins, drastically reducing their dependency on gcc
>>> internal header file structure.
>>>
>>> What I didn't realize was that gcc's internal header (plugin.h) also
>>> includes gcc-plugin.h. This means that every file which may need to do
>>> something for plugins ends up indirectly including the gcc world again
>>> :-P
>>>
>>> Easy fix. (ha). This patch leaves all the #includes in gcc-plugin.h
>>> making
>>> the change transparent to plugins.  All the remaining declarations and
>>> such
>>> are moved into a new file gcc-plugin-common.h.  Both gcc-plugin.h and
>>> gcc's
>>> internal header plugin.h now include this common file.
>>>
>>> The effect is that gcc's source files no longer get anything but the
>>> required plugin info.  Great.. Except there were a few files which were
>>> apparently picking up some required headers from gcc-plugins.h :-P
>>> This
>>> patch also adds the required headers to those source files.
>>>
>>> Compiles on x86_64-unknown-linux-gnu with no new regressions. Also
>>> compiles
>>> across all targets in config-list.mk.   OK for trunk?
>>
>> Err - why not simply remove the gcc-plugin.h include from plugin.h and
>> instead
>> include plugin.h from gcc-plugin.h?
>>
>>
>
> the gcc source files need to see the internal bits in plugin.h, as well as
> the common decls in gcc-plugin.h.  So we could change the includes as you
> suggest, but we'd have to copy all the decls from gcc-inlcude.h to plugin.h
> so the gcc functions can see them.  And then the plugins would be exposed to
> all the internal APIs and decls present in plugins.h

plugins are exposed to all internals of GCC anyway.  gcc-plugin.h should really
just be a #include kitchen-sink.

> Adding the 3rd file which contains all the common decls between both sides
> is the only way to isolate both. If you were OK with exposing the
> internal parts of plugin.h to plugin clients I could do that.

Yes.

>  Im presuming
> we didnt want to do that and thats why there were 2 files to start with.

No, gcc-plugin.h was introduced to make the set of includes required
for plugins "stable".

Richard.

> I
> hijacked the external interface in gcc-plugin.h file to provide all the
> includes when instead the right thing would have been to probably create a
> new in the first place.
>
> Andrew
>
>
>

Re: debug-early branch merged into mainline

2015-06-09 Thread Richard Biener

On Mon, Jun 8, 2015 at 10:25 PM, Aldy Hernandez  wrote:
> On 06/08/2015 02:59 PM, Richard Biener wrote:
>>
>> On June 8, 2015 7:14:19 PM GMT+02:00, Aldy Hernandez 
>> wrote:
>>>
>>> On 06/08/2015 09:30 AM, Richard Biener wrote:

 On Mon, Jun 8, 2015 at 2:05 PM, Aldy Hernandez 
>>>
>>> wrote:
>
> On 06/08/2015 04:26 AM, Richard Biener wrote:
>>
>>
>> On Mon, Jun 8, 2015 at 3:23 AM, Aldy Hernandez 
>
>
>>> What about if the comparison routine gets a named section and an
>>> unnamed
>>> section?  How to compare?  That's why I was giving priority to one over
>>>
>>> the other originally, but I didn't know about problematic qsort
>>> implementations.
>>
>>
>> Obviously unnamed and a named section can be sorted like you did in the
>> original patch.
>
>
> Obviously I'm not understanding :).
>
> How about this?

Ok with adding

v.create (object_block_htab->elements ());

and using v.quick_push () (avoids re-allocations)

and with adding a

v.release ();

at the end of the function.  And re-writing

+  return f1 < f2 ? -1 : (f1 > f2 ? 1 : 0);

to

   if (f1 == f2)
 return 0;
   return f1 < f2 ? -1 : 1;

Thanks,
Richard.

> Tested on x86-64 and ppc64le.
>
> Aldy

[PATCH] Fix assert in get_loop_body_in_bfs_order

2015-06-09 Thread Richard Biener


Noticed when it crashed rather than asserted with broken
loop structure.

Bootstrapped / tested on x86_64-unknown-linux-gnu, applied.

Richard.

2015-06-09  Richard Biener  

* cfgloop.c (get_loop_body_in_bfs_order): Fix assert.

Index: gcc/cfgloop.c
===
--- gcc/cfgloop.c   (revision 224266)
+++ gcc/cfgloop.c   (working copy)
@@ -954,7 +954,7 @@ get_loop_body_in_bfs_order (const struct
}
}
 
-  gcc_assert (i >= vc);
+  gcc_assert (i > vc);
 
   bb = blocks[vc++];
 }

Re: [GCC, ARM] armv8 linux toolchain asan testcase fail due to stl missing conditional code

2015-06-09 Thread Kyrill Tkachov



On 05/06/15 14:14, Kyrill Tkachov wrote:


On 05/06/15 14:11, Richard Earnshaw wrote:

On 05/06/15 14:08, Kyrill Tkachov wrote:

Hi Shiva,

On 05/06/15 10:42, Shiva Chen wrote:

Hi, Kyrill

I add the testcase as stl-cond.c.

Could you help to check the testcase ?

If it's OK, Could you help me to apply the patch ?


This looks ok to me.
One nit on the testcase:

diff --git a/gcc/testsuite/gcc.target/arm/stl-cond.c
b/gcc/testsuite/gcc.target/arm/stl-cond.c
new file mode 100755
index 000..44c6249
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/stl-cond.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v8a_ok } */
+/* { dg-options "-O2" } */

This should also have -marm as the problem exhibited itself in arm state.
I'll commit this patch with this change in 24 hours on your behalf if no
one
objects.


Explicit use of -marm will break multi-lib testing.  I've forgotten the
correct hook, but there's most-likely something that will give you the
right behaviour, even if it means that thumb-only multi-lib testing
skips this test.


So I think what we want is:

dg-require-effective-target arm_arm_ok

The comment in target-supports.exp is:
# Return 1 if this is an ARM target where -marm causes ARM to be
# used (not Thumb)



I've committed the attached patch to trunk on Shiva's behalf with r224269.
It gates the test on arm_arm_ok and adds -marm, like other similar tests.
The ChangeLog I used is below:

2015-06-09  Shiva Chen  

* sync.md (atomic_load): Add conditional code for lda/ldr
(atomic_store): Likewise.

2015-06-09  Shiva Chen  

* gcc.target/arm/stl-cond.c: New test.


Thanks,
Kyrill


Kyrill



R.


Ramana, Richard, we need to backport it to GCC 5 as well, right?

Thanks,
Kyrill



Thanks,

Shiva

2015-06-05 16:34 GMT+08:00 Kyrill Tkachov :

Hi Shiva,

On 05/06/15 09:29, Shiva Chen wrote:

Hi, Kyrill

I update the patch as Richard's suggestion.

-  return \"str\t%1, %0\";
+  return \"str%(%)\t%1, %0\";
else
-  return \"stl\t%1, %0\";
+  return \"stl%?\t%1, %0\";
  }
-)
+  [(set_attr "predicable" "yes")
+   (set_attr "predicable_short_it" "no")])
+  [(set_attr "predicable" "yes")
+   (set_attr "predicable_short_it" "no")])


Let me sum up.

We add predicable attribute to allow gcc do if-conversion in
ce1/ce2/ce3 not only in final phase by final_prescan_insn finite state
machine.

We set predicalble_short_it to "no" to restrict conditional code
generation on armv8 with thumb mode.

However, we could use the flags -mno-restrict-it to force generating
conditional code on thumb mode.

Therefore, we have to consider the assembly output format for strb
with condition code on arm/thumb mode.

Because arm/thumb mode use different syntax for strb,
we output the assembly as str%(%)
which will put the condition code in the right place according to
TARGET_UNIFIED_ASM.

Is there still missing something ?

That's all correct, and well summarised :)
The patch looks good to me, but please include the testcase
(test.c from earlier) appropriately marked up for the testsuite.
I think to the level of dg-assemble, just so we know everything is
wired up properly.

Thanks for dealing with this.
Kyrill



Thanks,

Shiva

2015-06-04 18:00 GMT+08:00 Kyrill Tkachov
:

Hi Shiva,

On 04/06/15 10:57, Shiva Chen wrote:

Hi, Kyrill

Thanks for the tips of syntax.

It seems that correct syntax for

ldrb with condition code is ldreqb

ldab with condition code is ldabeq


So I modified the pattern as follow

  {
enum memmodel model = (enum memmodel) INTVAL (operands[2]);
if (model == MEMMODEL_RELAXED
|| model == MEMMODEL_CONSUME
|| model == MEMMODEL_RELEASE)
  return \"ldr%?\\t%0, %1\";
else
  return \"lda%?\\t%0, %1\";
  }
  [(set_attr "predicable" "yes")
   (set_attr "predicable_short_it" "no")])

It seems we don't have to worry about thumb mode,

I suggest you use Richard's suggestion from:
https://gcc.gnu.org/ml/gcc-patches/2015-06/msg00384.html
to write this in a clean way.


Because we already set "predicable" "yes" and predicable_short_it"
"no"
for the pattern.

That's not quite true. The user may compile for armv8-a with
-mno-restrict-it which will turn off this
restriction for Thumb and allow the conditional execution of this.
In any case, I think Richard's suggestion above should work.

Thanks,
Kyrill



The new patch could build gcc and run gcc regression test
successfully.

Please correct me if I still missing something.

Thanks,

Shiva

-Original Message-
From: Richard Earnshaw [mailto:richard.earns...@foss.arm.com]
Sent: Thursday, June 04, 2015 4:42 PM
To: Kyrill Tkachov; Shiva Chen
Cc: Ramana Radhakrishnan; GCC Patches; ni...@redhat.com; Shiva Chen
Subject: Re: [GCC, ARM] armv8 linux toolchain asan testcase fail
due to
stl missing conditional code

On 04/06/15 09:17, Kyrill Tkachov wrote:

Hi Shiva,

On 04/06/15 04:13, Shiva Chen wrote:

Hi, Ramana

Currently

Re: [PR64164] drop copyrename, integrate into expand

2015-06-09 Thread Christophe Lyon

On 8 June 2015 at 10:14, Richard Biener  wrote:
> On Sat, Jun 6, 2015 at 3:14 AM, Alexandre Oliva  wrote:
>> On Apr 27, 2015, Richard Biener  wrote:
>>
>>> This should also mention that is_gimple_reg vars do not have their
>>> address taken.
>>
>> check
>>
 +static tree
 +leader_merge (tree cur, tree next)
>>
>>> Ick - presumably you can't use sth better than a TREE_LIST here?
>>
>> The list was an experiment that never really worked, and when I tried to
>> make it work after the patch, it proved to be unworkable, so I dropped
>> it, and rewrote leader_merge to choose either of the params, preferring
>> anonymous over ignored over named, so as to reduce the likelihood of
>> misreading of debug dumps, since that's all they're used for.
>>
 static void
 -expand_one_stack_var (tree var)
 +expand_one_stack_var_1 (tree var)
 {
 HOST_WIDE_INT size, offset;
 unsigned byte_align;

 -  size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
 -  byte_align = align_local_variable (SSAVAR (var));
 +  if (TREE_CODE (var) != SSA_NAME || SSA_NAME_VAR (var))
 +{
 +  size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
 +  byte_align = align_local_variable (SSAVAR (var));
 +}
 +  else
>>
>>> I'd go here for all TREE_CODE (var) == SSA_NAME
>>
>> Check
>>
>>> (and get rid of the SSAVAR macro?)
>>
>> There are remaining uses that don't seem worth dropping it for.
>>
 +/* Return the promoted mode for name.  If it is a named SSA_NAME, it
 +   is the same as promote_decl_mode.  Otherwise, it is the promoted
 +   mode of a temp decl of same type as the SSA_NAME, if we had created
 +   one.  */
 +
 +machine_mode
 +promote_ssa_mode (const_tree name, int *punsignedp)
 +{
 +  gcc_assert (TREE_CODE (name) == SSA_NAME);
 +
 +  if (SSA_NAME_VAR (name))
 +return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
>>
>>> As above I'd rather not have different paths for anonymous vs. non-anonymous
>>> vars (so just delete the above two lines).
>>
>> Check
>>
 @@ -9668,6 +9678,11 @@ expand_expr_real_1 (tree exp, rtx target, 
 machine_mode tmode,
 pmode = promote_function_mode (type, mode, &unsignedp,
 gimple_call_fntype (g),
 2);
 + else if (!exp)
 +   {
 + gcc_assert (code == SSA_NAME);
>>
>>> promote_ssa_mode should assert this.
>>
 + pmode = promote_ssa_mode (ssa_name, &unsignedp);
>>
>> It does, so...  check.
>>
>>
 @@ -2121,6 +2122,15 @@ aggregate_value_p (const_tree exp, const_tree 
 fntype)
 bool
 use_register_for_decl (const_tree decl)
 {
 +  if (TREE_CODE (decl) == SSA_NAME)
 +{
 +  if (!SSA_NAME_VAR (decl))
 +   return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
 + && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl)));
 +
 +  decl = SSA_NAME_VAR (decl);
>>
>>> See above.  Please drop the SSA_NAME_VAR != NULL path.
>>
>> Check, then taken back, after a bootstrap failure and some debugging
>> made me realize this would be wrong.  Here are the nearly-added comments
>> that explain why:
>>
>>   /* We often try to use the SSA_NAME, instead of its underlying
>>  decl, to get type information and guide decisions, to avoid
>>  differences of behavior between anonymous and named
>>  variables, but in this one case we have to go for the actual
>>  variable if there is one.  The main reason is that, at least
>>  at -O0, we want to place user variables on the stack, but we
>>  don't mind using pseudos for anonymous or ignored temps.
>>  Should we take the SSA_NAME, we'd conclude all SSA_NAMEs
>>  should go in pseudos, whereas their corresponding variables
>>  might have to go on the stack.  So, disregarding the decl
>>  here would negatively impact debug info at -O0, enable
>>  coalescing between SSA_NAMEs that ought to get different
>>  stack/pseudo assignments, and get the incoming argument
>>  processing thoroughly confused by PARM_DECLs expected to live
>>  in stack slots but assigned to pseudos.  */
>>
>>
 +++ b/gcc/gimple-expr.h
 +/* Defined in tree-ssa-coalesce.c.   */
 +extern bool gimple_can_coalesce_p (tree, tree);
>>
>>> Err, put it to tree-ssa-coalesce.h?
>>
>> Check.  Lots of additional headers required to be able to include
>> tree-ssa-coalesce.h, though.
>>
>>
 -  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var)));
 +  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var ? var : name)));
>>
>>> The TREE_TYPE of name and its SSA_NAME_VAR are always the same.  So just
>>> use TREE_TYPE (name) here.
>>
>> Check
>>
 gcc_assert (!REG_P (dest_rtx)
 - || dest_mode == promote_decl_mode (var, &unsignedp));
 + || dest_mode == promote_ssa_mode (name, &unsignedp));

 if (src_mode != dest

Re: [patch] libstdc++/66030 fix codecvt exports for mingw32

2015-06-09 Thread Jonathan Wakely


On 08/06/15 09:59 -0600, Martin Sebor wrote:

On 06/08/2015 09:12 AM, Jonathan Wakely wrote:

The linker script assumes that std::mbstate_t has the name __mbstate_t
for linkage purposes, but that's not necessarily true. For mingw32
it's just a typedef for int, so the patterns don't match.

This adds a new mingw32-specific pattern for codecvt_byname's
constructors and destructors, and relaxes the patterns for
codecvt so they match __mbstate_t or int.


As a data point, in case other targets have a similar issue,
mbstate_t is a typedef for char* on AIX, and (based on my
old notes) typedef struct mbstate_t on HP-UX.

(It is a typedef struct __mbstate_t on Darwin and Solaris.)


That is useful, thanks.

AIX and HP-UX don't use that linker script, so it's not a problem (for
now) that it assumes __mbstate_t.

Re: debug-early branch merged into mainline

2015-06-09 Thread Aldy Hernandez


On 06/09/2015 04:00 AM, Richard Biener wrote:

On Mon, Jun 8, 2015 at 10:25 PM, Aldy Hernandez  wrote:

On 06/08/2015 02:59 PM, Richard Biener wrote:


On June 8, 2015 7:14:19 PM GMT+02:00, Aldy Hernandez 
wrote:


On 06/08/2015 09:30 AM, Richard Biener wrote:


On Mon, Jun 8, 2015 at 2:05 PM, Aldy Hernandez 


wrote:


On 06/08/2015 04:26 AM, Richard Biener wrote:



On Mon, Jun 8, 2015 at 3:23 AM, Aldy Hernandez 




What about if the comparison routine gets a named section and an
unnamed
section?  How to compare?  That's why I was giving priority to one over

the other originally, but I didn't know about problematic qsort
implementations.



Obviously unnamed and a named section can be sorted like you did in the
original patch.



Obviously I'm not understanding :).

How about this?


Ok with adding


I've committed the attached patch.



v.create (object_block_htab->elements ());

and using v.quick_push () (avoids re-allocations)

and with adding a

v.release ();


For some reason I thought the new C++ vector world released stuff on its 
own if allocated on the heap.  Oh well...


Thanks.
Aldy
commit d2f1a9cd6e91c400ab4fa569565eece1ae08b64d
Author: Aldy Hernandez 
Date:   Tue Jun 9 05:39:34 2015 -0400

* varasm.c (output_object_block_htab): Remove.
(output_object_block_compare): New.
(output_object_blocks): Sort named object_blocks before outputting
them.

diff --git a/gcc/varasm.c b/gcc/varasm.c
index 18f3eac..86a70db 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -7420,14 +7420,31 @@ output_object_block (struct object_block *block)
 }
 }
 
-/* A htab_traverse callback used to call output_object_block for
-   each member of object_block_htab.  */
+/* A callback for qsort to compare object_blocks.  */
 
-int
-output_object_block_htab (object_block **slot, void *)
+static int
+output_object_block_compare (const void *x, const void *y)
 {
-  output_object_block (*slot);
-  return 1;
+  object_block *p1 = *(object_block * const*)x;
+  object_block *p2 = *(object_block * const*)y;
+
+  if (p1->sect->common.flags & SECTION_NAMED
+  && !(p2->sect->common.flags & SECTION_NAMED))
+return 1;
+
+  if (!(p1->sect->common.flags & SECTION_NAMED)
+  && p2->sect->common.flags & SECTION_NAMED)
+return -1;
+
+  if (p1->sect->common.flags & SECTION_NAMED
+  && p2->sect->common.flags & SECTION_NAMED)
+return strcmp (p1->sect->named.name, p2->sect->named.name);
+
+  unsigned f1 = p1->sect->common.flags;
+  unsigned f2 = p2->sect->common.flags;
+  if (f1 == f2)
+return 0;
+  return f1 < f2 ? -1 : 1;
 }
 
 /* Output the definitions of all object_blocks.  */
@@ -7435,7 +7452,23 @@ output_object_block_htab (object_block **slot, void *)
 void
 output_object_blocks (void)
 {
-  object_block_htab->traverse (NULL);
+  vec v;
+  v.create (object_block_htab->elements ());
+  object_block *obj;
+  hash_table::iterator hi;
+
+  FOR_EACH_HASH_TABLE_ELEMENT (*object_block_htab, obj, object_block *, hi)
+v.quick_push (obj);
+
+  /* Sort them in order to output them in a deterministic manner,
+ otherwise we may get .rodata sections in different orders with
+ and without -g.  */
+  v.qsort (output_object_block_compare);
+  unsigned i;
+  FOR_EACH_VEC_ELT (v, i, obj)
+output_object_block (obj);
+
+  v.release ();
 }
 
 /* This function provides a possible implementation of the

Re: Fix more of C/fortran canonical type issues

2015-06-09 Thread Richard Biener

On Mon, 8 Jun 2015, Jan Hubicka wrote:

> > 
> > I think we should instead work towards eliminating the get_alias_set
> > langhook first.  The LTO langhook variant contains the same handling, btw,
> > so just inline that into get_alias_set and see what remains?
> 
> I see, i completely missed existence of gimple_get_alias_set. It makes more
> sense now.
> 
> Is moving everyting to alias.c realy a desirable thing? If non-C languages do
> not have this rule, why we want to reduce the code quality when compiling
> those?

Well, for consistency and for getting rid of one langhook ;)

Richard.

> Honza
> > 
> > Richard.
> > 
> > > Honza
> > > > 
> > > > -- 
> > > > Joseph S. Myers
> > > > jos...@codesourcery.com
> > > 
> > > 
> > 
> > -- 
> > Richard Biener 
> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, 
> > Graham Norton, HRB 21284 (AG Nuernberg)
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham 
Norton, HRB 21284 (AG Nuernberg)

Re: [Patch libstdc++] Rewrite cpu/generic/atomic_word.h

2015-06-09 Thread Ramana Radhakrishnan




On 22/05/15 17:56, Torvald Riegel wrote:

On Fri, 2015-05-22 at 12:37 +0100, Ramana Radhakrishnan wrote:

Hi,

While writing atomic_word.h for the ARM backend to fix PR target/66200
I
thought it would make more sense to write it all up with atomic
primitives instead of providing various fragile bits of inline
asssembler. Thus this patch came about. I intend to ask for a
specialized version of this to be applied for the ARM backend in any
case after testing completes.

If this goes in as a cleanup I expect all the ports to be deleting
their
atomic_word.h implementation in the various ports directories and
I'll
follow up with patches asking port maintainers to test and apply for
each of the individual cases if this is deemed to be good enough.


Could you use C++11 atomics right away instead of adapting the wrappers?



I spent some time trying to massage guard.cc into using C++11 atomics 
but it was just easier to write it with the builtins - I consider this 
to be a step improvement compared to where we are currently.


Rewritten to use the builtins in guard.cc instead of std::atomic as that 
appears to be a bigger project than moving forward compared to where we 
are. I prefer small steps rather than big steps in these areas. Further 
I do not believe you can use the C++11 language features in the headers 
as they were not necessarily part of the standard for tr1 etc. and thus 
it's better to remain with the builtins, after all I am also continuing 
with existing practice in the headers.





I think the more non-C++11 atomic operations/wrappers we can remove the
better.


diff --git a/libstdc++-v3/config/cpu/generic/atomic_word.h b/libstdc
++-v3/config/cpu/generic/atomic_word.h
index 19038bb..bedce31 100644
--- a/libstdc++-v3/config/cpu/generic/atomic_word.h
+++ b/libstdc++-v3/config/cpu/generic/atomic_word.h
@@ -29,19 +29,46 @@
  #ifndef _GLIBCXX_ATOMIC_WORD_H
  #define _GLIBCXX_ATOMIC_WORD_H 1

+#include 
+
  typedef int _Atomic_word;

-// Define these two macros using the appropriate memory barrier for
the target.
-// The commented out versions below are the defaults.
-// See ia64/atomic_word.h for an alternative approach.
+
+namespace __gnu_cxx _GLIBCXX_VISIBILITY(default)
+{
+  // Test the first byte of __g and ensure that no loads are hoisted
across
+  // the test.




That comment is not quite correct.  I'd just say that this is a
memory_order_acquire load and a comparison.



Done.



+  inline bool
+  __test_and_acquire (__cxxabiv1::__guard *__g)
+  {
+unsigned char __c;
+unsigned char *__p = reinterpret_cast(__g);
+__atomic_load (__p, &__c,  __ATOMIC_ACQUIRE);
+return __c != 0;
+  }
+
+  // Set the first byte of __g to 1 and ensure that no stores are
sunk
+  // across the store.


Likewise; I'd just say this is a memory_order_release store with 1 as
value.


Ok. I've now realized that this is superfluous and better to fix the one 
definition in guard.cc to do the right thing instead of something like this.






+  inline void
+  __set_and_release (__cxxabiv1::__guard *__g)
+  {
+unsigned char *__p = reinterpret_cast(__g);
+unsigned char val = 1;
+__atomic_store (__p, &val, __ATOMIC_RELEASE);
+  }
+
+#define _GLIBCXX_GUARD_TEST_AND_ACQUIRE(G)
__gnu_cxx::__test_and_acquire (G)
+#define _GLIBCXX_GUARD_SET_AND_RELEASE(G)
__gnu_cxx::__set_and_release (G)

  // This one prevents loads from being hoisted across the barrier;
  // in other words, this is a Load-Load acquire barrier.


Likewise; I'd just say that this is an mo_acquire fence.


-// This is necessary iff TARGET_RELAXED_ORDERING is defined in
tm.h.
-// #define _GLIBCXX_READ_MEM_BARRIER __asm __volatile ("":::"memory")
+// This is necessary iff TARGET_RELAXED_ORDERING is defined in tm.h.
+#define _GLIBCXX_READ_MEM_BARRIER __atomic_thread_fence
(__ATOMIC_ACQUIRE)

  // This one prevents stores from being sunk across the barrier; in
other
  // words, a Store-Store release barrier.


Likewise; mo_release fence.


-// #define _GLIBCXX_WRITE_MEM_BARRIER __asm __volatile
("":::"memory")
+#define _GLIBCXX_WRITE_MEM_BARRIER __atomic_thread_fence
(__ATOMIC_RELEASE)
+
+}




I don't think we can remove _GLIBCXX_READ_MEM_BARRIER and 
_GLIBCXX_WRITE_MEM_BARRIER from atomic_word.h even though they are 
superseded by the atomics as it is published in the documentation as 
available macros.


It was obvious that alpha, powerpc, aix, ia64 could just fall back to 
the standard implementations. The cris and sparc implementations have 
different types for _Atomic_word from generic and the rest - my guess is 
sparc can remove the _GLIBCXX_READ_MEM_BARRIER and 
_GLIBCXX_WRITE_MEM_BARRIER but I have no way of testing this is sane.


I'll leave that to the respective target maintainers for sparc and cris 
to deal as appropriate.



2015-06-09  Ramana Radhakrishnan  

PR c++/66192
PR target/66200
* config/cpu/generic/atomic_word.h (_GLIBCXX_READ_MEM_BARRIER): 
Define

(_GLIBCXX_WRITE_MEM_BARRIER): Like

Re: [PATCH, ARM] attribute target (thumb,arm) [6/6] respin (5th)

2015-06-09 Thread Kyrill Tkachov


Hi Christian,


On 18/05/15 09:41, Christian Bruel wrote:

On 05/06/2015 04:29 PM, Christian Bruel wrote:

>Implement the -mflip-thump option. Undocumented for internal testing
>only. This option artificially inserts alternative attribute thumb/modes
>on functions.
>
>This close the patch set. Thanks for your review,
>
>Christian
>

to close the series. redo patch rebased.
  





  static bool
diff '--exclude=.svn' -ruN gnu_trunk.p5/gcc/gcc/config/arm/arm.opt 
gnu_trunk.p6/gcc/gcc/config/arm/arm.opt
--- gnu_trunk.p5/gcc/gcc/config/arm/arm.opt 2015-05-18 10:20:26.545835080 
+0200
+++ gnu_trunk.p6/gcc/gcc/config/arm/arm.opt 2015-05-13 13:13:11.014686529 
+0200
@@ -122,6 +122,10 @@
  EnumValue
  Enum(float_abi_type) String(hard) Value(ARM_FLOAT_ABI_HARD)
  
+mflip-thumb

+Target Report Var(TARGET_FLIP_THUMB)
+Switch ARM/Thumb modes on alternating functions for compiler testing


How about adding 'Undocumented' to the properties here?

Can we also get a test or two just to sanity check the option?
Otherwise the patch looks ok to me.

Thanks,
Kyrill

[PATCH] Add extensions to dwarf2.def

2015-06-09 Thread Pierre-Marie de Rodat


Hello,

I'm currently working on migrating debugging information for Ada from 
GNAT encodings to standard DWARF. At the moment, I have worked on two 
topics that I believe are not (completely) supported in standard DWARF:


  - fixed point types with arbitrary scale factors;
  - scalar types with biased representations.

My goal is to submit an issue on dwarfstd.org in an attempt to introduce 
these extensions to the next DWARF standard. Before that, though, I 
would like to make sure that these extensions actually fit the need by 
having them supported both in GCC and GDB.


The two attached patches make these extensions "public" so that no other 
vendor-specific tags/attributes conflict with them in the future. I 
cannot submit the patches that actually use these right now because I 
need first to port them from the 4.9 branch onto mainline (I hope I will 
be able to do this on early July).


May I commit them?

I also attached two documents that describe how to use these 
extensions. I guess this should go to the wiki just like for 
DW_AT_GNAT_descriptive_type 
(https://gcc.gnu.org/wiki/DW_AT_GNAT_descriptive_type). I will do this 
if the patches are integrated.


Thank you in advance!

include/
* dwarf2.def (DW_TAG_GNU_rational_constant): New tag.
(DW_AT_GNU_numerator, DW_AT_GNU_denominator): New attributes.

include/
* dwarf2.def (DW_AT_GNU_bias): New attribute.

--
Pierre-Marie de Rodat
== Introduction ==

Ada makes it possible to define fixed point types. In order to specify how
these will be represented at runtime, the compiler selects a size in bytes and
a "scale factor", which is a strictly positive rational number.  Assuming type
T is associated with the S scale factor, then values V will be represented at
runtime by R so that V = S * R.


== Fixed points in DWARF today ==

Since DWARFv3, there are several ways to describe fixed point types, all of
these require to use DW_TAG_base_type DIEs.  The encoding must be
DW_ATE_signed_fixed or DW_ATE_unsigned_fixed and the scale factor must be
encoded using one of the following attributes:

* DW_AT_decimal_scale, which makes it possible to represent scale factors of
  the form 10**N (with N any signed integer).

* DW_AT_binary_scale, which makes it possible to represent scale factors of
  the form 2**N (with N any signed integer).

* DW_AT_small, which is a reference to a DW_TAG_constant entry providing the
  scale factor itself.


== Limitations ==

While DW_AT_small enables one to describe any scale factor the two other
attributes could, there are still some scale factors that are not encodable as
of today.

In Ada, it is possible to define fixed point types with scale factors such as
1/3.  For instance:

type T is delta 0.1 range 0.0 .. 1.0
   with Small => 1.0/3.0;

Trying to represent 1/3 with a DW_TAG_constant leads to a chicken an egg problem
today: the constant is encoded in the DW_AT_const_value attribute, which must
be "the actual constant value [...] represented as it would be on the target
architecture". 1/3 can be represented on the target architecture, but only with
a fixed point type... whose scalar factor is 1/3, or 1/30, etc.


== Proposed extension ==

One new tag and two new attributes are proposed to solve this problem:

* DW_TAG_GNU_rational_constant, whose purpose is to describe any rational
  constant.  It must have exactly two attributes:

* DW_AT_GNU_numerator and DW_AT_GNU_denominator, whose forms are any constant
  form (DW_FORM_*data*) and which represent an integer (unsigned, unless the
  form represents a signed one).

The rational constant represented by DW_TAG_GNU_rational_constant is N/D where:

* N is the integer represented by the DW_AT_GNU_numerator attribute;
* D is the integer represented by the DW_AT_GNU_denominator attribute;

With this new tag, it is possible to represent arbitrary rational scale factors
for fixed point types, enabling one to describe all possible Ada fixed point
types.
== Introduction ==

In Ada, it is possible for a scalar type to have a biased representation in
some contexts.  For instance:

type Small_Type is range 50 .. 57;

type Record_Type is record
   A, B : Small_Type;
end record;
for Record_Type use record
   A at 0 range 0 .. 2;
   B at 0 range 3 .. 5;
end record;

In Record_Type, the A and B fields will be represented on 3 bits each.  This is
possible because they have a biased representation:

* the 000 bit pattern represents 50;
* the 001 one represents 51;
* etc.

So in this example, the representation for A and B have a bias of 50. The
general principle is that given some type T, a bias B and a value V, the
runtime representation for V is:

   V - B

There is no way to describe this representation in DWARF today.


== Proposed extension ==

One new attribute is proposed to lift this limitation: DW_AT_GNU_bias, whose
form is any constant form and which represents an integer (unsigned, unless the
form represents a sign

Re: [patch] Adjust hash-table.h and it's pre-requisite includes.

2015-06-09 Thread Martin Liška

On 06/05/2015 05:24 PM, Andrew MacLeod wrote:
> There is a horrible morass of include dependencies between hash-map.h, 
> mem-stats.h and hash-table.h.  There are even includes in both directions 
> (mem-stats.h and hash-map.h include each other, as do hash-map.h and 
> hash-table.h.. blech).  Some of those files need parts of the other file to 
> compile, and those whole mess is quite awful.  They also manage to include 
> vec.h into their little party 3 times as well, and it also has some icky 
> #ifdefs.
> 
> So I spent some time sorting out the situation, and reduced it down to a 
> straight dependency list, rooted by hash-table.h.  There are no double 
> direction includes, and no header is included more than once.   Once sorted 
> out, I moved the root of this tree into coretypes.h since pretty much every 
> file requires everything in the dependency chain.   This chain consists of 
> statistics.h, ggc.h, vec.h, hashtab.h, inchash.h, mem-stats-traits.h, 
> hash-map-traits.h, mem-stats.h, hash-map.h and hash-table.h.

Hello Andrew.

Thank you for solving issue related to the aforementioned cycle. Few weeks ago, 
I implemented memory statistics support for hash-{set|table|map}, which 
internally uses a hash-map and that's the reason why it was a bit awful.

Anyway, nice work!

Martin

> 
> With hash-table.h at the root of the dependency list,  I wondered how many 
> files actually need just that.  So I flattened a source tree such that 
> coretypes.h included the other required include files, but each .c file 
> included hash-table.h.  Then I  tried removing the includes.  It turned out 
> that virtually every file needs hash-table.h.  Part of that is due to how 
> tightly integrated with mem-stats.h it is (they still need each other), and 
> that is used throughout the compiler.  So I think it makes sense to put that 
> in coretypes.h.
> 
> I also noticed that hash-set.h is included in a lot of places as well.  
> Wondering how much it was actually needed, I preformed the same flattening 
> exercise and found that only about 10% of the files in gcc core didn't need 
> it to compile... the rest all needed it due to hash_set being in a 
> prototype parameter list or in a structure declared in a commonly used header 
> file (function.h, gimple-walk.h, tree-core.h, tree.h,...) .  It would be a 
> lot of work to remove this dependency (if its  even possible), so I added 
> hash-set.h to coretypes.h as well.   rtl.h needed hash-table.h added to the 
> GENERATOR list, but not hash-set.. I guess the generators don't use it much 
> :-)
> 
> The only other thing of note is the change to vec.h.  It had an ugly set of 
> checks to see whether it was being used in a generator file, and if not 
> whether GC was available, then included it if it wasn't or provided 3 
> prototypes if it wasn't suppose to be included. These allows it to compile 
> when GC isn't available (those routines referencing the GC functions would 
> never be referred to when GC isnt available).With my other changes, most 
> of those checks weren't necessary.  I also figured it was best to simply 
> include those 3 prototypes for ggc_free, ggc_round_alloc_size, and 
> ggc_realloc all the time.  When there isn't ggc.h, things remain as they are 
> now. If there is a ggc.h included, it will provide static confirmation that 
> those prototypes are up-to-date and in sync with how ggc.h defines them.
> 
> The first patch contains all of those changes.  The second one is fully 
> automated and removes all these headers  from every other .c and .h file in 
> the compiler.   This also included changes to many of the gen*.c routines. I 
> adjusted the #include list for all the *generated* .c files to also be up to 
> date with this patchset as well at the previous one which moved wide-int and 
> friends into coretypes.h
> 
> This bootstraps with all languages enabled  on x86_64-unknown-linux-gnu with 
> no new regressions.  It also causes no failures for all the  targets in 
> config-list.mk.
> 
> OK for trunk?
> 
> Andrew

Re: [PING^2] [PATCH][5a/5] Postpone expanding va_arg until pass_stdarg

2015-06-09 Thread Alan Lawrence

Hmmm. One side effect of this is that the line number information available in 
the target hook gimplify_va_arg_expr, is now just the name of the containing 
function, rather than the specific use of va_arg. Is there some way to get this 
more precise location (e.g. gimple_location(stmt) in expand_ifn_va_arg_1, the 
only caller of said hook)? I don't really want to have to add an extra parameter 
to the target hook...


--Alan

Richard Biener wrote:

On Thu, 16 Apr 2015, Tom de Vries wrote:


[stage1 ping^2]
On 10-03-15 16:30, Tom de Vries wrote:

[stage1 ping]
On 22-02-15 14:13, Tom de Vries wrote:

On 19-02-15 14:03, Richard Biener wrote:

On Thu, 19 Feb 2015, Tom de Vries wrote:


On 19-02-15 11:29, Tom de Vries wrote:

Hi,

I'm posting this patch series for stage1:
- 0001-Disable-lang_hooks.gimplify_expr-in-free_lang_data.patch
- 0002-Add-gimple_find_sub_bbs.patch
- 0003-Factor-optimize_va_list_gpr_fpr_size-out-of-pass_std.patch
- 0004-Handle-internal_fn-in-operand_equal_p.patch
- 0005-Postpone-expanding-va_arg-until-pass_stdarg.patch

The patch series - based on Michael's initial patch - postpones
expanding
va_arg
until pass_stdarg, which makes pass_stdarg more robust.

Bootstrapped and reg-tested on x86_64 using all languages, with
unix/ and
unix/-m32 testing.

I'll post the patches in reply to this email.


This patch postpones expanding va_arg until pass_stdarg.

We add a new internal function IFN_VA_ARG. During gimplification, we
map
VA_ARG_EXPR onto a CALL_EXPR with IFN_VA_ARG, which is then gimplified
in to a
gimple_call. At pass_stdarg, we expand the IFN_VA_ARG gimple_call into
actual
code.

There are a few implementation details worth mentioning:
- passing the type beyond gimplification is done by adding a NULL
pointer-
   to-type to IFN_VA_ARG.
- there is special handling for IFN_VA_ARG that would be most suited
to be
   placed in gimplify_va_arg_expr. However, that function lacks the
scope for
   the special handling, so it's placed awkwardly in
gimplify_modify_expr.
- there's special handling in case the va_arg type is variable-sized.
   gimplify_modify_expr adds a WITH_SIZE_EXPR to the CALL_EXPR
IFN_VA_ARG for
   variable-sized types. However, this is gimplified into a
gimple_call which
   does not have the possibility to wrap it's result in a
WITH_SIZE_EXPR. So
   we're adding the size argument of the WITH_SIZE_EXPR as argument to
   IFN_VA_ARG, and at expansion in pass_stdarg, wrap the result of the
   gimplification of IFN_VA_ARG in a WITH_SIZE_EXPR, such that the
subsequent
   gimplify_assign will generate a memcpy if necessary.
- when gimplifying the va_arg argument ap, it may not be addressable.
So
   gimplification will generate a copy ap.1 = ap, and use &ap.1 as
argument.
   This means that we have to copy back the ap.1 value to ap after
IFN_VA_ARG.
   The copy is classified by the va_list_gpr/fpr_size optimization as
an
   escape,  so it inhibits optimization. The tree-ssa/stdarg-2.c f15
update is
   because of that.

OK for stage1?

Looks mostly good, though it looks like with -O0 this doesn't delay
lowering of va-arg and thus won't "fix" offloading.  Can you instead
introduce a PROP_gimple_lva, provide it by the stdarg pass and add
a pass_lower_vaarg somewhere where pass_lower_complex_O0 is run
that runs of !PROP_gimple_lva (and also provides it), and require
PROP_gimple_lva by pass_expand?  (just look for PROP_gimple_lcx for
the complex stuff to get an idea what needs to be touched)


Updated according to comments.

Furthermore (having updated the patch series to recent trunk), I'm
dropping the
ACCEL_COMPILER bit in pass_stdarg::gate. AFAIU the comment there relates
to this
patch.

Retested as before.

OK for stage1?


Ping.

Ping again.

Patch originally posted at:
https://gcc.gnu.org/ml/gcc-patches/2015-02/msg01332.html .


Ok.

Thanks,
Richard.


Thanks,
- Tom


Btw, I'm wondering if as run-time optimization we can tentatively set
PROP_gimple_lva at the start of the gimple pass, and unset it in
gimplify_va_arg_expr. That way we would avoid the loop in
expand_ifn_va_arg_1
(over all bbs and gimples) in functions without va_arg.


Taken care of in follow-up patch 5b.

Re: [PING^2] [PATCH][5a/5] Postpone expanding va_arg until pass_stdarg

2015-06-09 Thread Richard Biener

On Tue, 9 Jun 2015, Alan Lawrence wrote:

> Hmmm. One side effect of this is that the line number information available in
> the target hook gimplify_va_arg_expr, is now just the name of the containing
> function, rather than the specific use of va_arg. Is there some way to get
> this more precise location (e.g. gimple_location(stmt) in expand_ifn_va_arg_1,
> the only caller of said hook)? I don't really want to have to add an extra
> parameter to the target hook...

The x86 variant doesn't use any locations but if then the caller of
the target hook (expand_ifn_va_arg_1) should assign the IFNs location
to all statements expanded from it (it could set input_location to 
that during the target hook call...)

Richard.

> --Alan
> 
> Richard Biener wrote:
> > On Thu, 16 Apr 2015, Tom de Vries wrote:
> > 
> > > [stage1 ping^2]
> > > On 10-03-15 16:30, Tom de Vries wrote:
> > > > [stage1 ping]
> > > > On 22-02-15 14:13, Tom de Vries wrote:
> > > > > On 19-02-15 14:03, Richard Biener wrote:
> > > > > > On Thu, 19 Feb 2015, Tom de Vries wrote:
> > > > > > 
> > > > > > > On 19-02-15 11:29, Tom de Vries wrote:
> > > > > > > > Hi,
> > > > > > > > 
> > > > > > > > I'm posting this patch series for stage1:
> > > > > > > > - 0001-Disable-lang_hooks.gimplify_expr-in-free_lang_data.patch
> > > > > > > > - 0002-Add-gimple_find_sub_bbs.patch
> > > > > > > > -
> > > > > > > > 0003-Factor-optimize_va_list_gpr_fpr_size-out-of-pass_std.patch
> > > > > > > > - 0004-Handle-internal_fn-in-operand_equal_p.patch
> > > > > > > > - 0005-Postpone-expanding-va_arg-until-pass_stdarg.patch
> > > > > > > > 
> > > > > > > > The patch series - based on Michael's initial patch - postpones
> > > > > > > > expanding
> > > > > > > > va_arg
> > > > > > > > until pass_stdarg, which makes pass_stdarg more robust.
> > > > > > > > 
> > > > > > > > Bootstrapped and reg-tested on x86_64 using all languages, with
> > > > > > > > unix/ and
> > > > > > > > unix/-m32 testing.
> > > > > > > > 
> > > > > > > > I'll post the patches in reply to this email.
> > > > > > > > 
> > > > > > > This patch postpones expanding va_arg until pass_stdarg.
> > > > > > > 
> > > > > > > We add a new internal function IFN_VA_ARG. During gimplification,
> > > > > > > we
> > > > > > > map
> > > > > > > VA_ARG_EXPR onto a CALL_EXPR with IFN_VA_ARG, which is then
> > > > > > > gimplified
> > > > > > > in to a
> > > > > > > gimple_call. At pass_stdarg, we expand the IFN_VA_ARG gimple_call
> > > > > > > into
> > > > > > > actual
> > > > > > > code.
> > > > > > > 
> > > > > > > There are a few implementation details worth mentioning:
> > > > > > > - passing the type beyond gimplification is done by adding a NULL
> > > > > > > pointer-
> > > > > > >to-type to IFN_VA_ARG.
> > > > > > > - there is special handling for IFN_VA_ARG that would be most
> > > > > > > suited
> > > > > > > to be
> > > > > > >placed in gimplify_va_arg_expr. However, that function lacks
> > > > > > > the
> > > > > > > scope for
> > > > > > >the special handling, so it's placed awkwardly in
> > > > > > > gimplify_modify_expr.
> > > > > > > - there's special handling in case the va_arg type is
> > > > > > > variable-sized.
> > > > > > >gimplify_modify_expr adds a WITH_SIZE_EXPR to the CALL_EXPR
> > > > > > > IFN_VA_ARG for
> > > > > > >variable-sized types. However, this is gimplified into a
> > > > > > > gimple_call which
> > > > > > >does not have the possibility to wrap it's result in a
> > > > > > > WITH_SIZE_EXPR. So
> > > > > > >we're adding the size argument of the WITH_SIZE_EXPR as
> > > > > > > argument to
> > > > > > >IFN_VA_ARG, and at expansion in pass_stdarg, wrap the result of
> > > > > > > the
> > > > > > >gimplification of IFN_VA_ARG in a WITH_SIZE_EXPR, such that the
> > > > > > > subsequent
> > > > > > >gimplify_assign will generate a memcpy if necessary.
> > > > > > > - when gimplifying the va_arg argument ap, it may not be
> > > > > > > addressable.
> > > > > > > So
> > > > > > >gimplification will generate a copy ap.1 = ap, and use &ap.1 as
> > > > > > > argument.
> > > > > > >This means that we have to copy back the ap.1 value to ap after
> > > > > > > IFN_VA_ARG.
> > > > > > >The copy is classified by the va_list_gpr/fpr_size optimization
> > > > > > > as
> > > > > > > an
> > > > > > >escape,  so it inhibits optimization. The tree-ssa/stdarg-2.c
> > > > > > > f15
> > > > > > > update is
> > > > > > >because of that.
> > > > > > > 
> > > > > > > OK for stage1?
> > > > > > Looks mostly good, though it looks like with -O0 this doesn't delay
> > > > > > lowering of va-arg and thus won't "fix" offloading.  Can you instead
> > > > > > introduce a PROP_gimple_lva, provide it by the stdarg pass and add
> > > > > > a pass_lower_vaarg somewhere where pass_lower_complex_O0 is run
> > > > > > that runs of !PROP_gimple_lva (and also provides it), and require
> > > > > > PROP_gimple_lva by pass_expand?  (just look for PROP_gimple_lcx for
> > > >

Re: [PING^2][PATCH][3/3][PR65460] Mark offloaded functions as parallelized

2015-06-09 Thread Richard Biener

On Mon, 8 Jun 2015, Tom de Vries wrote:

> On 17/04/15 12:08, Tom de Vries wrote:
> > On 20-03-15 12:38, Tom de Vries wrote:
> > > On 19-03-15 12:05, Tom de Vries wrote:
> > > > On 18-03-15 18:22, Tom de Vries wrote:
> > > > > Hi,
> > > > > 
> > > > > this patch fixes PR65460.
> > > > > 
> > > > > The patch marks offloaded functions as parallelized, which means the
> > > > > parloops
> > > > > pass no longer attempts to modify that function.
> > > > 
> > > > Updated patch to postpone mark_parallelized_function until the
> > > > corresponding
> > > > cgraph_node is available, to ensure it works with the updated
> > > > mark_parallelized_function from patch 2/3.
> > > > 
> > > 
> > > Updated to eliminate mark_parallelized_function.
> > > 
> > > Bootstrapped and reg-tested on x86_64.
> > > 
> > > OK for stage4?
> > > 
> > 
> > ping.
> 
> ping^2. Original post at
> https://gcc.gnu.org/ml/gcc-patches/2015-03/msg01063.html .

Ok, but shouldn't it be set before calling add_new_function as
add_new_function might run passes that wouldn't identify the
function as parallelized?

Richard.

> > 
> > OK for stage1?
> > 
> 
> Thanks,
> - Tom
> 
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham 
Norton, HRB 21284 (AG Nuernberg)

Re: [PATCH] Optimize (CST1 << A) == CST2 (PR tree-optimization/66299)

2015-06-09 Thread Marc Glisse


On Tue, 9 Jun 2015, Richard Biener wrote:


Tweaking it so that (6<=31 for TYPE_OVERFLOW_WRAPS and
false for TYPE_OVERFLOW_UNDEFINED is probably more controversial.


Hm, yes.  I think signed overflow != shift amount overflow, so testing 
the overflow macros for this isn't valid.


Would it be ok to always turn it to X>=31 then? (the value 31 is 
conveniently already computed in 'cand')


--
Marc Glisse

Re: conditional lim

2015-06-09 Thread Richard Biener

On Fri, May 29, 2015 at 3:14 PM, Evgeniya Maenkova
 wrote:
> Hi Richard,
>
> Here is some explanation. I hope you let me know if I need to clarify 
> something.
>
> Also, you asked me about concrete example, to make sure you don’t miss
> my answer here is the link:
> https://gcc.gnu.org/ml/gcc-patches/2015-05/msg02417.html.
>
> Also, I doubt whether it’s convenient for you to create a build with
> my patch or not. May be to clarify things you could send me some
> examples/concrete cases, then I’ll compile them with
> –fdump-tree-loopinit-details and –fdump-tree-lim-details and send you
> these dumps. May be these dumps will be useful. (I’ll only disable
> cleanup_cfg TODO after lim to let you know the exact picture after
> lim).
>
> What do you think?
>
> 1.   invariantness _dom_walker –
>
> 1.1   for each GIMPLE_COND in given bb calls handle_cond_stmt to call
> for true and false edges handle_branch_edge, which calls SET_TARGET_OF
> for all bb ‘predicated’ by given GIMPLE_COND.
>
> SET_TARGET_OF sets in basic_blocks aux 2 facts:
>
> a)  this is true or false edge;
>
> b)  link to cond stmt;
>
> Handle_branch_edge works this way:
>
> If (cond1)
>
>   {
>
>  bb1;
>
>  if (cond2}
>
>{
>
>bb2;
>
> }
>
>Being called for cond1, it sets cond1 as condition for both bb1 and
> bb2 (the whole branch for cond1, ie also for bb containing cond2),
> then this method will be called (as there is dominance order) for
> cond2 to correct things (ie to set cond2 as condition for bb2).

Hmm, why not track the current condition as state during the DOM walk
and thus avoid processing more than one basic-block in handle_branch_edge?
Thus get rid of handle_branch_edge and instead do everything in handle_cond_stmt
plus the dom-walkers BB visitor?

I see you don't handle BBs with multiple predecessors - that's ok, but
are you sure you don't run into correctness issues when not marking such
BBs as predicated?  This misses handling of, say

 if (a || b)
   bb;

which is a pity (but can be fixed later if desired).

I note that collecting predicates has similarities to what if-conversion
does in tree-ifcvt.c (even if its implementation is completely different,
of course).

> 1.2   As 1.1 goes we identify whether some bb is predicated by some
> condition or not.
>
> bb->aux->type will be [TRUE/FALSE]_TARGET_OF and
> bb->aux->cond_stmt=cond stmt (the nearest condition).
>
> If bb is always executed bb->aux->type = ALWAYS_EXECUTED_IN,
> bb->loop->loop (this info was available in the clean build).
>
> 1.3   As this walker is called in dominance order, information about
> condition is available when invariantness_dom_walker is called for
> given bb.  So we can make some computations based on bb->aux
> structure. This is done in check_conditionally_executed. The main goal
> of this method is to check that the root condition is always executed
> in the loop. I did so to avoid situation like this
>
> Loop:
>
>Jmp somewhere;
>
>   If (cond1)
>
>   If (cond2)
>
>  Etc
>
> By ‘root condition’ I mean cond1 in this case (the first cond in
> dominance order).

Ok, but this can't happen (an uncontional jump to somehwere).  It
will always be a conditional jump (a loop exit) and thus should
be handled already.

Did you have a testcase that was mishandled?

> If  ‘root condition’ is not always executed we disable (free) all the
> info in bb->aux, ie all the blocks becomes neither conditionally nor
> always executed according to bb->aux.
>
> There is some optimization to don’t go each time trough the conditions
> chain (cond2->cond1), let me skip such details for now.
>
>
>
> 1.4   Then we calculate tgt_level  (which is used in move_computations later)
>
> The patch is the same for phi and regular stmt (calculate_tgt_level)
>
> 1)  If stmt is conditionally executed we compute max movement
> (determine_max_movement) starting from
> get_lim_data(condition)->max_loop.
>
> 2)  If stmt is not cond executed as start level  for
> determine_max_movement computations we choose ALWAYS_EXECUTED_IN.
>
> To clarify why, see the comment
>
> /* The reason why we don't set start_level for MOVE_POSSIBLE
>
>   statements to more outer level is: this statement could depend on
>
>   conditional statements and even probably is not defined without this
>
>   condition. So either we should analyze these ones and move
>
>   under condition somehow or keep more strong start_level . */
>
> As you noted in your review there was some refactoring. Of course, I
> had no goal to refactor existing code, I intended to remove some
> duplication which I caused by my changes.  I hope we discuss in
> details later if you don’t like some refactoring.

Refactoring makes a big patch harder to review because it ends up
containing no-op changes.  It also makes it appear bigger than it
really is ;)

A reasonable solution is to factor out the refactoring to a separate patch.

> 2.   store_

Re: [PATCH] Optimize (CST1 << A) == CST2 (PR tree-optimization/66299)

2015-06-09 Thread Richard Biener

On Tue, 9 Jun 2015, Marc Glisse wrote:

> On Tue, 9 Jun 2015, Richard Biener wrote:
> 
> > > Tweaking it so that (6<=31 for TYPE_OVERFLOW_WRAPS and
> > > false for TYPE_OVERFLOW_UNDEFINED is probably more controversial.
> > 
> > Hm, yes.  I think signed overflow != shift amount overflow, so testing the
> > overflow macros for this isn't valid.
> 
> Would it be ok to always turn it to X>=31 then? (the value 31 is conveniently
> already computed in 'cand')

I think so.

Richard.

Re: [PATCH] Optimize (CST1 << A) == CST2 (PR tree-optimization/66299)

2015-06-09 Thread Richard Biener

On Tue, 9 Jun 2015, Richard Biener wrote:

> On Tue, 9 Jun 2015, Marc Glisse wrote:
> 
> > On Tue, 9 Jun 2015, Richard Biener wrote:
> > 
> > > > Tweaking it so that (6<=31 for TYPE_OVERFLOW_WRAPS and
> > > > false for TYPE_OVERFLOW_UNDEFINED is probably more controversial.
> > > 
> > > Hm, yes.  I think signed overflow != shift amount overflow, so testing the
> > > overflow macros for this isn't valid.
> > 
> > Would it be ok to always turn it to X>=31 then? (the value 31 is 
> > conveniently
> > already computed in 'cand')
> 
> I think so.

Or even ((unsigned)X - 31) < 1 (I probably got that wrong) to properly
say X>=29 && X<32, that is, preserve the implicit upper bound on X
we have because it is used in a shift.

Richard.

> Richard.
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham 
Norton, HRB 21284 (AG Nuernberg)

[PATCH] Fix ix86_split_long_move collision handling with TLS (PR target/66470)

2015-06-09 Thread Jakub Jelinek

Hi!

As mentioned in the PR, when trying to split:
(insn 7 15 13 2 (set (reg:DI 0 ax [92])
(mem:DI (plus:SI (plus:SI (mult:SI (reg/v:SI 1 dx [orig:89 b ] [89])
(const_int 8 [0x8]))
(unspec:SI [
(const_int 0 [0])
] UNSPEC_TP))
(reg:SI 0 ax [91])) [1 a S8 A64])) rh1212265.i:2 85 
{*movdi_internal}
 (nil))
which has collisions == 2 (both ax and dx used on lhs and both
ax and dx used in the memory address), we generate invalid insn
- lea with %gs: or %fs: in it.  This patch fixes it by using
normal lea instead (so remove the unspec UNSPEC_TP from the address
for lea) and duplicating the unspec UNSPEC_TP to all the memory loads.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk/5/4.9/4.8?

2015-06-09  Jakub Jelinek  

PR target/66470
* config/i386/i386.c (ix86_split_long_move): For collisions
involving direct tls segment refs, move the UNSPEC_TP out of
the address for lea, to each of the memory loads.

* gcc.dg/tls/pr66470.c: New test.

--- gcc/config/i386/i386.c.jj   2015-06-08 15:41:19.0 +0200
+++ gcc/config/i386/i386.c  2015-06-09 11:50:29.960859723 +0200
@@ -22866,7 +22866,7 @@ ix86_split_long_move (rtx operands[])
 Do an lea to the last part and use only one colliding move.  */
   else if (collisions > 1)
{
- rtx base;
+ rtx base, addr, tls_base = NULL_RTX;
 
  collisions = 1;
 
@@ -22877,10 +22877,45 @@ ix86_split_long_move (rtx operands[])
  if (GET_MODE (base) != Pmode)
base = gen_rtx_REG (Pmode, REGNO (base));
 
- emit_insn (gen_rtx_SET (base, XEXP (part[1][0], 0)));
+ addr = XEXP (part[1][0], 0);
+ if (TARGET_TLS_DIRECT_SEG_REFS)
+   {
+ struct ix86_address parts;
+ int ok = ix86_decompose_address (addr, &parts);
+ gcc_assert (ok);
+ if (parts.seg == DEFAULT_TLS_SEG_REG)
+   {
+ /* It is not valid to use %gs: or %fs: in
+lea though, so we need to remove it from the
+address used for lea and add it to each individual
+memory loads instead.  */
+ addr = copy_rtx (addr);
+ rtx *x = &addr;
+ while (GET_CODE (*x) == PLUS)
+   {
+ for (i = 0; i < 2; i++)
+   if (GET_CODE (XEXP (*x, i)) == UNSPEC
+   && XINT (XEXP (*x, i), 1) == UNSPEC_TP)
+ {
+   tls_base = XEXP (*x, i);
+   *x = XEXP (*x, 1 - i);
+   break;
+ }
+ if (tls_base)
+   break;
+ x = &XEXP (*x, 0);
+   }
+ gcc_assert (tls_base);
+   }
+   }
+ emit_insn (gen_rtx_SET (base, addr));
+ if (tls_base)
+   base = gen_rtx_PLUS (GET_MODE (base), base, tls_base);
  part[1][0] = replace_equiv_address (part[1][0], base);
  for (i = 1; i < nparts; i++)
{
+ if (tls_base)
+   base = copy_rtx (base);
  tmp = plus_constant (Pmode, base, UNITS_PER_WORD * i);
  part[1][i] = replace_equiv_address (part[1][i], tmp);
}
--- gcc/testsuite/gcc.dg/tls/pr66470.c.jj   2015-06-09 11:59:05.543954781 
+0200
+++ gcc/testsuite/gcc.dg/tls/pr66470.c  2015-06-09 11:58:43.0 +0200
@@ -0,0 +1,29 @@
+/* PR target/66470 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target tls } */
+
+extern __thread unsigned long long a[10];
+extern __thread struct S { int a, b; } b[10];
+
+unsigned long long
+foo (long x)
+{
+  return a[x];
+}
+
+struct S
+bar (long x)
+{
+  return b[x];
+}
+
+#ifdef __SIZEOF_INT128__
+extern __thread unsigned __int128 c[10];
+
+unsigned __int128
+baz (long x)
+{
+  return c[x];
+}
+#endif

Jakub

Re: [PATCH] Optimize (CST1 << A) == CST2 (PR tree-optimization/66299)

2015-06-09 Thread Marc Glisse


On Tue, 9 Jun 2015, Richard Biener wrote:


On Tue, 9 Jun 2015, Richard Biener wrote:


On Tue, 9 Jun 2015, Marc Glisse wrote:


On Tue, 9 Jun 2015, Richard Biener wrote:


Tweaking it so that (6<=31 for TYPE_OVERFLOW_WRAPS and
false for TYPE_OVERFLOW_UNDEFINED is probably more controversial.


Hm, yes.  I think signed overflow != shift amount overflow, so testing the
overflow macros for this isn't valid.


Would it be ok to always turn it to X>=31 then? (the value 31 is conveniently
already computed in 'cand')


I think so.


Or even ((unsigned)X - 31) < 1 (I probably got that wrong) to properly
say X>=29 && X<32, that is, preserve the implicit upper bound on X
we have because it is used in a shift.


I don't understand in what sense this preserves the upper bound. I would 
understand storing a range for X (when it is an SSA_NAME, and it would 
require a lot of care not to propagate backwards too far), or more simply 
introducing if(X>=32) __builtin_unreachable();. But you seem to be talking 
about generating more complicated code so that if someone checks 
(6<<123)==0 it returns false?


--
Marc Glisse

Re: [PATCH] Simple optimization for MASK_STORE.

2015-06-09 Thread Richard Biener

On Wed, May 20, 2015 at 4:00 PM, Yuri Rumyantsev  wrote:
> Hi All,
>
> Here is updated patch to optimize mask stores. The main goal of it is
> to avoid execution of mask store if its mask is zero vector since
> loads that follow it can be blocked.
> The following changes were done:
>   1.  A test on sink legality was added - it simply prohibits to cross
> statements with non-null vdef or vuse.
>   2. New phi node is created in join block for moved MASK_STORE statements.
>   3. Test was changed to check that 2 MASK_STORE statements are not
> moved to the same block.
>   4. New field was added to loop_vec_info structure to mark loops
> having MASK_STORE's.
>
> Any comments will be appreciated.
> Yuri.

I still don't understand why you need the new target hook.  If we have a masked
load/store then the mask is computed by an assignment with a VEC_COND_EXPR
(in your example) and thus a test for a zero mask is expressible as just

 if (vect__ifc__41.17_167 == { 0, 0, 0, 0... })

or am I missing something?  As we dont' want this transform on all targets
(or always) we can control it by either a --param or a new flag which default
targets can adjust.  [Is the hazard still present with Skylake or other
AVX512 implementations for example?]

+/* Helper for optimize_mask_stores: returns true if STMT sinking to end
+   of BB is valid and false otherwise.  */
+
+static bool
+is_valid_sink (gimple stmt)
+{

so STMT is either a masked store or a masked load?  You shouldn't
walk all stmts here but instead rely on the factored use-def def-use
chains of virtual operands.

+void
+optimize_mask_stores (struct loop *loop)
+{
+  basic_block bb = loop->header;
+  gimple_stmt_iterator gsi;
+  gimple stmt;
+  auto_vec worklist;
+
+  if (loop->dont_vectorize
+  || loop->num_nodes > 2)
+return;

looks like no longer needed given the place this is called from now
(or does it guard against outer loop vectorization as well?)
Also put it into tree-vect-loop.c and make it static.

+  /* Loop was not vectorized if mask does not have vector type.  */
+  if (!VECTOR_TYPE_P (TREE_TYPE (mask)))
+   return;

this should be always false

+  store_bb->frequency = bb->frequency / 2;
+  efalse->probability = REG_BR_PROB_BASE / 2;

I think the == 0 case should end up as unlikely.

+  if (dom_info_available_p (CDI_POST_DOMINATORS))
+   set_immediate_dominator (CDI_POST_DOMINATORS, bb, store_bb);

post dominators are not available in the vectorizer.

+ /* Put definition statement of stored value in STORE_BB
+if possible.  */
+ arg3 = gimple_call_arg (last, 3);
+ if (TREE_CODE (arg3) == SSA_NAME && has_single_use (arg3))
+   {
+ def_stmt = SSA_NAME_DEF_STMT (arg3);
+ /* Move def_stmt to STORE_BB if it is in the same bb and
+it is legal.  */
+ if (gimple_bb (def_stmt) == bb
+ && is_valid_sink (def_stmt))

ok, so you move arbitrary statements.  You can always move non-memory
statements and if you keep track of the last store you moved you
can check if gimple_vuse () is the same as on that last store and be
done with that, or if you sink another store (under the same condition)
then just update the last store.

Otherwise this looks better now.

Thus - get rid of the target hook and of is_valid_sink.

Thanks,
Richard.

> 2015-05-20  Yuri Rumyantsev  
>
> * config/i386/i386.c: Include files stringpool.h and tree-ssanames.h.
> (ix86_vectorize_is_zero_vector): New function.
> (TARGET_VECTORIZE_IS_ZERO_VECTOR): New target macro
> * doc/tm.texi.in: Add @hook TARGET_VECTORIZE_IS_ZERO_VECTOR.
> * doc/tm.texi: Updated.
> * target.def (is_zero_vector): New DEFHOOK.
> * tree-vect-stmts.c : Include tree-into-ssa.h.
> (vectorizable_mask_load_store): Initialize has_mask_store field.
> (is_valid_sink): New function.
> (optimize_mask_stores): New function.
> * tree-vectorizer.c (vectorize_loops): Invoke optimaze_mask_stores for
> loops having masked stores.
> * tree-vectorizer.h (loop_vec_info): Add new has_mask_store field and
> correspondent macros.
> (optimize_mask_stores): Update prototype.
>
> gcc/testsuite/ChangeLog:
> * gcc.target/i386/avx2-vect-mask-store-move1.c: New test.

Re: [PATCH] Optimize (CST1 << A) == CST2 (PR tree-optimization/66299)

2015-06-09 Thread Richard Biener

On Tue, 9 Jun 2015, Marc Glisse wrote:

> On Tue, 9 Jun 2015, Richard Biener wrote:
> 
> > On Tue, 9 Jun 2015, Richard Biener wrote:
> > 
> > > On Tue, 9 Jun 2015, Marc Glisse wrote:
> > > 
> > > > On Tue, 9 Jun 2015, Richard Biener wrote:
> > > > 
> > > > > > Tweaking it so that (6<=31 for TYPE_OVERFLOW_WRAPS
> > > > > > and
> > > > > > false for TYPE_OVERFLOW_UNDEFINED is probably more controversial.
> > > > > 
> > > > > Hm, yes.  I think signed overflow != shift amount overflow, so testing
> > > > > the
> > > > > overflow macros for this isn't valid.
> > > > 
> > > > Would it be ok to always turn it to X>=31 then? (the value 31 is
> > > > conveniently
> > > > already computed in 'cand')
> > > 
> > > I think so.
> > 
> > Or even ((unsigned)X - 31) < 1 (I probably got that wrong) to properly
> > say X>=29 && X<32, that is, preserve the implicit upper bound on X
> > we have because it is used in a shift.
> 
> I don't understand in what sense this preserves the upper bound. I would
> understand storing a range for X (when it is an SSA_NAME, and it would require
> a lot of care not to propagate backwards too far), or more simply introducing
> if(X>=32) __builtin_unreachable();. But you seem to be talking about
> generating more complicated code so that if someone checks (6<<123)==0 it
> returns false?

Well, I'm mixing simplifying the computation and preserving extra
info we got from the complex computation.  So yes, you are right.

Richard.

[PATCH] Fix PR66423

2015-06-09 Thread Richard Biener


Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2015-06-09  Richard Biener  

PR middle-end/66423
* match.pd: Handle A % (unsigned)(1 << B).

* gcc.dg/fold-modpow2.c: New testcase.

Index: gcc/match.pd
===
*** gcc/match.pd(revision 224271)
--- gcc/match.pd(working copy)
*** (define_operator_list swapped_tcc_compar
*** 248,258 
   (lshift INTEGER_CST@1 @2))
  (for mod (trunc_mod floor_mod)
   (simplify
!   (mod @0 (power_of_two_cand@1 @2))
(if ((TYPE_UNSIGNED (type)
|| tree_expr_nonnegative_p (@0))
&& integer_pow2p (@2) && tree_int_cst_sgn (@2) > 0)
!(bit_and @0 (minus @1 { build_int_cst (TREE_TYPE (@1), 1); })
  
  /* X % Y is smaller than Y.  */
  (for cmp (lt ge)
--- 248,259 
   (lshift INTEGER_CST@1 @2))
  (for mod (trunc_mod floor_mod)
   (simplify
!   (mod @0 (convert?@3 (power_of_two_cand@1 @2)))
(if ((TYPE_UNSIGNED (type)
|| tree_expr_nonnegative_p (@0))
+   && tree_nop_conversion_p (type, TREE_TYPE (@3))
&& integer_pow2p (@2) && tree_int_cst_sgn (@2) > 0)
!(bit_and @0 (convert (minus @1 { build_int_cst (TREE_TYPE (@1), 1); }))
  
  /* X % Y is smaller than Y.  */
  (for cmp (lt ge)
Index: gcc/testsuite/gcc.dg/fold-modpow2.c
===
*** gcc/testsuite/gcc.dg/fold-modpow2.c (revision 0)
--- gcc/testsuite/gcc.dg/fold-modpow2.c (working copy)
***
*** 0 
--- 1,11 
+ /* { dg-do compile } */
+ /* { dg-options "-fdump-tree-original" } */
+ 
+ unsigned int
+ my_mod (unsigned int a, unsigned int b)
+ {
+   return a % (1 << b);
+ }
+ 
+ /* The above should be simplified to (unsigned int) ((1 << b) + -1) & a */
+ /* { dg-final { scan-tree-dump "& a;" "original" } } */

Re: [PATCH] Fix ix86_split_long_move collision handling with TLS (PR target/66470)

2015-06-09 Thread Uros Bizjak

On Tue, Jun 9, 2015 at 1:57 PM, Jakub Jelinek  wrote:
> Hi!
>
> As mentioned in the PR, when trying to split:
> (insn 7 15 13 2 (set (reg:DI 0 ax [92])
> (mem:DI (plus:SI (plus:SI (mult:SI (reg/v:SI 1 dx [orig:89 b ] [89])
> (const_int 8 [0x8]))
> (unspec:SI [
> (const_int 0 [0])
> ] UNSPEC_TP))
> (reg:SI 0 ax [91])) [1 a S8 A64])) rh1212265.i:2 85 
> {*movdi_internal}
>  (nil))
> which has collisions == 2 (both ax and dx used on lhs and both
> ax and dx used in the memory address), we generate invalid insn
> - lea with %gs: or %fs: in it.  This patch fixes it by using
> normal lea instead (so remove the unspec UNSPEC_TP from the address
> for lea) and duplicating the unspec UNSPEC_TP to all the memory loads.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk/5/4.9/4.8?
>
> 2015-06-09  Jakub Jelinek  
>
> PR target/66470
> * config/i386/i386.c (ix86_split_long_move): For collisions
> involving direct tls segment refs, move the UNSPEC_TP out of
> the address for lea, to each of the memory loads.
>
> * gcc.dg/tls/pr66470.c: New test.
>
> --- gcc/config/i386/i386.c.jj   2015-06-08 15:41:19.0 +0200
> +++ gcc/config/i386/i386.c  2015-06-09 11:50:29.960859723 +0200
> @@ -22866,7 +22866,7 @@ ix86_split_long_move (rtx operands[])
>  Do an lea to the last part and use only one colliding move.  */
>else if (collisions > 1)
> {
> - rtx base;
> + rtx base, addr, tls_base = NULL_RTX;
>
>   collisions = 1;
>
> @@ -22877,10 +22877,45 @@ ix86_split_long_move (rtx operands[])
>   if (GET_MODE (base) != Pmode)
> base = gen_rtx_REG (Pmode, REGNO (base));
>
> - emit_insn (gen_rtx_SET (base, XEXP (part[1][0], 0)));
> + addr = XEXP (part[1][0], 0);
> + if (TARGET_TLS_DIRECT_SEG_REFS)
> +   {
> + struct ix86_address parts;
> + int ok = ix86_decompose_address (addr, &parts);
> + gcc_assert (ok);
> + if (parts.seg == DEFAULT_TLS_SEG_REG)
> +   {
> + /* It is not valid to use %gs: or %fs: in
> +lea though, so we need to remove it from the
> +address used for lea and add it to each individual
> +memory loads instead.  */
> + addr = copy_rtx (addr);
> + rtx *x = &addr;
> + while (GET_CODE (*x) == PLUS)

Why not use RTX iterators here? IMO, it would be much more readable.

Uros.

> +   {
> + for (i = 0; i < 2; i++)
> +   if (GET_CODE (XEXP (*x, i)) == UNSPEC
> +   && XINT (XEXP (*x, i), 1) == UNSPEC_TP)
> + {
> +   tls_base = XEXP (*x, i);
> +   *x = XEXP (*x, 1 - i);
> +   break;
> + }
> + if (tls_base)
> +   break;
> + x = &XEXP (*x, 0);
> +   }
> + gcc_assert (tls_base);
> +   }
> +   }
> + emit_insn (gen_rtx_SET (base, addr));
> + if (tls_base)
> +   base = gen_rtx_PLUS (GET_MODE (base), base, tls_base);
>   part[1][0] = replace_equiv_address (part[1][0], base);
>   for (i = 1; i < nparts; i++)
> {
> + if (tls_base)
> +   base = copy_rtx (base);
>   tmp = plus_constant (Pmode, base, UNITS_PER_WORD * i);
>   part[1][i] = replace_equiv_address (part[1][i], tmp);
> }
> --- gcc/testsuite/gcc.dg/tls/pr66470.c.jj   2015-06-09 11:59:05.543954781 
> +0200
> +++ gcc/testsuite/gcc.dg/tls/pr66470.c  2015-06-09 11:58:43.0 +0200
> @@ -0,0 +1,29 @@
> +/* PR target/66470 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +/* { dg-require-effective-target tls } */
> +
> +extern __thread unsigned long long a[10];
> +extern __thread struct S { int a, b; } b[10];
> +
> +unsigned long long
> +foo (long x)
> +{
> +  return a[x];
> +}
> +
> +struct S
> +bar (long x)
> +{
> +  return b[x];
> +}
> +
> +#ifdef __SIZEOF_INT128__
> +extern __thread unsigned __int128 c[10];
> +
> +unsigned __int128
> +baz (long x)
> +{
> +  return c[x];
> +}
> +#endif
>
> Jakub

Re: [patch] Adjust gcc-plugin.h

2015-06-09 Thread Andrew MacLeod


On 06/09/2015 03:56 AM, Richard Biener wrote:

On Mon, Jun 8, 2015 at 7:52 PM, Andrew MacLeod  wrote:


the gcc source files need to see the internal bits in plugin.h, as well as
the common decls in gcc-plugin.h.  So we could change the includes as you
suggest, but we'd have to copy all the decls from gcc-inlcude.h to plugin.h
so the gcc functions can see them.  And then the plugins would be exposed to
all the internal APIs and decls present in plugins.h

plugins are exposed to all internals of GCC anyway.  gcc-plugin.h should really
just be a #include kitchen-sink.



  Im presuming
we didnt want to do that and thats why there were 2 files to start with.

No, gcc-plugin.h was introduced to make the set of includes required
for plugins "stable".

Richard.


I didn't actually realize that.. at least its serving its true purpose 
now :-)


So hows this.  Bootstrapping and testruns proceeding now, but it should 
have the same results as the other patch.

Assuming no regressions, OK?

Andrew




	* gcc-plugin.h: Move decls to plugin.h and include it.
	* plugin.h: Relocate decls from gcc-plugin.h
	* ggc-page.c: Include required header files.
	* passes.c: Likewise.
	* cgraphunit.c: Likewise.

Index: gcc-plugin.h
===
*** gcc-plugin.h	(revision 224250)
--- gcc-plugin.h	(working copy)
*** along with GCC; see the file COPYING3.
*** 27,33 
  #include "config.h"
  #include "system.h"
  #include "coretypes.h"
- #include "highlev-plugin-common.h"
  #include "tm.h"
  #include "hard-reg-set.h"
  #include "input.h"
--- 27,32 
*** along with GCC; see the file COPYING3.
*** 49,186 
  #include "tree-core.h"
  #include "fold-const.h"
  #include "tree-check.h"
! 
! /* Event names.  */
! enum plugin_event
! {
! # define DEFEVENT(NAME) NAME,
! # include "plugin.def"
! # undef DEFEVENT
!   PLUGIN_EVENT_FIRST_DYNAMIC
! };
! 
! /* All globals declared here have C linkage to reduce link compatibility
!issues with implementation language choice and mangling.  */
! #ifdef __cplusplus
! extern "C" {
! #endif
! 
! extern const char **plugin_event_name;
! 
! struct plugin_argument
! {
!   char *key;/* key of the argument.  */
!   char *value;  /* value is optional and can be NULL.  */
! };
! 
! /* Additional information about the plugin. Used by --help and --version. */
! 
! struct plugin_info
! {
!   const char *version;
!   const char *help;
! };
! 
! /* Represents the gcc version. Used to avoid using an incompatible plugin. */
! 
! struct plugin_gcc_version
! {
!   const char *basever;
!   const char *datestamp;
!   const char *devphase;
!   const char *revision;
!   const char *configuration_arguments;
! };
! 
! /* Object that keeps track of the plugin name and its arguments. */
! struct plugin_name_args
! {
!   char *base_name;  /* Short name of the plugin (filename without
!.so suffix). */
!   const char *full_name;/* Path to the plugin as specified with
!-fplugin=. */
!   int argc; /* Number of arguments specified with
!-fplugin-arg-... */
!   struct plugin_argument *argv; /* Array of ARGC key-value pairs. */
!   const char *version;  /* Version string provided by plugin. */
!   const char *help; /* Help string provided by plugin. */
! };
! 
! /* The default version check. Compares every field in VERSION. */
! 
! extern bool plugin_default_version_check (struct plugin_gcc_version *,
! 	  struct plugin_gcc_version *);
! 
! /* Function type for the plugin initialization routine. Each plugin module
!should define this as an externally-visible function with name
!"plugin_init."
! 
!PLUGIN_INFO - plugin invocation information.
!VERSION - the plugin_gcc_version symbol of GCC.
! 
!Returns 0 if initialization finishes successfully.  */
! 
! typedef int (*plugin_init_func) (struct plugin_name_args *plugin_info,
!  struct plugin_gcc_version *version);
! 
! /* Declaration for "plugin_init" function so that it doesn't need to be
!duplicated in every plugin.  */
! extern int plugin_init (struct plugin_name_args *plugin_info,
! struct plugin_gcc_version *version);
! 
! /* Function type for a plugin callback routine.
! 
!GCC_DATA  - event-specific data provided by GCC
!USER_DATA - plugin-specific data provided by the plugin  */
! 
! typedef void (*plugin_callback_func) (void *gcc_data, void *user_data);
! 
! /* Called from the plugin's initialization code. Register a single callback.
!This function can be called multiple times.
! 
!PLUGIN_NAME - display name for this plugin
!EVENT   - which event the callback is for
!CALLBACK- the callback to be called at the event
!USER_DATA   - plugin-provided data.
! */
! 
! /* Number of event ids / names registered

Re: [patch] Adjust gcc-plugin.h

2015-06-09 Thread Richard Biener

On Tue, Jun 9, 2015 at 2:32 PM, Andrew MacLeod  wrote:
> On 06/09/2015 03:56 AM, Richard Biener wrote:
>>
>> On Mon, Jun 8, 2015 at 7:52 PM, Andrew MacLeod 
>> wrote:
>>>
>>>
>>> the gcc source files need to see the internal bits in plugin.h, as well
>>> as
>>> the common decls in gcc-plugin.h.  So we could change the includes as you
>>> suggest, but we'd have to copy all the decls from gcc-inlcude.h to
>>> plugin.h
>>> so the gcc functions can see them.  And then the plugins would be exposed
>>> to
>>> all the internal APIs and decls present in plugins.h
>>
>> plugins are exposed to all internals of GCC anyway.  gcc-plugin.h should
>> really
>> just be a #include kitchen-sink.
>>
>>
>>>   Im presuming
>>> we didnt want to do that and thats why there were 2 files to start with.
>>
>> No, gcc-plugin.h was introduced to make the set of includes required
>> for plugins "stable".
>>
>> Richard.
>>
>>
> I didn't actually realize that.. at least its serving its true purpose now
> :-)
>
> So hows this.  Bootstrapping and testruns proceeding now, but it should have
> the same results as the other patch.
> Assuming no regressions, OK?

Ok.

Richard.

> Andrew
>
>
>

Re: [PATCH] Fix ix86_split_long_move collision handling with TLS (PR target/66470)

2015-06-09 Thread Jakub Jelinek

On Tue, Jun 09, 2015 at 02:32:07PM +0200, Uros Bizjak wrote:
> > - emit_insn (gen_rtx_SET (base, XEXP (part[1][0], 0)));
> > + addr = XEXP (part[1][0], 0);
> > + if (TARGET_TLS_DIRECT_SEG_REFS)
> > +   {
> > + struct ix86_address parts;
> > + int ok = ix86_decompose_address (addr, &parts);
> > + gcc_assert (ok);
> > + if (parts.seg == DEFAULT_TLS_SEG_REG)
> > +   {
> > + /* It is not valid to use %gs: or %fs: in
> > +lea though, so we need to remove it from the
> > +address used for lea and add it to each individual
> > +memory loads instead.  */
> > + addr = copy_rtx (addr);
> > + rtx *x = &addr;
> > + while (GET_CODE (*x) == PLUS)
> 
> Why not use RTX iterators here? IMO, it would be much more readable.

Do you mean something like this?  It is larger and don't see readability
advantages there at all (plus the 4.8/4.9 backports can't use that anyway).
But if you prefer it, I can retest it.
I still need to look for a PLUS and scan the individual operands of it,
because the desired change is that the PLUS is replaced with its operand
(one that isn't UNSPEC_TP).  And getting rid of the ix86_decompose_address
would mean either unconditionally performing (wasteful) copy_rtx, or
two RTX iterators cycles.  The PLUS it is looking for has to be
a toplevel PLUS or PLUS in XEXP (x, 0) of that (recursively), otherwise
ix86_decompose_address wouldn't recognize it.

2015-06-09  Jakub Jelinek  

PR target/66470
* config/i386/i386.c (ix86_split_long_move): For collisions
involving direct tls segment refs, move the UNSPEC_TP out of
the address for lea, to each of the memory loads.

* gcc.dg/tls/pr66470.c: New test.

--- gcc/config/i386/i386.c.jj   2015-06-08 15:41:19.0 +0200
+++ gcc/config/i386/i386.c  2015-06-09 14:42:18.357849227 +0200
@@ -22866,7 +22866,7 @@ ix86_split_long_move (rtx operands[])
 Do an lea to the last part and use only one colliding move.  */
   else if (collisions > 1)
{
- rtx base;
+ rtx base, addr, tls_base = NULL_RTX;
 
  collisions = 1;
 
@@ -22877,10 +22877,48 @@ ix86_split_long_move (rtx operands[])
  if (GET_MODE (base) != Pmode)
base = gen_rtx_REG (Pmode, REGNO (base));
 
- emit_insn (gen_rtx_SET (base, XEXP (part[1][0], 0)));
+ addr = XEXP (part[1][0], 0);
+ if (TARGET_TLS_DIRECT_SEG_REFS)
+   {
+ struct ix86_address parts;
+ int ok = ix86_decompose_address (addr, &parts);
+ gcc_assert (ok);
+ if (parts.seg == DEFAULT_TLS_SEG_REG)
+   {
+ /* It is not valid to use %gs: or %fs: in
+lea though, so we need to remove it from the
+address used for lea and add it to each individual
+memory loads instead.  */
+ addr = copy_rtx (addr);
+ subrtx_ptr_iterator::array_type array;
+ FOR_EACH_SUBRTX_PTR (iter, array, &addr, NONCONST)
+   {
+ rtx *x = *iter;
+ if (GET_CODE (*x) == PLUS)
+   {
+ for (i = 0; i < 2; i++)
+   if (GET_CODE (XEXP (*x, i)) == UNSPEC
+   && XINT (XEXP (*x, i), 1) == UNSPEC_TP)
+ {
+   tls_base = XEXP (*x, i);
+   *x = XEXP (*x, 1 - i);
+   break;
+ }
+ if (tls_base)
+   break;
+   }
+   }
+ gcc_assert (tls_base);
+   }
+   }
+ emit_insn (gen_rtx_SET (base, addr));
+ if (tls_base)
+   base = gen_rtx_PLUS (GET_MODE (base), base, tls_base);
  part[1][0] = replace_equiv_address (part[1][0], base);
  for (i = 1; i < nparts; i++)
{
+ if (tls_base)
+   base = copy_rtx (base);
  tmp = plus_constant (Pmode, base, UNITS_PER_WORD * i);
  part[1][i] = replace_equiv_address (part[1][i], tmp);
}
--- gcc/testsuite/gcc.dg/tls/pr66470.c.jj   2015-06-09 11:59:05.543954781 
+0200
+++ gcc/testsuite/gcc.dg/tls/pr66470.c  2015-06-09 11:58:43.0 +0200
@@ -0,0 +1,29 @@
+/* PR target/66470 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target tls } */
+
+extern __thread unsigned long long a[10];
+extern __thread struct S { int a, b; } b[10];
+
+unsigned long long
+foo (long x)
+{
+  return a[x];
+}
+
+struct S
+bar (long x)
+{
+  return b[x];
+}
+
+#ifdef __SIZEOF_INT128__
+extern __thread unsigned __int128 c

Re: [PING^2] [PATCH][5a/5] Postpone expanding va_arg until pass_stdarg

2015-06-09 Thread Tom de Vries


On 09/06/15 13:03, Richard Biener wrote:

On Tue, 9 Jun 2015, Alan Lawrence wrote:


Hmmm. One side effect of this is that the line number information available in
the target hook gimplify_va_arg_expr, is now just the name of the containing
function, rather than the specific use of va_arg. Is there some way to get
this more precise location (e.g. gimple_location(stmt) in expand_ifn_va_arg_1,
the only caller of said hook)? I don't really want to have to add an extra
parameter to the target hook...


The x86 variant doesn't use any locations but if then the caller of
the target hook (expand_ifn_va_arg_1) should assign the IFNs location
to all statements expanded from it (it could set input_location to
that during the target hook call...)



That seems to work.

The scan-assembler-not test in the testcase in attached patch:
- fails without the expand_ifn_va_arg_1 patch hunk, and
- passes with that hunk.

I'll put it through bootstrap and reg-test on x86_64.

OK for trunk if that goes well?

Thanks,
- Tom
Handle location in expand_ifn_va_arg_1

2015-06-09  Tom de Vries  

	* tree-stdarg.c (expand_ifn_va_arg_1): Handle location.

	* gcc.target/i386/vararg-loc.c: New test.
---
 gcc/testsuite/gcc.target/i386/vararg-loc.c | 27 +++
 gcc/tree-stdarg.c  |  4 
 2 files changed, 31 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/vararg-loc.c

diff --git a/gcc/testsuite/gcc.target/i386/vararg-loc.c b/gcc/testsuite/gcc.target/i386/vararg-loc.c
new file mode 100644
index 000..f236fe3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/vararg-loc.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-g -O0" } */
+
+#include 
+
+int/* 6.  */
+/* 7.  */
+f (int a, ...)			/* 8.  */
+/* 9.  */
+{
+
+  int sum = a;
+
+  va_list ap;
+
+  va_start (ap, a);
+
+  sum += va_arg (ap, int);	/* 18.  */
+
+  sum += va_arg (ap, int);	/* 20.  */
+
+  return sum;
+}
+
+/* { dg-final { scan-assembler-not "\\.loc 1 \[6789\] 0" } } */
+/* { dg-final { scan-assembler-times "\\.loc 1 18 0" 1 } } */
+/* { dg-final { scan-assembler-times "\\.loc 1 20 0" 1 } } */
diff --git a/gcc/tree-stdarg.c b/gcc/tree-stdarg.c
index 08d10b5..65fe9f9 100644
--- a/gcc/tree-stdarg.c
+++ b/gcc/tree-stdarg.c
@@ -1031,6 +1031,7 @@ expand_ifn_va_arg_1 (function *fun)
   bool modified = false;
   basic_block bb;
   gimple_stmt_iterator i;
+  location_t saved_location;
 
   FOR_EACH_BB_FN (bb, fun)
 for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
@@ -1051,6 +1052,8 @@ expand_ifn_va_arg_1 (function *fun)
 	ap = build_fold_indirect_ref (ap);
 
 	push_gimplify_context (false);
+	saved_location = input_location;
+	input_location = gimple_location (stmt);
 
 	/* Make it easier for the backends by protecting the valist argument
 	   from multiple evaluations.  */
@@ -1081,6 +1084,7 @@ expand_ifn_va_arg_1 (function *fun)
 	else
 	  gimplify_expr (&expr, &pre, &post, is_gimple_lvalue, fb_lvalue);
 
+	input_location = saved_location;
 	pop_gimplify_context (NULL);
 
 	gimple_seq_add_seq (&pre, post);
-- 
1.9.1

Re: [PING^2] [PATCH][5a/5] Postpone expanding va_arg until pass_stdarg

2015-06-09 Thread Richard Biener

On Tue, 9 Jun 2015, Tom de Vries wrote:

> On 09/06/15 13:03, Richard Biener wrote:
> > On Tue, 9 Jun 2015, Alan Lawrence wrote:
> > 
> > > Hmmm. One side effect of this is that the line number information
> > > available in
> > > the target hook gimplify_va_arg_expr, is now just the name of the
> > > containing
> > > function, rather than the specific use of va_arg. Is there some way to get
> > > this more precise location (e.g. gimple_location(stmt) in
> > > expand_ifn_va_arg_1,
> > > the only caller of said hook)? I don't really want to have to add an extra
> > > parameter to the target hook...
> > 
> > The x86 variant doesn't use any locations but if then the caller of
> > the target hook (expand_ifn_va_arg_1) should assign the IFNs location
> > to all statements expanded from it (it could set input_location to
> > that during the target hook call...)
> > 
> 
> That seems to work.
> 
> The scan-assembler-not test in the testcase in attached patch:
> - fails without the expand_ifn_va_arg_1 patch hunk, and
> - passes with that hunk.
> 
> I'll put it through bootstrap and reg-test on x86_64.
> 
> OK for trunk if that goes well?

Ok.

Thanks,
Richard.

> Thanks,
> - Tom
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham 
Norton, HRB 21284 (AG Nuernberg)

[Patch testsuite obvious] g++.dg/ext/pr57735.C should not run if the testsuite is explicitly passing -mfloat-abi=hard

2015-06-09 Thread James Greenhalgh


Hi,

g++.dg/ext/pr57735.C is failing for test runs which explicitly pass
-mfloat-abi=hard. Looking at the test, it seems the best fix would be
to check before adding -mfloat-abi=soft that we are not testing some other
float-abi. We also fail to check that it is OK to add -march=armv5te
and -marm.

Fixed using the same mechanisms we use elsewhere in the gcc.target/arm/
tests with the attached, applied as obvious as revision 224280.

Thanks,
James

---
gcc/testsuite/

2015-06-09  James Greenhalgh  

* g++.dg/ext/pr57735.C: Do not override -mfloat-abi directives
passed by the testsuite driver.

diff --git a/gcc/testsuite/g++.dg/ext/pr57735.C b/gcc/testsuite/g++.dg/ext/pr57735.C
index 0eb9500..a8f7d05 100644
--- a/gcc/testsuite/g++.dg/ext/pr57735.C
+++ b/gcc/testsuite/g++.dg/ext/pr57735.C
@@ -1,4 +1,7 @@
 /* { dg-do compile { target arm*-*-* } } */
+/* { dg-require-effective-target arm_arch_v5te_ok } */
+/* { dg-require-effective-target arm_arm_ok } */
+/* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } {"-mfloat-abi=soft" } } */
 /* { dg-options "-march=armv5te -marm  -mtune=xscale -mfloat-abi=soft -O1" } */
 
 typedef unsigned int size_t;

[PATCH] Fix PR bootstrap/66471

2015-06-09 Thread Martin Liška

Hello.

Following patch renames an enum values so that they do not clash with a MinGW 
reserved keyword.
Patch can bootstrap on MinGW and there's no regression seen on 
x86_64-linux-unknown-gnu.

Ready for trunk?
Thanks,
Martin
>From 740811e0af76af3e1e2f39f149c35bd49ef5fed3 Mon Sep 17 00:00:00 2001
From: mliska 
Date: Tue, 9 Jun 2015 13:27:53 +0200
Subject: [PATCH] Fix BITMAP identifier clash.

gcc/ChangeLog:

2015-06-09  Martin Liska  

	PR bootstrap/66471
	* mem-stats-traits.h (enum mem_alloc_origin): Add _ORIGIN suffix for
	all enum values in mem_alloc_origin.
	* alloc-pool.c (dump_alloc_pool_statistics): Use newly changed enum
	name.
	* alloc-pool.h (pool_allocator::pool_allocator): Likewise.
	* bitmap.c (bitmap_register): Likewise.
	(dump_bitmap_statistics): Likewise.
	* ggc-common.c (dump_ggc_loc_statistics): Likewise.
	(ggc_record_overhead): Likewise.
	* hash-map.h: Likewise.
	* hash-set.h: Likewise.
	* hash-table.c (void dump_hash_table_loc_statistics): Likewise.
	* hash-table.h: Likewise.
	* vec.c (vec_prefix::register_overhead): Likewise.
	(vec_prefix::release_overhead): Likewise.
	(dump_vec_loc_statistics): Likewise.
---
 gcc/alloc-pool.c   |  2 +-
 gcc/alloc-pool.h   |  2 +-
 gcc/bitmap.c   |  5 +++--
 gcc/ggc-common.c   |  4 ++--
 gcc/hash-map.h |  2 +-
 gcc/hash-set.h |  2 +-
 gcc/hash-table.c   |  2 +-
 gcc/hash-table.h   |  4 ++--
 gcc/mem-stats-traits.h | 14 +++---
 gcc/vec.c  |  8 +---
 10 files changed, 24 insertions(+), 21 deletions(-)

diff --git a/gcc/alloc-pool.c b/gcc/alloc-pool.c
index bdf184a..7e25915 100644
--- a/gcc/alloc-pool.c
+++ b/gcc/alloc-pool.c
@@ -33,5 +33,5 @@ dump_alloc_pool_statistics (void)
   if (! GATHER_STATISTICS)
 return;
 
-  pool_allocator_usage.dump (ALLOC_POOL);
+  pool_allocator_usage.dump (ALLOC_POOL_ORIGIN);
 }
diff --git a/gcc/alloc-pool.h b/gcc/alloc-pool.h
index af3adde..1785df5 100644
--- a/gcc/alloc-pool.h
+++ b/gcc/alloc-pool.h
@@ -232,7 +232,7 @@ pool_allocator::pool_allocator (const char *name, size_t num,
   m_elts_free (0), m_blocks_allocated (0), m_block_list (NULL),
   m_block_size (0), m_ignore_type_size (ignore_type_size),
   m_extra_size (extra_size), m_initialized (false),
-  m_location (ALLOC_POOL, false PASS_MEM_STAT) {}
+  m_location (ALLOC_POOL_ORIGIN, false PASS_MEM_STAT) {}
 
 /* Initialize a pool allocator.  */
 
diff --git a/gcc/bitmap.c b/gcc/bitmap.c
index d600ace..733c767 100644
--- a/gcc/bitmap.c
+++ b/gcc/bitmap.c
@@ -30,7 +30,8 @@ mem_alloc_description bitmap_mem_desc;
 void
 bitmap_register (bitmap b MEM_STAT_DECL)
 {
-  bitmap_mem_desc.register_descriptor (b, BITMAP, false FINAL_PASS_MEM_STAT);
+  bitmap_mem_desc.register_descriptor (b, BITMAP_ORIGIN, false
+   FINAL_PASS_MEM_STAT);
 }
 
 /* Account the overhead.  */
@@ -2076,7 +2077,7 @@ dump_bitmap_statistics (void)
   if (! GATHER_STATISTICS)
 return;
 
-  bitmap_mem_desc.dump (BITMAP);
+  bitmap_mem_desc.dump (BITMAP_ORIGIN);
 }
 
 DEBUG_FUNCTION void
diff --git a/gcc/ggc-common.c b/gcc/ggc-common.c
index 6021c2a..60d427f 100644
--- a/gcc/ggc-common.c
+++ b/gcc/ggc-common.c
@@ -977,7 +977,7 @@ dump_ggc_loc_statistics (bool final)
   ggc_force_collect = true;
   ggc_collect ();
 
-  ggc_mem_desc.dump (GGC, final ? ggc_usage::compare_final : NULL);
+  ggc_mem_desc.dump (GGC_ORIGIN, final ? ggc_usage::compare_final : NULL);
 
   ggc_force_collect = false;
 }
@@ -986,7 +986,7 @@ dump_ggc_loc_statistics (bool final)
 void
 ggc_record_overhead (size_t allocated, size_t overhead, void *ptr MEM_STAT_DECL)
 {
-  ggc_usage *usage = ggc_mem_desc.register_descriptor (ptr, GGC, false
+  ggc_usage *usage = ggc_mem_desc.register_descriptor (ptr, GGC_ORIGIN, false
 		   FINAL_PASS_MEM_STAT);
 
   ggc_mem_desc.register_object_overhead (usage, allocated + overhead, ptr);
diff --git a/gcc/hash-map.h b/gcc/hash-map.h
index f28f74f..2f1bca4 100644
--- a/gcc/hash-map.h
+++ b/gcc/hash-map.h
@@ -107,7 +107,7 @@ class GTY((user)) hash_map
 public:
   explicit hash_map (size_t n = 13, bool ggc = false,
 		 bool gather_mem_stats = true CXX_MEM_STAT_INFO)
-: m_table (n, ggc, gather_mem_stats, HASH_MAP PASS_MEM_STAT) {}
+: m_table (n, ggc, gather_mem_stats, HASH_MAP_ORIGIN PASS_MEM_STAT) {}
 
   /* Create a hash_map in ggc memory.  */
   static hash_map *create_ggc (size_t size, bool gather_mem_stats = true
diff --git a/gcc/hash-set.h b/gcc/hash-set.h
index af91e31..3ec0b15 100644
--- a/gcc/hash-set.h
+++ b/gcc/hash-set.h
@@ -179,7 +179,7 @@ class hash_set
 
 public:
   explicit hash_set (size_t n = 13, bool ggc = false CXX_MEM_STAT_INFO)
-: m_table (n, ggc, true, HASH_SET PASS_MEM_STAT) {}
+: m_table (n, ggc, true, HASH_SET_ORIGIN PASS_MEM_STAT) {}
 
   /* Create a hash_set in gc memory with space for at least n elements.  */
 
diff --git a/gcc/hash-table.c b/gcc/hash-table.c
index 012b241..a42b884 100644
--- a/gcc/hash-table.c
+++ b/gcc/hash-table.c
@@ -103,7 +103,7 @@ mem_alloc_descrip

Re: [PATCH] Fix ix86_split_long_move collision handling with TLS (PR target/66470)

2015-06-09 Thread Uros Bizjak

On Tue, Jun 9, 2015 at 2:57 PM, Jakub Jelinek  wrote:
> On Tue, Jun 09, 2015 at 02:32:07PM +0200, Uros Bizjak wrote:
>> > - emit_insn (gen_rtx_SET (base, XEXP (part[1][0], 0)));
>> > + addr = XEXP (part[1][0], 0);
>> > + if (TARGET_TLS_DIRECT_SEG_REFS)
>> > +   {
>> > + struct ix86_address parts;
>> > + int ok = ix86_decompose_address (addr, &parts);
>> > + gcc_assert (ok);
>> > + if (parts.seg == DEFAULT_TLS_SEG_REG)
>> > +   {
>> > + /* It is not valid to use %gs: or %fs: in
>> > +lea though, so we need to remove it from the
>> > +address used for lea and add it to each individual
>> > +memory loads instead.  */
>> > + addr = copy_rtx (addr);
>> > + rtx *x = &addr;
>> > + while (GET_CODE (*x) == PLUS)
>>
>> Why not use RTX iterators here? IMO, it would be much more readable.
>
> Do you mean something like this?  It is larger and don't see readability
> advantages there at all (plus the 4.8/4.9 backports can't use that anyway).
> But if you prefer it, I can retest it.

I'm afraid that simple scan loop won't work correctly on x32. There
are some issues with UNSPEC_TP for this target, so we have to generate
zero_extend of SImode UNSPEC, e.g.:

(plus:DI (zero_extend:DI (unspec:SI [...] UNSPEC_TP) (reg:DI ...))

as can be seen in get_thread_pointer to construct the address. It
looks that your loop won't find the UNSPEC_TP tag in the above case.

Uros.

Re: [PATCH] Yet another simple fix to enhance outer-loop vectorization.

2015-06-09 Thread Richard Biener

On Mon, Jun 8, 2015 at 12:27 PM, Yuri Rumyantsev  wrote:
> Hi All,
>
> Here is a simple fix which allows duplication of outer loops to
> perform peeling for number of iterations if outer loop is marked with
> pragma omp simd.
>
> Bootstrap and regression testing did not show any new failures.
> Is it OK for trunk?

Hmm, I don't remember needing to adjust rename_variables_in_bb
when teaching loop distibution to call slpeel_tree_duplicate_to_edge_cfg
on non-innermost loops...  (I just copied, I never called
slpeel_can_duplicate_loop_p though).

So - you should just remove the loop->inner condition from
slpeel_can_duplicate_loop_p as it is used by non-vectorizer
code as well (yeah, I never merged the nested loop support
for loop distribution...).

Index: tree-vect-loop.c
===
--- tree-vect-loop.c(revision 224100)
+++ tree-vect-loop.c(working copy)
@@ -1879,6 +1879,10 @@
   return false;
 }

+  /* Peeling for alignment is not supported for outer-loop vectorization.  */
+  if (LOOP_VINFO_LOOP (loop_vinfo)->inner)
+LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) = 0;

I think you can't simply do this - if vect_enhance_data_refs_alignment
decided to peel for alignment then it has adjusted the DRs alignment
info already.  So instead of the above simply disallow peeling for
alignment in vect_enhance_data_refs_alignment?  Thus add
|| ->inner to

  /* Check if we can possibly peel the loop.  */
  if (!vect_can_advance_ivs_p (loop_vinfo)
  || !slpeel_can_duplicate_loop_p (loop, single_exit (loop)))
do_peeling = false;

?

I also can't see why the improvement has to be gated on force_vect,
it surely looks profitable to enable more outer loop vectorization in
general, no?

How do the cost model calculations end up with peeling the outer loop
for niter?

On targets which don't support unaligned accesses we're left with
versioning for alignment.  Isn't peeling for alignment better there?
Thus only disallow peeling for alignment if there is no unhandled
alignment?

Thanks,
Richard.

> ChangeLog:
>
> 2015-06-08  Yuri Rumyantsev  
>
> * tree-vect-loop-manip.c (rename_variables_in_bb): Add argument
> to allow renaming of PHI arguments on edges incoming from outer
> loop header, add corresponding check before start PHI iterator.
> (slpeel_tree_duplicate_loop_to_edge_cfg): Introduce new bool
> variable DUPLICATE_OUTER_LOOP and set it to true for outer loops
> with true force_vectorize.  Set-up dominator for outer loop too.
> Pass DUPLICATE_OUTER_LOOP as argument to rename_variables_in_bb.
> (slpeel_can_duplicate_loop_p): Allow duplicate of outer loop if it
> was marked with force_vectorize and has restricted cfg.
> * tre-vect-loop.c (vect_analyze_loop_2): Prohibit alignment peeling
> for outer loops.
>
> gcc/testsuite/ChangeLog:
> * gcc.dg/vect/vect-outer-simd-2.c: New test.

Re: [PATCH] Fix PR bootstrap/66471

2015-06-09 Thread Richard Biener

On Tue, Jun 9, 2015 at 3:11 PM, Martin Liška  wrote:
> Hello.
>
> Following patch renames an enum values so that they do not clash with a MinGW 
> reserved keyword.

Reserved keyword as in a #define?  Thus simply #undef it in system.h?

> Patch can bootstrap on MinGW and there's no regression seen on 
> x86_64-linux-unknown-gnu.

Does c++04 have scoped enums?  Thus can you use mem_alloc_origin::BITMAP?

Thanks,
Richard.

> Ready for trunk?
> Thanks,
> Martin

Re: [PING^2] [PATCH][5a/5] Postpone expanding va_arg until pass_stdarg

2015-06-09 Thread Alan Lawrence


Tom de Vries wrote:

On 09/06/15 13:03, Richard Biener wrote:

On Tue, 9 Jun 2015, Alan Lawrence wrote:


Hmmm. One side effect of this is that the line number information available in
the target hook gimplify_va_arg_expr, is now just the name of the containing
function, rather than the specific use of va_arg. Is there some way to get
this more precise location (e.g. gimple_location(stmt) in expand_ifn_va_arg_1,
the only caller of said hook)? I don't really want to have to add an extra
parameter to the target hook...

The x86 variant doesn't use any locations but if then the caller of
the target hook (expand_ifn_va_arg_1) should assign the IFNs location
to all statements expanded from it (it could set input_location to
that during the target hook call...)



That seems to work.

The scan-assembler-not test in the testcase in attached patch:
- fails without the expand_ifn_va_arg_1 patch hunk, and
- passes with that hunk.

I'll put it through bootstrap and reg-test on x86_64.

OK for trunk if that goes well?

Thanks,
- Tom


That fixes the issue for me (I'm working with on a patch which uses the line 
info in error output) - thank you very much!


Cheers, Alan

Re: [PATCH] Fix PR bootstrap/66471

2015-06-09 Thread İsmail Dönmez

Hi,

On Tue, Jun 9, 2015 at 4:28 PM, Richard Biener
 wrote:
> On Tue, Jun 9, 2015 at 3:11 PM, Martin Liška  wrote:
>> Hello.
>>
>> Following patch renames an enum values so that they do not clash with a 
>> MinGW reserved keyword.
>
> Reserved keyword as in a #define?  Thus simply #undef it in system.h?

No, wingdi.h has the following

  typedef struct tagBITMAP {
LONG bmType;
LONG bmWidth;
LONG bmHeight;
LONG bmWidthBytes;
WORD bmPlanes;
WORD bmBitsPixel;
LPVOID bmBits;
  } BITMAP,*PBITMAP,*NPBITMAP,*LPBITMAP;

Re: [PATCH] Fix PR bootstrap/66471

2015-06-09 Thread Richard Biener

On Tue, Jun 9, 2015 at 3:30 PM, İsmail Dönmez  wrote:
> Hi,
>
> On Tue, Jun 9, 2015 at 4:28 PM, Richard Biener
>  wrote:
>> On Tue, Jun 9, 2015 at 3:11 PM, Martin Liška  wrote:
>>> Hello.
>>>
>>> Following patch renames an enum values so that they do not clash with a 
>>> MinGW reserved keyword.
>>
>> Reserved keyword as in a #define?  Thus simply #undef it in system.h?
>
> No, wingdi.h has the following
>
>   typedef struct tagBITMAP {
> LONG bmType;
> LONG bmWidth;
> LONG bmHeight;
> LONG bmWidthBytes;
> WORD bmPlanes;
> WORD bmBitsPixel;
> LPVOID bmBits;
>   } BITMAP,*PBITMAP,*NPBITMAP,*LPBITMAP;

Hmm. I see.

Patch is ok.

Thanks,
Richard.

Re: [PATCH] Fix ix86_split_long_move collision handling with TLS (PR target/66470)

2015-06-09 Thread Jakub Jelinek

On Tue, Jun 09, 2015 at 03:21:55PM +0200, Uros Bizjak wrote:
> I'm afraid that simple scan loop won't work correctly on x32. There
> are some issues with UNSPEC_TP for this target, so we have to generate
> zero_extend of SImode UNSPEC, e.g.:
> 
> (plus:DI (zero_extend:DI (unspec:SI [...] UNSPEC_TP) (reg:DI ...))
> 
> as can be seen in get_thread_pointer to construct the address. It
> looks that your loop won't find the UNSPEC_TP tag in the above case.

You're right, for -m32 it would need to start with
   rtx *x = &addr;
+  while (GET_CODE (*x) == ZERO_EXTEND
+|| GET_CODE (*x) == AND
+|| GET_CODE (*x) == SUBREG)
+x = &XEXP (*x, 0);
to get at the PLUS.  Now, with either the original patch with the above
ammendment, or with the iterators, the question is what to do
with the UNSPEC_TP for -m32.  Is it ok to just use addr32 on the
lea and use normal Pmode (== DImode) addressing for the memory reads
(if I read the code well, that is what it does right now for the non-TLS
ones:
  if (GET_MODE (base) != Pmode)
base = gen_rtx_REG (Pmode, REGNO (base));
)?  Then we'd need to change tls_base mode to Pmode.

Jakub

Re: [PATCH] Optimize (CST1 << A) == CST2 (PR tree-optimization/66299)

2015-06-09 Thread Marek Polacek

On Tue, Jun 09, 2015 at 09:53:21AM +0200, Richard Biener wrote:
> >> +/* (CST1 << A) == CST2 -> A == ctz (CST2) - ctz (CST1)
> >> +   (CST1 << A) != CST2 -> A != ctz (CST2) - ctz (CST1)
> >> +   if CST2 != 0.  */
> >> +(for cmp (ne eq)
> >> + (simplify
> >> +  (cmp (lshift INTEGER_CST@0 @1) INTEGER_CST@2)
> >> +  (with {
> >> +   unsigned int cand = wi::ctz (@2) - wi::ctz (@0); }
> >> +  (if (!integer_zerop (@2)
> >
> >
> > You can probably use directly wi::ne_p (@2, 0) here. Shouldn't this be
> > indented one space more?
> 
> Yes, one space more.  I suppose using integer_zerop might in theory
> allow for handling vector shifts at some point ...?
 
Fixed the formatting.  But I kept the integer_zerop.

> >> +   && wi::eq_p (wi::lshift (@0, cand), @2))
> >> +   (cmp @1 { build_int_cst (TREE_TYPE (@1), cand); })
> >
> >
> > Making 'cand' signed, you could return 0 when cand<0, like (2< > could also return 0 when the candidate turns out not to work: (3< 
> Sounds like a good improvement.
 
Yeah, it makes sense to do that while I'm on this.  Done, and a new
test added.

> Otherwise the patch looks ok to me as well - mind doing the improvement above?

Thank you both.  How does this look now?

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2015-06-09  Marek Polacek  

PR tree-optimization/66299
* match.pd ((CST1 << A) == CST2 -> A == ctz (CST2) - ctz (CST1)
((CST1 << A) != CST2 -> A != ctz (CST2) - ctz (CST1)): New
patterns.

* gcc.dg/pr66299-1.c: New test.
* gcc.dg/pr66299-2.c: New test.
* gcc.dg/pr66299-3.c: New test.

diff --git gcc/match.pd gcc/match.pd
index abd7851..560b218 100644
--- gcc/match.pd
+++ gcc/match.pd
@@ -676,6 +676,21 @@ along with GCC; see the file COPYING3.  If not see
   (cmp (bit_and (lshift integer_onep @0) integer_onep) integer_zerop)
   (icmp @0 { build_zero_cst (TREE_TYPE (@0)); })))
 
+/* (CST1 << A) == CST2 -> A == ctz (CST2) - ctz (CST1)
+   (CST1 << A) != CST2 -> A != ctz (CST2) - ctz (CST1)
+   if CST2 != 0.  */
+(for cmp (ne eq)
+ (simplify
+  (cmp (lshift INTEGER_CST@0 @1) INTEGER_CST@2)
+  (with { int cand = wi::ctz (@2) - wi::ctz (@0); }
+   (if (cand < 0
+   || (!integer_zerop (@2)
+   && wi::ne_p (wi::lshift (@0, cand), @2)))
+{ constant_boolean_node (cmp == NE_EXPR, type); })
+   (if (!integer_zerop (@2)
+   && wi::eq_p (wi::lshift (@0, cand), @2))
+(cmp @1 { build_int_cst (TREE_TYPE (@1), cand); })
+
 /* Simplifications of conversions.  */
 
 /* Basic strip-useless-type-conversions / strip_nops.  */
diff --git gcc/testsuite/gcc.dg/pr66299-1.c gcc/testsuite/gcc.dg/pr66299-1.c
index e69de29..e75146b 100644
--- gcc/testsuite/gcc.dg/pr66299-1.c
+++ gcc/testsuite/gcc.dg/pr66299-1.c
@@ -0,0 +1,92 @@
+/* PR tree-optimization/66299 */
+/* { dg-do run } */
+/* { dg-options "-fdump-tree-original" } */
+
+void
+test1 (int x)
+{
+  if ((0 << x) != 0
+  || (1 << x) != 2
+  || (2 << x) != 4
+  || (3 << x) != 6
+  || (4 << x) != 8
+  || (5 << x) != 10
+  || (6 << x) != 12
+  || (7 << x) != 14
+  || (8 << x) != 16
+  || (9 << x) != 18
+  || (10 << x) != 20)
+__builtin_abort ();
+}
+
+void
+test2 (int x)
+{
+  if (!((0 << x) == 0
+   && (1 << x) == 4
+   && (2 << x) == 8
+   && (3 << x) == 12
+   && (4 << x) == 16
+   && (5 << x) == 20
+   && (6 << x) == 24
+   && (7 << x) == 28
+   && (8 << x) == 32
+   && (9 << x) == 36
+   && (10 << x) == 40))
+__builtin_abort ();
+}
+
+void
+test3 (unsigned int x)
+{
+  if ((0U << x) != 0U
+  || (1U << x) != 16U
+  || (2U << x) != 32U
+  || (3U << x) != 48U
+  || (4U << x) != 64U
+  || (5U << x) != 80U
+  || (6U << x) != 96U
+  || (7U << x) != 112U
+  || (8U << x) != 128U
+  || (9U << x) != 144U
+  || (10U << x) != 160U)
+__builtin_abort ();
+}
+
+void
+test4 (unsigned int x)
+{
+  if (!((0U << x) == 0U
+   || (1U << x) == 8U
+   || (2U << x) == 16U
+   || (3U << x) == 24U
+   || (4U << x) == 32U
+   || (5U << x) == 40U
+   || (6U << x) == 48U
+   || (7U << x) == 56U
+   || (8U << x) == 64U
+   || (9U << x) == 72U
+   || (10U << x) == 80U))
+__builtin_abort ();
+}
+
+void
+test5 (int x)
+{
+  if ((0 << x) == 1
+  || (0 << x) != 0
+  || (0x8001U << x) != 0x2U)
+__builtin_abort ();
+}
+
+int
+main (void)
+{
+  test1 (1);
+  test2 (2);
+  test3 (4U);
+  test4 (3U);
+  test5 (17);
+}
+
+/* { dg-final { scan-tree-dump-not "<<" "original" } } */
diff --git gcc/testsuite/gcc.dg/pr66299-2.c gcc/testsuite/gcc.dg/pr66299-2.c
index e69de29..45e9218 100644
--- gcc/testsuite/gcc.dg/pr66299-2.c
+++ gcc/testsuite/gcc.dg/pr66299-2.c
@@ -0,0 +1,33 @@
+/* PR tree-optimization/66299 */
+/* { dg-do run } */
+/* { dg-options "-fdump-tree-optimized -O" } */
+
+void
+test1 (int x, unsigned u)
+{
+  if ((1U << x) != 64
+  || (2 << x) != u
+  || (x << x) != 384
+  || (3 << x) ==

Re: [C++/58583] ICE instantiating NSDMIs

2015-06-09 Thread Nathan Sidwell


On 06/08/15 13:47, Jason Merrill wrote:

How about using a DECL_LANG_FLAG instead of creating a garbage DEFAULT_ARG?


This patch uses DECL_LANG_FLAG_2 on the FIELD_DECL to mark that its NSDMI is 
being instantiated.  I also discovered I'd flubbed the markup on the testcase, 
not sure why I didn't notice that in the test results.  I've double checked this 
time, and all looks good.


built & tested on x86_64-linux-gnu

nathan
2015-06-09  Nathan Sidwell  

	cp/
	PR c++/58583
	* cp-tree.h (DECL_INSTANTIATING_NSDMI_P): New.
	* init.c (get_nsdmi): Check for DEFAULT_ARG in template case and
	protect it from recursive instantiation.

	testsuite/
	PR c++/58583
	* g++.dg/cpp0x/nsdmi-template14.C: New test.

Index: cp/cp-tree.h
===
--- cp/cp-tree.h	(revision 224198)
+++ cp/cp-tree.h	(working copy)
@@ -160,6 +160,7 @@ c-common.h, not after.
   LABEL_DECL_CONTINUE (in LABEL_DECL)
2: DECL_THIS_EXTERN (in VAR_DECL or FUNCTION_DECL).
   DECL_IMPLICIT_TYPEDEF_P (in a TYPE_DECL)
+  DECL_INSTANTIATING_NSDMI_P (in a FIELD_DECL)
3: DECL_IN_AGGR_P.
4: DECL_C_BIT_FIELD (in a FIELD_DECL)
   DECL_ANON_UNION_VAR_P (in a VAR_DECL)
@@ -3785,6 +3786,11 @@ more_aggr_init_expr_args_p (const aggr_i
 #define DECL_ARRAY_PARAMETER_P(NODE) \
   DECL_LANG_FLAG_1 (PARM_DECL_CHECK (NODE))
 
+/* Nonzero for a FIELD_DECL who's NSMDI is currently being
+   instantiated.  */
+#define DECL_INSTANTIATING_NSDMI_P(NODE) \
+  DECL_LANG_FLAG_2 (FIELD_DECL_CHECK (NODE))
+
 /* Nonzero for FIELD_DECL node means that this field is a base class
of the parent object, as opposed to a member field.  */
 #define DECL_FIELD_IS_BASE(NODE) \
Index: cp/init.c
===
--- cp/init.c	(revision 224198)
+++ cp/init.c	(working copy)
@@ -544,6 +544,7 @@ get_nsdmi (tree member, bool in_ctor)
   tree init;
   tree save_ccp = current_class_ptr;
   tree save_ccr = current_class_ref;
+  
   if (!in_ctor)
 {
   /* Use a PLACEHOLDER_EXPR when we don't have a 'this' parameter to
@@ -551,22 +552,40 @@ get_nsdmi (tree member, bool in_ctor)
   current_class_ref = build0 (PLACEHOLDER_EXPR, DECL_CONTEXT (member));
   current_class_ptr = build_address (current_class_ref);
 }
+
   if (DECL_LANG_SPECIFIC (member) && DECL_TEMPLATE_INFO (member))
 {
-  /* Do deferred instantiation of the NSDMI.  */
-  init = (tsubst_copy_and_build
-	  (DECL_INITIAL (DECL_TI_TEMPLATE (member)),
-	   DECL_TI_ARGS (member),
-	   tf_warning_or_error, member, /*function_p=*/false,
-	   /*integral_constant_expression_p=*/false));
+  init = DECL_INITIAL (DECL_TI_TEMPLATE (member));
+  if (TREE_CODE (init) == DEFAULT_ARG)
+	goto unparsed;
 
-  init = digest_nsdmi_init (member, init);
+  /* Check recursive instantiation.  */
+  if (DECL_INSTANTIATING_NSDMI_P (member))
+	{
+	  error ("recursive instantiation of non-static data member "
+		 "initializer for %qD", member);
+	  init = error_mark_node;
+	}
+  else
+	{
+	  DECL_INSTANTIATING_NSDMI_P (member) = 1;
+	  
+	  /* Do deferred instantiation of the NSDMI.  */
+	  init = (tsubst_copy_and_build
+		  (init, DECL_TI_ARGS (member),
+		   tf_warning_or_error, member, /*function_p=*/false,
+		   /*integral_constant_expression_p=*/false));
+	  init = digest_nsdmi_init (member, init);
+	  
+	  DECL_INSTANTIATING_NSDMI_P (member) = 0;
+	}
 }
   else
 {
   init = DECL_INITIAL (member);
   if (init && TREE_CODE (init) == DEFAULT_ARG)
 	{
+	unparsed:
 	  error ("constructor required before non-static data member "
 		 "for %qD has been parsed", member);
 	  DECL_INITIAL (member) = error_mark_node;
Index: testsuite/g++.dg/cpp0x/nsdmi-template14.C
===
--- testsuite/g++.dg/cpp0x/nsdmi-template14.C	(revision 0)
+++ testsuite/g++.dg/cpp0x/nsdmi-template14.C	(working copy)
@@ -0,0 +1,22 @@
+// PR c++/58583
+// { dg-do compile { target c++11 } }
+
+template struct A // { dg-error "has been parsed" }
+{
+  int i = (A<0>(), 0); // { dg-error "has been parsed" }
+};
+
+template struct B
+{
+  B* p = new B;
+};
+
+B<1> x; // { dg-error "recursive instantiation of non-static data" }
+
+struct C
+{
+  template struct D
+  {
+D* p = new D<0>;
+  };
+};

Re: [C++ Patch] PR 65815

2015-06-09 Thread Jason Merrill


OK, thanks.

Jason

Re: [PATCH] Fix ix86_split_long_move collision handling with TLS (PR target/66470)

2015-06-09 Thread Uros Bizjak

On Tue, Jun 9, 2015 at 3:39 PM, Jakub Jelinek  wrote:
> On Tue, Jun 09, 2015 at 03:21:55PM +0200, Uros Bizjak wrote:
>> I'm afraid that simple scan loop won't work correctly on x32. There
>> are some issues with UNSPEC_TP for this target, so we have to generate
>> zero_extend of SImode UNSPEC, e.g.:
>>
>> (plus:DI (zero_extend:DI (unspec:SI [...] UNSPEC_TP) (reg:DI ...))
>>
>> as can be seen in get_thread_pointer to construct the address. It
>> looks that your loop won't find the UNSPEC_TP tag in the above case.
>
> You're right, for -m32 it would need to start with
>rtx *x = &addr;
> +  while (GET_CODE (*x) == ZERO_EXTEND
> +|| GET_CODE (*x) == AND
> +|| GET_CODE (*x) == SUBREG)
> +x = &XEXP (*x, 0);
> to get at the PLUS.  Now, with either the original patch with the above
> ammendment, or with the iterators, the question is what to do
> with the UNSPEC_TP for -m32.  Is it ok to just use addr32 on the
> lea and use normal Pmode (== DImode) addressing for the memory reads
> (if I read the code well, that is what it does right now for the non-TLS
> ones:
>   if (GET_MODE (base) != Pmode)
> base = gen_rtx_REG (Pmode, REGNO (base));
> )?  Then we'd need to change tls_base mode to Pmode.

(I assume you referred to -mx32 above. Please also note that -mx32
uses Pmode == SImode by default).

Yes, please emit zero-extended address load to a DImode register, and
use this register with UNSPEC_TP to form final DImode address. The LEA
that will be generated from address load will use DImode inputs due to
%H operand modifier and will output to (implicitly zero-extended)
SImode register.

FYI, you just hit two special cases here:
- addresses that mention %fs and %gs are always in DImode
- LEA input operands are also always in DImode, but compiler already
takes care of it.

Uros.

Re: [PATCH] Optimize (CST1 << A) == CST2 (PR tree-optimization/66299)

2015-06-09 Thread Richard Biener

On Tue, 9 Jun 2015, Marek Polacek wrote:

> On Tue, Jun 09, 2015 at 09:53:21AM +0200, Richard Biener wrote:
> > >> +/* (CST1 << A) == CST2 -> A == ctz (CST2) - ctz (CST1)
> > >> +   (CST1 << A) != CST2 -> A != ctz (CST2) - ctz (CST1)
> > >> +   if CST2 != 0.  */
> > >> +(for cmp (ne eq)
> > >> + (simplify
> > >> +  (cmp (lshift INTEGER_CST@0 @1) INTEGER_CST@2)
> > >> +  (with {
> > >> +   unsigned int cand = wi::ctz (@2) - wi::ctz (@0); }
> > >> +  (if (!integer_zerop (@2)
> > >
> > >
> > > You can probably use directly wi::ne_p (@2, 0) here. Shouldn't this be
> > > indented one space more?
> > 
> > Yes, one space more.  I suppose using integer_zerop might in theory
> > allow for handling vector shifts at some point ...?
>  
> Fixed the formatting.  But I kept the integer_zerop.
> 
> > >> +   && wi::eq_p (wi::lshift (@0, cand), @2))
> > >> +   (cmp @1 { build_int_cst (TREE_TYPE (@1), cand); })
> > >
> > >
> > > Making 'cand' signed, you could return 0 when cand<0, like (2< > > could also return 0 when the candidate turns out not to work: (3< > 
> > Sounds like a good improvement.
>  
> Yeah, it makes sense to do that while I'm on this.  Done, and a new
> test added.
> 
> > Otherwise the patch looks ok to me as well - mind doing the improvement 
> > above?
> 
> Thank you both.  How does this look now?
> 
> Bootstrapped/regtested on x86_64-linux, ok for trunk?

Ok.

Thanks,
Richard.

> 2015-06-09  Marek Polacek  
> 
>   PR tree-optimization/66299
>   * match.pd ((CST1 << A) == CST2 -> A == ctz (CST2) - ctz (CST1)
>   ((CST1 << A) != CST2 -> A != ctz (CST2) - ctz (CST1)): New
>   patterns.
> 
>   * gcc.dg/pr66299-1.c: New test.
>   * gcc.dg/pr66299-2.c: New test.
>   * gcc.dg/pr66299-3.c: New test.
> 
> diff --git gcc/match.pd gcc/match.pd
> index abd7851..560b218 100644
> --- gcc/match.pd
> +++ gcc/match.pd
> @@ -676,6 +676,21 @@ along with GCC; see the file COPYING3.  If not see
>(cmp (bit_and (lshift integer_onep @0) integer_onep) integer_zerop)
>(icmp @0 { build_zero_cst (TREE_TYPE (@0)); })))
>  
> +/* (CST1 << A) == CST2 -> A == ctz (CST2) - ctz (CST1)
> +   (CST1 << A) != CST2 -> A != ctz (CST2) - ctz (CST1)
> +   if CST2 != 0.  */
> +(for cmp (ne eq)
> + (simplify
> +  (cmp (lshift INTEGER_CST@0 @1) INTEGER_CST@2)
> +  (with { int cand = wi::ctz (@2) - wi::ctz (@0); }
> +   (if (cand < 0
> + || (!integer_zerop (@2)
> + && wi::ne_p (wi::lshift (@0, cand), @2)))
> +{ constant_boolean_node (cmp == NE_EXPR, type); })
> +   (if (!integer_zerop (@2)
> + && wi::eq_p (wi::lshift (@0, cand), @2))
> +(cmp @1 { build_int_cst (TREE_TYPE (@1), cand); })
> +
>  /* Simplifications of conversions.  */
>  
>  /* Basic strip-useless-type-conversions / strip_nops.  */
> diff --git gcc/testsuite/gcc.dg/pr66299-1.c gcc/testsuite/gcc.dg/pr66299-1.c
> index e69de29..e75146b 100644
> --- gcc/testsuite/gcc.dg/pr66299-1.c
> +++ gcc/testsuite/gcc.dg/pr66299-1.c
> @@ -0,0 +1,92 @@
> +/* PR tree-optimization/66299 */
> +/* { dg-do run } */
> +/* { dg-options "-fdump-tree-original" } */
> +
> +void
> +test1 (int x)
> +{
> +  if ((0 << x) != 0
> +  || (1 << x) != 2
> +  || (2 << x) != 4
> +  || (3 << x) != 6
> +  || (4 << x) != 8
> +  || (5 << x) != 10
> +  || (6 << x) != 12
> +  || (7 << x) != 14
> +  || (8 << x) != 16
> +  || (9 << x) != 18
> +  || (10 << x) != 20)
> +__builtin_abort ();
> +}
> +
> +void
> +test2 (int x)
> +{
> +  if (!((0 << x) == 0
> + && (1 << x) == 4
> + && (2 << x) == 8
> + && (3 << x) == 12
> + && (4 << x) == 16
> + && (5 << x) == 20
> + && (6 << x) == 24
> + && (7 << x) == 28
> + && (8 << x) == 32
> + && (9 << x) == 36
> + && (10 << x) == 40))
> +__builtin_abort ();
> +}
> +
> +void
> +test3 (unsigned int x)
> +{
> +  if ((0U << x) != 0U
> +  || (1U << x) != 16U
> +  || (2U << x) != 32U
> +  || (3U << x) != 48U
> +  || (4U << x) != 64U
> +  || (5U << x) != 80U
> +  || (6U << x) != 96U
> +  || (7U << x) != 112U
> +  || (8U << x) != 128U
> +  || (9U << x) != 144U
> +  || (10U << x) != 160U)
> +__builtin_abort ();
> +}
> +
> +void
> +test4 (unsigned int x)
> +{
> +  if (!((0U << x) == 0U
> + || (1U << x) == 8U
> + || (2U << x) == 16U
> + || (3U << x) == 24U
> + || (4U << x) == 32U
> + || (5U << x) == 40U
> + || (6U << x) == 48U
> + || (7U << x) == 56U
> + || (8U << x) == 64U
> + || (9U << x) == 72U
> + || (10U << x) == 80U))
> +__builtin_abort ();
> +}
> +
> +void
> +test5 (int x)
> +{
> +  if ((0 << x) == 1
> +  || (0 << x) != 0
> +  || (0x8001U << x) != 0x2U)
> +__builtin_abort ();
> +}
> +
> +int
> +main (void)
> +{
> +  test1 (1);
> +  test2 (2);
> +  test3 (4U);
> +  test4 (3U);
> +  test5 (17);
> +}
> +
> +/* { dg-final { scan-tree-dump-not "<<" "original" } } */
> diff --git gcc/testsuite/gcc.dg/pr66299-2.c gcc/testsuite/gcc.dg/pr66299-2.c
> i

[PATCH] Remove restrictions on group gaps in SLP

2015-06-09 Thread Richard Biener


The following patch does $subject by fixing the bogus removal of
load-permutations.  This means gaps are now handled by the
permutation support (well, if the required permutation is supported).

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2015-06-09  Richard Biener  

* tree-vect-slp.c (vect_build_slp_tree_1): Remove bailout on gaps.
(vect_analyze_slp_instance): Instead do not falsely drop
load permutations.

Index: gcc/tree-vect-slp.c
===
*** gcc/tree-vect-slp.c (revision 224271)
--- gcc/tree-vect-slp.c (working copy)
*** vect_build_slp_tree_1 (loop_vec_info loo
*** 762,794 
  else
{
  /* Load.  */
- unsigned unrolling_factor
-   = least_common_multiple
-   (*max_nunits, group_size) / group_size;
-   /* FORNOW: Check that there is no gap between the loads
-and no gap between the groups when we need to load
-multiple groups at once.  */
-   if (unrolling_factor > 1
- && ((GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt)) == stmt
-  && GROUP_GAP (vinfo_for_stmt (stmt)) != 0)
- /* If the group is split up then GROUP_GAP
-isn't correct here, nor is GROUP_FIRST_ELEMENT.  */
- || GROUP_SIZE (vinfo_for_stmt (stmt)) > group_size))
- {
-   if (dump_enabled_p ())
- {
-   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
-  "Build SLP failed: grouped "
-  "loads have gaps ");
-   dump_gimple_stmt (MSG_MISSED_OPTIMIZATION, TDF_SLIM,
-   stmt, 0);
-   dump_printf (MSG_MISSED_OPTIMIZATION, "\n");
- }
- /* Fatal mismatch.  */
- matches[0] = false;
-   return false;
- }
- 
/* Check that the size of interleaved loads group is not
   greater than the SLP group size.  */
  unsigned ncopies
--- 762,767 
*** vect_analyze_slp_instance (loop_vec_info
*** 1846,1852 
this_load_permuted = true;
  load_permutation.safe_push (load_place);
}
! if (!this_load_permuted)
{
  load_permutation.release ();
  continue;
--- 1834,1846 
this_load_permuted = true;
  load_permutation.safe_push (load_place);
}
! if (!this_load_permuted
! /* The load requires permutation when unrolling exposes
!a gap either because the group is larger than the SLP
!group-size or because there is a gap between the groups.  */
! && (unrolling_factor == 1
! || (group_size == GROUP_SIZE (vinfo_for_stmt (first_stmt))
! && GROUP_GAP (vinfo_for_stmt (first_stmt)) == 0)))
{
  load_permutation.release ();
  continue;

Re: [PING^2][PATCH][3/3][PR65460] Mark offloaded functions as parallelized

2015-06-09 Thread Tom de Vries


On 09/06/15 13:07, Richard Biener wrote:

On Mon, 8 Jun 2015, Tom de Vries wrote:


On 17/04/15 12:08, Tom de Vries wrote:

On 20-03-15 12:38, Tom de Vries wrote:

On 19-03-15 12:05, Tom de Vries wrote:

On 18-03-15 18:22, Tom de Vries wrote:

Hi,

this patch fixes PR65460.

The patch marks offloaded functions as parallelized, which means the
parloops
pass no longer attempts to modify that function.


Updated patch to postpone mark_parallelized_function until the
corresponding
cgraph_node is available, to ensure it works with the updated
mark_parallelized_function from patch 2/3.



Updated to eliminate mark_parallelized_function.

Bootstrapped and reg-tested on x86_64.

OK for stage4?



ping.


ping^2. Original post at
https://gcc.gnu.org/ml/gcc-patches/2015-03/msg01063.html .


Ok, but shouldn't it be set before calling add_new_function as
add_new_function might run passes that wouldn't identify the
function as parallelized?



Hm, indeed sometimes add_new_function executes some passes itself, 
besides queueing the function for further processing. I suppose the 
existing settings of parallelized_function should be modified in a 
similar way.


I'll bootstrap and reg-test attached two patches on x86_64, and commit 
unless objections.


Thanks,
- Tom
Mark function parallelized_function before add_new_function

2015-06-09  Tom de Vries  

	* omp-low.c (finalize_task_copyfn, expand_omp_taskreg): Mark function
	parallelized_function before add_new_function.
---
 gcc/omp-low.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index f322416..2045e48 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -1552,8 +1552,9 @@ finalize_task_copyfn (gomp_task *task_stmt)
   pop_cfun ();
 
   /* Inform the callgraph about the new function.  */
+  cgraph_node *node = cgraph_node::get_create (child_fn);
+  node->parallelized_function = 1;
   cgraph_node::add_new_function (child_fn, false);
-  cgraph_node::get (child_fn)->parallelized_function = 1;
 }
 
 /* Destroy a omp_context data structures.  Called through the splay tree
@@ -5589,8 +5590,9 @@ expand_omp_taskreg (struct omp_region *region)
 
   /* Inform the callgraph about the new function.  */
   DECL_STRUCT_FUNCTION (child_fn)->curr_properties = cfun->curr_properties;
+  cgraph_node *node = cgraph_node::get_create (child_fn);
+  node->parallelized_function = 1;
   cgraph_node::add_new_function (child_fn, true);
-  cgraph_node::get (child_fn)->parallelized_function = 1;
 
   /* Fix the callgraph edges for child_cfun.  Those for cfun will be
 	 fixed in a following pass.  */
-- 
1.9.1

Mark offloaded functions as parallelized

2015-06-09  Tom de Vries  

	PR tree-optimization/65460
	* omp-low.c (expand_omp_target): Set parallelized_function on
	cgraph_node for child_fn.
---
 gcc/omp-low.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 2045e48..77716bf6 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -8959,6 +8959,8 @@ expand_omp_target (struct omp_region *region)
 
   /* Inform the callgraph about the new function.  */
   DECL_STRUCT_FUNCTION (child_fn)->curr_properties = cfun->curr_properties;
+  cgraph_node *node = cgraph_node::get_create (child_fn);
+  node->parallelized_function = 1;
   cgraph_node::add_new_function (child_fn, true);
 
 #ifdef ENABLE_OFFLOADING
-- 
1.9.1

C++ PATCH for c++/66383 (aggregate NSDMI issue)

2015-06-09 Thread Jason Merrill

In this testcase we end up with the initializer for the A subobject 
referring to the address of the B PLACEHOLDER_EXPR, which wasn't getting 
replaced properly because when we see the A TARGET_EXPR we stopped 
looking for more PLACEHOLDER_EXPRs.  Instead, we need to keep looking, 
and learn how to handle placeholders for enclosing objects as well as 
the innermost one.


Tested x86_64-pc-linux-gnu, applying to trunk and 5.
commit 26b958fac8f8c42ab0158583c6a362caf4a9ab73
Author: Jason Merrill 
Date:   Fri Jun 5 17:07:23 2015 -0400

	PR c++/66383
	* tree.c (replace_placeholders_r): Handle placeholders for an
	outer object.
	* typeck2.c (store_init_value): Only replace_placeholders for
	objects of class type.

diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index 3c4b527..3553d7c 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -2552,15 +2552,15 @@ replace_placeholders_r (tree* t, int* walk_subtrees, void* data_)
   switch (TREE_CODE (*t))
 {
 case PLACEHOLDER_EXPR:
-  gcc_assert (same_type_ignoring_top_level_qualifiers_p
-		  (TREE_TYPE (*t), TREE_TYPE (obj)));
-  *t = obj;
-  *walk_subtrees = false;
-  break;
-
-case TARGET_EXPR:
-  /* Don't mess with placeholders in an unrelated object.  */
-  *walk_subtrees = false;
+  {
+	tree x = obj;
+	for (; !(same_type_ignoring_top_level_qualifiers_p
+		 (TREE_TYPE (*t), TREE_TYPE (x)));
+	 x = TREE_OPERAND (x, 0))
+	  gcc_assert (TREE_CODE (x) == COMPONENT_REF);
+	*t = x;
+	*walk_subtrees = false;
+  }
   break;
 
 case CONSTRUCTOR:
diff --git a/gcc/cp/typeck2.c b/gcc/cp/typeck2.c
index affa265..22a5580 100644
--- a/gcc/cp/typeck2.c
+++ b/gcc/cp/typeck2.c
@@ -836,7 +836,7 @@ store_init_value (tree decl, tree init, vec** cleanups, int flags)
   TREE_CONSTANT (decl) = const_init && decl_maybe_constant_var_p (decl);
 }
 
-  if (cxx_dialect >= cxx14)
+  if (cxx_dialect >= cxx14 && CLASS_TYPE_P (strip_array_types (type)))
 /* Handle aggregate NSDMI in non-constant initializers, too.  */
 value = replace_placeholders (value, decl);
 
diff --git a/gcc/testsuite/g++.dg/cpp1y/nsdmi-aggr3.C b/gcc/testsuite/g++.dg/cpp1y/nsdmi-aggr3.C
new file mode 100644
index 000..185ea10
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/nsdmi-aggr3.C
@@ -0,0 +1,43 @@
+// PR c++/66383
+// { dg-do compile { target c++11 } }
+
+namespace N1 {
+  struct B;
+
+  struct A
+  {
+B* b;
+A(B* b);
+  };
+
+  struct B
+  {
+A a{ this };
+  };
+
+  A::A(B* b): b{ b } {}
+
+  void foo()
+  {
+auto b = B{};
+  }
+}
+
+namespace N2 {
+  struct B;
+
+  struct A
+  {
+B* b;
+  };
+
+  struct B
+  {
+A a{ this };
+  };
+
+  void foo()
+  {
+auto b = B{};
+  }
+}

Re: [PATCH] Fix ix86_split_long_move collision handling with TLS (PR target/66470)

2015-06-09 Thread Uros Bizjak

On Tue, Jun 9, 2015 at 3:39 PM, Jakub Jelinek  wrote:
> On Tue, Jun 09, 2015 at 03:21:55PM +0200, Uros Bizjak wrote:
>> I'm afraid that simple scan loop won't work correctly on x32. There
>> are some issues with UNSPEC_TP for this target, so we have to generate
>> zero_extend of SImode UNSPEC, e.g.:
>>
>> (plus:DI (zero_extend:DI (unspec:SI [...] UNSPEC_TP) (reg:DI ...))
>>
>> as can be seen in get_thread_pointer to construct the address. It
>> looks that your loop won't find the UNSPEC_TP tag in the above case.
>
> You're right, for -m32 it would need to start with
>rtx *x = &addr;
> +  while (GET_CODE (*x) == ZERO_EXTEND
> +|| GET_CODE (*x) == AND
> +|| GET_CODE (*x) == SUBREG)
> +x = &XEXP (*x, 0);

Oh, you can use SImode_address_operand predicate here.

Uros.

[PATCH] Remove last permutation restriction for SLP vectorizing

2015-06-09 Thread Richard Biener


This removes the restriction in place on reductions.  To not regress
gfortran.dg/vect/fast-math-pr37021.f90 the patch also implements
SLP permutations for strided group loads.  With those two pieces
we can finally SLP vectorize the complex multiplication (which
in the gfortran.dg/vect/fast-math-pr37021.f90 testcase uses
variable stride).  Code generated on x86_64 suffers from PR56766,
combine not being able to use sse3_addsubv2df3 because the
RTL doesn't use the expected vec_merge but
(vec_select (vec_concat )) instead.

Previously we vectorized this with unrolling and interleaving.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2015-06-09  Richard Biener  

* tree-vect-slp.c (vect_attempt_slp_rearrange_stmts): Split
out from ...
(vect_supported_load_permutation_p): ... here.  Handle
supportable permutations in reductions.
* tree-vect-stmts.c (vectorizable_load): Handle SLP permutations
for vectorizing strided group loads.

Index: gcc/tree-vect-slp.c
===
*** gcc/tree-vect-slp.c (revision 224271)
--- gcc/tree-vect-slp.c (working copy)
*** vect_slp_rearrange_stmts (slp_tree node,
*** 1326,1331 
--- 1270,1336 
  }
  
  
+ /* Attempt to reorder stmts in a reduction chain so that we don't
+require any load permutation.  Return true if that was possible,
+otherwise return false.  */
+ 
+ static bool
+ vect_attempt_slp_rearrange_stmts (slp_instance slp_instn)
+ {
+   unsigned int group_size = SLP_INSTANCE_GROUP_SIZE (slp_instn);
+   unsigned int i, j;
+   sbitmap load_index;
+   unsigned int lidx;
+   slp_tree node, load;
+ 
+   /* Compare all the permutation sequences to the first one.  We know
+  that at least one load is permuted.  */
+   node = SLP_INSTANCE_LOADS (slp_instn)[0];
+   if (!node->load_permutation.exists ())
+ return false;
+   for (i = 1; SLP_INSTANCE_LOADS (slp_instn).iterate (i, &load); ++i)
+ {
+   if (!load->load_permutation.exists ())
+   return false;
+   FOR_EACH_VEC_ELT (load->load_permutation, j, lidx)
+   if (lidx != node->load_permutation[j])
+ return false;
+ }
+ 
+   /* Check that the loads in the first sequence are different and there
+  are no gaps between them.  */
+   load_index = sbitmap_alloc (group_size);
+   bitmap_clear (load_index);
+   FOR_EACH_VEC_ELT (node->load_permutation, i, lidx)
+ {
+   if (bitmap_bit_p (load_index, lidx))
+   {
+ sbitmap_free (load_index);
+ return false;
+   }
+   bitmap_set_bit (load_index, lidx);
+ }
+   for (i = 0; i < group_size; i++)
+ if (!bitmap_bit_p (load_index, i))
+   {
+   sbitmap_free (load_index);
+   return false;
+   }
+   sbitmap_free (load_index);
+ 
+   /* This permutation is valid for reduction.  Since the order of the
+  statements in the nodes is not important unless they are memory
+  accesses, we can rearrange the statements in all the nodes
+  according to the order of the loads.  */
+   vect_slp_rearrange_stmts (SLP_INSTANCE_TREE (slp_instn), group_size,
+   node->load_permutation);
+ 
+   /* We are done, no actual permutations need to be generated.  */
+   FOR_EACH_VEC_ELT (SLP_INSTANCE_LOADS (slp_instn), i, node)
+ SLP_TREE_LOAD_PERMUTATION (node).release ();
+   return true;
+ }
+ 
  /* Check if the required load permutations in the SLP instance
 SLP_INSTN are supported.  */
  
*** vect_supported_load_permutation_p (slp_i
*** 1334,1340 
  {
unsigned int group_size = SLP_INSTANCE_GROUP_SIZE (slp_instn);
unsigned int i, j, k, next;
-   sbitmap load_index;
slp_tree node;
gimple stmt, load, next_load, first_load;
struct data_reference *dr;
--- 1339,1344 
*** vect_supported_load_permutation_p (slp_i
*** 1369,1427 
stmt = SLP_TREE_SCALAR_STMTS (node)[0];
  
/* Reduction (there are no data-refs in the root).
!  In reduction chain the order of the loads is important.  */
if (!STMT_VINFO_DATA_REF (vinfo_for_stmt (stmt))
&& !GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt)))
  {
!   slp_tree load;
!   unsigned int lidx;
! 
!   /* Compare all the permutation sequences to the first one.  We know
!  that at least one load is permuted.  */
!   node = SLP_INSTANCE_LOADS (slp_instn)[0];
!   if (!node->load_permutation.exists ())
!   return false;
!   for (i = 1; SLP_INSTANCE_LOADS (slp_instn).iterate (i, &load); ++i)
!   {
! if (!load->load_permutation.exists ())
!   return false;
! FOR_EACH_VEC_ELT (load->load_permutation, j, lidx)
!   if (lidx != node->load_permutation[j])
! return false;
!   }
  
!   /* Check that the loads in the first sequence are different and there
!are no gaps between them.  */
!   load_index = sbitmap_alloc (group_size);
!

[PATCH] top-level for libvtv: use normal (not raw_cxx) target exports

2015-06-09 Thread Michael Haubenwallner

Hi build machinery maintainers,

since we always build the C++ compiler now, I fail to see the need to still
use RAW_CXX_TARGET_EXPORTS for libvtv.

The situation to expose the problem is:
* Use a multilib-enabled x86_64-linux box.
* Use a 64-bit (multilib-disabled) bootstrap compiler (binary image).
$ configure --enable-multilib --with-system-zlib
$ make bootstrap

When it comes to build the 32-bit libvtv, it breaks because of using
"CC=/build/prev-gcc/xgcc -m32" "CXX=g++ -m32", while it should use
"CC=/build/prev-gcc/xgcc -m32" "CXX=/build/prev-gcc/xg++ -m32" instead.

However, I'm not sure about the general question behind:
Should it work to bootstrap the multilib-compiler using a non-multilib one?

This also needs above configure flags to work around two more but minor issues,
which I'm unsure about whether I can/should fix at all:
* --enable-multilib: Without this, the "user friendly check" is breaking,
since https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=205975
* --with-system-zlib: Without this, --enable-multilib tries to build a
32-bit zlib with "CC=/build/32/./prev-gcc/xgcc"

Thanks!
/haubi/
2015-06-09  Michael Haubenwallner  

	* Makefile.def (libvtv): Drop raw_cxx=true.
	* Makefile.in: Regenerate.

Index: Makefile.def
===
--- Makefile.def	(revision 224279)
+++ Makefile.def	(working copy)
@@ -135,8 +135,7 @@
 		   lib_path=.libs; };
 target_modules = { module= libvtv;
 		   bootstrap=true;
-		   lib_path=.libs;
-		   raw_cxx=true; };
+		   lib_path=.libs; };
 target_modules = { module= libcilkrts;
 		   lib_path=.libs; };
 target_modules = { module= liboffloadmic;
Index: Makefile.in
===
--- Makefile.in	(revision 224279)
+++ Makefile.in	(working copy)
@@ -35290,7 +35290,7 @@
 	fi; \
 	test ! -f $(TARGET_SUBDIR)/libvtv/Makefile || exit 0; \
 	$(SHELL) $(srcdir)/mkinstalldirs $(TARGET_SUBDIR)/libvtv; \
-	$(RAW_CXX_TARGET_EXPORTS)  \
+	$(NORMAL_TARGET_EXPORTS)  \
 	echo Configuring in $(TARGET_SUBDIR)/libvtv; \
 	cd "$(TARGET_SUBDIR)/libvtv" || exit 1; \
 	case $(srcdir) in \
@@ -35333,7 +35333,7 @@
 	  mv $(TARGET_SUBDIR)/libvtv/multilib.tmp $(TARGET_SUBDIR)/libvtv/multilib.out; \
 	fi; \
 	test ! -f $(TARGET_SUBDIR)/libvtv/Makefile || exit 0; \
-	$(RAW_CXX_TARGET_EXPORTS) \
+	$(NORMAL_TARGET_EXPORTS) \
 	CFLAGS="$(CFLAGS_FOR_TARGET)"; export CFLAGS; \
 	CXXFLAGS="$(CXXFLAGS_FOR_TARGET)"; export CXXFLAGS; \
 	LIBCFLAGS="$(LIBCFLAGS_FOR_TARGET)"; export LIBCFLAGS;  \
@@ -35377,7 +35377,7 @@
 	  mv $(TARGET_SUBDIR)/libvtv/multilib.tmp $(TARGET_SUBDIR)/libvtv/multilib.out; \
 	fi; \
 	test ! -f $(TARGET_SUBDIR)/libvtv/Makefile || exit 0; \
-	$(RAW_CXX_TARGET_EXPORTS) \
+	$(NORMAL_TARGET_EXPORTS) \
 	 \
 	CFLAGS="$(CFLAGS_FOR_TARGET)"; export CFLAGS; \
 	CXXFLAGS="$(CXXFLAGS_FOR_TARGET)"; export CXXFLAGS; \
@@ -35422,7 +35422,7 @@
 	  mv $(TARGET_SUBDIR)/libvtv/multilib.tmp $(TARGET_SUBDIR)/libvtv/multilib.out; \
 	fi; \
 	test ! -f $(TARGET_SUBDIR)/libvtv/Makefile || exit 0; \
-	$(RAW_CXX_TARGET_EXPORTS) \
+	$(NORMAL_TARGET_EXPORTS) \
 	 \
 	CFLAGS="$(CFLAGS_FOR_TARGET)"; export CFLAGS; \
 	CXXFLAGS="$(CXXFLAGS_FOR_TARGET)"; export CXXFLAGS; \
@@ -35467,7 +35467,7 @@
 	  mv $(TARGET_SUBDIR)/libvtv/multilib.tmp $(TARGET_SUBDIR)/libvtv/multilib.out; \
 	fi; \
 	test ! -f $(TARGET_SUBDIR)/libvtv/Makefile || exit 0; \
-	$(RAW_CXX_TARGET_EXPORTS) \
+	$(NORMAL_TARGET_EXPORTS) \
 	 \
 	CFLAGS="$(CFLAGS_FOR_TARGET)"; export CFLAGS; \
 	CXXFLAGS="$(CXXFLAGS_FOR_TARGET)"; export CXXFLAGS; \
@@ -35512,7 +35512,7 @@
 	  mv $(TARGET_SUBDIR)/libvtv/multilib.tmp $(TARGET_SUBDIR)/libvtv/multilib.out; \
 	fi; \
 	test ! -f $(TARGET_SUBDIR)/libvtv/Makefile || exit 0; \
-	$(RAW_CXX_TARGET_EXPORTS) \
+	$(NORMAL_TARGET_EXPORTS) \
 	 \
 	CFLAGS="$(CFLAGS_FOR_TARGET)"; export CFLAGS; \
 	CXXFLAGS="$(CXXFLAGS_FOR_TARGET)"; export CXXFLAGS; \
@@ -35557,7 +35557,7 @@
 	  mv $(TARGET_SUBDIR)/libvtv/multilib.tmp $(TARGET_SUBDIR)/libvtv/multilib.out; \
 	fi; \
 	test ! -f $(TARGET_SUBDIR)/libvtv/Makefile || exit 0; \
-	$(RAW_CXX_TARGET_EXPORTS) \
+	$(NORMAL_TARGET_EXPORTS) \
 	 \
 	CFLAGS="$(CFLAGS_FOR_TARGET)"; export CFLAGS; \
 	CXXFLAGS="$(CXXFLAGS_FOR_TARGET)"; export CXXFLAGS; \
@@ -35594,9 +35594,9 @@
 all-target-libvtv: configure-target-libvtv
 	@r=`${PWD_COMMAND}`; export r; \
 	s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \
-	$(RAW_CXX_TARGET_EXPORTS)  \
+	$(NORMAL_TARGET_EXPORTS)  \
 	(cd $(TARGET_SUBDIR)/libvtv && \
-	  $(MAKE) $(BASE_FLAGS_TO_PASS) $(EXTRA_TARGET_FLAGS) 'CXX=$$(RAW_CXX_FOR_TARGET)' 'CXX_FOR_TARGET=$$(RAW_CXX_FOR_TARGET)'   \
+	  $(MAKE) $(BASE_FLAGS_TO_PASS) $(EXTRA_TARGET_FLAGS)   \
 		$(TARGET-target-libvtv))
 @endif target-libvtv
 
@@ -35615,7 +35615,7 @@
 	@r=`${PWD_COMMAND}`; export r; \
 	s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \
 	TFLAGS="$(STAGE1_TFLAGS)"; \
-	$(RAW_CXX_TARGET_EXPORTS)  \
+	$(NORMAL_TARGET_EXPORTS)  \
 	cd $(TARGET_SUBDIR)/libvtv &&

Re: [PATCH] Fix ix86_split_long_move collision handling with TLS (PR target/66470)

2015-06-09 Thread Jakub Jelinek

On Tue, Jun 09, 2015 at 04:12:33PM +0200, Uros Bizjak wrote:
> On Tue, Jun 9, 2015 at 3:39 PM, Jakub Jelinek  wrote:
> > On Tue, Jun 09, 2015 at 03:21:55PM +0200, Uros Bizjak wrote:
> >> I'm afraid that simple scan loop won't work correctly on x32. There
> >> are some issues with UNSPEC_TP for this target, so we have to generate
> >> zero_extend of SImode UNSPEC, e.g.:
> >>
> >> (plus:DI (zero_extend:DI (unspec:SI [...] UNSPEC_TP) (reg:DI ...))
> >>
> >> as can be seen in get_thread_pointer to construct the address. It
> >> looks that your loop won't find the UNSPEC_TP tag in the above case.
> >
> > You're right, for -m32 it would need to start with

Yeah, I meant -mx32 (which I have no experience with nor spare time for).

> >rtx *x = &addr;
> > +  while (GET_CODE (*x) == ZERO_EXTEND
> > +|| GET_CODE (*x) == AND
> > +|| GET_CODE (*x) == SUBREG)
> > +x = &XEXP (*x, 0);
> 
> Oh, you can use SImode_address_operand predicate here.

Do I need to loop, or can there be just one SImode_address_operand
code?  Do you want to use the iterators (as in the second patch) or not
(then is
  if (SImode_address_operand (addr, VOIDmode))
x = &XEXP (addr, 0);
ok)?  Is Pmode always SImode for -mx32, or depending on some switch or
something?  Would it be acceptable to just guard the changes in the patch
with !TARGET_X32 and let H.J. deal with that target?  I'm afraid I'm lost
when to ZERO_EXTEND addr (if needed at all), etc.

Jakub

[hsa] Integrate into the existing accelerator framework, libgomp plugin

2015-06-09 Thread Martin Jambor

Hi,

the big patch below brings HSA substantially closer to nice
co-existence with the existing accelerator framework in trunk.  Above
all, it implements a libgomp plugin, and with this patch applied, the
branch uses it to run the target constructs of OpenMP 4.  While there
may be mistakes, I hope that the plugin already implements all that is
necessary and should be very close to the version I will propose for
inclusion to trunk late in the summer.  The changes in libgomp itself
are not big, I assume the only notable thing is that it I made it skip
any and all re-mapping and transport of data (as opposed to functions)
if the plugin has the GOMP_OFFLOAD_CAP_SHARED_MEM capability.

The automake/autoconf scripts have been changed so that the plugin is
built only when hsa is listed in --enable-offload-targets during
configuration of gcc.  There is also a new option --with-hsa-runtime
(and more specific options --with-hsa-runtime-include and
--with-hsa-runtime-lib) to provide the path to the HSA run-time header
files and library, which you are now likely to have to use to play
with the branch.

The file hsa.c and the hsa headers in the libgomp proper have not gone
away yet, so that the prototype entry point can still be used for a
bit longer.  But it will be removed in the big next step.

The majority of changes to the compiler itself are changes to autoconf
scripts to recognize hsa in the list --enable-offload-targets and
define ENABLE_HSA if it si present but also to *not* define
ENABLE_OFFLOADING when it is the only accelerator listed so that gcc
does not create the offloading LTO elf sections.  Similarly, I have
changed the driver not to attempt to find an external compiler for
HSA, and the common option handling so that it is usable to disable
hsa offloading even when it has been configured.

The fact that HSA could be enabled even though ENABLE_OFFLOADING is
not, however, is slightly confusing, especially since we happy use the
offload flag of cgraph_node.  Would the rest of the community be
opposed to changing the name of the macro to something that would
imply that stuff is being streamed to disk?

I have tested the patch passes our small HSA testsuite, wrote myself a
simple multi-threaded shared-library loading and unloading test
running an assignment in a target OppenMP construct and have also
verified that the Intel accelerator works and tests on the branch as
well as on corresponding trunk revision.

I'm looking forward to any comments and suggestions, meanwhile I have
committed the patch to the branch as r224284.

Thanks,

Martin



2015-06-09  Martin Jambor  

libgomp/
* Makefile.in: Regenerated.
* config.h.in: Likewise.
* configure: Likewise.
* libgomp-plugin.h (offload_target_type): Added
OFFLOAD_TARGET_TYPE_HSA.
* plugin/Makefrag.am: Conditionally build HSA plugin.
* plugin/configfrag.ac: New options for providing path to HSA
run-time.  Test that it is available if building the plugin.
* plugin/plugin-hsa.c: New file.
* target.c (GOMP_target): Do not re-map arguments when the device is
capable of sharing memory.
(GOMP_target_data): Likewise.
* testsuite/Makefile.in (): Regenerated.

gcc/
* builtin-types.def (BT_FN_VOID_PTR_INT_PTR): New.
* common.opt (flag_disable_hsa): Likewise.
* config.in: Regenrated.
* configure : Likewise.
* configure.ac (accel_dir_suffix): Set ENABLE_HSA when hsa is
listed as accelerator, do not set ENABLE_OFFLOADING if it is the
only one.
* hsa-brig.c (hsa_dtor_statements): New variable.
(hsa_output_kernel_mapping): Generate libgomp registration and
unregistration calls.
* hsa-gen.c (generate_hsa): New parameter kernel, use it rather
than parameters.
(pass_gen_hsail::gate): Make conditional on ENABLE_HSA.
(pass_gen_hsail::execute): Detect kernels.
* lto-wrapper.c (compile_images_for_offload_targets): Do not
attempt to invoke an external hsa accelerator compiler.
* omp-builtins.def (BUILT_IN_GOMP_OFFLOAD_REGISTER): New.
(BUILT_IN_GOMP_OFFLOAD_UNREGISTER): Likewise.
* opts.c (common_handle_option): Disable HSA if requested on the
command line.

gcc/fortran/
* types.def (BT_FN_VOID_PTR_INT_PTR): New.

diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def
index 0e34531..f462d8a 100644
--- a/gcc/builtin-types.def
+++ b/gcc/builtin-types.def
@@ -450,6 +450,7 @@ DEF_FUNCTION_TYPE_3 (BT_FN_BOOL_ULONG_ULONG_ULONGPTR, 
BT_BOOL, BT_ULONG,
 BT_ULONG, BT_PTR_ULONG)
 DEF_FUNCTION_TYPE_3 (BT_FN_BOOL_ULONGLONG_ULONGLONG_ULONGLONGPTR, BT_BOOL,
 BT_ULONGLONG, BT_ULONGLONG, BT_PTR_ULONGLONG)
+DEF_FUNCTION_TYPE_3 (BT_FN_VOID_PTR_INT_PTR, BT_VOID, BT_PTR, BT_INT, BT_PTR)
 
 DEF_FUNCTION_TYPE_4 (BT_FN_SIZE_CONST_PTR_SIZE_SIZE_FILEPTR,
 BT_SIZE, BT_CONST_PTR, BT

PATCH to check_global_declaration for bootstrap/66448

2015-06-09 Thread Jason Merrill

On targets without alias support (Darwin) bootstrap was failing with a 
warning about one of the constructor clones being unused.  We shouldn't 
warn about that, since clones are artificial; we should only warn if the 
abstract constructor is unused.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 16232674086f330d05f7279681978eb683990ed5
Author: Jason Merrill 
Date:   Tue Jun 9 10:12:47 2015 -0400

	PR bootstrap/66448
	* toplev.c (check_global_declaration): Don't warn about a clone.

diff --git a/gcc/testsuite/g++.dg/warn/Wunused-function1.C b/gcc/testsuite/g++.dg/warn/Wunused-function1.C
new file mode 100644
index 000..86b319a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wunused-function1.C
@@ -0,0 +1,12 @@
+// PR bootstrap/66448
+// { dg-options "-Wunused-function" }
+
+struct A { A(); };
+namespace {
+  struct B: virtual A { B(); };
+  B::B() { }
+  B b;
+
+  struct C: virtual A { C(); };
+  C::C() { }			// { dg-warning "defined but not used" }
+}
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 306d008..b1ccc18 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -529,6 +529,7 @@ check_global_declaration (tree decl)
   && ! (DECL_NAME (decl) && TREE_USED (DECL_NAME (decl)))
   && ! DECL_EXTERNAL (decl)
   && ! DECL_ARTIFICIAL (decl)
+  && ! DECL_ABSTRACT_ORIGIN (decl)
   && ! TREE_PUBLIC (decl)
   /* A volatile variable might be used in some non-obvious way.  */
   && ! TREE_THIS_VOLATILE (decl)

[PATCH] xtensa: implement trap pattern

2015-06-09 Thread Max Filippov

gcc/
* config/xtensa/xtensa.h (TARGET_DEBUG): New definition.
* config/xtensa/xtensa.md (define_attr "type"): New type "trap".
(define_insn "trap"): New definition.
---
 gcc/config/xtensa/xtensa.h  |  1 +
 gcc/config/xtensa/xtensa.md | 15 ++-
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/gcc/config/xtensa/xtensa.h b/gcc/config/xtensa/xtensa.h
index 011411c..584080b 100644
--- a/gcc/config/xtensa/xtensa.h
+++ b/gcc/config/xtensa/xtensa.h
@@ -67,6 +67,7 @@ extern unsigned xtensa_current_frame_size;
 #define TARGET_THREADPTR   XCHAL_HAVE_THREADPTR
 #define TARGET_LOOPS   XCHAL_HAVE_LOOPS
 #define TARGET_WINDOWED_ABI(XSHAL_ABI == XTHAL_ABI_WINDOWED)
+#define TARGET_DEBUG   XCHAL_HAVE_DEBUG
 
 #define TARGET_DEFAULT \
   ((XCHAL_HAVE_L32R? 0 : MASK_CONST16) |   \
diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md
index 6d84384..fed7586 100644
--- a/gcc/config/xtensa/xtensa.md
+++ b/gcc/config/xtensa/xtensa.md
@@ -86,7 +86,7 @@
 ;; Attributes.
 
 (define_attr "type"
-  
"unknown,jump,call,load,store,move,arith,multi,nop,farith,fmadd,fconv,fload,fstore,mul16,mul32,div32,mac16,rsr,wsr,entry"
+  
"unknown,jump,call,load,store,move,arith,multi,nop,farith,fmadd,fconv,fload,fstore,mul16,mul32,div32,mac16,rsr,wsr,entry,trap"
   (const_string "unknown"))
 
 (define_attr "mode"
@@ -1764,6 +1764,19 @@
   [(set_attr "length" "0")
(set_attr "type" "nop")])
 
+(define_insn "trap"
+  [(trap_if (const_int 1) (const_int 0))]
+  ""
+{
+  if (TARGET_DEBUG)
+return (TARGET_DENSITY ? "break.n\t0" : "break\t0, 0");
+  else
+return (TARGET_DENSITY ? "ill.n" : "ill");
+}
+  [(set_attr "type""trap")
+   (set_attr "mode""none")
+   (set_attr "length"  "3")])
+
 ;; Setting up a frame pointer is tricky for Xtensa because GCC doesn't
 ;; know if a frame pointer is required until the reload pass, and
 ;; because there may be an incoming argument value in the hard frame
-- 
1.8.1.4

Re: [PATCH] Fix ix86_split_long_move collision handling with TLS (PR target/66470)

2015-06-09 Thread Uros Bizjak

On Tue, Jun 9, 2015 at 4:26 PM, Jakub Jelinek  wrote:

>> >> I'm afraid that simple scan loop won't work correctly on x32. There
>> >> are some issues with UNSPEC_TP for this target, so we have to generate
>> >> zero_extend of SImode UNSPEC, e.g.:
>> >>
>> >> (plus:DI (zero_extend:DI (unspec:SI [...] UNSPEC_TP) (reg:DI ...))
>> >>
>> >> as can be seen in get_thread_pointer to construct the address. It
>> >> looks that your loop won't find the UNSPEC_TP tag in the above case.
>> >
>> > You're right, for -m32 it would need to start with
>
> Yeah, I meant -mx32 (which I have no experience with nor spare time for).
>
>> >rtx *x = &addr;
>> > +  while (GET_CODE (*x) == ZERO_EXTEND
>> > +|| GET_CODE (*x) == AND
>> > +|| GET_CODE (*x) == SUBREG)
>> > +x = &XEXP (*x, 0);
>>
>> Oh, you can use SImode_address_operand predicate here.
>
> Do I need to loop, or can there be just one SImode_address_operand

IIRC, apart from the whole address, only UNSPEC_TP can be
zero_extended. It is a hardware "feature" (== HW bug) that addr32
doesn't apply to segment registers.

> code?  Do you want to use the iterators (as in the second patch) or not
> (then is
>   if (SImode_address_operand (addr, VOIDmode))
> x = &XEXP (addr, 0);
> ok)?  Is Pmode always SImode for -mx32, or depending on some switch or

Nope, it depends on -maddress-mode switch, and can be SImode or DImode.

> something?  Would it be acceptable to just guard the changes in the patch
> with !TARGET_X32 and let H.J. deal with that target?  I'm afraid I'm lost
> when to ZERO_EXTEND addr (if needed at all), etc.

If you wish, I can take your patch and take if further. -mx32 is a
delicate beast...

Uros.
>
> Jakub

[patch] [jit] Install headers using INSTALL_DATA

2015-06-09 Thread Matthias Klose

Install the gcc/jit header files without the x bit set. Ok for the trunk and the
5 branch?

Matthias

gcc/jit/

2015-06-09  Matthias Klose  

* Make-lang.in (jit.install-common): Install headers using INSTALL_DATA.

Index: gcc/jit/Make-lang.in
===
--- gcc/jit/Make-lang.in(revision 224282)
+++ gcc/jit/Make-lang.in(working copy)
@@ -268,9 +268,9 @@
ln -sf \
  $(LIBGCCJIT_SONAME_SYMLINK)\
  $(DESTDIR)/$(libdir)/$(LIBGCCJIT_LINKER_NAME_SYMLINK)
-   $(INSTALL_PROGRAM) $(srcdir)/jit/libgccjit.h \
+   $(INSTALL_DATA) $(srcdir)/jit/libgccjit.h \
  $(DESTDIR)/$(includedir)/libgccjit.h
-   $(INSTALL_PROGRAM) $(srcdir)/jit/libgccjit++.h \
+   $(INSTALL_DATA) $(srcdir)/jit/libgccjit++.h \
  $(DESTDIR)/$(includedir)/libgccjit++.h

 jit.install-man:

Re: [PR64164] drop copyrename, integrate into expand

2015-06-09 Thread David Edelsohn

This also broke bootstrap on PPC64 LE Linux with the same error.

- David

Re: [PATCH] Fix ix86_split_long_move collision handling with TLS (PR target/66470)

2015-06-09 Thread Jakub Jelinek

On Tue, Jun 09, 2015 at 06:16:32PM +0200, Uros Bizjak wrote:
> > something?  Would it be acceptable to just guard the changes in the patch
> > with !TARGET_X32 and let H.J. deal with that target?  I'm afraid I'm lost
> > when to ZERO_EXTEND addr (if needed at all), etc.
> 
> If you wish, I can take your patch and take if further. -mx32 is a
> delicate beast...

If you could, it would be appreciated, I'm quite busy with OpenMP 4.1 stuff
now.
Note that for -m64/-mx32 it will be much harder to create a reproducer,
because to trigger the bug one has to convince the register allocator
to allocate the lhs of the load in certain registers (not that hard),
but also the index register (to be scaled, also not that hard) and
also the register holding the tls symbol immediate.  Wonder if one has to
keep all but the two registers live across the load or something similar.

Jakub

Re: C++ PATCH for c++/51747 (list-initialization from same type)

2015-06-09 Thread Jason Merrill


On 05/07/2015 12:45 PM, Jason Merrill wrote:

We also need to adjust digest_init_r.


This was only needed because we weren't calling reshape_init.  Now that 
Paolo has fixed that, this can become an assert.



commit 8d2793c42fc3de4a0b665f4c2ff2a2946ae0beda
Author: Jason Merrill 
Date:   Tue Jun 9 09:51:31 2015 -0400

	DR 1467
	PR c++/51747
	* typeck2.c (digest_init_r): Replace previous change with
	gcc_unreachable.

diff --git a/gcc/cp/typeck2.c b/gcc/cp/typeck2.c
index b077f02..709875c 100644
--- a/gcc/cp/typeck2.c
+++ b/gcc/cp/typeck2.c
@@ -1089,6 +1089,7 @@ digest_init_r (tree type, tree init, bool nested, int flags,
 	  || TREE_CODE (type) == UNION_TYPE
 	  || TREE_CODE (type) == COMPLEX_TYPE);
 
+#ifdef ENABLE_CHECKING
   /* "If T is a class type and the initializer list has a single
  element of type cv U, where U is T or a class derived from T,
  the object is initialized from that element."  */
@@ -1099,8 +1100,10 @@ digest_init_r (tree type, tree init, bool nested, int flags,
 {
   tree elt = CONSTRUCTOR_ELT (init, 0)->value;
   if (reference_related_p (type, TREE_TYPE (elt)))
-	init = elt;
+	/* We should have fixed this in reshape_init.  */
+	gcc_unreachable ();
 }
+#endif
 
   if (BRACE_ENCLOSED_INITIALIZER_P (init)
   && !TYPE_NON_AGGREGATE_CLASS (type))

Re: Fix more of C/fortran canonical type issues

2015-06-09 Thread Jan Hubicka

> On Mon, 8 Jun 2015, Jan Hubicka wrote:
> 
> > > 
> > > I think we should instead work towards eliminating the get_alias_set
> > > langhook first.  The LTO langhook variant contains the same handling, btw,
> > > so just inline that into get_alias_set and see what remains?
> > 
> > I see, i completely missed existence of gimple_get_alias_set. It makes more
> > sense now.
> > 
> > Is moving everyting to alias.c realy a desirable thing? If non-C languages 
> > do
> > not have this rule, why we want to reduce the code quality when compiling
> > those?
> 
> Well, for consistency and for getting rid of one langhook ;)
:)
In a way this particular langhook makes sense to me - TBAA rules are language 
specific.
We also may with explicit streaming of the TBAA dag, like LLVM does.

Anyway, this is the updated patch fixing the Fortran's interoperability with
size_t and signed char.  I will send separate patch for the extra lto-symtab
warnings shortly.

I will be happy looking into the TYPE_CANONICAL (int) to be different from
TYPE_CANONICAL (unsigned int) if that seems desirable. There are two things that
needs to be solved - hash_canonical_type/gimple_canonical_types_compatible_p 
can't
use TYPE_CNAONICAL of subtypes in all cases (that is easy) and we will need some
way to recognize the conflict in lto-symtab other thanjust comparing 
TYPE_CANONICAL
to not warn when a variable is declared signed in Fortran unit and unsigned in 
C.

Bootstrapped/regtested ppc64le-linux.

* lto/lto.c (hash_canonical_type): Do not hash TYPE_UNSIGNED
of INTEGER_TYPE.
* tree.c (gimple_canonical_types_compatible_p): Do not compare 
TYPE_UNSIGNED
of INTEGER_TYPE.
* gimple-expr.c (useless_type_conversion_p): Move INTEGER type handling
ahead the canonical type lookup.

* gfortran.dg/lto/bind_c-2_0.f90: New testcase
* gfortran.dg/lto/bind_c-2_1.c: New testcase
* gfortran.dg/lto/bind_c-3_0.f90: New testcase
* gfortran.dg/lto/bind_c-3_1.c: New testcase
* gfortran.dg/lto/bind_c-4_0.f90: New testcase
* gfortran.dg/lto/bind_c-4_1.c: New testcase
Index: lto/lto.c
===
--- lto/lto.c   (revision 224252)
+++ lto/lto.c   (working copy)
@@ -298,6 +298,7 @@
 hash_canonical_type (tree type)
 {
   inchash::hash hstate;
+  enum tree_code code;
 
   /* We compute alias sets only for types that needs them.
  Be sure we do not recurse to something else as we can not hash incomplete
@@ -309,7 +310,8 @@
  smaller sets; when searching for existing matching types to merge,
  only existing types having the same features as the new type will be
  checked.  */
-  hstate.add_int (tree_code_for_canonical_type_merging (TREE_CODE (type)));
+  code = tree_code_for_canonical_type_merging (TREE_CODE (type));
+  hstate.add_int (code);
   hstate.add_int (TYPE_MODE (type));
 
   /* Incorporate common features of numerical types.  */
@@ -319,7 +321,14 @@
   || TREE_CODE (type) == OFFSET_TYPE
   || POINTER_TYPE_P (type))
 {
-  hstate.add_int (TYPE_UNSIGNED (type));
+  /* Some Fortran integer types are interoperable with C types regardless
+their signedness.  We need to ignore sign on these to make it
+possible for structure containing unsigned type to interoperate
+with structure containing signed type which is also required by
+the standard.  It is thus not enough to keep alias set of signed
+type same with alias set of unsigned type.  */
+  if (code != INTEGER_TYPE)
+hstate.add_int (TYPE_UNSIGNED (type));
   hstate.add_int (TYPE_PRECISION (type));
 }
 
Index: tree.c
===
--- tree.c  (revision 224252)
+++ tree.c  (working copy)
@@ -12879,6 +12879,7 @@
 gimple_canonical_types_compatible_p (const_tree t1, const_tree t2,
 bool trust_type_canonical)
 {
+  enum tree_code code;
   /* Type variants should be same as the main variant.  When not doing sanity
  checking to verify this fact, go to main variants and save some work.  */
   if (trust_type_canonical)
@@ -12918,9 +12919,9 @@
   && trust_type_canonical)
 return TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2);
 
+  code = tree_code_for_canonical_type_merging (TREE_CODE (t1));
   /* Can't be the same type if the types don't have the same code.  */
-  if (tree_code_for_canonical_type_merging (TREE_CODE (t1))
-  != tree_code_for_canonical_type_merging (TREE_CODE (t2)))
+  if (code != tree_code_for_canonical_type_merging (TREE_CODE (t2)))
 return false;
 
   /* Qualifiers do not matter for canonical type comparison purposes.  */
@@ -12945,7 +12946,14 @@
 {
   /* Can't be the same type if they have different sign or precision.  */
   if (TYPE_PRECISION (t1) != TYPE_PRECISION (t2)
- || TYPE_UNSIGNED (t1) != TYPE_UNSIGNED (t2))
+ /* Some

[PATCH] PowerPC atomic memmodel cleanup

2015-06-09 Thread David Edelsohn

The infrastructure for atomic operations in the rs6000 port has not
utilized the cleaner GCC internal APIs for manipulating memmodel.  The
appended patch converts the direct casts and use of memmodel_from_int
to consistently use memmodel_base.  This also allows the SYNC variants
to be removed from switch case statements.  This should not change any
semantics or behavior of atomics in PowerPC code generation.

Comments?

Bootstrapped on powerpc-ibm-aix7.1.0.0

* config/rs6000/rs6000.c (rs6000_pre_atomic_barrier): Remove SYNC
variants cases from switch.
(rs6000_post_atomic_barrier): Same.
(rs6000_expand_atomic_compare_and_swap): Use memmodel_base.
(rs6000_expand_atomic_exchange): Same.
(rs6000_expand_atomic_op): Same.
* config/rs6000/sync.md (mem_thread_fence): Use memodel_base. Remove
SYNC variants cases from switch.
(atomic_load): Same.
(atomic_store): Same.

Thanks, David
* config/rs6000/rs6000.c (rs6000_pre_atomic_barrier): Remove SYNC
variants cases from switch.
(rs6000_post_atomic_barrier): Same.
(rs6000_expand_atomic_compare_and_swap): Use memmodel_base.
(rs6000_expand_atomic_exchange): Same.
(rs6000_expand_atomic_op): Same.
* config/rs6000/sync.md (mem_thread_fence): Use memodel_base. Remove
SYNC variants cases from switch.
(atomic_load): Same.
(atomic_store): Same.

Index: rs6000.c
===
--- rs6000.c(revision 224288)
+++ rs6000.c(working copy)
@@ -20558,15 +20558,12 @@ rs6000_pre_atomic_barrier (rtx mem, enum memmodel
 case MEMMODEL_RELAXED:
 case MEMMODEL_CONSUME:
 case MEMMODEL_ACQUIRE:
-case MEMMODEL_SYNC_ACQUIRE:
   break;
 case MEMMODEL_RELEASE:
-case MEMMODEL_SYNC_RELEASE:
 case MEMMODEL_ACQ_REL:
   emit_insn (gen_lwsync ());
   break;
 case MEMMODEL_SEQ_CST:
-case MEMMODEL_SYNC_SEQ_CST:
   emit_insn (gen_hwsync ());
   break;
 default:
@@ -20583,13 +20580,10 @@ rs6000_post_atomic_barrier (enum memmodel model)
 case MEMMODEL_RELAXED:
 case MEMMODEL_CONSUME:
 case MEMMODEL_RELEASE:
-case MEMMODEL_SYNC_RELEASE:
   break;
 case MEMMODEL_ACQUIRE:
-case MEMMODEL_SYNC_ACQUIRE:
 case MEMMODEL_ACQ_REL:
 case MEMMODEL_SEQ_CST:
-case MEMMODEL_SYNC_SEQ_CST:
   emit_insn (gen_isync ());
   break;
 default:
@@ -20690,8 +20684,8 @@ rs6000_expand_atomic_compare_and_swap (rtx operand
   oldval = operands[3];
   newval = operands[4];
   is_weak = (INTVAL (operands[5]) != 0);
-  mod_s = memmodel_from_int (INTVAL (operands[6]));
-  mod_f = memmodel_from_int (INTVAL (operands[7]));
+  mod_s = memmodel_base (INTVAL (operands[6]));
+  mod_f = memmodel_base (INTVAL (operands[7]));
   orig_mode = mode = GET_MODE (mem);
 
   mask = shift = NULL_RTX;
@@ -20810,7 +20804,7 @@ rs6000_expand_atomic_exchange (rtx operands[])
   retval = operands[0];
   mem = operands[1];
   val = operands[2];
-  model = (enum memmodel) INTVAL (operands[3]);
+  model = memmodel_base (INTVAL (operands[3]));
   mode = GET_MODE (mem);
 
   mask = shift = NULL_RTX;
@@ -20861,7 +20855,7 @@ void
 rs6000_expand_atomic_op (enum rtx_code code, rtx mem, rtx val,
 rtx orig_before, rtx orig_after, rtx model_rtx)
 {
-  enum memmodel model = (enum memmodel) INTVAL (model_rtx);
+  enum memmodel model = memmodel_base (INTVAL (model_rtx));
   machine_mode mode = GET_MODE (mem);
   machine_mode store_mode = mode;
   rtx label, x, cond, mask, shift;
Index: sync.md
===
--- sync.md (revision 224288)
+++ sync.md (working copy)
@@ -41,7 +41,7 @@
   [(match_operand:SI 0 "const_int_operand" "")];; model
   ""
 {
-  enum memmodel model = memmodel_from_int (INTVAL (operands[0]));
+  enum memmodel model = memmodel_base (INTVAL (operands[0]));
   switch (model)
 {
 case MEMMODEL_RELAXED:
@@ -48,14 +48,11 @@
   break;
 case MEMMODEL_CONSUME:
 case MEMMODEL_ACQUIRE:
-case MEMMODEL_SYNC_ACQUIRE:
 case MEMMODEL_RELEASE:
-case MEMMODEL_SYNC_RELEASE:
 case MEMMODEL_ACQ_REL:
   emit_insn (gen_lwsync ());
   break;
 case MEMMODEL_SEQ_CST:
-case MEMMODEL_SYNC_SEQ_CST:
   emit_insn (gen_hwsync ());
   break;
 default:
@@ -147,7 +144,7 @@
   if (mode == TImode && !TARGET_SYNC_TI)
 FAIL;
 
-  enum memmodel model = memmodel_from_int (INTVAL (operands[2]));
+  enum memmodel model = memmodel_base (INTVAL (operands[2]));
 
   if (is_mm_seq_cst (model))
 emit_insn (gen_hwsync ());
@@ -185,9 +182,7 @@
   break;
 case MEMMODEL_CONSUME:
 case MEMMODEL_ACQUIRE:
-case MEMMODEL_SYNC_ACQUIRE:
 case MEMMODEL_SEQ_CST:
-case MEMMODEL_SYNC_SEQ_CST:
   emit_insn (gen_loadsync_ (operands[0]));
   break;
 default:
@@ -214,17 +209,15 @@
   if (mode =

[PATCH 3/3] Improve -Wmissing-indentation heuristics

2015-06-09 Thread Patrick Palka

This patch improves the heuristics of the warning in a number of ways.
The improvements are hopefully adequately documented in the code
comments.

The additions to the test case also highlight the improvements.

I tested an earlier version of this patch on more than a dozen C code
bases.  I only found one class of bogus warnings yet emitted, in the
libpng and bdwgc projects.  These projects have a coding style which
indents code inside #ifdefs as if this code was guarded by an if(), e.g.

  if (foo != 0)
x = 10;
  else   // GUARD
y = 100; // BODY

  #ifdef BAR
blah ();  // NEXT
  #endif

These bogus warnings are pre-existing, however (i.e. not caused by this
patch).

gcc/c-family/ChangeLog:

* c-indentation.c (should_warn_for_misleading_indentation):
Improve heuristics.

gcc/testsuite/ChangeLog:

* c-c++-common/Wmisleading-indentation.c: Add more tests.
---
 gcc/c-family/c-indentation.c   | 167 ++---
 .../c-c++-common/Wmisleading-indentation.c | 166 
 2 files changed, 313 insertions(+), 20 deletions(-)

diff --git a/gcc/c-family/c-indentation.c b/gcc/c-family/c-indentation.c
index f43aee6..8de3bc5 100644
--- a/gcc/c-family/c-indentation.c
+++ b/gcc/c-family/c-indentation.c
@@ -185,6 +185,7 @@ should_warn_for_misleading_indentation (const 
token_indent_info &guard_tinfo,
   location_t body_loc = body_tinfo.location;
   location_t next_stmt_loc = next_tinfo.location;
 
+  enum cpp_ttype body_type = body_tinfo.type;
   enum cpp_ttype next_tok_type = next_tinfo.type;
 
   /* Don't attempt to compare the indentation of BODY_LOC and NEXT_STMT_LOC
@@ -230,12 +231,33 @@ should_warn_for_misleading_indentation (const 
token_indent_info &guard_tinfo,
   || next_tinfo.keyword == RID_ELSE)
 return false;
 
+  /* Likewise, if the body of the guard is a compound statement then control
+ flow is quite visually explicit regardless of the code's possibly poor
+ indentation, e.g.
+
+ while (foo)  <- GUARD
+   {  <- BODY
+   bar ();
+   }
+   baz ();<- NEXT
+
+Things only get muddy when the body of the guard does not have
+braces, e.g.
+
+if (foo)  <- GUARD
+  bar (); <- BODY
+  baz (); <- NEXT
+  */
+  if (body_type == CPP_OPEN_BRACE)
+return false;
+
   /* Don't warn here about spurious semicolons.  */
   if (next_tok_type == CPP_SEMICOLON)
 return false;
 
   expanded_location body_exploc = expand_location (body_loc);
   expanded_location next_stmt_exploc = expand_location (next_stmt_loc);
+  expanded_location guard_exploc = expand_location (guard_loc);
 
   /* They must be in the same file.  */
   if (next_stmt_exploc.file != body_exploc.file)
@@ -253,13 +275,20 @@ should_warn_for_misleading_indentation (const 
token_indent_info &guard_tinfo,
if (flag) foo (); bar ();
  ^ WARN HERE
 
+
+   if (flag) ; {
+   ^ WARN HERE
+
+   if (flag)
+; {
+  ^ WARN HERE
+
  Cases where we don't want to issue a warning:
 
various_code (); if (flag) foo (); bar (); more_code ();
   ^ DON'T WARN HERE.  */
   if (next_stmt_exploc.line == body_exploc.line)
 {
-  expanded_location guard_exploc = expand_location (guard_loc);
   if (guard_exploc.file != body_exploc.file)
return true;
   if (guard_exploc.line < body_exploc.line)
@@ -307,14 +336,10 @@ should_warn_for_misleading_indentation (const 
token_indent_info &guard_tinfo,
  bar ();
  ^ DON'T WARN HERE
 
-if (flag) {
-  foo ();
-} else
-{
-  bar ();
-}
-baz ();
-^ DON'T WARN HERE
+   if (flag)
+ ;
+ foo ();
+ ^ DON'T WARN HERE
   */
   if (next_stmt_exploc.line > body_exploc.line)
 {
@@ -322,29 +347,84 @@ should_warn_for_misleading_indentation (const 
token_indent_info &guard_tinfo,
 "visual column"...  */
   unsigned int next_stmt_vis_column;
   unsigned int body_vis_column;
+  unsigned int body_line_first_nws;
   /* If we can't determine it, don't issue a warning.  This is sometimes
 the case for input files containing #line directives, and these
 are often for autogenerated sources (e.g. from .md files), where
 it's not clear that it's meaningful to look at indentation.  */
   if (!get_visual_column (next_stmt_exploc, &next_stmt_vis_column))
return false;
-  if (!get_visual_column (body_exploc, &body_vis_column))
+  if (!get_visual_column (body_exploc,
+ &body_vis_column,
+ &body_line_first_nws))
return false;
-  if (next_stmt_vis_column == body_vis_column)
+  if ((body_type != CPP_SEMICOLON
+  && next_stmt_vis_column == body_vis_column)
+ /* As a special case handle the case where the body is a semicolo

[PATCH 1/3] Refactor entry point to -Wmisleading-indentation

2015-06-09 Thread Patrick Palka

This patch refactors the entry point of -Wmisleading-indentation from:

  void
  warn_for_misleading_indentation (location_t guard_loc,
   location_t body_loc,
   location_t next_stmt_loc,
   enum cpp_ttype next_tok_type,
   const char *guard_kind);

to

  struct token_indent_info
  {
location_t location;
cpp_ttype type;
rid keyword;
  };

  void
  warn_for_misleading_indentation (const token_indent_info &guard_tinfo,
   const token_indent_info &body_tinfo,
   const token_indent_info &next_tinfo);

The purpose of this refactoring is to expose more information to the
-Wmisleading-indentation implementation to allow for more advanced
heuristics and for better coverage.

(I decided to keep the usage of const references because nobody
seems to mind.  Also I added a new header file, c-indentation.h.)

gcc/c-family/ChangeLog:

* c-indentation.h (struct token_indent_info): Define.
(get_token_indent_info): Define.
(warn_for_misleading_information): Declare.
* c-common.h (warn_for_misleading_information): Remove.
* c-identation.c (warn_for_misleading_indentation):
Change declaration to take three token_indent_infos.  Adjust
accordingly.
* c-identation.c (should_warn_for_misleading_indentation):
Likewise.  Bail out early if the body is a compound statement.
(guard_tinfo_to_string): Define.

gcc/c/ChangeLog:

* c-parser.c (c_parser_if_body): Take token_indent_info
argument. Call warn_for_misleading_indentation even when the
body is a semicolon.  Extract token_indent_infos corresponding
to the guard, body and next tokens.  Adjust call to
warn_for_misleading_indentation accordingly.
(c_parser_else_body): Likewise.
(c_parser_if_statement): Likewise.
(c_parser_while_statement): Likewise.
(c_parser_for_statement): Likewise.

gcc/cp/ChangeLog:

* parser.c (cp_parser_selection_statement): Move handling of
semicolon body to ...
(cp_parser_implicitly_scoped_statement): .. here.  Call
warn_for_misleading_indentation even when the body is a
semicolon.  Extract token_indent_infos corresponding to the
guard, body and next tokens.  Adjust call to
warn_for_misleading_indentation accordingly.  Take
token_indent_info argument.
(cp_parser_already_scoped_statement): Likewise.
(cp_parser_selection_statement, cp_parser_iteration_statement):
Extract a token_indent_info corresponding to the guard token.
---
 gcc/c-family/c-common.h  |   7 ---
 gcc/c-family/c-indentation.c |  79 ++
 gcc/c-family/c-indentation.h |  52 ++
 gcc/c/c-parser.c |  81 +++
 gcc/cp/parser.c  | 100 +++
 5 files changed, 201 insertions(+), 118 deletions(-)
 create mode 100644 gcc/c-family/c-indentation.h

diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 7ba81c4..a78a13c 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1429,12 +1429,5 @@ extern bool contains_cilk_spawn_stmt (tree);
 extern tree cilk_for_number_of_iterations (tree);
 extern bool check_no_cilk (tree, const char *, const char *,
   location_t loc = UNKNOWN_LOCATION);
-/* In c-indentation.c.  */
-extern void
-warn_for_misleading_indentation (location_t guard_loc,
-location_t body_loc,
-location_t next_stmt_loc,
-enum cpp_ttype next_tok_type,
-const char *guard_kind);
 
 #endif /* ! GCC_C_COMMON_H */
diff --git a/gcc/c-family/c-indentation.c b/gcc/c-family/c-indentation.c
index f8c28cf..9043f8c 100644
--- a/gcc/c-family/c-indentation.c
+++ b/gcc/c-family/c-indentation.c
@@ -29,6 +29,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "stor-layout.h"
 #include "input.h"
 #include "c-common.h"
+#include "c-indentation.h"
 
 extern cpp_options *cpp_opts;
 
@@ -187,11 +188,16 @@ detect_preprocessor_logic (expanded_location body_exploc,
description of that function below.  */
 
 static bool
-should_warn_for_misleading_indentation (location_t guard_loc,
-   location_t body_loc,
-   location_t next_stmt_loc,
-   enum cpp_ttype next_tok_type)
+should_warn_for_misleading_indentation (const token_indent_info &guard_tinfo,
+   const token_indent_info &body_tinfo,
+   const token_indent_info &next_tinfo)
 {
+  location_t guard_loc = guard_tinfo.locati

[PATCH 2/3] Remove is_first_nonwhitespace_on_line(), instead improve get_visual_column()

2015-06-09 Thread Patrick Palka

This patch removes the function is_first_nonwhitespace_on_line() in
favor of augmenting the function get_visual_column() to optionally
return the visual column corresponding to the first non-whitespace character
on the line.  Existing usage of is_first_nonwhitespace_on_line() can
be trivially replaced by calling get_visual_column() and comparing *out
with *first_nws.

The rationale for this change is that in many cases it is better to use
the visual column of the first non-whitespace character rather than the
visual column of the token.  Consider:

  if (p) {
foo (1);
  } else   // GUARD
if (q) // BODY
  foo (2);
foo (3);   // NEXT

Here, with current heuristics, we do not emit a warning because we
notice that the visual columns of each token line up ("suggesting"
autogenerated code).  Yet it is obvious that we should warn here because
it misleadingly looks like the foo (3); statement is guarded by the
else.

If we instead consider the visual column of the first non-whitespace
character on the guard line, the columns will not line up thus we will
emit the warning.  This will be done in the next patch.

gcc/c-family/ChangeLog:

* c-indentation.c (get_visual_column): Add parameter first_nws,
use it.  Update comment documenting the function.
(is_first_nonwhitespace_on_line): Remove.
(should_warn_for_misleading_indentation): Replace usage of
of is_first_nonwhitespace_on_line with get_visual_column.
---
 gcc/c-family/c-indentation.c | 53 
 1 file changed, 24 insertions(+), 29 deletions(-)

diff --git a/gcc/c-family/c-indentation.c b/gcc/c-family/c-indentation.c
index 9043f8c..f43aee6 100644
--- a/gcc/c-family/c-indentation.c
+++ b/gcc/c-family/c-indentation.c
@@ -36,11 +36,16 @@ extern cpp_options *cpp_opts;
 /* Convert libcpp's notion of a column (a 1-based char count) to
the "visual column" (0-based column, respecting tabs), by reading the
relevant line.
+
Returns true if a conversion was possible, writing the result to OUT,
-   otherwise returns false.  */
+   otherwise returns false.  If FIRST_NWS is not NULL, then write to it
+   the visual column corresponding to the first non-whitespace character
+   on the line.  */
 
 static bool
-get_visual_column (expanded_location exploc, unsigned int *out)
+get_visual_column (expanded_location exploc,
+  unsigned int *out,
+  unsigned int *first_nws = NULL)
 {
   int line_len;
   const char *line = location_get_source_line (exploc, &line_len);
@@ -50,6 +55,13 @@ get_visual_column (expanded_location exploc, unsigned int 
*out)
   for (int i = 1; i < exploc.column; i++)
 {
   unsigned char ch = line[i - 1];
+
+  if (first_nws != NULL && !ISSPACE (ch))
+   {
+ *first_nws = vis_column;
+ first_nws = NULL;
+   }
+
   if (ch == '\t')
{
 /* Round up to nearest tab stop. */
@@ -60,36 +72,13 @@ get_visual_column (expanded_location exploc, unsigned int 
*out)
vis_column++;
 }
 
+  if (first_nws != NULL)
+*first_nws = vis_column;
+
   *out = vis_column;
   return true;
 }
 
-/* Is the token at LOC the first non-whitespace on its line?
-   Helper function for should_warn_for_misleading_indentation.  */
-
-static bool
-is_first_nonwhitespace_on_line (expanded_location exploc)
-{
-  int line_len;
-  const char *line = location_get_source_line (exploc, &line_len);
-
-   /* If we can't determine it, return false so that we don't issue a
-  warning.  This is sometimes the case for input files
-  containing #line directives, and these are often for autogenerated
-  sources (e.g. from .md files), where it's not clear that it's
-  meaningful to look at indentation.  */
-  if (!line)
-return false;
-
-  for (int i = 1; i < exploc.column; i++)
-{
-  unsigned char ch = line[i - 1];
-  if (!ISSPACE (ch))
-   return false;
-}
-  return true;
-}
-
 /* Does the given source line appear to contain a #if directive?
(or #ifdef/#ifndef).  Ignore the possibility of it being inside a
comment, for simplicity.
@@ -282,9 +271,15 @@ should_warn_for_misleading_indentation (const 
token_indent_info &guard_tinfo,
  /* They're all on the same line.  */
  gcc_assert (guard_exploc.file == next_stmt_exploc.file);
  gcc_assert (guard_exploc.line == next_stmt_exploc.line);
+ unsigned int guard_vis_column;
+ unsigned int guard_line_first_nws;
+ if (!get_visual_column (guard_exploc,
+ &guard_vis_column,
+ &guard_line_first_nws))
+   return false;
  /* Heuristic: only warn if the guard is the first thing
 on its line.  */
- if (is_first_nonwhitespace_on_line (guard_exploc))
+ if (guard_vis_column == guard_line_first_nws)
return true;
}
 }
-- 
2.4.3.368.g7974889.dirty

[PATCH] PR 66474, Document using %x for VSX registers on PowerPC

2015-06-09 Thread Michael Meissner

A user pointed out that we never documented the requirement to use %x in the
ouptut template when VSX registers are used.  This patch adds the necessary
documentation. Is it ok to install in the trunk and the open release branches?

2015-06-09  Michael Meissner  

PR target/66474
* doc/md.texi (Machine Constraints): Document that on the PowerPC
if you use a constraint that targets a VSX register, you must use
%x in the template.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/doc/md.texi
===
--- gcc/doc/md.texi (revision 224165)
+++ gcc/doc/md.texi (working copy)
@@ -3070,6 +3070,26 @@ Altivec vector register
 @item wa
 Any VSX register if the -mvsx option was used or NO_REGS.
 
+When using any of the register constraints (@code{wa}, @code{wd},
+@code{wf}, @code{wg}, @code{wh}, @code{wi}, @code{wj}, @code{wk},
+@code{wl}, @code{wm}, @code{ws}, @code{wt}, @code{wu}, @code{wv},
+@code{ww}, or @code{wy}) that take VSX registers, you must use
+@code{%x} in the template so that the correct register is used.
+Otherwise, the register number will be incorrect if an Altivec
+register is used in a place where a VSX register is expected.
+
+@smallexample
+asm ("xvadddp %x0,%x1,%x2" "=wa" (v1) : "wa" (v2), "wa" (v3));
+@end smallexample
+
+is correct, but:
+
+@smallexample
+asm ("xvadddp %0,%1,%2" "=wa" (v1) : "wa" (v2), "wa" (v3));
+@end smallexample
+
+is not correct.
+
 @item wd
 VSX vector register to hold vector double data or NO_REGS.

Re: [PR64164] drop copyrename, integrate into expand

2015-06-09 Thread Alexandre Oliva

On Jun  9, 2015, David Edelsohn  wrote:

> This also broke bootstrap on PPC64 LE Linux with the same error.

Thanks for your reports.  I'm looking into the problem.

I'd appreciate a preprocessed testcase from either of you to confirm the
fix, if not to help debug it.

Thanks in advance,

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

Re: [gomp4.1] Initial support for some OpenMP 4.1 construct parsing

2015-06-09 Thread Ilya Verbin

On Wed, Apr 29, 2015 at 14:06:44 +0200, Jakub Jelinek wrote:
> [...] The draft requires only alloc or to
> (or always, variants) for enter data and only from or delete (or always,
> variants) for exit data, so in theory it is possible to figure that from
> the call without extra args, but not so for update - enter data is supposed
> to increment reference counts, exit data decrement. [...]

TR3.pdf also says about 'release' map-type for exit data, but it is not
described in the document.

> [...] And, we'll need new
> arguments for async (pass through info that it is async (nowait) or sync,
> and the depend clause address(es)).

I don't quite understand from "If a depend clause is present, then it is treated
as if it had appeared on the implicit task construct that encloses the target
construct", is

  #pragma omp target depend(inout: x)

equivalent to

  #pragma omp task depend(inout: x)
  #pragma omp target

or not?

In other words, can't we just generate GOMP_task (...) with GOMP_target (...)
inside, without any new arguments?

Thanks,
  -- Ilya

Re: [PATCH] Fix ix86_split_long_move collision handling with TLS (PR target/66470)

2015-06-09 Thread Uros Bizjak

On Tue, Jun 9, 2015 at 6:21 PM, Jakub Jelinek  wrote:
> On Tue, Jun 09, 2015 at 06:16:32PM +0200, Uros Bizjak wrote:
>> > something?  Would it be acceptable to just guard the changes in the patch
>> > with !TARGET_X32 and let H.J. deal with that target?  I'm afraid I'm lost
>> > when to ZERO_EXTEND addr (if needed at all), etc.
>>
>> If you wish, I can take your patch and take if further. -mx32 is a
>> delicate beast...
>
> If you could, it would be appreciated, I'm quite busy with OpenMP 4.1 stuff
> now.
> Note that for -m64/-mx32 it will be much harder to create a reproducer,
> because to trigger the bug one has to convince the register allocator
> to allocate the lhs of the load in certain registers (not that hard),
> but also the index register (to be scaled, also not that hard) and
> also the register holding the tls symbol immediate.  Wonder if one has to
> keep all but the two registers live across the load or something similar.

Please find attach a patch that takes your idea slightly further. We
find  perhaps zero-extended UNSPEC_TP, and copy it for further use. At
its place, we simply slap const0_rtx. We know that address to
multi-word values has to be offsettable, which in case of x32 means
that it is NOT zero-extended address.

Uros.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 224292)
+++ config/i386/i386.c  (working copy)
@@ -22858,7 +22858,7 @@ ix86_split_long_move (rtx operands[])
 Do an lea to the last part and use only one colliding move.  */
   else if (collisions > 1)
{
- rtx base;
+ rtx base, addr, tls_base = NULL_RTX;
 
  collisions = 1;
 
@@ -22869,10 +22869,52 @@ ix86_split_long_move (rtx operands[])
  if (GET_MODE (base) != Pmode)
base = gen_rtx_REG (Pmode, REGNO (base));
 
- emit_insn (gen_rtx_SET (base, XEXP (part[1][0], 0)));
+ addr = XEXP (part[1][0], 0);
+ if (TARGET_TLS_DIRECT_SEG_REFS)
+   {
+ struct ix86_address parts;
+ int ok = ix86_decompose_address (addr, &parts);
+ gcc_assert (ok);
+ if (parts.seg != SEG_DEFAULT)
+   {
+ /* It is not valid to use %gs: or %fs: in
+lea though, so we need to remove it from the
+address used for lea and add it to each individual
+memory loads instead.  */
+ rtx *x = &addr;
+  while (GET_CODE (*x) == PLUS)
+{
+  for (i = 0; i < 2; i++)
+   {
+ rtx op = XEXP (*x, i);
+ if ((GET_CODE (op) == UNSPEC
+&& XINT (op, 1) == UNSPEC_TP)
+   || (GET_CODE (op) == ZERO_EXTEND
+   && GET_CODE (XEXP (op, 0)) == UNSPEC
+   && (XINT (XEXP (op, 0), 1)
+   == UNSPEC_TP)))
+ {
+   tls_base = XEXP (*x, i);
+   XEXP (*x, i) = const0_rtx;
+   break;
+ }
+   }
+
+ if (tls_base)
+   break;
+ x = &XEXP (*x, 0);
+   }
+ gcc_assert (tls_base);
+   }
+   }
+ emit_insn (gen_rtx_SET (base, addr));
+ if (tls_base)
+   base = gen_rtx_PLUS (GET_MODE (base), base, tls_base);
  part[1][0] = replace_equiv_address (part[1][0], base);
  for (i = 1; i < nparts; i++)
{
+ if (tls_base)
+   base = copy_rtx (base);
  tmp = plus_constant (Pmode, base, UNITS_PER_WORD * i);
  part[1][i] = replace_equiv_address (part[1][i], tmp);
}

Re: [PATCH] PR 66474, Document using %x for VSX registers on PowerPC

2015-06-09 Thread Segher Boessenkool

On Tue, Jun 09, 2015 at 02:00:48PM -0400, Michael Meissner wrote:
> +asm ("xvadddp %x0,%x1,%x2" "=wa" (v1) : "wa" (v2), "wa" (v3));

A colon went missing?   ^^^


Segher

Re: [PATCH] PR 66474, Document using %x for VSX registers on PowerPC

2015-06-09 Thread Michael Meissner

On Tue, Jun 09, 2015 at 02:17:19PM -0500, Segher Boessenkool wrote:
> On Tue, Jun 09, 2015 at 02:00:48PM -0400, Michael Meissner wrote:
> > +asm ("xvadddp %x0,%x1,%x2" "=wa" (v1) : "wa" (v2), "wa" (v3));
> 
> A colon went missing?   ^^^

Yes, I will correct it when I check it in.  Thanks.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Re: [PATCH] Fix ix86_split_long_move collision handling with TLS (PR target/66470)

2015-06-09 Thread Jakub Jelinek

On Tue, Jun 09, 2015 at 08:09:28PM +0200, Uros Bizjak wrote:
> Please find attach a patch that takes your idea slightly further. We
> find  perhaps zero-extended UNSPEC_TP, and copy it for further use. At
> its place, we simply slap const0_rtx. We know that address to

Is that safe?  I mean, the address, even if offsetable, can have some
immediate already (seems e.g. the offsettable_memref_p predicate just checks
you can plus_constant some small integer and be recognized again) and if you
turn the %gs: into a const0_rtx, it would fail next decompose.
And when you already have the PLUS which has UNSPEC_TP as one of its
arguments, replacing that PLUS with the other argument is IMHO very easy.
Perhaps you are right that there is no need to copy_rtx, supposedly
the rtx shouldn't be shared with anything and thus can be modified in place.

If -mx32 is a non-issue here, then perhaps my initial patch is good enough?

> Index: config/i386/i386.c
> ===
> --- config/i386/i386.c(revision 224292)
> +++ config/i386/i386.c(working copy)
> @@ -22858,7 +22858,7 @@ ix86_split_long_move (rtx operands[])
>Do an lea to the last part and use only one colliding move.  */
>else if (collisions > 1)
>   {
> -   rtx base;
> +   rtx base, addr, tls_base = NULL_RTX;
>  
> collisions = 1;
>  
> @@ -22869,10 +22869,52 @@ ix86_split_long_move (rtx operands[])
> if (GET_MODE (base) != Pmode)
>   base = gen_rtx_REG (Pmode, REGNO (base));
>  
> -   emit_insn (gen_rtx_SET (base, XEXP (part[1][0], 0)));
> +   addr = XEXP (part[1][0], 0);
> +   if (TARGET_TLS_DIRECT_SEG_REFS)
> + {
> +   struct ix86_address parts;
> +   int ok = ix86_decompose_address (addr, &parts);
> +   gcc_assert (ok);
> +   if (parts.seg != SEG_DEFAULT)
> + {
> +   /* It is not valid to use %gs: or %fs: in
> +  lea though, so we need to remove it from the
> +  address used for lea and add it to each individual
> +  memory loads instead.  */
> +   rtx *x = &addr;
> +  while (GET_CODE (*x) == PLUS)
> +{
> +  for (i = 0; i < 2; i++)
> + {
> +   rtx op = XEXP (*x, i);
> +   if ((GET_CODE (op) == UNSPEC
> +  && XINT (op, 1) == UNSPEC_TP)
> + || (GET_CODE (op) == ZERO_EXTEND
> + && GET_CODE (XEXP (op, 0)) == UNSPEC
> + && (XINT (XEXP (op, 0), 1)
> + == UNSPEC_TP)))
> +   {
> + tls_base = XEXP (*x, i);
> + XEXP (*x, i) = const0_rtx;
> + break;
> +   }
> + }
> +
> +   if (tls_base)
> + break;
> +   x = &XEXP (*x, 0);
> + }
> +   gcc_assert (tls_base);
> + }
> + }
> +   emit_insn (gen_rtx_SET (base, addr));
> +   if (tls_base)
> + base = gen_rtx_PLUS (GET_MODE (base), base, tls_base);
> part[1][0] = replace_equiv_address (part[1][0], base);
> for (i = 1; i < nparts; i++)
>   {
> +   if (tls_base)
> + base = copy_rtx (base);
> tmp = plus_constant (Pmode, base, UNITS_PER_WORD * i);
> part[1][i] = replace_equiv_address (part[1][i], tmp);
>   }


Jakub

Re: conditional lim

2015-06-09 Thread Evgeniya Maenkova

On Tue, Jun 9, 2015 at 3:46 PM, Richard Biener
 wrote:
> On Fri, May 29, 2015 at 3:14 PM, Evgeniya Maenkova
>  wrote:
>> Hi Richard,
>>
>> Here is some explanation. I hope you let me know if I need to clarify 
>> something.
>>
>> Also, you asked me about concrete example, to make sure you don’t miss
>> my answer here is the link:
>> https://gcc.gnu.org/ml/gcc-patches/2015-05/msg02417.html.
>>
>> Also, I doubt whether it’s convenient for you to create a build with
>> my patch or not. May be to clarify things you could send me some
>> examples/concrete cases, then I’ll compile them with
>> –fdump-tree-loopinit-details and –fdump-tree-lim-details and send you
>> these dumps. May be these dumps will be useful. (I’ll only disable
>> cleanup_cfg TODO after lim to let you know the exact picture after
>> lim).
>>
>> What do you think?
>>
>> 1.   invariantness _dom_walker –
>>
>> 1.1   for each GIMPLE_COND in given bb calls handle_cond_stmt to call
>> for true and false edges handle_branch_edge, which calls SET_TARGET_OF
>> for all bb ‘predicated’ by given GIMPLE_COND.
>>
>> SET_TARGET_OF sets in basic_blocks aux 2 facts:
>>
>> a)  this is true or false edge;
>>
>> b)  link to cond stmt;
>>
>> Handle_branch_edge works this way:
>>
>> If (cond1)
>>
>>   {
>>
>>  bb1;
>>
>>  if (cond2}
>>
>>{
>>
>>bb2;
>>
>> }
>>
>>Being called for cond1, it sets cond1 as condition for both bb1 and
>> bb2 (the whole branch for cond1, ie also for bb containing cond2),
>> then this method will be called (as there is dominance order) for
>> cond2 to correct things (ie to set cond2 as condition for bb2).
>
> Hmm, why not track the current condition as state during the DOM walk
> and thus avoid processing more than one basic-block in handle_branch_edge?
> Thus get rid of handle_branch_edge and instead do everything in 
> handle_cond_stmt
> plus the dom-walkers BB visitor?
>
I need to look more carefully how to implement it, but I think I
understand what you mean and this optimization of course looks
reasonable to me. Will do.

> I see you don't handle BBs with multiple predecessors - that's ok, but
> are you sure you don't run into correctness issues when not marking such
> BBs as predicated?  This misses handling of, say
>
>  if (a || b)
>bb;
>
> which is a pity (but can be fixed later if desired).
>
I had some test (in gcc testsuite or bootstrap build) which worked
incorrectly because of multiple predecessors. As far as I remember the
situation was (further, will make some notes about such tests to
clarify this better), I mean with previous version of my code which
handled bb with 2 predecessors:
if (a)
  tmpvar=something;
while()
  if (a || b)
  basic_block {do something with tmpvar;} // I mean basic block
predicated by bb with a and bb with b

So, if a is false, I mean we didn't do tmpvar=something (outside
loop), BUT we are in basick_block  (we went by bb with b), we got
Segmentation falt in basic_block {do something with tmpvar;}.

I think we can reproduce all the details of this test if I remove not
handling bb with 2 predecessors.

So I wouldn't move bb with 2 predecessors (this is not always executed
bb in any loop, not conditionally, they will not be moved at all).

This is my more detail explanation on this point. Perhaps, I didn't
understand your question about correctness. Could you repeat it in
other words (based on this new clarification).

So I think according to current code it will not be moved. What
incorrectness do you mean?

> I note that collecting predicates has similarities to what if-conversion
> does in tree-ifcvt.c (even if its implementation is completely different,
> of course).
>

Ok, I'll look at this. But could you please clarify your point?
(Should I just take this into account with low priority and look at
this later or you want some refactoring?)

>> 1.2   As 1.1 goes we identify whether some bb is predicated by some
>> condition or not.
>>
>> bb->aux->type will be [TRUE/FALSE]_TARGET_OF and
>> bb->aux->cond_stmt=cond stmt (the nearest condition).
>>
>> If bb is always executed bb->aux->type = ALWAYS_EXECUTED_IN,
>> bb->loop->loop (this info was available in the clean build).
>>
>> 1.3   As this walker is called in dominance order, information about
>> condition is available when invariantness_dom_walker is called for
>> given bb.  So we can make some computations based on bb->aux
>> structure. This is done in check_conditionally_executed. The main goal
>> of this method is to check that the root condition is always executed
>> in the loop. I did so to avoid situation like this
>>
>> Loop:
>>
>>Jmp somewhere;
>>
>>   If (cond1)
>>
>>   If (cond2)
>>
>>  Etc
>>
>> By ‘root condition’ I mean cond1 in this case (the first cond in
>> dominance order).
>
> Ok, but this can't happen (an uncontional jump to somehwere).  It
> will always be a conditional jump (a loop exit) and thus should
> be handled already.
>
> Did you have a testcase

Re: [PR64164] drop copyrename, integrate into expand

2015-06-09 Thread Alexandre Oliva

On Jun  9, 2015, Alexandre Oliva  wrote:

> On Jun  9, 2015, David Edelsohn  wrote:
>> This also broke bootstrap on PPC64 LE Linux with the same error.

> Thanks for your reports.  I'm looking into the problem.

> I'd appreciate a preprocessed testcase from either of you to confirm the
> fix, if not to help debug it.

The first potential source for this problem that jumped at me would be
silenced with this change:

diff --git a/gcc/function.c b/gcc/function.c
index 8bcc352..9201ed9 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -2974,7 +2974,8 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
stack_parm = copy_rtx (stack_parm);
   if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
PUT_MODE (stack_parm, GET_MODE (entry_parm));
-  set_mem_attributes (stack_parm, parm, 1);
+  if (GET_CODE (stack_parm) == MEM)
+   set_mem_attributes (stack_parm, parm, 1);
 }
 
   /* If a BLKmode arrives in registers, copy it to a stack slot.  Handle

but I suspect there might be other similar issues lurking in function.c
after my attempt to turn parm assignment upside down ;-)

(namely, it used to assume it could pick stack slots and pseudos in a
whim, but after this change it must give way to out-of-SSA's partition
assignments.)

I'll look into cross-building some embedded targets and see if any
further issues surface.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

Re: [gomp4.1] Initial support for some OpenMP 4.1 construct parsing

2015-06-09 Thread Jakub Jelinek

On Tue, Jun 09, 2015 at 09:36:08PM +0300, Ilya Verbin wrote:
> On Wed, Apr 29, 2015 at 14:06:44 +0200, Jakub Jelinek wrote:
> > [...] The draft requires only alloc or to
> > (or always, variants) for enter data and only from or delete (or always,
> > variants) for exit data, so in theory it is possible to figure that from
> > the call without extra args, but not so for update - enter data is supposed
> > to increment reference counts, exit data decrement. [...]
> 
> TR3.pdf also says about 'release' map-type for exit data, but it is not
> described in the document.

Seems that is the case even in the latest draft.  Can you file a ticket
for this or discuss on omp-lang?  Or should I?

> > [...] And, we'll need new
> > arguments for async (pass through info that it is async (nowait) or sync,
> > and the depend clause address(es)).
> 
> I don't quite understand from "If a depend clause is present, then it is 
> treated
> as if it had appeared on the implicit task construct that encloses the target
> construct", is
> 
>   #pragma omp target depend(inout: x)
> 
> equivalent to
> 
>   #pragma omp task depend(inout: x)
>   #pragma omp target
> 
> or not?
> 
> In other words, can't we just generate GOMP_task (...) with GOMP_target (...)
> inside, without any new arguments?

No, that would be an explicit task.  Furthermore, the implicit task isn't
on the host side, but on the offloading device side.  The implicit task is
what is executed when you enter the #pragma omp target.  Ignoring the teams
construct which is there mainly for NVidia GPGPUs, when you enter the
#pragma omp target construct, there is an implicit parallel with
num_threads(1) (like there is an implicit parallel with num_threads(1)
when you enter main () of a host program), and that implicit parallel has
a single implicit task, which executes the statements inside of #pragma
omp target body, until you encounter #pragma omp teams or #pragma omp
parallel.  And the above statement simply says that no statements from
the #pragma omp target body are executed until the depend dependency is
satisfied.  Whether these dependencies are host addresses, or offloading
device addresses, is something that really needs to be figured out, I admit
I haven't read the whole async offloading text carefully yet, nor
participated in the telecons about it. 

Jakub

Re: [PR64164] drop copyrename, integrate into expand

2015-06-09 Thread Jakub Jelinek

On Tue, Jun 09, 2015 at 05:11:45PM -0300, Alexandre Oliva wrote:
> On Jun  9, 2015, Alexandre Oliva  wrote:
> 
> > On Jun  9, 2015, David Edelsohn  wrote:
> >> This also broke bootstrap on PPC64 LE Linux with the same error.
> 
> > Thanks for your reports.  I'm looking into the problem.
> 
> > I'd appreciate a preprocessed testcase from either of you to confirm the
> > fix, if not to help debug it.
> 
> The first potential source for this problem that jumped at me would be
> silenced with this change:
> 
> diff --git a/gcc/function.c b/gcc/function.c
> index 8bcc352..9201ed9 100644
> --- a/gcc/function.c
> +++ b/gcc/function.c
> @@ -2974,7 +2974,8 @@ assign_parm_setup_block (struct assign_parm_data_all 
> *all,
>   stack_parm = copy_rtx (stack_parm);
>if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
>   PUT_MODE (stack_parm, GET_MODE (entry_parm));
> -  set_mem_attributes (stack_parm, parm, 1);
> +  if (GET_CODE (stack_parm) == MEM)

FYI, this is preferrably if (MEM_P (stack_parm)) these days.

> + set_mem_attributes (stack_parm, parm, 1);
>  }

Jakub

Re: [patch] [jit] Install headers using INSTALL_DATA

2015-06-09 Thread David Malcolm

On Tue, 2015-06-09 at 18:19 +0200, Matthias Klose wrote:
> Install the gcc/jit header files without the x bit set. Ok for the trunk and 
> the
> 5 branch?

OK.

Re: [PR64164] drop copyrename, integrate into expand

2015-06-09 Thread Eric Botcazou

> I'll look into cross-building some embedded targets and see if any
> further issues surface.

SPARC is also broken, see my message and the tescase under the PR.

-- 
Eric Botcazou

Re: [PR64164] drop copyrename, integrate into expand

2015-06-09 Thread David Edelsohn

Alex, I sent you a pre-processed file off-list.  You could try
bootstrapping on PPC64 on the GCC Compile Farm.

The SPARC failure reports a different error than PPC and ARM.  The PPC
and ARM failures are the same message, but seem to be on different
files.

After the breakage from Aldy's patch all weekend, this failure is very
frustrating.  If this is not fixed within 24 hours, your patch must be
reverted.  This patch clearly should have been tested on more
architectures than x86 before being approved and merged.

Thanks, David

On Tue, Jun 9, 2015 at 5:27 PM, Eric Botcazou  wrote:
>> I'll look into cross-building some embedded targets and see if any
>> further issues surface.
>
> SPARC is also broken, see my message and the tescase under the PR.
>
> --
> Eric Botcazou

[PATCH][PR debug/65549] Restore DW_AT_abstract_origin for cross-unit call sites

2015-06-09 Thread Pierre-Marie de Rodat


Hello,

With the recent work for PR debug/65549, we observed a regression in the 
generated debugging information. Take the attached reproducer, build it 
and look at the output DWARF:


 $ gcc -c -O1 -g foo.c
 $ objdump --dwarf=info foo.o
 [...]
  <2><4e>: Abbrev Number: 3 (DW_TAG_GNU_call_site)
 <4f>   DW_AT_low_pc  : 0x9
  <2><57>: Abbrev Number: 0

There is no DW_AT_abstract_origin attribute anymore, whereas there used 
to be one.


On PowerPC with -mlongcall, for instance, call instructions are indirect 
and we (at AdaCore) rely on this attribute to determine what function is 
actually called (it's easier this way than interpreting machine 
code...). The DWARF proposal for call sites also say debuggers should be 
able to use this attribute (to build virtual call stacks in the presence 
of tail calls), but I could not observe this in practice with GDB.


The attached patch is an attempt to restore this attribute. The logic 
behind it is: this is what we used to do previously for contexts that 
are alien translation unit (through the call to force_decl_die), so this 
should not harm.


Bootstrapped and regtested on x86_64-linux. Ok to commit?
Thank you in advance!

gcc/ChangeLog
* dwarf2out.c (lookup_context_die): Return the DIE for the
current compilation unit when the input context is a
TRANSLATION_UNIT_DECL.

--
Pierre-Marie de Rodat

extern int bar(int a);

int foo(int a)
{
  a += 1;
  return a;
}

int main(void)
{
  bar(1);
}

>From b2ededa4a5eac77fc0d5d95011de935a1dff7eba Mon Sep 17 00:00:00 2001
From: Pierre-Marie de Rodat 
Date: Mon, 8 Jun 2015 16:27:04 +0200
Subject: [PATCH] DWARF: restore DW_AT_abstract_origin for cross-unit call
 sites

gcc/ChangeLog
	* dwarf2out.c (lookup_context_die): Return the DIE for the
	current compilation unit when the input context is a
	TRANSLATION_UNIT_DECL.
---
 gcc/dwarf2out.c |   25 -
 1 file changed, 12 insertions(+), 13 deletions(-)

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index ee2bcb1..4e204df 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -21053,21 +21053,20 @@ is_naming_typedef_decl (const_tree decl)
 static inline dw_die_ref
 lookup_context_die (tree context)
 {
-  if (context)
+  if (context == NULL_TREE || TREE_CODE (context) == TRANSLATION_UNIT_DECL)
+return comp_unit_die ();
+
+  /* Find die that represents this context.  */
+  if (TYPE_P (context))
 {
-  /* Find die that represents this context.  */
-  if (TYPE_P (context))
-	{
-	  context = TYPE_MAIN_VARIANT (context);
-	  dw_die_ref ctx = lookup_type_die (context);
-	  if (!ctx)
-	return NULL;
-	  return strip_naming_typedef (context, ctx);
-	}
-  else
-	return lookup_decl_die (context);
+  context = TYPE_MAIN_VARIANT (context);
+  dw_die_ref ctx = lookup_type_die (context);
+  if (!ctx)
+	return NULL;
+  return strip_naming_typedef (context, ctx);
 }
-  return comp_unit_die ();
+  else
+return lookup_decl_die (context);
 }
 
 /* Returns the DIE for a context.  */
-- 
1.7.10.4

Add extra C/Fortran interoperability testcase

2015-06-09 Thread Jan Hubicka

Hi,
I constructed this testcase as I anticipated a bogus warning. It does not happen
because C_FUNPTR is function pointer and useless_type_conversion does not look
into function signature, but it is still useful testcase to have.
Once we start to do TBAA for pointers, we will end up in need to glob C_FUNPTR 
with
void * to make it universal.

regtested ppc64le-linux, will commit it shortly.

Honza

* gfortran.dg/lto/bind_c-5_0.f90: New testcase.
* gfortran.dg/lto/bind_c-5_1.c: New testcase.
Index: bind_c-5_0.f90
===
--- bind_c-5_0.f90  (revision 0)
+++ bind_c-5_0.f90  (working copy)
@@ -0,0 +1,17 @@
+! { dg-lto-do run }
+! { dg-lto-options {{ -O3 -flto }} }
+! This testcase will abort if C_FUNPTR is not interoperable with both int *
+! and float *
+module lto_type_merge_test
+  use, intrinsic :: iso_c_binding
+  implicit none
+
+  type(c_funptr), bind(c, name="myVar") :: myVar
+  type(c_funptr), bind(c, name="myVar2") :: myVar2
+
+contains
+  subroutine types_test() bind(c)
+myVar = myVar2
+  end subroutine types_test
+end module lto_type_merge_test
+
Index: bind_c-5_1.c
===
--- bind_c-5_1.c(revision 0)
+++ bind_c-5_1.c(working copy)
@@ -0,0 +1,31 @@
+#include 
+/* declared in the fortran module */
+extern int (*myVar) (int);
+extern float (*myVar2) (float);
+void types_test(void);
+
+
+extern void abort(void);
+
+int main(int argc, char **argv)
+{
+   int (**myptr) (int);
+   float (**myptr2) (float);
+   asm("":"=r"(myptr):"0"(&myVar));
+   asm("":"=r"(myptr2):"0"(&myVar2));
+   *myptr = (int (*) (int)) (size_t) (void *)1;
+   *myptr2 = (float (*) (float)) (size_t) (void *)2;
+   types_test();
+   if (*myptr != (int (*) (int)) (size_t) (void *)2)
+   abort ();
+   if (*myptr2 != (float (*) (float)) (size_t) (void *)2)
+   abort ();
+   *myptr2 = (float (*) (float)) (size_t) (void *)3;
+   types_test();
+   if (*myptr != (int (*) (int)) (size_t) (void *)3)
+   abort ();
+   if (*myptr2 != (float (*) (float)) (size_t) (void *)3)
+   abort ();
+   return 0;
+}
+

[PATCH v2] xtensa: implement trap pattern

2015-06-09 Thread Max Filippov

gcc/
* config/xtensa/xtensa.h (TARGET_DEBUG): New definition.
* config/xtensa/xtensa.md (define_attr "type"): New type "trap".
(define_insn "trap"): New definition.
---
Changes v1->v2:
- drop break.n, replace break 0, 0 with break 1, 15, coded breakpoint
  that transfers control to debugger if present.

 gcc/config/xtensa/xtensa.h  |  1 +
 gcc/config/xtensa/xtensa.md | 15 ++-
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/gcc/config/xtensa/xtensa.h b/gcc/config/xtensa/xtensa.h
index 011411c..584080b 100644
--- a/gcc/config/xtensa/xtensa.h
+++ b/gcc/config/xtensa/xtensa.h
@@ -67,6 +67,7 @@ extern unsigned xtensa_current_frame_size;
 #define TARGET_THREADPTR   XCHAL_HAVE_THREADPTR
 #define TARGET_LOOPS   XCHAL_HAVE_LOOPS
 #define TARGET_WINDOWED_ABI(XSHAL_ABI == XTHAL_ABI_WINDOWED)
+#define TARGET_DEBUG   XCHAL_HAVE_DEBUG
 
 #define TARGET_DEFAULT \
   ((XCHAL_HAVE_L32R? 0 : MASK_CONST16) |   \
diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md
index 6d84384..a577aa3 100644
--- a/gcc/config/xtensa/xtensa.md
+++ b/gcc/config/xtensa/xtensa.md
@@ -86,7 +86,7 @@
 ;; Attributes.
 
 (define_attr "type"
-  
"unknown,jump,call,load,store,move,arith,multi,nop,farith,fmadd,fconv,fload,fstore,mul16,mul32,div32,mac16,rsr,wsr,entry"
+  
"unknown,jump,call,load,store,move,arith,multi,nop,farith,fmadd,fconv,fload,fstore,mul16,mul32,div32,mac16,rsr,wsr,entry,trap"
   (const_string "unknown"))
 
 (define_attr "mode"
@@ -1764,6 +1764,19 @@
   [(set_attr "length" "0")
(set_attr "type" "nop")])
 
+(define_insn "trap"
+  [(trap_if (const_int 1) (const_int 0))]
+  ""
+{
+  if (TARGET_DEBUG)
+return "break\t1, 15";
+  else
+return (TARGET_DENSITY ? "ill.n" : "ill");
+}
+  [(set_attr "type""trap")
+   (set_attr "mode""none")
+   (set_attr "length"  "3")])
+
 ;; Setting up a frame pointer is tricky for Xtensa because GCC doesn't
 ;; know if a frame pointer is required until the reload pass, and
 ;; because there may be an incoming argument value in the hard frame
-- 
1.8.1.4

Re: [PING^2][PATCH][3/3][PR65460] Mark offloaded functions as parallelized

2015-06-09 Thread Thomas Schwinge

Hi Tom!

On Tue, 9 Jun 2015 16:12:12 +0200, Tom de Vries  wrote:
> On 09/06/15 13:07, Richard Biener wrote:
> > On Mon, 8 Jun 2015, Tom de Vries wrote:
> >
> >> On 17/04/15 12:08, Tom de Vries wrote:
> >>> On 20-03-15 12:38, Tom de Vries wrote:
>  On 19-03-15 12:05, Tom de Vries wrote:
> > On 18-03-15 18:22, Tom de Vries wrote:
> >> this patch fixes PR65460.
> >>
> >> The patch marks offloaded functions as parallelized, which means the
> >> parloops
> >> pass no longer attempts to modify that function.

> > Ok, but shouldn't it be set before calling add_new_function as
> > add_new_function might run passes that wouldn't identify the
> > function as parallelized?
> 
> Hm, indeed sometimes add_new_function executes some passes itself, 
> besides queueing the function for further processing. I suppose the 
> existing settings of parallelized_function should be modified in a 
> similar way.

(I took note of this small change of the trunk patch vs. the version
present on gomp-4_0-branch, and will adapt the latter as required as part
of the next merge.)

> I'll bootstrap and reg-test attached two patches on x86_64, and commit 
> unless objections.

> Mark offloaded functions as parallelized
> 
> 2015-06-09  Tom de Vries  
> 
>   PR tree-optimization/65460
>   * omp-low.c (expand_omp_target): Set parallelized_function on
>   cgraph_node for child_fn.

(Committed to trunk in r224303.)

> --- a/gcc/omp-low.c
> +++ b/gcc/omp-low.c
> @@ -8959,6 +8959,8 @@ expand_omp_target (struct omp_region *region)
>  
>/* Inform the callgraph about the new function.  */
>DECL_STRUCT_FUNCTION (child_fn)->curr_properties = 
> cfun->curr_properties;
> +  cgraph_node *node = cgraph_node::get_create (child_fn);

Would you please committ a fix: in offloading configurations
(ENABLE_OFFLOADING), this new node variable will clash with an existing
definition a little later:

> +  node->parallelized_function = 1;
>cgraph_node::add_new_function (child_fn, true);
>  
>  #ifdef ENABLE_OFFLOADING
|/* Add the new function to the offload table.  */
|vec_safe_push (offload_funcs, child_fn);
|  #endif
|  
|/* Fix the callgraph edges for child_cfun.  Those for cfun will be
|fixed in a following pass.  */
|push_cfun (child_cfun);
|cgraph_edge::rebuild_edges ();
|  
|  #ifdef ENABLE_OFFLOADING
|/* Prevent IPA from removing child_fn as unreachable, since there are 
no
|refs from the parent function to child_fn in offload LTO mode.  */
|struct cgraph_node *node = cgraph_node::get (child_fn);
|node->mark_force_output ();
|  #endif


Grüße,
 Thomas


signature.asc
Description: PGP signature

[PATCH, Google] Notify df framework when removing an insn in simplify-got

2015-06-09 Thread Carrot Wei

Hi

I forgot to notify df framework when I removed an insn, it caused df
verification failure described in google bug b/16155462.

The following patch passed regression test on arm qemu in both thumb
and arm modes.
OK for google 4.9 branch?


Index: simplify-got.c
===
--- simplify-got.c (revision 224174)
+++ simplify-got.c (working copy)
@@ -169,7 +169,10 @@

   /* Since there is no usage of pic_reg now, we can remove it.  */
   if (use)
-remove_insn (use);
+{
+  df_insn_delete (use);
+  remove_insn (use);
+}
   targetm.got_access.clear_pic_reg ();
   free (got_accesses);
   htab_delete (var_table);

Re: [PATCH, Google] Notify df framework when removing an insn in simplify-got

2015-06-09 Thread Xinliang David Li

ok.

David

On Tue, Jun 9, 2015 at 4:46 PM, Carrot Wei  wrote:
> Hi
>
> I forgot to notify df framework when I removed an insn, it caused df
> verification failure described in google bug b/16155462.
>
> The following patch passed regression test on arm qemu in both thumb
> and arm modes.
> OK for google 4.9 branch?
>
>
> Index: simplify-got.c
> ===
> --- simplify-got.c (revision 224174)
> +++ simplify-got.c (working copy)
> @@ -169,7 +169,10 @@
>
>/* Since there is no usage of pic_reg now, we can remove it.  */
>if (use)
> -remove_insn (use);
> +{
> +  df_insn_delete (use);
> +  remove_insn (use);
> +}
>targetm.got_access.clear_pic_reg ();
>free (got_accesses);
>htab_delete (var_table);

Re: [PR64164] drop copyrename, integrate into expand

2015-06-09 Thread Alexandre Oliva

On Jun  5, 2015, Alexandre Oliva  wrote:

> On Apr 27, 2015, Richard Biener  wrote:

>>> +/* Return the promoted mode for name.  If it is a named SSA_NAME, it
>>> +   is the same as promote_decl_mode.  Otherwise, it is the promoted
>>> +   mode of a temp decl of same type as the SSA_NAME, if we had created
>>> +   one.  */
>>> +
>>> +machine_mode
>>> +promote_ssa_mode (const_tree name, int *punsignedp)
>>> +{
>>> +  gcc_assert (TREE_CODE (name) == SSA_NAME);
>>> +
>>> +  if (SSA_NAME_VAR (name))
>>> +return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);

>> As above I'd rather not have different paths for anonymous vs. non-anonymous
>> vars (so just delete the above two lines).

> Check

This caused the sparc regression reported by Eric in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64164#c37

We need to match the mode of the rtl created for the partition and the
promoted mode expected for the parm.  I recall working to make parm and
result decls the partition leaders, so that promote_ssa_mode would DTRT,
but this escaped my mind when revisiting the patch after some time on
another project.

So we either restore promote_ssa_mode's check for an underlying decl, at
least for PARM_ and RESULT_DECLs, or further massage function.c to deal
with the mode difference.  Any preference?

I'm reverting the patch for now, so that we don't have to rush to a fix
on this, and I can have more time to test and fix other arches.  It was
a terrible mistake to not do so before submitting the final version of
the patch, or at least before installing it.  I apologize for that.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

Re: [PATCH] PR 66474, Document using %x for VSX registers on PowerPC

2015-06-09 Thread David Edelsohn

On Tue, Jun 9, 2015 at 3:26 PM, Michael Meissner
 wrote:
> On Tue, Jun 09, 2015 at 02:17:19PM -0500, Segher Boessenkool wrote:
>> On Tue, Jun 09, 2015 at 02:00:48PM -0400, Michael Meissner wrote:
>> > +asm ("xvadddp %x0,%x1,%x2" "=wa" (v1) : "wa" (v2), "wa" (v3));
>>
>> A colon went missing?   ^^^
>
> Yes, I will correct it when I check it in.  Thanks.

Mike,

VSX registers are a superset of Altivec registers, so the statement
about an Altivec register used where a VSX register is expected is a
little confusing.  How about:

Otherwise the register number output in the assembly file will be
incorrect if an Altivec register is an operand of a VSX instruction
that expects VSX register numbering.

Thanks, David

Re: [PATCH] Emit -Waddress warnings for comparing address of reference against NULL

2015-06-09 Thread Patrick Palka

On Sun, May 3, 2015 at 5:29 PM, Patrick Palka  wrote:
> On Sun, Apr 26, 2015 at 8:56 PM, Patrick Palka  wrote:
>> Here is an updated version of the patch with, hopefully, all your
>> suggestions made.  I decided to add calls to STRIP_NOPS before emitting
>> the warning so that we properly warn for cases where there's a cast in
>> between the whole thing, e.g.
>>
>>   if (!&(int &)a)
>>
>> I also added guards to emit the warnings only when the stripped operand
>> is actually a decl so that we don't pass a non-decl argument to
>> warning_at() which can happen in a case like
>>
>>   if (!&(int &)*(int *)0)
>>
>> Does this look OK now after testing?
>>
>> gcc/c-family/ChangeLog:
>>
>> PR c++/65168
>> * c-common.c (c_common_truthvalue_conversion): Warn when
>> converting an address of a reference to a truth value.
>>
>> gcc/cp/ChangeLog:
>>
>> PR c++/65168
>> * typeck.c (cp_build_binary_op): Warn when comparing an address
>> of a reference against NULL.
>>
>> gcc/testsuite/ChangeLog:
>>
>> PR c++/65168
>> g++.dg/warn/Walways-true-3.C: New test.
>> ---
>>  gcc/c-family/c-common.c| 14 ++
>>  gcc/cp/typeck.c| 34 ++
>>  gcc/testsuite/g++.dg/warn/Walways-true-3.C | 45 
>> ++
>>  3 files changed, 93 insertions(+)
>>  create mode 100644 gcc/testsuite/g++.dg/warn/Walways-true-3.C
>>
>> diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
>> index 9797e17..c353c72 100644
>> --- a/gcc/c-family/c-common.c
>> +++ b/gcc/c-family/c-common.c
>> @@ -4806,6 +4806,20 @@ c_common_truthvalue_conversion (location_t location, 
>> tree expr)
>> tree totype = TREE_TYPE (expr);
>> tree fromtype = TREE_TYPE (TREE_OPERAND (expr, 0));
>>
>> +   if (POINTER_TYPE_P (totype)
>> +   && TREE_CODE (fromtype) == REFERENCE_TYPE)
>> + {
>> +   tree inner = expr;
>> +   STRIP_NOPS (inner);
>> +
>> +   if (DECL_P (inner))
>> + warning_at (location,
>> + OPT_Waddress,
>> + "the compiler can assume that the address of "
>> + "%qD will always evaluate to %",
>> + inner);
>> + }
>> +
>> /* Don't cancel the effect of a CONVERT_EXPR from a REFERENCE_TYPE,
>>since that affects how `default_conversion' will behave.  */
>> if (TREE_CODE (totype) == REFERENCE_TYPE
>> diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
>> index 91db32a..13fb401 100644
>> --- a/gcc/cp/typeck.c
>> +++ b/gcc/cp/typeck.c
>> @@ -4423,6 +4423,23 @@ cp_build_binary_op (location_t location,
>> warning (OPT_Waddress, "the address of %qD will never be 
>> NULL",
>>  TREE_OPERAND (op0, 0));
>> }
>> +
>> + if (CONVERT_EXPR_P (op0)
>> + && TREE_CODE (TREE_TYPE (TREE_OPERAND (op0, 0)))
>> +== REFERENCE_TYPE)
>> +   {
>> + tree inner_op0 = op0;
>> + STRIP_NOPS (inner_op0);
>> +
>> + if ((complain & tf_warning)
>> + && c_inhibit_evaluation_warnings == 0
>> + && !TREE_NO_WARNING (op0)
>> + && DECL_P (inner_op0))
>> +   warning (OPT_Waddress,
>> +"the compiler can assume that the address of "
>> +"%qD will never be NULL",
>> +inner_op0);
>> +   }
>> }
>>else if (((code1 == POINTER_TYPE || TYPE_PTRDATAMEM_P (type1))
>> && null_ptr_cst_p (op0))
>> @@ -4445,6 +4462,23 @@ cp_build_binary_op (location_t location,
>> warning (OPT_Waddress, "the address of %qD will never be 
>> NULL",
>>  TREE_OPERAND (op1, 0));
>> }
>> +
>> + if (CONVERT_EXPR_P (op1)
>> + && TREE_CODE (TREE_TYPE (TREE_OPERAND (op1, 0)))
>> +== REFERENCE_TYPE)
>> +   {
>> + tree inner_op1 = op1;
>> + STRIP_NOPS (inner_op1);
>> +
>> + if ((complain & tf_warning)
>> + && c_inhibit_evaluation_warnings == 0
>> + && !TREE_NO_WARNING (op1)
>> + && DECL_P (inner_op1))
>> +   warning (OPT_Waddress,
>> +"the compiler can assume that the address of "
>> +"%qD will never be NULL",
>> +inner_op1);
>> +   }
>> }
>>else if ((code0 == POINTER_TYPE && code1 == POINTER_TYPE)
>>|| (TYPE_PTRDATAMEM_P (type0) && TYPE_PTRDATAMEM_P (type1)))
>> diff --git a/gcc/testsuite/g++.dg/warn/Walways-true-3.C 
>> b/gcc/testsuite/g++.dg/warn/Walways-true-3.C
>> new file mode 100644
>> index 000..0d13d3f
>> --- /dev/null
>> +++ b/gcc/testsuite/g++.dg/warn/Walways-true-3.C
>> @@ -0,0 +1,45 @@
>> +//

RE: backport the fixes of PR target/64011 and /61749 to 4.9 gcc

2015-06-09 Thread weixiangyu

Updated patches were attached. Rebased on the latest 4.9 branch. 
Tested on aarch64-linux (big-endian and little-endian) with qemu. 
OK for 4.9?

The first patch:
Index: gcc/ChangeLog
===
--- gcc/ChangeLog   (revision 223867)
+++ gcc/ChangeLog   (working copy)
@@ -1,3 +1,11 @@
+2015-06-09  Xiangyu Wei  
+
+   Backport from mainline r219717:
+   2015-01-15  Jiong Wang  
+   PR rtl-optimization/64011
+   * expmed.c (store_bit_field_using_insv): Warn and truncate bitsize when
+   there is partial overflow.
+
Index: gcc/expmed.c
===
--- gcc/expmed.c(revision 223867)
+++ gcc/expmed.c(working copy)
@@ -539,7 +539,21 @@ store_bit_field_using_insv (const extraction_insn
   xop0 = tem;
   copy_back = true;
 }
-
+  
+  /* There are similar overflow check at the start of store_bit_field_1, 
+ but that only check the situation where the field lies completely 
+ outside the register, while there do have situation where the field 
+ lies partialy in the register, we need to adjust bitsize for this 
+ partial overflow situation.  Without this fix, pr48335-2.c on big-endian 
+ will broken on those arch support bit insert instruction, like arm, 
aarch64 
+ etc.  */ 
+  if (bitsize + bitnum > unit && bitnum < unit) 
+{ 
+  warning (OPT_Wextra, "write of "HOST_WIDE_INT_PRINT_UNSIGNED"bit data " 
+   "outside the bound of destination object, data truncated into " 
+   HOST_WIDE_INT_PRINT_UNSIGNED"bit", bitsize, unit - bitnum); 
+  bitsize = unit - bitnum; 
+}
   /* If BITS_BIG_ENDIAN is zero on a BYTES_BIG_ENDIAN machine, we count
  "backwards" from the size of the unit we are inserting into.
  Otherwise, we count bits from the most significant on a

And the second one:
Index: gcc/ChangeLog
===
--- gcc/ChangeLog   (revision 223867)
+++ gcc/ChangeLog   (working copy)
@@ -1,3 +1,28 @@
+2015-06-09  Xiangyu Wei  
+
+   Backport from mainline r215046:
+   2014-09-09  Kyrylo Tkachov  
+   PR target/61749
+   *  config/aarch64/aarch64-builtins.c (aarch64_types_quadop_qualifiers):
+   Use qualifier_immediate for last operand.  Rename to...
+   (aarch64_types_ternop_lane_qualifiers): ... This.
+   (TYPES_QUADOP): Rename to...
+   (TYPES_TERNOP_LANE): ... This.
+   (aarch64_simd_expand_args): Return const0_rtx when encountering user
+   error.  Change return of 0 to return of NULL_RTX.
+   (aarch64_crc32_expand_builtin): Likewise.
+   (aarch64_expand_builtin): Return NULL_RTX instead of 0.
+   ICE when expanding unknown builtin.
+   * config/aarch64/aarch64-simd-builtins.def (sqdmlal_lane): Use
+   TERNOP_LANE qualifiers.
+   (sqdmlsl_lane): Likewise.
+   (sqdmlal_laneq): Likewise.
+   (sqdmlsl_laneq): Likewise.
+   (sqdmlal2_lane): Likewise.
+   (sqdmlsl2_lane): Likewise.
+   (sqdmlal2_laneq): Likewise.
+   (sqdmlsl2_laneq): Likewise.
+   * gcc.target/aarch64/vqdml_lane_intrinsics-bad_1.c: New test.
Index: gcc/config/aarch64/aarch64-builtins.c
===
--- gcc/config/aarch64/aarch64-builtins.c   (revision 223867)
+++ gcc/config/aarch64/aarch64-builtins.c   (working copy)
@@ -172,10 +172,10 @@ aarch64_types_ternopu_qualifiers[SIMD_MAX_BUILTIN_
 #define TYPES_TERNOPU (aarch64_types_ternopu_qualifiers)

 static enum aarch64_type_qualifiers
-aarch64_types_quadop_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+aarch64_types_ternop_lane_qualifiers[SIMD_MAX_BUILTIN_ARGS]
   = { qualifier_none, qualifier_none, qualifier_none,
-  qualifier_none, qualifier_none };
-#define TYPES_QUADOP (aarch64_types_quadop_qualifiers)
+  qualifier_none, qualifier_immediate };
+#define TYPES_TERNOP_LANE (aarch64_types_ternop_lane_qualifiers)

 static enum aarch64_type_qualifiers
 aarch64_types_getlane_qualifiers[SIMD_MAX_BUILTIN_ARGS]
@@ -818,8 +818,11 @@ aarch64_simd_expand_args (rtx target, int icode, i
case SIMD_ARG_CONSTANT:
  if (!(*insn_data[icode].operand[argc + have_retval].predicate)
  (op[argc], mode[argc]))
-   error_at (EXPR_LOCATION (exp), "incompatible type for argument 
%d, "
+   {
+ error_at (EXPR_LOCATION (exp), "incompatible type for 
argument %d, "
   "expected %", argc + 1);
+ return const0_rtx;
+   }
  break;

case SIMD_ARG_STOP:
@@ -886,7 +889,7 @@ aarch64_simd_expand_args (rtx target, int icode, i
   }

   if (!pat)
-return 0;
+return NULL_RTX;

> -Original Message-
> From: James Greenhalgh [mailto:james.greenha...@arm.com]
> Sent: Thursday, May 28, 2015 9:58 PM
> To: weixiangyu
> Cc: gcc-patches@

Refine lto-symtab warnings

2015-06-09 Thread Jan Hubicka

Hi,
this patch makes warn_type_compatibility_p to use TYPE_CANONICAL rather than
types_compatible_p to decide about warnings on type mismatches.
types_compatible_p is way too precise. It insist on oprations among the types
to be compatible. This is not necessary here as accesses are wrapped in MEM_REF
that does the proper cast - we only want types to be compatible by their
representaiton and by TBAA rules.

Comparing TYPE_CANONICAL is still not completely fitting the bill, as it won't
work for incomplete types (we can hit these in array declarations and return
values of structures) and if we get TYPE_CANONICAl for pointers to not be fully
globbed (I hope so), we will need to add exception these same way as 
get_alias_set
does some exception.

An option would be to simply compare alias set values. This however is bit too
coarse as it won't get size mismatches nor complete/incomplete type
incompatibilities, so I think it makes sense to simply build from
TYPE_CANONICAL and implement whatever globbing is needed here.  Even if we may
need to implement similar globbing in get_alias_set and here, it is not
necessarily the same: the definition of interoperability is not precisely
following TBAA compatibility.

Bootstrapped/regtested ppc64le-linux, OK?  The testcase here is based on mixing
signedness between C and Fortran units. It won't work unless the earlier patch 
is
applied. But even if we get more discussion in the TYPE_UNSIGNED globbing, I 
think
this patch makes sense by itself, possibly w/o a testcase.

Honza

* lto-symtab.c (warn_type_compatibility_p): Use TYPE_CANONICAL
to compare memory representation equality.

* gfortran.dg/lto/bind_c-6_0.f90: New testcase.
* gfortran.dg/lto/bind_c-6_1.c: New testcase.
Index: lto/lto-symtab.c
===
--- lto/lto-symtab.c(revision 224250)
+++ lto/lto-symtab.c(working copy)
@@ -199,13 +199,19 @@
 
 /* Return non-zero if we want to output waring about T1 and T2.
Return value is a bitmask of reasons of violation:
-   Bit 0 indicates that types are not compatible of memory layout.
-   Bot 1 indicates that types are not compatible because of C++ ODR rule.  */
+   Bit 0 indicates that types are not compatible in memory representation
+   or by TBAA.
+   Bit 1 indicates that types are not compatible because of C++ ODR rule.
 
+   We may have false positives for bit 0.  We may also output more strict
+   warnings in mismatches between types originating from same compilation
+   units.  For example, we may spot qualification differences for C.  */
+
 static int
 warn_type_compatibility_p (tree prevailing_type, tree type)
 {
   int lev = 0;
+
   /* C++ provide a robust way to check for type compatibility via the ODR
  rule.  */
   if (odr_or_derived_type_p (prevailing_type) && odr_or_derived_type_p (type)
@@ -245,54 +251,48 @@
  incomplete pointed-to types more aggressively here, ignoring
  mismatches in both field and tag names.  It's difficult though
  to guarantee that this does not have side-effects on merging
- more compatible types from other translation units though.  */
+ more compatible types from other translation units though.
+ For C programs doing so is however valid by WG14
+ as response to DR#314  */
 
-  /* We can tolerate differences in type qualification, the
- qualification of the prevailing definition will prevail.
- ???  In principle we might want to only warn for structurally
- incompatible types here, but unless we have protective measures
- for TBAA in place that would hide useful information.  */
   prevailing_type = TYPE_MAIN_VARIANT (prevailing_type);
   type = TYPE_MAIN_VARIANT (type);
 
-  if (!types_compatible_p (prevailing_type, type))
+  /* Check type compatibility by the canonical type machinery.  This only works
+ for complete types.  */
+  if (COMPLETE_TYPE_P (type) && COMPLETE_TYPE_P (prevailing_type))
 {
-  if (TREE_CODE (prevailing_type) == FUNCTION_TYPE
- || TREE_CODE (type) == METHOD_TYPE)
+  /* For mixed C/non-C programs we only can warn if the types are 
incompatible
+in their representation.
+For example it is possible to declare the same variable as C_PTR in
+Fortran unit and as float * in C unit, because C_PTR (represented as
+void *) is defined to be interoperable.
+It is however definitely possible to check more here than what we
+do currently.  */
+  gcc_assert (TYPE_CANONICAL (prevailing_type) && TYPE_CANONICAL (type));
+  if (TYPE_CANONICAL (prevailing_type) != TYPE_CANONICAL (type))
return 1 | lev;
-  if (COMPLETE_TYPE_P (type) && COMPLETE_TYPE_P (prevailing_type))
-   return 1 | lev;
+  return lev;
+}
 
-  /* If type is incomplete then avoid warnings in the cases
-that TBAA handles just fine.  */
+  /* Incomplete structure is compatible with every other structure.

[PATCH] Move gen_* stubs from defaults.h to genflags

2015-06-09 Thread Mikhail Maltsev

Hi, all.

I noticed that defaults.h file contains stub generator functions which
simply call gcc_unreachable. FWIW, Trevor added them to remove some
conditional compilation which depends on HAVE_ macros (I mean
something like r223624).

Because we still have ~80 more such conditions in GCC, and probably some
of them will be fixed in the same way, I propose a patch, which allows
to generate required stubs in genflags.

Bootstrapped and regtested on x86_64-unknown-linux-gnu, testing for full
target list in progress. OK for trunk?

-- 
Regards,
Mikhail Maltsev
gcc/ChangeLog:

2015-06-10  Mikhail Maltsev  

* defaults.h (gen_simple_return, gen_return, gen_mem_thread_fence,
gen_memory_barrier, gen_mem_signal_fence, gen_load_multiple,
gen_store_multiple, gen_tablejump): Remove.  Generate by genflags.
(HAVE_*): Likewise.
* genflags.c (struct stub_info_t): Define.
(stubs): Define variable.
(struct gflg_string_hasher): Define.
(stubs_map): Define variable.
(mark_as_done): New function.
(gen_dummy): New function (move code out of gen_proto, generalize).
(gen_proto): Use gen_dummy.
(gen_insn): Call mark_as_done.
(gen_stub): New function.
(main): Initialize stubs_map, call gen_stubs for missing stubs.


diff --git a/gcc/defaults.h b/gcc/defaults.h
index 057b646..5beddea 100644
--- a/gcc/defaults.h
+++ b/gcc/defaults.h
@@ -1426,96 +1426,6 @@ see the files COPYING3 and COPYING.RUNTIME respectively. 
 If not, see
 #define TARGET_VTABLE_USES_DESCRIPTORS 0
 #endif
 
-#ifndef HAVE_simple_return
-#define HAVE_simple_return 0
-static inline rtx
-gen_simple_return ()
-{
-  gcc_unreachable ();
-  return NULL;
-}
-#endif
-
-#ifndef HAVE_return
-#define HAVE_return 0
-static inline rtx
-gen_return ()
-{
-  gcc_unreachable ();
-  return NULL;
-}
-#endif
-
-#ifndef HAVE_epilogue
-#define HAVE_epilogue 0
-static inline rtx
-gen_epilogue ()
-{
-  gcc_unreachable ();
-  return NULL;
-}
-#endif
-
-#ifndef HAVE_mem_thread_fence
-#define HAVE_mem_thread_fence 0
-static inline rtx
-gen_mem_thread_fence (rtx)
-{
-  gcc_unreachable ();
-  return NULL;
-}
-#endif
-
-#ifndef HAVE_memory_barrier
-#define HAVE_memory_barrier 0
-static inline rtx
-gen_memory_barrier ()
-{
-  gcc_unreachable ();
-  return NULL;
-}
-#endif
-
-#ifndef HAVE_mem_signal_fence
-#define HAVE_mem_signal_fence 0
-static inline rtx
-gen_mem_signal_fence (rtx)
-{
-  gcc_unreachable ();
-  return NULL;
-}
-#endif
-
-#ifndef HAVE_load_multiple
-#define HAVE_load_multiple 0
-static inline rtx
-gen_load_multiple (rtx, rtx, rtx)
-{
-  gcc_unreachable ();
-  return NULL;
-}
-#endif
-
-#ifndef HAVE_store_multiple
-#define HAVE_store_multiple 0
-static inline rtx
-gen_store_multiple (rtx, rtx, rtx)
-{
-  gcc_unreachable ();
-  return NULL;
-}
-#endif
-
-#ifndef HAVE_tablejump
-#define HAVE_tablejump 0
-static inline rtx
-gen_tablejump (rtx, rtx)
-{
-  gcc_unreachable ();
-  return NULL;
-}
-#endif
-
 #endif /* GCC_INSN_FLAGS_H  */
 
 #endif  /* ! GCC_DEFAULTS_H */
diff --git a/gcc/genflags.c b/gcc/genflags.c
index 0da15f1..2a70b56 100644
--- a/gcc/genflags.c
+++ b/gcc/genflags.c
@@ -26,10 +26,65 @@ along with GCC; see the file COPYING3.  If not see
 #include "tm.h"
 #include "rtl.h"
 #include "obstack.h"
+#include "hash-map.h"
 #include "errors.h"
 #include "read-md.h"
 #include "gensupport.h"
 
+/* Structure which holds data, required for generating stub gen_* function.  */
+struct stub_info_t
+{
+  /* Instruction name.  */
+  const char *name;
+  /* Number of arguments (i.e., instruction operands).  */
+  int opno;
+  /* Set to true when generator is output, so no stub is needed.  */
+  bool done;
+};
+
+/* These instructions require default stub function.  Stubs are never called.
+   They allow to reduce the amount of conditionally-compiled code: if a stub is
+   defined there is no need to guard the call by #ifdef (plain if statement can
+   be used instead).  */
+
+#define DEF_STUB(name, opno) { name, opno, false }
+
+static stub_info_t stubs[] = {
+DEF_STUB ("simple_return",  0),
+DEF_STUB ("return", 0),
+DEF_STUB ("epilogue",   0),
+DEF_STUB ("mem_thread_fence",1),
+DEF_STUB ("memory_barrier",  0),
+DEF_STUB ("mem_signal_fence",1),
+DEF_STUB ("load_multiple",  3),
+DEF_STUB ("store_multiple", 3),
+DEF_STUB ("tablejump",  2)
+};
+
+#undef DEF_STUB
+
+/* Helper traits for using null-terminated strings as keys in hash map.
+   FIXME: Unify various "string hashers" and move them to hash-map-traits.h.  
*/
+struct gflg_string_hasher : default_hashmap_traits
+{
+  typedef const char *value_type;
+  typedef const char *compare_type;
+
+  static inline hashval_t hash (const char *s)
+  {
+return htab_hash_string (s);
+  }
+
+  static inline bool equal_keys (const char *p1, const char *p2)
+  {
+return strcmp (p1, p2) == 0;
+  }
+};
+
+/* Mapping from insn nam

Re: [PATCH] Fix ix86_split_long_move collision handling with TLS (PR target/66470)

2015-06-09 Thread Uros Bizjak

On Tue, Jun 9, 2015 at 9:30 PM, Jakub Jelinek  wrote:
> On Tue, Jun 09, 2015 at 08:09:28PM +0200, Uros Bizjak wrote:
>> Please find attach a patch that takes your idea slightly further. We
>> find  perhaps zero-extended UNSPEC_TP, and copy it for further use. At
>> its place, we simply slap const0_rtx. We know that address to
>
> Is that safe?  I mean, the address, even if offsetable, can have some
> immediate already (seems e.g. the offsettable_memref_p predicate just checks
> you can plus_constant some small integer and be recognized again) and if you
> turn the %gs: into a const0_rtx, it would fail next decompose.
> And when you already have the PLUS which has UNSPEC_TP as one of its
> arguments, replacing that PLUS with the other argument is IMHO very easy.
> Perhaps you are right that there is no need to copy_rtx, supposedly
> the rtx shouldn't be shared with anything and thus can be modified in place.

Hm, you are right. I was under impression that decompose_address can
handle multiple CONST_INT addends, which is unfortunatelly not the
case.

> If -mx32 is a non-issue here, then perhaps my initial patch is good enough?

It looks to me, that if you detect and record zero-extended UNSPEC_TP,
your original patch would also handle -mx32.

Can you please repost your original patch with the above addition?

Thanks,
Uros.

Re: [PATCH] Fix ix86_split_long_move collision handling with TLS (PR target/66470)

2015-06-09 Thread Richard Sandiford

Uros Bizjak  writes:
> On Tue, Jun 9, 2015 at 9:30 PM, Jakub Jelinek  wrote:
>> On Tue, Jun 09, 2015 at 08:09:28PM +0200, Uros Bizjak wrote:
>>> Please find attach a patch that takes your idea slightly further. We
>>> find  perhaps zero-extended UNSPEC_TP, and copy it for further use. At
>>> its place, we simply slap const0_rtx. We know that address to
>>
>> Is that safe?  I mean, the address, even if offsetable, can have some
>> immediate already (seems e.g. the offsettable_memref_p predicate just checks
>> you can plus_constant some small integer and be recognized again) and if you
>> turn the %gs: into a const0_rtx, it would fail next decompose.
>> And when you already have the PLUS which has UNSPEC_TP as one of its
>> arguments, replacing that PLUS with the other argument is IMHO very easy.
>> Perhaps you are right that there is no need to copy_rtx, supposedly
>> the rtx shouldn't be shared with anything and thus can be modified in place.
>
> Hm, you are right. I was under impression that decompose_address can
> handle multiple CONST_INT addends, which is unfortunatelly not the
> case.

That's in some ways a feature though.  I don't think we want to support
multiple offsets, since that implies having more than one representation
for the same address.

Thanks,
Richard

Re: [PATCH, Google] Notify df framework when removing an insn in simplify-got

2015-06-09 Thread Richard Sandiford

Carrot Wei  writes:
> Index: simplify-got.c
> ===
> --- simplify-got.c (revision 224174)
> +++ simplify-got.c (working copy)
> @@ -169,7 +169,10 @@
>
>/* Since there is no usage of pic_reg now, we can remove it.  */
>if (use)
> -remove_insn (use);
> +{
> +  df_insn_delete (use);
> +  remove_insn (use);
> +}
>targetm.got_access.clear_pic_reg ();
>free (got_accesses);
>htab_delete (var_table);

Why not just use delete_insn ()?

Thanks,
Richard

Re: [PATCH][PR debug/65549] Restore DW_AT_abstract_origin for cross-unit call sites

2015-06-09 Thread Richard Biener

On Tue, 9 Jun 2015, Pierre-Marie de Rodat wrote:

> Hello,
> 
> With the recent work for PR debug/65549, we observed a regression in the
> generated debugging information. Take the attached reproducer, build it and
> look at the output DWARF:
> 
>  $ gcc -c -O1 -g foo.c
>  $ objdump --dwarf=info foo.o
>  [...]
>   <2><4e>: Abbrev Number: 3 (DW_TAG_GNU_call_site)
>  <4f>   DW_AT_low_pc  : 0x9
>   <2><57>: Abbrev Number: 0
> 
> There is no DW_AT_abstract_origin attribute anymore, whereas there used to be
> one.
> 
> On PowerPC with -mlongcall, for instance, call instructions are indirect and
> we (at AdaCore) rely on this attribute to determine what function is actually
> called (it's easier this way than interpreting machine code...). The DWARF
> proposal for call sites also say debuggers should be able to use this
> attribute (to build virtual call stacks in the presence of tail calls), but I
> could not observe this in practice with GDB.
> 
> The attached patch is an attempt to restore this attribute. The logic behind
> it is: this is what we used to do previously for contexts that are alien
> translation unit (through the call to force_decl_die), so this should not
> harm.
> 
> Bootstrapped and regtested on x86_64-linux. Ok to commit?

Hmm, so the underlying issue is that we don't associate comp_unit_die ()
with any TRANSLATION_UNIT_DECL.  For the LTO early debug work I did
the following at the very beginning of dwarf2out_early_finish:

  /* Pick the first TRANSLATION_UNIT_DECL we didn't create a DIE for
 and equate it with our default CU DIE.  LTO output needs to be
 able to lookup DIEs for translation unit decls.  */
  unsigned i;
  tree decl;
  FOR_EACH_VEC_SAFE_ELT (all_translation_units, i, decl)
if (!lookup_decl_die (decl))
  equate_decl_number_to_die (decl, comp_unit_die ());

We create some DIEs for builtin types too early before frontends
got a chance to build their TRANSLATION_UNIT_DECL, so comp_unit_die
gets created before there is any TRANSLATION_UNIT_DECL.  Another
approach would of course be that the Frontends register their main
TRANSLATION_UNIT_DECL with dwarf2out via a debug hook.

But - does the above work for you and fix the regression?  If so
adding a debug hook would probably be the cleaner solution still.

Thanks,
Richard.

> Thank you in advance!
> 
> gcc/ChangeLog
> * dwarf2out.c (lookup_context_die): Return the DIE for the
> current compilation unit when the input context is a
> TRANSLATION_UNIT_DECL.
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham 
Norton, HRB 21284 (AG Nuernberg)

97 matches

Mail list logo