ping*3: [PATCH, x86] [PR target/64835] Add TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE hook

2015-04-27 Thread Christian Bruel
Hi,

I'd like to re-ping the following patch for GCC 5.2. It fixes the
__attribute__ ((__optimize__ (...))) on x86. Testcase is in the patch.

thanks

Christian

On 04/13/2015 04:24 PM, Christian Bruel wrote:
> https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00243.html
> 
> thanks,
> 
> Christian
> 


Re: [PATCH][PR65823] Fix va_arg ap_copy nop detection

2015-04-27 Thread Tom de Vries

On 22-04-15 15:50, Richard Biener wrote:

On Wed, Apr 22, 2015 at 3:38 PM, Tom de Vries  wrote:

On 22-04-15 10:06, Richard Biener wrote:


On Wed, Apr 22, 2015 at 9:41 AM, Tom de Vries 
wrote:


Hi,

this patch fixes PR65823.







The patches fixes the problem by using operand_equal_p to do the equality
test.


Bootstrapped and reg-tested on x86_64.

Did minimal non-bootstrap build on arm and reg-tested.

OK for trunk?



Hmm, ok for now.



Committed.


But I wonder if we can't fix things to not require that
odd extra copy.



Agreed, that would be good.


In fact that we introduce ap.1 looks completely bogus to me



AFAICT, it's introduced by gimplify_arg ('&argp') because argp (a PARM_DECL)
is not addressable.


(and we don't in this case for arm).  Note that the pointer compare
obviously
fails because we unshare the expression.

So ... what breaks if we simply remove this odd "fixup"?



[ Originally mentioned at https://gcc.gnu.org/ml/gcc/2015-02/msg00011.html .
]

I've committed gcc.target/x86_64/abi/callabi/vaarg-6.c specifically as a
minimal version of this problem.

If we remove the ap_copy fixup, at original we have:
...
;; Function do_cpy2 (null)
{
   char * e;

 char * e;
   e = VA_ARG_EXPR ;
   e = VA_ARG_EXPR ;
   if (e != b)
 {
   abort ();
 }
}
...

and after gimplify we have:
...
do_cpy2 (char * argp)
{
   char * argp.1;
   char * argp.2;
   char * b.3;
   char * e;

   argp.1 = argp;
   e = VA_ARG (&argp.1, 0B);
   argp.2 = argp;
   e = VA_ARG (&argp.2, 0B);
   b.3 = b;
   if (e != b.3) goto ; else goto ;
   :
   abort ();
   :
}
...

The second VA_ARG uses &argp.2, which is a copy of argp, which is unmodified
by the first VA_ARG.


Using attached _proof-of-concept_ patch, I get callabi.exp working without
the ap_copy, still at the cost of one 'argp.1 = argp' copy though:
...
do_cpy2 (char * argp)
{
   char * argp.1;
   char * b.2;
   char * e;

   argp.1 = argp;
   e = VA_ARG (&argp.1, 0B);
   e = VA_ARG (&argp.1, 0B);
   b.2 = b;
   if (e != b.2) goto ; else goto ;
   :
   abort ();
   :
}
...

But perhaps there's an easier way?


Hum, simply

Index: gcc/gimplify.c
===
--- gcc/gimplify.c  (revision 222320)
+++ gcc/gimplify.c  (working copy)
@@ -9419,6 +9419,7 @@ gimplify_va_arg_expr (tree *expr_p, gimp
  }

/* Transform a VA_ARG_EXPR into an VA_ARG internal function.  */
+  mark_addressable (valist);
ap = build_fold_addr_expr_loc (loc, valist);
tag = build_int_cst (build_pointer_type (type), 0);
*expr_p = build_call_expr_internal_loc (loc, IFN_VA_ARG, type, 2, ap, tag);



That sort of works, but causes other problems. Filed PR65887 - 'remove va_arg ap 
copies' to track this issue.


Thanks,
- Tom



Re: [PATCH] Fix VRP update_value_range and caller (PR tree-optimization/65875)

2015-04-27 Thread Richard Biener
On Fri, 24 Apr 2015, Jakub Jelinek wrote:

> Hi!
> 
> In vrp_visit_assignment_or_call we try to return SSA_PROP_VARYING
> if update_value_range returned true and the new range is VR_VARYING,
> but vrp_visit_phi_node fails to do that.
> Another thing is that if update_value_range decides to
> set_value_range_to_varying (old_vr);
> it doesn't update new_vr, so the caller doesn't know the new vr
> is varying (and calling get_value_range again sounds unnecessary).
> 
> The following patch fixes it, bootstrapped/regtested on x86_64-linux and
> i686-linux, ok for trunk and 5.2?
> 
> 2015-04-24  Jakub Jelinek  
> 
>   PR tree-optimization/65875
>   * tree-vrp.c (update_value_range): If in is_new case setting
>   old_vr to VR_VARYING, also set new_vr to it.
>   (vrp_visit_phi_node): Return SSA_PROP_VARYING instead of
>   SSA_PROP_INTERESTING if update_value_range returned true,
>   but new range is VR_VARYING.
> 
>   * gcc.c-torture/compile/pr65875.c: New test.
> 
> --- gcc/tree-vrp.c.jj 2015-04-20 14:35:39.0 +0200
> +++ gcc/tree-vrp.c2015-04-24 18:10:41.321367440 +0200
> @@ -892,7 +892,12 @@ update_value_range (const_tree var, valu
>UNDEFINED or from VARYING.  */
>if (new_vr->type == VR_UNDEFINED
> || old_vr->type == VR_VARYING)
> - set_value_range_to_varying (old_vr);
> + {
> +   BITMAP_FREE (new_vr->equiv);
> +   set_value_range_to_varying (old_vr);
> +   set_value_range_to_varying (new_vr);
> +   return true;

Actually we didn't change anything here (old_vr->type is VARYING already,
so we shouldn't even have visited the stmt defining the SSA name again...
eventually your fix below fixes that.

So I'd prefer to simply drop the hunk... (you might want to do some
digging which rev. introduced it and if it came with a testcase...).

Otherwise ok.

Thanks,
Richard.

> + }
>else
>   set_value_range (old_vr, new_vr->type, new_vr->min, new_vr->max,
>new_vr->equiv);
> @@ -8941,6 +8946,9 @@ update_range:
> fprintf (dump_file, "\n");
>   }
>  
> +  if (vr_result.type == VR_VARYING)
> + return SSA_PROP_VARYING;
> +
>return SSA_PROP_INTERESTING;
>  }
>  
> --- gcc/testsuite/gcc.c-torture/compile/pr65875.c.jj  2015-04-24 
> 18:20:47.650595581 +0200
> +++ gcc/testsuite/gcc.c-torture/compile/pr65875.c 2015-04-24 
> 18:20:30.0 +0200
> @@ -0,0 +1,24 @@
> +/* PR tree-optimization/65875 */
> +
> +int a, b, c, d, e, f, g;
> +
> +void
> +foo (void)
> +{
> +  long h = 0, i;
> +  if (g < 0)
> +i = -g;
> +  for (; b;)
> +for (; c;)
> +  if (e)
> + h = 1;
> +  for (; f;)
> +if (a)
> +  break;
> +  if (h > i)
> +while (h > i)
> +  {
> + d = 0;
> + h--;
> +  }
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)


Re: [PATCH] Fix VRP update_value_range and caller (PR tree-optimization/65875)

2015-04-27 Thread Jakub Jelinek
On Mon, Apr 27, 2015 at 10:05:21AM +0200, Richard Biener wrote:
> > --- gcc/tree-vrp.c.jj   2015-04-20 14:35:39.0 +0200
> > +++ gcc/tree-vrp.c  2015-04-24 18:10:41.321367440 +0200
> > @@ -892,7 +892,12 @@ update_value_range (const_tree var, valu
> >  UNDEFINED or from VARYING.  */
> >if (new_vr->type == VR_UNDEFINED
> >   || old_vr->type == VR_VARYING)
> > -   set_value_range_to_varying (old_vr);
> > +   {
> > + BITMAP_FREE (new_vr->equiv);
> > + set_value_range_to_varying (old_vr);
> > + set_value_range_to_varying (new_vr);
> > + return true;
> 
> Actually we didn't change anything here (old_vr->type is VARYING already,
> so we shouldn't even have visited the stmt defining the SSA name again...
> eventually your fix below fixes that.

On the testcase, old_vr wasn't actually VARYING, but new_vr was UNDEFINED
(a result of intersecting disjoint ranges).  While for old_vr->type ==
VR_VARYING I agree we shouldn't have been called on this.

And turning a VR_RANGE into VR_UNDEFINED is going in the wrong direction
in the lattice.

Jakub


Re: [PATCH][PR65818][bootstrap,hppa] Return side-effect free result in gimplify_va_arg_internal

2015-04-27 Thread Richard Biener
On Sat, Apr 25, 2015 at 2:35 PM, Tom de Vries  wrote:
> Hi,
>
> this patch fixes PR65818, and hppa bootstrap.
>
> When compiling gcc/libiberty/vprintf-support.c, the following va_arg is
> compiled:
> ...
>  (void) __builtin_va_arg(ap, double);
> ...
>
> This results in the following ifn_va_arg at gimple level, with a NULL_TREE
> lhs:
> ...
>   VA_ARG (&ap, 0B);
> ...
>
> We start expanding the ifn_va_arg in expand_ifn_va_arg_1 by calling
> gimplify_va_arg_internal:
> ...
> expr = gimplify_va_arg_internal (ap, type, gimple_location (stmt),
>  &pre, &post);
> ...
>
> Subsequently, because the lhs is NULL_TREE, we skip generating the assign to
> the lhs:
> ...
> lhs = gimple_call_lhs (stmt);
> if (lhs != NULL_TREE)
>   {
> gcc_assert (useless_type_conversion_p (TREE_TYPE (lhs), type));
>
> if (gimple_call_num_args (stmt) == 3)
>   {
> /* We've transported the size of with WITH_SIZE_EXPR here as
>the 3rd argument of the internal fn call.  Now reinstate
>it.  */
> tree size = gimple_call_arg (stmt, 2);
> expr = build2 (WITH_SIZE_EXPR, TREE_TYPE (expr), expr,
> size);
>   }
>
> /* We use gimplify_assign here, rather than gimple_build_assign,
>because gimple_assign knows how to deal with variable-sized
>types.  */
> gimplify_assign (lhs, expr, &pre);
>   }
> ...
>
> We assume here that any side-effects related to updating ap have been
> generated into pre/post by gimplify_va_arg_internal, and that it's safe to
> ignore expr.
>
> Turns out, that's not the case for hppa. The target hook
> hppa_gimplify_va_arg_expr (called from gimplify_va_arg_internal) returns an
> expression that still contains a side-effect:
> ...
> *(double *) (ap = ap + 4294967288 & 4294967288B)
> ...
>
> This patch fixes that by gimplifying the address expression of the mem-ref
> returned by the target hook (borrowing code from gimplify_expr, case
> MEM_REF).
>
> Bootstrapped and reg-tested on x86_64.
>
> Bootstrapped and reg-tested on hppa2.0w-hp-hpux11.11.
>
> OK for trunk?

Hmm, that assert looks suspicious...

Can't you simply always do

  gimplify_expr (expr, pre_p, post_p, is_gimple_lvalue, fb_lvalue);

?

Richard.

> Thanks,
> - Tom
>


Re: [PATCH] Fix VRP update_value_range and caller (PR tree-optimization/65875)

2015-04-27 Thread Richard Biener
On Mon, 27 Apr 2015, Jakub Jelinek wrote:

> On Mon, Apr 27, 2015 at 10:05:21AM +0200, Richard Biener wrote:
> > > --- gcc/tree-vrp.c.jj 2015-04-20 14:35:39.0 +0200
> > > +++ gcc/tree-vrp.c2015-04-24 18:10:41.321367440 +0200
> > > @@ -892,7 +892,12 @@ update_value_range (const_tree var, valu
> > >UNDEFINED or from VARYING.  */
> > >if (new_vr->type == VR_UNDEFINED
> > > || old_vr->type == VR_VARYING)
> > > - set_value_range_to_varying (old_vr);
> > > + {
> > > +   BITMAP_FREE (new_vr->equiv);
> > > +   set_value_range_to_varying (old_vr);
> > > +   set_value_range_to_varying (new_vr);
> > > +   return true;
> > 
> > Actually we didn't change anything here (old_vr->type is VARYING already,
> > so we shouldn't even have visited the stmt defining the SSA name again...
> > eventually your fix below fixes that.
> 
> On the testcase, old_vr wasn't actually VARYING, but new_vr was UNDEFINED
> (a result of intersecting disjoint ranges).  While for old_vr->type ==
> VR_VARYING I agree we shouldn't have been called on this.
> 
> And turning a VR_RANGE into VR_UNDEFINED is going in the wrong direction
> in the lattice.

Ah, I misread the || for a &&.  The patch is ok with just dropping
the == VR_VARYING check.

Thanks,
Richard.


Fix librayr name of __builtin_allocal_with_align

2015-04-27 Thread Jan Hubicka
Hi,
build_common_builtin_nodes declares both __builtin_alloca and
__builtin_alloca_with_align to have library name "alloca". This actually
triggers warning in an updated ODR violation detector on "alloca" being
declared twice.

__builtin_alloca_with_align IMO do not have library equivalent and I think this
is a pasto (__builtin_alloca_with_align is not documented in extend.texi). It
is not clear to me if there was some intention behind this oddity though.

I have bootstrapped/regtested x86_64 with the following. OK?

Honza

* tree.c (build_common_builtin_nodes): Do not build
__builtin_alloca_with_align as equivalent of library alloca.
Index: tree.c
===
--- tree.c  (revision 222391)
+++ tree.c  (working copy)
@@ -10088,7 +10098,8 @@ build_common_builtin_nodes (void)
   ftype = build_function_type_list (ptr_type_node, size_type_node,
size_type_node, NULL_TREE);
   local_define_builtin ("__builtin_alloca_with_align", ftype,
-   BUILT_IN_ALLOCA_WITH_ALIGN, "alloca",
+   BUILT_IN_ALLOCA_WITH_ALIGN,
+   "__builtin_alloca_with_align",
ECF_MALLOC | ECF_NOTHROW | ECF_LEAF);
 
   /* If we're checking the stack, `alloca' can throw.  */


Re: [PATCH] Properly valueize values we value-number to

2015-04-27 Thread Richard Biener
On Fri, 24 Apr 2015, Jeff Law wrote:

> On 02/17/2015 07:58 AM, Richard Biener wrote:
> [ Restarting a old thread... ]
> 
> > On a closer look the record_const_or_copy_1 hunk is redundant
> > (record_equality is really a bit obfuscated...).
> Agreed.  I'm not entirely sure how it got to this point.
> 
> > And record_equality is where the SSA_NAME_VALUEs might end up
> > forming chains?  At least because we might record a SSA_NAME_VALUE
> > for sth that might be the target of a SSA_NAME_VALUE, as in
> > 
> >   a_1 = b_2;  SSA_NAME_VALUE (a_1) == b_2;
> >   if (b_2 == i_3(D))
> > ... temporarily SSA_NAME_VALUE (b_2) == i_3(D)
> > 
> > thus under if (b_2 == i_3(D)) SSA_NAME_VALUE (a_1) == b_2 and
> > SSA_NAME_VALUE (SSA_NAME_VALUE (a_1)) == i_3(D)?  I guess we
> > can't easily avoid that because we don't keep track of a
> > reverse mapping (nor would we want to update that).
> Right.  In general, the fact that we're in SSA form means that we ought not
> get loops in the SSA_NAME_VALUE chain since there's a single static assignment
> and we'll visit it once.
> 
> That nice model breaks down in two ways.  First we derive equivalences from
> equality conditions like you've shown above.  Second, when threading we can
> thread through a loop latch and start reprocessing a block.  The interaction
> between those two can result in loops in the value chain.
> 
> What I'm hoping to do in GCC6 is eliminate all this stuff with a threader that
> is independent of the dominator walk (and optimizer). Instead of walking
> forward through the dominator tree recording key nuggets, then removing those
> nuggets as we pop back up the tree, we instead we start with the conditional
> and walk up the use-def chains, simplifying as we go, with the goal of
> simplifying to a constant, of course.
> 
> You can see the beginnings of that with the FSM bits from Sebastian.
> 
> Obviously until those bits are in place, we should try to clean up any warts
> in the current implementation.
> 
> > 
> > Btw, in record_equality, the == part of the <= check for the loop
> > depth will just randomly swap x and y while we should prefer
> > IL canonical order.  If we can't rely on the input being in IL
> > canonical order we should prepend the whole function with
> > 
> >   if (tree_swap_operands_p (x, y, false))
> > std::swap (x, y);
> > 
> > and change <= to < for the loop-depth check.
> Agreed.  Makes perfect sense.

I'm now re-bootstrapping and testing the following.

Richard.

2015-04-27  Richard Biener  

* tree-ssa-dom.c (record_equivalences_from_phis): Valueize PHI arg.
(record_equivalences_from_stmt): Valueize rhs.
(record_equality): Canonicalize x and y order via
tree_swap_operands_p.  Do not swap operands for same loop depth.

Index: gcc/tree-ssa-dom.c
===
*** gcc/tree-ssa-dom.c  (revision 222360)
--- gcc/tree-ssa-dom.c  (working copy)
*** record_equivalences_from_phis (basic_blo
*** 1519,1524 
--- 1519,1531 
  if (lhs == t)
continue;
  
+ /* Valueize t.  */
+ if (TREE_CODE (t) == SSA_NAME)
+   {
+ tree tmp = SSA_NAME_VALUE (t);
+ t = tmp ? tmp : t;
+   }
+ 
  /* If we have not processed an alternative yet, then set
 RHS to this alternative.  */
  if (rhs == NULL)
*** record_equality (tree x, tree y)
*** 1752,1757 
--- 1759,1767 
  {
tree prev_x = NULL, prev_y = NULL;
  
+   if (tree_swap_operands_p (x, y, false))
+ std::swap (x, y);
+ 
if (TREE_CODE (x) == SSA_NAME)
  prev_x = SSA_NAME_VALUE (x);
if (TREE_CODE (y) == SSA_NAME)
*** record_equality (tree x, tree y)
*** 1766,1772 
else if (is_gimple_min_invariant (x)
   /* ???  When threading over backedges the following is important
  for correctness.  See PR61757.  */
!  || (loop_depth_of_name (x) <= loop_depth_of_name (y)))
  prev_x = x, x = y, y = prev_x, prev_x = prev_y;
else if (prev_x && is_gimple_min_invariant (prev_x))
  x = y, y = prev_x, prev_x = prev_y;
--- 1776,1782 
else if (is_gimple_min_invariant (x)
   /* ???  When threading over backedges the following is important
  for correctness.  See PR61757.  */
!  || (loop_depth_of_name (x) < loop_depth_of_name (y)))
  prev_x = x, x = y, y = prev_x, prev_x = prev_y;
else if (prev_x && is_gimple_min_invariant (prev_x))
  x = y, y = prev_x, prev_x = prev_y;
*** record_equivalences_from_stmt (gimple st
*** 2128,2145 
if (may_optimize_p
  && (TREE_CODE (rhs) == SSA_NAME
  || is_gimple_min_invariant (rhs)))
!   {
!   if (dump_file && (dump_flags & TDF_DETAILS))
! {
!   fprintf (dump_file, " ASGN ");
!   print_generic_expr (dump_file, lhs, 0);
!   fprintf (dump_file, " = ");
!  

Improve LTO type checking during symtab merging

2015-04-27 Thread Jan Hubicka
Hi,
this patch strengten ODR checking for requiring match on declarations
(if both of them are defined in C++ tranlsation unit).  Currently
ipa-symtab.c will warn only if declarations are incompatible in 
useless_type_conversion sense.  Now we warn more often, i.e. in the following
example:

t.C:
char a;

t2.C:
#include 
extern int8_t a;
void *b=&a;

Will lead to warning:
t2.C:2:15: warning: type of �a� does not match original declaration
 extern int8_t a;
   ^
: note: type name �char� should match type name �signed char�
/usr/include/stdint.h:37:22: note: the incompatible type is defined here
 typedef signed char  int8_t;
  ^
/aux/hubicka/t.C:1:6: note: previously declared here
 char a;
  ^

Funnilly enough we do not get warning when t2.C is just "signed char"
or "unsigned char" because that is builtin type and thus non-ODR. Something
I need to deal with incrementally.  

I also added call to warn_types_mismatch that outputs extra note about
what is wrong with the type: in C++ the mismatch is often carried over
several levels on declarations and it is hard to work out the reason
without extra info.

Richard, According to comments, it seems that the code was tailored to not
output warning when we can avoid wrong code issues on mismatches based on fact
that linker would be also silent. I am particularly entertained by the
following hunk which I killed:

-  if (!types_compatible_p (TREE_TYPE (prevailing_decl),
-  TREE_TYPE (decl)))
-   /* If we don't have a merged type yet...sigh.  The linker
-  wouldn't complain if the types were mismatched, so we
-  probably shouldn't either.  Just use the type from
-  whichever decl appears to be associated with the
-  definition.  If for some odd reason neither decl is, the
-  older one wins.  */
-   (void) 0;

It is fun we go through the trouble of comparing types and then do nothing ;)
Type merging is also completed at this time, so at least the comment looks
out of date.

I think as QOI issue, we do want to warn in cases the code is inconsistent (it
is pretty much surely a bug), but perhaps we may want to have the warnings with
-Wodr only and use slightly different warning and different -W switch for
warnings that unavoidably leads to wrong code surprises?   This should be easy
to implement by adding two levels into new warn_type_compatibility_p predicate.
I am personaly also OK however with warning always or making all the warnings
-Wodr.

What do you think?

Incrementally I am heading towards proper definition of decl compatibility that
I can plug into symtab merging and avoid merging incompatible decls (so
FORTIFY_SOURCE works).

lto-symtab and ipa-icf both have some knowledge of the problem, I want to get
both right and factor out common logic.

Other improvement is to preserve the ODR type info when non-ODR variant
previals so one can diagnose clash in between C++ units even in mixed
language linktimes.

Bootstrapped/regtested x86_64-linux with no new ODR warnings hit by -Werror.

Honza

* ipa-devirt.c (register_odr_type): Be ready for non-odr main variant
of odr type.
* ipa-utils.h (warn_types_mismatch): New.
* lto/lto-symtab.c (warn_type_compatibility_p): Break out from ...
(lto_symtab_merge): ... here.
(lto_symtab_merge_decls_2): Improve warnings.

Index: ipa-devirt.c
===
--- ipa-devirt.c(revision 222391)
+++ ipa-devirt.c(working copy)
@@ -2029,10 +2030,14 @@ register_odr_type (tree type)
   if (in_lto_p)
 odr_vtable_hash = new odr_vtable_hash_type (23);
 }
-  /* Arrange things to be nicer and insert main variants first.  */
-  if (odr_type_p (TYPE_MAIN_VARIANT (type)))
+  /* Arrange things to be nicer and insert main variants first.
+ ??? fundamental prerecorded types do not have mangled names; this
+ makes it possible that non-ODR type is main_odr_variant of ODR type.
+ Things may get smoother if LTO FE set mangled name of those types same
+ way as C++ FE does.  */
+  if (odr_type_p (main_odr_variant (TYPE_MAIN_VARIANT (type
 get_odr_type (TYPE_MAIN_VARIANT (type), true);
-  if (TYPE_MAIN_VARIANT (type) != type)
+  if (TYPE_MAIN_VARIANT (type) != type && odr_type_p (main_odr_variant (type)))
 get_odr_type (type, true);
 }
 
Index: ipa-utils.h
===
--- ipa-utils.h (revision 222391)
+++ ipa-utils.h (working copy)
@@ -84,6 +84,7 @@ bool types_must_be_same_for_odr (tree, t
 bool types_odr_comparable (tree, tree, bool strict = false);
 cgraph_node *try_speculative_devirtualization (tree, HOST_WIDE_INT,
   ipa_polymorphic_call_context);
+void warn_types_mismatch (tree t1, tree t2);
 
 /* Return vector containing possible targets of polymorphic call E.
If COMPLETEP is non-N

Re: Fix librayr name of __builtin_allocal_with_align

2015-04-27 Thread Richard Biener
On Mon, 27 Apr 2015, Jan Hubicka wrote:

> Hi,
> build_common_builtin_nodes declares both __builtin_alloca and
> __builtin_alloca_with_align to have library name "alloca". This actually
> triggers warning in an updated ODR violation detector on "alloca" being
> declared twice.
> 
> __builtin_alloca_with_align IMO do not have library equivalent and I think 
> this
> is a pasto (__builtin_alloca_with_align is not documented in extend.texi). It
> is not clear to me if there was some intention behind this oddity though.
> 
> I have bootstrapped/regtested x86_64 with the following. OK?

Yes.

Thanks,
Richard.

> Honza
> 
>   * tree.c (build_common_builtin_nodes): Do not build
>   __builtin_alloca_with_align as equivalent of library alloca.
> Index: tree.c
> ===
> --- tree.c(revision 222391)
> +++ tree.c(working copy)
> @@ -10088,7 +10098,8 @@ build_common_builtin_nodes (void)
>ftype = build_function_type_list (ptr_type_node, size_type_node,
>   size_type_node, NULL_TREE);
>local_define_builtin ("__builtin_alloca_with_align", ftype,
> - BUILT_IN_ALLOCA_WITH_ALIGN, "alloca",
> + BUILT_IN_ALLOCA_WITH_ALIGN,
> + "__builtin_alloca_with_align",
>   ECF_MALLOC | ECF_NOTHROW | ECF_LEAF);
>  
>/* If we're checking the stack, `alloca' can throw.  */
> 


Re: Improve LTO type checking during symtab merging

2015-04-27 Thread Richard Biener
On Mon, 27 Apr 2015, Jan Hubicka wrote:

> Hi,
> this patch strengten ODR checking for requiring match on declarations
> (if both of them are defined in C++ tranlsation unit).  Currently
> ipa-symtab.c will warn only if declarations are incompatible in 
> useless_type_conversion sense.  Now we warn more often, i.e. in the following
> example:
> 
> t.C:
> char a;
> 
> t2.C:
> #include 
> extern int8_t a;
> void *b=&a;
> 
> Will lead to warning:
> t2.C:2:15: warning: type of ???a??? does not match original declaration
>  extern int8_t a;
>^
> : note: type name ???char??? should match type name ???signed 
> char???
> /usr/include/stdint.h:37:22: note: the incompatible type is defined here
>  typedef signed char  int8_t;
>   ^
> /aux/hubicka/t.C:1:6: note: previously declared here
>  char a;
>   ^
> 
> Funnilly enough we do not get warning when t2.C is just "signed char"
> or "unsigned char" because that is builtin type and thus non-ODR. Something
> I need to deal with incrementally.  
> 
> I also added call to warn_types_mismatch that outputs extra note about
> what is wrong with the type: in C++ the mismatch is often carried over
> several levels on declarations and it is hard to work out the reason
> without extra info.
> 
> Richard, According to comments, it seems that the code was tailored to not
> output warning when we can avoid wrong code issues on mismatches based on fact
> that linker would be also silent. I am particularly entertained by the
> following hunk which I killed:
> 
> -  if (!types_compatible_p (TREE_TYPE (prevailing_decl),
> -TREE_TYPE (decl)))
> - /* If we don't have a merged type yet...sigh.  The linker
> -wouldn't complain if the types were mismatched, so we
> -probably shouldn't either.  Just use the type from
> -whichever decl appears to be associated with the
> -definition.  If for some odd reason neither decl is, the
> -older one wins.  */
> - (void) 0;
> 
> It is fun we go through the trouble of comparing types and then do nothing ;)
> Type merging is also completed at this time, so at least the comment looks
> out of date.
>
> I think as QOI issue, we do want to warn in cases the code is 
> inconsistent (it is pretty much surely a bug), but perhaps we may want 
> to have the warnings with -Wodr only and use slightly different warning 
> and different -W switch for warnings that unavoidably leads to wrong 
> code surprises?  This should be easy to implement by adding two levels 
> into new warn_type_compatibility_p predicate. I am personaly also OK 
> however with warning always or making all the warnings -Wodr.
>
> What do you think?

Only emitting the warnings with -Wodr looks good to me.  I can't see
how we can decide what cases lead to wrong code surprises and what not
(other than using types_compatible_p ...).  Wrong-code can only(?) happen
if we happen to inline in a way that makes the incosistency visible in
a single function.
 
> Incrementally I am heading towards proper definition of decl 
> compatibility that I can plug into symtab merging and avoid merging 
> incompatible decls (so FORTIFY_SOURCE works).
>
> lto-symtab and ipa-icf both have some knowledge of the problem, I want to get
> both right and factor out common logic.

Sounds good.
 
> Other improvement is to preserve the ODR type info when non-ODR variant
> previals so one can diagnose clash in between C++ units even in mixed
> language linktimes.

Hmm, but then merging ODR with non-ODR variant is already pointing to
a ODR violation?  So why do you need to retain that info?

Btw, there is that old PR41227 which has both wrong-code and diagnostic
impact...

Richard.

> Bootstrapped/regtested x86_64-linux with no new ODR warnings hit by -Werror.
> 
> Honza
> 
>   * ipa-devirt.c (register_odr_type): Be ready for non-odr main variant
>   of odr type.
>   * ipa-utils.h (warn_types_mismatch): New.
>   * lto/lto-symtab.c (warn_type_compatibility_p): Break out from ...
>   (lto_symtab_merge): ... here.
>   (lto_symtab_merge_decls_2): Improve warnings.
> 
> Index: ipa-devirt.c
> ===
> --- ipa-devirt.c  (revision 222391)
> +++ ipa-devirt.c  (working copy)
> @@ -2029,10 +2030,14 @@ register_odr_type (tree type)
>if (in_lto_p)
>  odr_vtable_hash = new odr_vtable_hash_type (23);
>  }
> -  /* Arrange things to be nicer and insert main variants first.  */
> -  if (odr_type_p (TYPE_MAIN_VARIANT (type)))
> +  /* Arrange things to be nicer and insert main variants first.
> + ??? fundamental prerecorded types do not have mangled names; this
> + makes it possible that non-ODR type is main_odr_variant of ODR type.
> + Things may get smoother if LTO FE set mangled name of those types same
> + way as C++ FE does.  */
> +  if (odr_type_p (main_odr_variant (TYPE_MAIN_VARIANT (type
> 

[PATCH][combine][obvious] Use std::swap instead of manually swapping

2015-04-27 Thread Kyrill Tkachov

Hi all,

This is another one of those std::swap patches, this time I was in the combine 
neighbourhood

Bootstrapped on x86_64 and tested on aarch64.
Precedents suggest these changes are considered obvious.
So I'll commit this in a couple of days unless someone objects.

Thanks,
Kyrill

2015-04-27  Kyrylo Tkachov  

* combine.c (simplify_if_then_else): Use std::swap instead
of manually swapping.
(known_cond): Likewise.
(simplify_comparison): Likewise.
commit e8aa8f2ec22cf548bc05f0f8e56a13a2bdd1c228
Author: Kyrylo Tkachov 
Date:   Fri Apr 24 17:47:14 2015 +0100

[combine][obvious] Use std::swap instead of manually swapping

diff --git a/gcc/combine.c b/gcc/combine.c
index 9f840cb..5f7cbc0 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -6185,7 +6185,7 @@ simplify_if_then_else (rtx x)
   if (false_code == EQ)
 	{
 	  swapped = 1, true_code = EQ, false_code = NE;
-	  temp = true_rtx, true_rtx = false_rtx, false_rtx = temp;
+	  std::swap (true_rtx, false_rtx);
 	}
 
   /* If we are comparing against zero and the expression being tested has
@@ -6250,7 +6250,7 @@ simplify_if_then_else (rtx x)
   SUBST (XEXP (x, 1), false_rtx);
   SUBST (XEXP (x, 2), true_rtx);
 
-  temp = true_rtx, true_rtx = false_rtx, false_rtx = temp;
+  std::swap (true_rtx, false_rtx);
   cond = XEXP (x, 0);
 
   /* It is possible that the conditional has been simplified out.  */
@@ -9022,7 +9022,6 @@ static rtx
 known_cond (rtx x, enum rtx_code cond, rtx reg, rtx val)
 {
   enum rtx_code code = GET_CODE (x);
-  rtx temp;
   const char *fmt;
   int i, j;
 
@@ -9062,7 +9061,7 @@ known_cond (rtx x, enum rtx_code cond, rtx reg, rtx val)
   else if (COMPARISON_P (x) || COMMUTATIVE_ARITH_P (x))
 {
   if (rtx_equal_p (XEXP (x, 0), val))
-	cond = swap_condition (cond), temp = val, val = reg, reg = temp;
+	cond = swap_condition (cond), std::swap (val, reg);
 
   if (rtx_equal_p (XEXP (x, 0), reg) && rtx_equal_p (XEXP (x, 1), val))
 	{
@@ -11454,7 +11453,7 @@ simplify_comparison (enum rtx_code code, rtx *pop0, rtx *pop1)
  is already a constant integer.  */
   if (swap_commutative_operands_p (op0, op1))
 {
-  tem = op0, op0 = op1, op1 = tem;
+  std::swap (op0, op1);
   code = swap_condition (code);
 }
 
@@ -12341,7 +12340,7 @@ simplify_comparison (enum rtx_code code, rtx *pop0, rtx *pop1)
   /* We may have changed the comparison operands.  Re-canonicalize.  */
   if (swap_commutative_operands_p (op0, op1))
 {
-  tem = op0, op0 = op1, op1 = tem;
+  std::swap (op0, op1);
   code = swap_condition (code);
 }
 


[PATCH][simplify-rtx][trivial] Use std::swap instead of manually swapping

2015-04-27 Thread Kyrill Tkachov

Hi all,

This patch replaces in simplify-rtx.c the manual swapping of values with 
std::swap.
Precedents suggest these are considered obvious changes.

Bootstrapped and tested on aarch64, x86_64.

Will commit as obvious in a couple of days if no one objects

Thanks,
Kyrill

2015-04-27  Kyrylo Tkachov  

* simplify-rtx.c (simplify_gen_binary): Use std::swap instead
of manually swapping.
(simplify_associative_operation): Likewise.
(simplify_binary_operation): Likewise.
(simplify_plus_minus): Likewise.
(simplify_relational_operation): Likewise.
(simplify_ternary_operation): Likewise.
commit dbbb2823461181853a5605adc3cbbcab04dd00b1
Author: Kyrylo Tkachov 
Date:   Fri Apr 24 16:54:49 2015 +0100

[simplify-rtx] Use std::swap instead of manually swapping

diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index 4a8b56b..2dc2b58 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -216,7 +216,7 @@ simplify_gen_binary (enum rtx_code code, machine_mode mode, rtx op0,
   /* Put complex operands first and constants second if commutative.  */
   if (GET_RTX_CLASS (code) == RTX_COMM_ARITH
   && swap_commutative_operands_p (op0, op1))
-tem = op0, op0 = op1, op1 = tem;
+std::swap (op0, op1);
 
   return gen_rtx_fmt_ee (code, mode, op0, op1);
 }
@@ -1926,9 +1926,7 @@ simplify_associative_operation (enum rtx_code code, machine_mode mode,
   if (! swap_commutative_operands_p (op1, op0))
 	return simplify_gen_binary (code, mode, op1, op0);
 
-  tem = op0;
-  op0 = op1;
-  op1 = tem;
+  std::swap (op0, op1);
 }
 
   if (GET_CODE (op0) == code)
@@ -1977,9 +1975,7 @@ simplify_binary_operation (enum rtx_code code, machine_mode mode,
   /* Make sure the constant is second.  */
   if (GET_RTX_CLASS (code) == RTX_COMM_ARITH
   && swap_commutative_operands_p (op0, op1))
-{
-  tem = op0, op0 = op1, op1 = tem;
-}
+std::swap (op0, op1);
 
   trueop0 = avoid_constant_pool_reference (op0);
   trueop1 = avoid_constant_pool_reference (op1);
@@ -4274,10 +4270,10 @@ simplify_plus_minus (enum rtx_code code, machine_mode mode, rtx op0,
 		  {
 		ncode = MINUS;
 		if (lneg)
-		  tem = lhs, lhs = rhs, rhs = tem;
+		  std::swap (lhs, rhs);
 		  }
 		else if (swap_commutative_operands_p (lhs, rhs))
-		  tem = lhs, lhs = rhs, rhs = tem;
+		  std::swap (lhs, rhs);
 
 		if ((GET_CODE (lhs) == CONST || CONST_INT_P (lhs))
 		&& (GET_CODE (rhs) == CONST || CONST_INT_P (rhs)))
@@ -4467,7 +4463,7 @@ simplify_relational_operation (enum rtx_code code, machine_mode mode,
   /* For the following tests, ensure const0_rtx is op1.  */
   if (swap_commutative_operands_p (op0, op1)
   || (op0 == const0_rtx && op1 != const0_rtx))
-tem = op0, op0 = op1, op1 = tem, code = swap_condition (code);
+std::swap (op0, op1), code = swap_condition (code);
 
   /* If op0 is a compare, extract the comparison arguments from it.  */
   if (GET_CODE (op0) == COMPARE && op1 == const0_rtx)
@@ -4828,7 +4824,7 @@ simplify_const_relational_operation (enum rtx_code code,
   /* Make sure the constant is second.  */
   if (swap_commutative_operands_p (op0, op1))
 {
-  tem = op0, op0 = op1, op1 = tem;
+  std::swap (op0, op1);
   code = swap_condition (code);
 }
 
@@ -5189,7 +5185,7 @@ simplify_ternary_operation (enum rtx_code code, machine_mode mode,
   /* Canonicalize the two multiplication operands.  */
   /* a * -b + c  =>  -b * a + c.  */
   if (swap_commutative_operands_p (op0, op1))
-	tem = op0, op0 = op1, op1 = tem, any_change = true;
+	std::swap (op0, op1), any_change = true;
 
   if (any_change)
 	return gen_rtx_FMA (mode, op0, op1, op2);


[committed] Make arguments to vec::splice const

2015-04-27 Thread Richard Sandiford
vec::splice doesn't modify its source vector, so mark the argument as const.

Tested on x86_64-linux-gnu.  Applied as obvious.

Richard


gcc/
* vec.h (vec): Make splice arguments const.  Update definitions
accordingly.

Index: gcc/vec.h
===
--- gcc/vec.h   2015-04-27 10:38:48.0 +0100
+++ gcc/vec.h   2015-04-27 10:41:57.859920289 +0100
@@ -483,8 +483,8 @@ struct GTY((user)) vec
   bool iterate (unsigned, T *) const;
   bool iterate (unsigned, T **) const;
   vec *copy (ALONE_CXX_MEM_STAT_INFO) const;
-  void splice (vec &);
-  void splice (vec *src);
+  void splice (const vec &);
+  void splice (const vec *src);
   T *quick_push (const T &);
   T &pop (void);
   void truncate (unsigned);
@@ -705,7 +705,7 @@ vec_safe_copy (vec *src
Reallocate DST, if necessary.  */
 template
 inline void
-vec_safe_splice (vec *&dst, vec *src
+vec_safe_splice (vec *&dst, const vec *src
 CXX_MEM_STAT_INFO)
 {
   unsigned src_len = vec_safe_length (src);
@@ -836,7 +836,7 @@ vec::copy (ALONE_MEM_STA
 
 template
 inline void
-vec::splice (vec &src)
+vec::splice (const vec &src)
 {
   unsigned len = src.length ();
   if (len)
@@ -849,7 +849,7 @@ vec::splice (vec
 inline void
-vec::splice (vec *src)
+vec::splice (const vec *src)
 {
   if (src)
 splice (*src);
@@ -1212,8 +1212,8 @@ struct vec
   vec copy (ALONE_CXX_MEM_STAT_INFO) const;
   bool reserve (unsigned, bool = false CXX_MEM_STAT_INFO);
   bool reserve_exact (unsigned CXX_MEM_STAT_INFO);
-  void splice (vec &);
-  void safe_splice (vec & CXX_MEM_STAT_INFO);
+  void splice (const vec &);
+  void safe_splice (const vec & CXX_MEM_STAT_INFO);
   T *quick_push (const T &);
   T *safe_push (const T &CXX_MEM_STAT_INFO);
   T &pop (void);
@@ -1489,7 +1489,7 @@ vec::release (void)
 
 template
 inline void
-vec::splice (vec &src)
+vec::splice (const vec &src)
 {
   if (src.m_vec)
 m_vec->splice (*(src.m_vec));
@@ -1503,7 +1503,7 @@ vec::splice (vec
 inline void
-vec::safe_splice (vec &src
+vec::safe_splice (const vec &src
  MEM_STAT_DECL)
 {
   if (src.length ())



[PATCH][AArch64] Add alternative 'extr' pattern, calculate rtx cost properly

2015-04-27 Thread Kyrill Tkachov

Hi all,

We currently have a pattern that will recognise a particular combination of 
shifts and bitwise-or
as an extr instruction. However, the order of the shifts inside the IOR doesn't 
have
canonicalisation rules (see rev16 pattern for similar stuff).
This means that for code like:

unsigned long
foo (unsigned long a, unsigned long b)
{
  return (a << 16) |  (b >> 48);
}

we will recognise the extr, but for the equivalent:
unsigned long
foo (unsigned long a, unsigned long b)
{
  return (b >> 48) | (a << 16);
}

we won't, and we'll emit three instructions.
This patch adds the pattern for the alternative order of shifts and allows us 
to generate
for the above the code:
foo:
extrx0, x0, x1, 48
ret

The zero-extended version is added as well and the rtx costs function is 
updated to handle
all of these cases.

I've seen this pattern trigger in the gcc code itself in expmed.c where it 
eliminated a sequence
of orrs and shifts into an extr instruction!

Bootstrapped and tested on aarch64-linux.

Ok for trunk?

Thanks,
Kyrill

2015-04-27  Kyrylo Tkachov  

* config/aarch64/aarch64.md (*extr5_insn_alt): New pattern.
(*extrsi5_insn_uxtw_alt): Likewise.
* config/aarch64/aarch64.c (aarch64_extr_rtx_p): New function.
(aarch64_rtx_costs, IOR case): Use above to properly cost extr
operations.
commit d45e92b3b8c5837328b7b10682565cacfb566e5b
Author: Kyrylo Tkachov 
Date:   Mon Mar 2 17:26:38 2015 +

[AArch64] Add alternative 'extr' pattern, calculate rtx cost properly

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 860a1dd..ef5a1e4 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -5438,6 +5438,51 @@ aarch64_frint_unspec_p (unsigned int u)
 }
 }
 
+/* Return true iff X is an rtx that will match an extr instruction
+   i.e. as described in the *extr5_insn family of patterns.
+   OP0 and OP1 will be set to the operands of the shifts involved
+   on success and will be NULL_RTX otherwise.  */
+
+static bool
+aarch64_extr_rtx_p (rtx x, rtx *res_op0, rtx *res_op1)
+{
+  rtx op0, op1;
+  machine_mode mode = GET_MODE (x);
+
+  *res_op0 = NULL_RTX;
+  *res_op1 = NULL_RTX;
+
+  if (GET_CODE (x) != IOR)
+return false;
+
+  op0 = XEXP (x, 0);
+  op1 = XEXP (x, 1);
+
+  if ((GET_CODE (op0) == ASHIFT && GET_CODE (op1) == LSHIFTRT)
+  || (GET_CODE (op1) == ASHIFT && GET_CODE (op0) == LSHIFTRT))
+{
+ /* Canonicalise locally to ashift in op0, lshiftrt in op1.  */
+  if (GET_CODE (op1) == ASHIFT)
+std::swap (op0, op1);
+
+  if (!CONST_INT_P (XEXP (op0, 1)) || !CONST_INT_P (XEXP (op1, 1)))
+return false;
+
+  unsigned HOST_WIDE_INT shft_amnt_0 = UINTVAL (XEXP (op0, 1));
+  unsigned HOST_WIDE_INT shft_amnt_1 = UINTVAL (XEXP (op1, 1));
+
+  if (shft_amnt_0 < GET_MODE_BITSIZE (mode)
+  && shft_amnt_0 + shft_amnt_1 == GET_MODE_BITSIZE (mode))
+{
+  *res_op0 = XEXP (op0, 0);
+  *res_op1 = XEXP (op1, 0);
+  return true;
+}
+}
+
+  return false;
+}
+
 /* Calculate the cost of calculating (if_then_else (OP0) (OP1) (OP2)),
storing it in *COST.  Result is true if the total cost of the operation
has now been calculated.  */
@@ -5968,6 +6013,16 @@ cost_plus:
 
   return true;
 }
+
+  if (aarch64_extr_rtx_p (x, &op0, &op1))
+{
+  *cost += rtx_cost (op0, IOR, 0, speed)
+   + rtx_cost (op1, IOR, 1, speed);
+  if (speed)
+*cost += extra_cost->alu.shift;
+
+  return true;
+}
 /* Fall through.  */
 case XOR:
 case AND:
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 1a7f888..17a8755 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -3597,6 +3597,21 @@ (define_insn "*extr5_insn"
   [(set_attr "type" "shift_imm")]
 )
 
+;; There are no canonicalisation rules for ashift and lshiftrt inside an ior
+;; so we have to match both orderings.
+(define_insn "*extr5_insn_alt"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(ior:GPI  (lshiftrt:GPI (match_operand:GPI 2 "register_operand" "r")
+			(match_operand 4 "const_int_operand" "n"))
+		  (ashift:GPI (match_operand:GPI 1 "register_operand" "r")
+			  (match_operand 3 "const_int_operand" "n"]
+  "UINTVAL (operands[3]) < GET_MODE_BITSIZE (mode)
+   && (UINTVAL (operands[3]) + UINTVAL (operands[4])
+   == GET_MODE_BITSIZE (mode))"
+  "extr\\t%0, %1, %2, %4"
+  [(set_attr "type" "shift_imm")]
+)
+
 ;; zero_extend version of the above
 (define_insn "*extrsi5_insn_uxtw"
   [(set (match_operand:DI 0 "register_operand" "=r")
@@ -3611,6 +3626,19 @@ (define_insn "*extrsi5_insn_uxtw"
   [(set_attr "type" "shift_imm")]
 )
 
+(define_insn "*extrsi5_insn_uxtw_alt"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(zero_extend:DI
+	 (ior:SI (lshiftrt:SI (match_operand:SI 2 "register_operand" "r")
+			  

Fix xstormy16 handling of HImode predicates

2015-04-27 Thread Richard Sandiford
xstormy16 has:

(define_predicate "xs_hi_general_operand"
  (match_code "const_int,reg,subreg,mem,symbol_ref,label_ref,const")
{
  if ((GET_CODE (op) == CONST_INT)
   && ((INTVAL (op) >= 32768) || (INTVAL (op) < -32768)))
{
  error ("constant halfword load operand out of range");
  return false;
}

  return general_operand (op, mode);
})

I think this is wrong for two reasons:

- Predicates are supposed to accept things in different modes and weed out
  the invalid values.  That's why the predicates start with a check that
  GET_MODE (op) == VOIDmode || GET_MODE (op) == mode.

  IMO it doesn't make sense to handle (e.g.) SImode REGs correctly but
  raise an error for SImode CONST_INTs.  The generic predicates later
  reject CONST_INTs that aren't suitable for the mode, so invalid
  HImode CONST_INTs will fail to match rather than be silently accepted.

- If we did have a check, it should be an assert rather than a user error.

The port is effectively relying on the matching algorithm in genrecog to
ensure that the predicate is only called on SET_SRC rtxes once the mode
of the SET_DEST has been checked.

This patch removes the predicate and uses general_operand instead.

xs_hi_nonmemory_operand is similar except that it rejects symbol_ref
and label_ref constants.  The patch therefore removes the error check
but keeps the restricted code list.

Note that we have an assert in rtl.h that CONST_INTs fit into the
range appropriate for the wi:: precision.  This should catch more
cases than the predicate checks would have.  (It has flagged up
problems that were latent before.)

Tested by building xstormy16-elf and making sure that there were
no differences in assembly output for the GCC testsuite.
OK to install?

Thanks,
Richard


gcc/
* config/stormy16/predicates.md (xs_hi_general_operand): Delete.
(xs_hi_nonmemory_operand): Remove error.
* config/stormy16/stormy16.md (movhi, movhi_internal): Use
general_operand rather than xs_hi_general_operand.

Index: gcc/config/stormy16/predicates.md
===
--- gcc/config/stormy16/predicates.md   2015-04-27 10:38:48.0 +0100
+++ gcc/config/stormy16/predicates.md   2015-04-27 10:40:00.709344494 +0100
@@ -151,28 +151,8 @@ (define_predicate "xstormy16_carry_plus_
  && (INTVAL (XEXP (op, 1)) < -4 || INTVAL (XEXP (op, 1)) > 4));
 })
 
-(define_predicate "xs_hi_general_operand"
-  (match_code "const_int,reg,subreg,mem,symbol_ref,label_ref,const")
-{
-  if ((GET_CODE (op) == CONST_INT)
-   && ((INTVAL (op) >= 32768) || (INTVAL (op) < -32768)))
-{
-  error ("constant halfword load operand out of range");
-  return false;
-}
-
-  return general_operand (op, mode);
-})
-
 (define_predicate "xs_hi_nonmemory_operand"
   (match_code "const_int,reg,subreg,const")
 {
-  if ((GET_CODE (op) == CONST_INT) 
-   && ((INTVAL (op) >= 32768) || (INTVAL (op) < -32768)))
-{
-  error ("constant arithmetic operand out of range");
-  return false;
-}
-
   return nonmemory_operand (op, mode);
 })
Index: gcc/config/stormy16/stormy16.md
===
--- gcc/config/stormy16/stormy16.md 2015-04-27 10:38:48.0 +0100
+++ gcc/config/stormy16/stormy16.md 2015-04-27 10:40:00.709344494 +0100
@@ -185,7 +185,7 @@ (define_insn "pophi1"
 
 (define_expand "movhi"
   [(set (match_operand:HI 0 "nonimmediate_nonstack_operand" "")
-   (match_operand:HI 1 "xs_hi_general_operand" ""))]
+   (match_operand:HI 1 "general_operand" ""))]
   ""
   { xstormy16_expand_move (HImode, operands[0], operands[1]);
 DONE;
@@ -193,7 +193,7 @@ (define_expand "movhi"
 
 (define_insn "movhi_internal"
   [(set (match_operand:HI 0 "nonimmediate_nonstack_operand" 
"=r,m,e,e,T,r,S,W,e")
-   (match_operand:HI 1 "xs_hi_general_operand"  
"r,e,m,L,L,i,i,ie,W"))]
+   (match_operand:HI 1 "general_operand"
"r,e,m,L,L,i,i,ie,W"))]
   ""
   "@
mov %0,%1



Extend list of codes accepted by address_operand

2015-04-27 Thread Richard Sandiford
While testing the patch that I'm about to post, I found that the list
of address_operand codes in gensupport.c was incomplete.  This patch
fills in some "obvious" missing entries.

It isn't exactly clear which codes are allowed at the top level of
an address.  decompose_address is known to be general enough for all LRA
targets, but it also technically accepts any unary operator as the outer
code, whereas I think only SIGN_EXTEND and ZERO_EXTEND are used at the
top level.  The routine accepts ZERO_EXTRACT and SIGN_EXTRACT, but I
think in practice they're only used for index values and not full addresses.

Tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard


gcc/
* gensupport.c (std_preds): Add missing codes to address_operand entry.

Index: gcc/gensupport.c
===
--- gcc/gensupport.c2015-04-27 10:38:48.0 +0100
+++ gcc/gensupport.c2015-04-27 10:40:59.044635383 +0100
@@ -2800,7 +2800,8 @@ struct std_pred_table
 
 static const struct std_pred_table std_preds[] = {
   {"general_operand", false, true, {SUBREG, REG, MEM}},
-  {"address_operand", true, true, {SUBREG, REG, MEM, PLUS, MINUS, MULT}},
+  {"address_operand", true, true, {SUBREG, REG, MEM, PLUS, MINUS, MULT,
+  ZERO_EXTEND, SIGN_EXTEND, AND}},
   {"register_operand", false, false, {SUBREG, REG}},
   {"pmode_register_operand", true, false, {SUBREG, REG}},
   {"scratch_operand", false, false, {SCRATCH, REG}},



Add a build version of inchash.o

2015-04-27 Thread Richard Sandiford
The generator patch that I'm about to post wants to use inchash.c,
so this patch adds it to the list of files that are built for the
build system as well as the host.

Tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard


gcc/
* inchash.h, inchash.c: Include bconfig.h for build objects.
* Makefile.in (build/inchash.o): New rule.

Index: gcc/inchash.h
===
--- gcc/inchash.h   2015-04-27 10:38:48.0 +0100
+++ gcc/inchash.h   2015-04-27 10:42:57.783191573 +0100
@@ -20,7 +20,11 @@ Software Foundation; either version 3, o
 #ifndef INCHASH_H
 #define INCHASH_H 1
 
+#ifdef GENERATOR_FILE
+#include "bconfig.h"
+#else
 #include "config.h"
+#endif
 #include "system.h"
 #include "coretypes.h"
 #include "hashtab.h"
Index: gcc/inchash.c
===
--- gcc/inchash.c   2015-04-27 10:38:48.0 +0100
+++ gcc/inchash.c   2015-04-27 10:42:57.783191573 +0100
@@ -17,7 +17,11 @@ Software Foundation; either version 3, o
 along with GCC; see the file COPYING3.  If not see
 .  */
 
+#ifdef GENERATOR_FILE
+#include "bconfig.h"
+#else
 #include "config.h"
+#endif
 #include "system.h"
 #include "coretypes.h"
 #include "hashtab.h"
Index: gcc/Makefile.in
===
--- gcc/Makefile.in 2015-04-27 10:38:48.0 +0100
+++ gcc/Makefile.in 2015-04-27 10:42:57.783191573 +0100
@@ -2434,6 +2434,8 @@ build/vec.o : vec.c $(BCONFIG_H) $(SYSTE
$(GGC_H) toplev.h $(DIAGNOSTIC_CORE_H)
 build/hash-table.o : hash-table.c $(BCONFIG_H) $(SYSTEM_H) coretypes.h  \
$(HASH_TABLE_H) $(GGC_H) toplev.h $(DIAGNOSTIC_CORE_H)
+build/inchash.o : inchash.c $(BCONFIG_H) $(SYSTEM_H) coretypes.h  \
+   $(HASHTAB_H) inchash.h
 build/gencondmd.o : build/gencondmd.c $(BCONFIG_H) $(SYSTEM_H) \
   coretypes.h $(GTM_H) insn-constants.h
\
   $(filter-out insn-flags.h, $(RTL_H) $(TM_P_H) $(FUNCTION_H) $(REGS_H) \



Remove redudant tests in icf_handled_component_p

2015-04-27 Thread Eric Botcazou
handled_component_p comprises both REALPART_EXPR and IMAGPART_EXPR.

Tested on x86_64-suse-linux, applied on the mainline as obvious.


2015-04-27  Eric Botcazou  

* ipa-icf.c (icf_handled_component_p): Remove redundant tests.


-- 
Eric BotcazouIndex: ipa-icf.c
===
--- ipa-icf.c	(revision 222439)
+++ ipa-icf.c	(working copy)
@@ -1806,9 +1806,8 @@ sem_function::icf_handled_component_p (t
 {
   tree_code tc = TREE_CODE (t);
 
-  return ((handled_component_p (t))
-	  || tc == ADDR_EXPR || tc == MEM_REF || tc == REALPART_EXPR
-	  || tc == IMAGPART_EXPR || tc == OBJ_TYPE_REF);
+  return (handled_component_p (t)
+	  || tc == ADDR_EXPR || tc == MEM_REF || tc == OBJ_TYPE_REF);
 }
 
 /* Basic blocks dictionary BB_DICT returns true if SOURCE index BB

Re: [PATCH][expr.c] PR 65358 Avoid clobbering partial argument during sibcall

2015-04-27 Thread Kyrill Tkachov


On 22/04/15 12:51, Kyrill Tkachov wrote:

On 21/04/15 18:33, Kyrill Tkachov wrote:

On 21/04/15 15:09, Jeff Law wrote:

On 04/21/2015 02:30 AM, Kyrill Tkachov wrote:

From reading config/stormy16/stormy-abi it seems to me that we don't
pass arguments partially in stormy16, so this code would never be called
there. That leaves pa as the potential problematic target.
I don't suppose there's an easy way to test on pa? My checkout of binutils
doesn't seem to include a sim target for it.

No simulator, no machines in the testfarm, the box I had access to via
parisc-linux.org seems dead and my ancient PA overheats well before a
bootstrap could complete.  I often regret knowing about the backwards
way many things were done on the PA because it makes me think about
cases that only matter on dead architectures.

So what should be the action plan here? I can't add an assert on
positive result as a negative result is valid.

We want to catch the case where this would cause trouble on
pa, or change the patch until we're confident that it's fine
for pa.

That being said, reading the documentation of STACK_GROWS_UPWARD
and ARGS_GROW_DOWNWARD I'm having a hard time visualising a case
where this would cause trouble on pa.

Is the problem that in the function:

+/* Add SIZE to X and check whether it's greater than Y.
+   If it is, return the constant amount by which it's greater or smaller.
+   If the two are not statically comparable (for example, X and Y contain
+   different registers) return -1.  This is used in expand_push_insn to
+   figure out if reading SIZE bytes from location X will end up reading from
+   location Y.  */
+static int
+memory_load_overlap (rtx x, rtx y, HOST_WIDE_INT size)
+{
+  rtx tmp = plus_constant (Pmode, x, size);
+  rtx sub = simplify_gen_binary (MINUS, Pmode, tmp, y);
+
+  if (!CONST_INT_P (sub))
+return -1;
+
+  return INTVAL (sub);
+}

for ARGS_GROW_DOWNWARD we would be reading 'backwards' from x,
so the function should something like the following?

static int
memory_load_overlap (rtx x, rtx y, HOST_WIDE_INT size)
{
#ifdef ARGS_GROW_DOWNWARD
 rtx tmp = plus_constant (Pmode, x, -size);
#else
 rtx tmp = plus_constant (Pmode, x, size);
#endif
 rtx sub = simplify_gen_binary (MINUS, Pmode, tmp, y);

 if (!CONST_INT_P (sub))
   return -1;

#ifdef ARGS_GROW_DOWNWARD
 return INTVAL (-sub);
#else
 return INTVAL (sub);
#endif
}

now, say for x == sp + 4,  y == sp + 8, size == 16:
This would be a problematic case for arm, so this code on arm
(where ARGS_GROW_DOWNWARD is *not* defined) would return
12, which is the number of bytes that overlap.

On a target where ARGS_GROW_DOWNWARD is defined this would return
-20, meaning that no overlap occurs (because we read in the descending
direction from x, IIUC).

Hi Jeff,

Here's an attempt to make this more concrete.
Only the memory_load_overlap function has changed.
This time I tried to take into account the case when
ARGS_GROW_DOWNWARD.

Take the case where x == sp, y == sp + 8, size == 16.
For arm, this would return 8 as that is the number of bytes
that overlap. On pa, since ARGS_GROW_DOWNWARD is defined it
would return -1 as we're reading down from x rather than up
towards y.

In the case when x == sp + 8, y == sp, size == 16
This would return -1 on arm since we're reading upwards from x
and thefore no overlap would happen.

On pa, this would return 8, which I think is the right thing.
But again, I don't have access to any pa means of testing.

What do you think of this approach?


Hi Dave,

Would it be possible for you to test this patch on a 64-bit hppa
or at least bootstrap it?
https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01288.html

There is a concern that it may potentially affect the passing of
complex arguments partially on the stack and partially in regs
on pa because of the way the args and stack grow on that target.

Unfortunately I don't have access to any hardware or simulators.
It would help a lot with getting this patch in.

Thanks,
Kyrill




Thanks,
Kyrill

P.S. I've included the testcase from Honggyu in the patch.

2015-04-22  Kyrylo Tkachov  

  PR target/65358
  * expr.c (memory_load_overlap): New function.
  (emit_push_insn): When pushing partial args to the stack would
  clobber the register part load the overlapping part into a pseudo
  and put it into the hard reg after pushing.

2015-04-22  Honggyu Kim  

  PR target/65358
  * gcc.dg/pr65358.c: New test.



Thanks,
Kyrill


Jeff





Mostly rewrite genrecog

2015-04-27 Thread Richard Sandiford
I think it's been the case for a while that parallel builds of GCC tend
to serialise around the compilation of insn-recog.c, especially with
higher --enable-checking settings.  This patch tries to speed that
up by replacing most of genrecog with a new algorithm.

I think the main problems with the current code are:

1. Vector architectures have added lots of new instructions that have
   a similar shape and differ only in mode, code or unspec number.
   The current algorithm doesn't have any way of factoring out those
   similarities.

2. When matching a particular instruction, the current code examines
   everything about a SET_DEST before moving on to the SET_SRC.  This has
   two subproblems:

   2a. The destination of a SET isn't very distinctive.  It's usually
   just a register_operand, a memory_operand, a nonimmediate_operand
   or a flags register.  We therefore tend to backtrack to the
   SET_DEST a lot, oscillating between groups of instructions with
   the same kind of destination.

   2b. Backtracking through predicate checks is relatively expensive.
   It would be good to narrow down the "shape" of the instruction
   first and only then check the predicates.  (The backtracking is
   expensive in terms of insn-recog.o compile time too, both because
   we need to copy into argument registers and out of the result
   register, and because it adds more sites where spills are needed.)

3. The code keeps one local variable per rtx depth, so it ends up
   loading the same rtx many times over (mostly when backtracking).
   This is very expensive in rtl-checking builds because each XEXP
   includes a code check and a line-specific failure call.

   In principle the idea of having one local variable per depth
   is good.  But it was originally written that way when all optimisations
   were done at the rtl level and I imagine each local variable mapped
   to one pseudo register.  These days the statements that reload the
   value needed on backtracking lead to many more SSA names and phi
   statements than you'd get with just a single variable per position
   (loaded once, so naturally SSA).  There is still the potential benefit
   of avoiding having sibling rtxes live at once, but fixing (2) above
   reduces that problem.

Also, the code is all goto-based, which makes it rather hard to step through.

The patch deals with these as follows:

1. Detect subpatterns that differ only by mode, code and/or integer
   (e.g. unspec number) and split them out into a common routine.

2. Match the "shape" of the instruction first, in terms of codes,
   integers and vector lengths, and only then check the modes, predicates
   and dups.  When checking the shape, handle SET_SRCs before SET_DESTs.
   In practice this seems to greatly reduce the amount of backtracking.

3. Have one local variable per rtx position.  I tested the patch with
   and without the change and it helped a lot with rtl-checking builds
   without seeming to affect release builds much either way.

As far as debuggability goes, the new code avoids gotos and just
uses "natural" control flow.

The headline stat is that a stage 3 --enable-checking=yes,rtl,df
build of insn-recog.c on my box goes from 7m43s to 2m2s (using the
same stage 2 compiler).  The corresponding --enable-checking=release
change is from 49s to 24s (less impressive, as expected).

The patch seems to speed up recog.  E.g. the time taken to build
fold-const.ii goes from 6.74s before the patch to 6.69s after it;
not a big speed-up, but reproducible.

Here's a comparison of the number of lines of code in insn-recog.c
before and after the patch on one target per config/ CPU:

aarch64-linux-gnueabi   11552638169 :   33.04%
alpha-linux-gnu  2447910740 :   43.87%
arm-linux-gnueabi   16920867759 :   40.04%
avr-rtems5564722127 :   39.76%
bfin-elf 13928 6498 :   46.65%
c6x-elf  2992813324 :   44.52%
cr16-elf  2650 1419 :   53.55%
cris-elf 18669 7257 :   38.87%
epiphany-elf 19308 6131 :   31.75%
fr30-elf  2204 1112 :   50.45%
frv-linux-gnu13541 5950 :   43.94%
h8300-elf19584 9327 :   47.63%
hppa64-hp-hpux11.23  18299 8549 :   46.72%
ia64-linux-gnu   3762917101 :   45.45%
iq2000-elf2752 1609 :   58.47%
lm32-elf  1536  872 :   56.77%
m32c-elf 10040 4145 :   41.28%
m32r-elf   

Re: Add a build version of inchash.o

2015-04-27 Thread Richard Biener
On Mon, Apr 27, 2015 at 12:09 PM, Richard Sandiford
 wrote:
> The generator patch that I'm about to post wants to use inchash.c,
> so this patch adds it to the list of files that are built for the
> build system as well as the host.
>
> Tested on x86_64-linux-gnu.  OK to install?

Ok.

THanks,
Richard.

> Thanks,
> Richard
>
>
> gcc/
> * inchash.h, inchash.c: Include bconfig.h for build objects.
> * Makefile.in (build/inchash.o): New rule.
>
> Index: gcc/inchash.h
> ===
> --- gcc/inchash.h   2015-04-27 10:38:48.0 +0100
> +++ gcc/inchash.h   2015-04-27 10:42:57.783191573 +0100
> @@ -20,7 +20,11 @@ Software Foundation; either version 3, o
>  #ifndef INCHASH_H
>  #define INCHASH_H 1
>
> +#ifdef GENERATOR_FILE
> +#include "bconfig.h"
> +#else
>  #include "config.h"
> +#endif
>  #include "system.h"
>  #include "coretypes.h"
>  #include "hashtab.h"
> Index: gcc/inchash.c
> ===
> --- gcc/inchash.c   2015-04-27 10:38:48.0 +0100
> +++ gcc/inchash.c   2015-04-27 10:42:57.783191573 +0100
> @@ -17,7 +17,11 @@ Software Foundation; either version 3, o
>  along with GCC; see the file COPYING3.  If not see
>  .  */
>
> +#ifdef GENERATOR_FILE
> +#include "bconfig.h"
> +#else
>  #include "config.h"
> +#endif
>  #include "system.h"
>  #include "coretypes.h"
>  #include "hashtab.h"
> Index: gcc/Makefile.in
> ===
> --- gcc/Makefile.in 2015-04-27 10:38:48.0 +0100
> +++ gcc/Makefile.in 2015-04-27 10:42:57.783191573 +0100
> @@ -2434,6 +2434,8 @@ build/vec.o : vec.c $(BCONFIG_H) $(SYSTE
> $(GGC_H) toplev.h $(DIAGNOSTIC_CORE_H)
>  build/hash-table.o : hash-table.c $(BCONFIG_H) $(SYSTEM_H) coretypes.h  \
> $(HASH_TABLE_H) $(GGC_H) toplev.h $(DIAGNOSTIC_CORE_H)
> +build/inchash.o : inchash.c $(BCONFIG_H) $(SYSTEM_H) coretypes.h  \
> +   $(HASHTAB_H) inchash.h
>  build/gencondmd.o : build/gencondmd.c $(BCONFIG_H) $(SYSTEM_H) \
>coretypes.h $(GTM_H) insn-constants.h  
>   \
>$(filter-out insn-flags.h, $(RTL_H) $(TM_P_H) $(FUNCTION_H) $(REGS_H) \
>


RE: [PATCH, combine] Try REG_EQUAL for nonzero_bits

2015-04-27 Thread Thomas Preud'homme
> From: Jeff Law [mailto:l...@redhat.com]
> Sent: Saturday, April 25, 2015 3:00 AM
> Do you have a testcase where this change can result in better generated
> code.  If so please add that testcase.  It's OK if it's ARM specific.

Hi Jeff,

Last time I tried I couldn't reduce the code to a small testcase but if I 
remember
well it was mostly due to the problem of finding a good test for creduce
(zero extension is not unique enough). I'll try again with a more manual 
approach
and get back to you.

Best regards,

Thomas





Minor tweak to size functions

2015-04-27 Thread Eric Botcazou
The size functions are used by the Ada compiler to factor out the large size 
computations in self-referential types (dynamic types whose size depends on a 
discriminant in the object).  We generate useless leaf size functions for size 
expressions involving discriminants:

_Global.Sz4_Q 
(ada__streams__stream_element_count___XDLU_0__9223372036854775807 p0)
{
  return (bitsizetype) p0 * 8;


_Global.Sz5_Q 
(ada__streams__stream_element_count___XDLU_0__9223372036854775807 p0)
{
  return (sizetype) p0;


_Global.Sz6_Q 
(ada__streams__stream_element_count___XDLU_0__9223372036854775807 p0)
{
  return (bitsizetype) p0 * 8;


_Global.Sz7_Q 
(ada__streams__stream_element_count___XDLU_0__9223372036854775807 p0)
{
  return (sizetype) p0;


The wrapped expressions are simple arithmetic operations so there is no real 
point in creating a size function for them.  That's already ensured for non-
leaf size functions (size functions calling other size functions) and it's 
immediate to extend it down to the leaves.

Tested on x86_64-suse-linux, applied on the mainline (this only affects the 
Ada compiler).  The tree-cfg.c hunk adds the missing curly bracket at the end 
in the above dumps, applied as obvious.


2015-04-27  Eric Botcazou  

* stor-layout.c (self_referential_component_ref_p): New predicate.
(copy_self_referential_tree_r): Use it.
(self_referential_size): Punt for simple operations directly involving
self-referential component references.
* tree-cfg.c (dump_function_to_file): Add missing final curly bracket.


-- 
Eric BotcazouIndex: stor-layout.c
===
--- stor-layout.c	(revision 222439)
+++ stor-layout.c	(working copy)
@@ -127,6 +127,20 @@ variable_size (tree size)
 /* An array of functions used for self-referential size computation.  */
 static GTY(()) vec *size_functions;
 
+/* Return true if T is a self-referential component reference.  */
+
+static bool
+self_referential_component_ref_p (tree t)
+{
+  if (TREE_CODE (t) != COMPONENT_REF)
+return false;
+
+  while (REFERENCE_CLASS_P (t))
+t = TREE_OPERAND (t, 0);
+
+  return (TREE_CODE (t) == PLACEHOLDER_EXPR);
+}
+
 /* Similar to copy_tree_r but do not copy component references involving
PLACEHOLDER_EXPRs.  These nodes are spotted in find_placeholder_in_expr
and substituted in substitute_in_expr.  */
@@ -154,19 +168,10 @@ copy_self_referential_tree_r (tree *tp,
 }
 
   /* Default case: the component reference.  */
-  else if (code == COMPONENT_REF)
+  else if (self_referential_component_ref_p (*tp))
 {
-  tree inner;
-  for (inner = TREE_OPERAND (*tp, 0);
-	   REFERENCE_CLASS_P (inner);
-	   inner = TREE_OPERAND (inner, 0))
-	;
-
-  if (TREE_CODE (inner) == PLACEHOLDER_EXPR)
-	{
-	  *walk_subtrees = 0;
-	  return NULL_TREE;
-	}
+  *walk_subtrees = 0;
+  return NULL_TREE;
 }
 
   /* We're not supposed to have them in self-referential size trees
@@ -199,7 +204,7 @@ self_referential_size (tree size)
 
   /* Do not factor out simple operations.  */
   t = skip_simple_constant_arithmetic (size);
-  if (TREE_CODE (t) == CALL_EXPR)
+  if (TREE_CODE (t) == CALL_EXPR || self_referential_component_ref_p (t))
 return size;
 
   /* Collect the list of self-references in the expression.  */
Index: tree-cfg.c
===
--- tree-cfg.c	(revision 222439)
+++ tree-cfg.c	(working copy)
@@ -7441,7 +7441,11 @@ dump_function_to_file (tree fndecl, FILE
   else
 	{
 	  if (!ignore_topmost_bind)
-	fprintf (file, "{\n");
+	{
+	  fprintf (file, "{\n");
+	  /* No topmost bind, pretend it's ignored for later.  */
+	  ignore_topmost_bind = true;
+	}
 	  indent = 2;
 	}
 


RE: [PATCH] [1/2] [ARM] [libgcc] Support RTABI half-precision conversion functions.

2015-04-27 Thread Hale Wang
> -Original Message-
> From: Ramana Radhakrishnan [mailto:ramana@googlemail.com]
> Sent: Wednesday, April 22, 2015 5:00 PM
> To: Hale Wang
> Cc: Ramana Radhakrishnan; Joseph Myers; GCC Patches
> Subject: Re: [PATCH] [1/2] [ARM] [libgcc] Support RTABI half-precision
> conversion functions.
> 
> On Wed, Apr 22, 2015 at 9:32 AM, Hale Wang  wrote:
> >> -Original Message-
> >> From: Ramana Radhakrishnan [mailto:ramana@googlemail.com]
> >> Sent: Wednesday, April 22, 2015 3:50 PM
> >> To: Joseph Myers
> >> Cc: Hale Wang; GCC Patches
> >> Subject: Re: [PATCH] [1/2] [ARM] [libgcc] Support RTABI
> >> half-precision conversion functions.
> >>
> >> On Mon, Apr 13, 2015 at 12:25 PM, Joseph Myers
> >>  wrote:
> >> > On Mon, 13 Apr 2015, Hale Wang wrote:
> >> >
> >> >> Yes, you are right. It's my fault to add the "only" here. Thank
> >> >> you to point out this.
> >> >> Beside this, is this patch OK for you?
> >> >
> >> > I don't think it's a good idea for libgcc to include large pieces
> >> > of assembly code generated by a compiler.  Just compile the code
> >> > with whatever options are needed at the time libgcc is built -
> >> > possibly with #if conditionals to allow compiling different versions of 
> >> > the
> code.
> >
> > Indeed, just compile the code with option '-mfloat-abi=soft' at the time
> libgcc is build which can solve this problem.
> 
> Or why not "conditionally" use the ``pcs'' attribute on the ARM port ?
> That then means you don't need options magic on top ?

OK. I think your suggestion can solve this problem more clearly. I will 
resubmit a patch later. I think we can discard this patch this time.
Thanks a lot.

Hale
> 
> Ramana





Re: [ARM] Fix RTX cost for vector SET

2015-04-27 Thread Kyrill Tkachov

Hi Kugan,

On 25/04/15 00:30, Kugan wrote:

Thanks for the review. I have updated the patch based on the comments
with some other minor changes. Bootstrapped and regression tested on
aarch64-none-linux-gnu with no-new regressions. Is this OK for trunk?


Thanks,
Kugan


gcc/ChangeLog:

2015-04-24  Kugan Vivekanandarajah  
Jim Wilson  

* config/arm/aarch-common-protos.h (struct mem_cost_table): Added
new  fields loadv and storev.
* config/aarch64/aarch64-cost-tables.h (thunderx_extra_costs):
Initialize loadv and storev.
* config/arm/aarch-cost-tables.h (generic_extra_costs): Likewise.
(cortexa53_extra_costs): Likewise.
(cortexa57_extra_costs): Likewise.
(xgene1_extra_costs): Likewise.
* config/aarch64/aarch64.c (aarch64_rtx_costs): Update vector
rtx_costs.


Due to the struct mem_cost_table update for aarch64, arm cost tables
also need to be updated. Please find the patch that does this.
Regression tested on arm-none-linux-gnu with no-new regressions. Is this
OK for trunk?

Thanks,
Kugan

gcc/ChangeLog:

2015-04-25  Kugan Vivekanandarajah  

* config/arm/arm.c (cortexa9_extra_costs): Initialize loadv and
 storev.
(cortexa8_extra_costs): Likewise.
(cortexa5_extra_costs): Likewise.
(cortexa7_extra_costs): Likewise.
(cortexa12_extra_costs): Likewise.
(cortexa15_extra_costs): Likewise.
(v7m_extra_costs): Likewise.


This arm part looks ok to me FWIW (if the approach in the aarch64
patch is deemed ok by the aarch64 maintainers), it just syncs the
fields of the common cost struct between arm and aarch64.

Please only commit this if the aarch64 patch gets approved and
committed at the same time, otherwise we'll get a build failure.
Having a look at the arm rtx costs and seeing if we can improve them
in the same way as the aarch64 ones would be good as a follow up at some
point too.

Thanks,
Kyrill





RE: [PATCH 1/2, combine] Try REG_EQUAL for nonzero_bits

2015-04-27 Thread Thomas Preud'homme
> From: Jeff Law [mailto:l...@redhat.com]
> Sent: Saturday, April 25, 2015 2:57 AM

> > +static rtx
> > +sign_extend_short_imm (rtx src, machine_mode mode, unsigned int
> prec)
> > +{
> > +  if (GET_MODE_PRECISION (mode) < prec && CONST_INT_P (src)
> > +  && INTVAL (src) > 0 && val_signbit_known_set_p (mode, INTVAL
> (src)))
> > +src = GEN_INT (INTVAL (src) | ~GET_MODE_MASK (mode));
> Can you go ahead and put each condition of the && on a separate line.
> It uses more vertical space, but IMHO makes this easier to read.As I
> said, it was a nit :-)

You're perfectly right. Anything that can improve readability of source code
is a good thing.

> 
> OK with that fix.

Committed.

Best regards,

Thomas





[Patch, fortran] Simplify lbound for array subcomponents

2015-04-27 Thread Mikael Morin
Hello,

while reviewing Thomas' bound simplification patch, I noticed that the
{l,u}bound simplification code wasn't handling array subcomponents.
Fixed by the attached patch, regression tested.  OK for trunk?

Mikael





2015-04-27  Mikael Morin  

* simplify.c (simplify_bound_dim): Tighten the check for array fullness
by also checking for absence of subreference.
(simplify_bound): Don't skip simplification if the array
has subreferences.
(simplify_cobound): Same.

2015-04-27  Mikael Morin  

* gfortran.dg/bound_simplify_4.f90: New.

Index: simplify.c
===
--- simplify.c	(révision 221972)
+++ simplify.c	(copie de travail)
@@ -3338,7 +3338,7 @@ simplify_bound_dim (gfc_expr *array, gfc_expr *kin
   result = gfc_get_constant_expr (BT_INTEGER, k, &array->where);
 
   /* Then, we need to know the extent of the given dimension.  */
-  if (coarray || ref->u.ar.type == AR_FULL)
+  if (coarray || (ref->u.ar.type == AR_FULL && !ref->next))
 {
   l = as->lower[d-1];
   u = as->upper[d-1];
@@ -3417,11 +3417,8 @@ simplify_bound (gfc_expr *array, gfc_expr *dim, gf
 	case AR_FULL:
 	  /* We're done because 'as' has already been set in the
 		 previous iteration.  */
-	  if (!ref->next)
-		goto done;
+	  goto done;
 
-	/* Fall through.  */
-
 	case AR_UNKNOWN:
 	  return NULL;
 
@@ -3556,11 +3553,8 @@ simplify_cobound (gfc_expr *array, gfc_expr *dim,
 	case AR_FULL:
 	  /* We're done because 'as' has already been set in the
 		 previous iteration.  */
-	  if (!ref->next)
-	goto done;
+	  goto done;
 
-	/* Fall through.  */
-
 	case AR_UNKNOWN:
 	  return NULL;
 



! { dg-do run }
! { dg-additional-options "-fcoarray=single -fdump-tree-original" }
!
! Check that {L,U}{,CO}BOUND intrinsics are properly simplified.
!
  type :: t
integer :: c
  end type t

  type(t) :: d(3:8) = t(7)
  type(t) :: e[5:9,-1:*]

  if (lbound(d,  1) /= 3) call abort
  if (lbound(d(3:5), 1) /= 1) call abort
  if (lbound(d%c,1) /= 1) call abort  
  if (ubound(d,  1) /= 8) call abort  
  if (ubound(d(3:5), 1) /= 3) call abort
  if (ubound(d%c,1) /= 6) call abort  

  if (lcobound(e,   1) /=  5) call abort
  if (lcobound(e%c, 1) /=  5) call abort
  if (lcobound(e,   2) /= -1) call abort
  if (lcobound(e%c, 2) /= -1) call abort
  if (ucobound(e,   1) /=  9) call abort
  if (ucobound(e%c, 1) /=  9) call abort
  ! no simplification for ucobound(e{,%c}, dim=2)
end
! { dg-final { scan-tree-dump-not "bound" "original" } }
! { dg-final { scan-tree-dump-not "abort" "original" } }
! { dg-final { cleanup-tree-dump "original" } }



Re: Hide _S_n_primes from user code

2015-04-27 Thread Jonathan Wakely

On 22/04/15 22:11 +0200, François Dumont wrote:

+constexpr auto __n_primes
+  = sizeof(__prime_list) / sizeof(unsigned long) - 1;


Normally I'd say

   sizeof(__prime_list) / sizeof(*__prime_list) - 1

would be better, but since it's very unlikely we'll change the element
type in the array what you have should be safe.

OK for trunk.



Re: [PR64164] drop copyrename, integrate into expand

2015-04-27 Thread Richard Biener
On Fri, Apr 24, 2015 at 3:56 AM, Alexandre Oliva  wrote:
> On Apr  6, 2015, Jeff Law  wrote:
>
 So the bulk of the changes into this routine are really about picking
 a good leader, which presumably is how we're able to get the desired
 effects on debuginfo that we used to get from tree-ssa-copyrename.c?
>>>
>>> This has nothing to do with debuginfo, I'm afraid.  We just had to keep
>>> track of parm and result decls to avoid coalescing them together, and
>>> parm decl default defs to promote them to leaders, because expand copies
>>> incoming REGs to pseudos in PARM_DECL's DECL_RTL.  We should fill that
>>> in with the RTL created for the default def for the PARM_DECL.  At the
>>> end, I believe we also copy the RESULT_DECL DECL_RTL to the actual
>>> return register or rtl.  I didn't want to tackle the reworking of these
>>> expanders to avoid problems out of copying incoming parms to one pseudo
>>> and then reading it from another, as I observed before I made this
>>> change.  I'm now tackling that, so that we can refrain from touching the
>>> base vars altogether, and not refrain from coalescing parms and results.
>
>> Hmmm, so the real issue here is the expansion setup of parms and
>> results.  I hadn't pondered that aspect.  I'd encourage fixing the
>> expansion code too if you can see a path for that.
>
> That was the trickiest bit of the patch: getting assign_parms to use the
> out-of-SSA-chosen RTL for the (default def of the) param instead of
> creating a pseudo or stack slot of its own, especially when we create a
> .result_ptr decl and there is an incoming by-ref result_decl, in which
> case we ought to use the same SSA-assigned pseudo for both.  Another
> case worth mentioning is that in which a param is unused: there is no
> default def for it, but in the non-optimized case, we have to assign it
> to the same location.  I've used the DECL_RTL itself to carry the
> information in this case, at least in the non-optimized case, in which
> we know all SSA_NAMEs associated with each param *will* be assigned to
> the same partition, and use the same RTL.  If we do optimize, the param
> may get multiple locations, and DECL_RTL will be cleared.  That's fine:
> the incoming value of the param will end up being copied to a separate
> pseudo, so that there's no risk of messing with any other default def
> (there's a new testcase for this), and the copy is likely to be
> optimized out.
>
> The other tricky bit was to fix all expander bits that required
> SSA_NAMEs to have a associated decl.  I've removed all such cases, so we
> can now expand anonymous SSA decls directly, without having to create an
> ignored decl.  Doing that, we can coalesce variables and expand each
> partition without worrying about choosing a preferred partition leader.
> We just have to make sure we don't consider pairs of variables eligible
> for coalescing if they should get different promoted modes, or a
> different register-or-stack choice, and then expansion of partitions is
> streamlined: we just expand each leader, and then adjust all SSA_NAMEs
> to associate the RTL with their base variables, if any.
>
>
> In this revision of the patch, I have retained -ftree-coalesce-vars, so
> that its negated form can be used in testcases that formerly expected no
> coalescing across user variables, but that explicitly disabled VTA.
>
> As for testcases, while investigating test regressions, I found out
> various guality failures had to do with VT's lack of awareness of custom
> calling conventions.  Caller's variables saved in registers that are
> normally call-clobbered, but that are call-saved in custom conventions
> set up for a callee, would end up invalidating the entry-point location
> associations.  I've arranged for var-tracking to use custom calling
> conventions for register invalidation at call insns, and this fixed not
> only a few guality regressions due to changes in register assignment,
> but a number of other long-standing guality failures.  Yay!  This could
> be split out into a standalone patch.
>
>
> On Mar 31, 2015, Richard Biener  wrote:
>
>> Did you do any statistics on how the number of basevars changes with your 
>> patch
>> compared to trunk?
>
> In this version of the patch, we no longer touch the base vars at all.
> We just associate the piece of RTL generated for the partition with a
> list of decls, if needed.  (I've just realized that I never noticed a
> list of decls show up anywhere, and looking into this, I saw a bug in
> the leader_merge function, that causes it to fail to go from a single
> entry to a list: it creates the list, but then returns the original
> non-list entry; that's why I never saw them!  I won't delay posting the
> patch just because of this; I'm not even sure we want decl lists in REG
> or MEM attrs begin with)
>
> I have collected some statistics on the effects of the patch in
> compiling stage3-gcc/, before and after the patch, with and without
> -fno-tree-coalesce-vars. 

Re: [PATCH] Fix LTO option streaming

2015-04-27 Thread Richard Biener
On Thu, 23 Apr 2015, Richard Biener wrote:

> On Thu, 23 Apr 2015, Jan Hubicka wrote:
> 
> > > 
> > > It looks like when transitioning to using target and optimization
> > > option nodes for compile-time to link-time option streaming you
> > > didn't adjust lto-opts.c nor lto-wrapper.c.  The following fixes
> > 
> > Yep, I assumed that lto-wrapper's merging is now redundant for optimization
> > options, while it is still needed for global functions (-fPIC) that was
> > not covered by my changes and since this was all bit late, it seemed I can
> > just keep it for next stage1.
> > 
> > > the target option side (for SWITCHABLE_TARGET).  Do not record
> > > any target options in the lto_opts section.
> > > 
> > > Honza - I suppose we don't have any testcase that this works, I'll
> > > try to come up with sth.  This also looks like a correctness issue
> > > to me.
> > 
> > Thanks, that would be a good idea. So with SWITCHABLE_TARGET
> > we have problem that after switching the options are misinterpretted?

Ok, so with

t.c
---
extern int have_fma;
extern double x, y, z;
void do_fma (void);
int main()
{
  if (have_fma)
do_fma ();
  else
x = x + y * z;
}

t2.c
---
int have_fma;
volatile double x, y, z;
void do_fma (void)
{
  x = x + y * z;
}

and doing

gcc -c -Ofast -flto t.c
gcc -c -Ofast -mfma -flto t2.c
gcc t.o t2.o

I get -mfma on the lto1 command-line for the link.  But I also get for 
main()

 

while for do_fma()

 

and for some reason main () ends up _not_ having -mfma enabled.

I'm not sure what disables it or how we arrive at the above set of
options in the first place...  does the TARGET_OPTION_NODE store
the complete state of all target options?  I had the impression
it only stores "modification"?

So it seems to work (and simply ignore the global -mfma flag).  Not
sure on whether by design or by luck ;)

Richard.


> > I think Jakub made patch to simply skip target-option-node streaming here,
> > how this is supposed to work.
> 
> Not all targets support target attributes and I think those that
> support them are those that are SWITCHABLE_TARGET - but maybe I am
> mistaken?
> 
> > > 
> > > We can do similar changes for optimize options, now, for all targets, no?
> > 
> > I am not quite sure about this; not all options are per-function nature.
> > I need to audit all the target options for LTO safety, I did some work
> > on this for optimization options, but the target options are still very 
> > wild.
> > For example I am not quite convinced mixing -malign-double 
> > -mno-align-double units
> > will work as expected.
> 
> Well, but once types are laid out specifying sth else at link time
> won't help.  But yes, if we don't stream the option we have no chance
> to complain in lto-wrapper.  It's just that we seem to not enter
> the merging machinery for single-file testcases so I chose to "fix"
> lto option streaming rather than lto-wrapper ... I'll have a closer
> look to lto-wrapper.  But to complain about target option mismatches
> we'd have to call a target hook from lto-wrapper (or finally re-structure
> all this and merge options in lto1 itself).
> 
> Richard.
> 
> > Honza
> > > 
> > > Thanks,
> > > Richard.
> > > 
> > > 2015-04-23  Richard Biener  
> > > 
> > >   * lto-opts.c (lto_write_options): Do not record target options
> > >   if we are targeting a SWITCHABLE_TARGET target.  We use target
> > >   options on functions to transfer target flags from compile to
> > >   link time.
> > > 
> > > Index: gcc/lto-opts.c
> > > ===
> > > *** gcc/lto-opts.c(revision 222360)
> > > --- gcc/lto-opts.c(working copy)
> > > *** lto_write_options (void)
> > > *** 219,224 
> > > --- 219,230 
> > > && lto_stream_offload_p)
> > >  continue;
> > >   
> > > +   /* Do not store target-specific options if we target a
> > > +  SWITCHABLE_TARGET target.  */
> > > +   if ((cl_options[option->opt_index].flags & CL_TARGET)
> > > +   && SWITCHABLE_TARGET)
> > > + continue;
> > > + 
> > > /* Drop options created from the gcc driver that will be rejected
> > >when passed on to the driver again.  */
> > > if (cl_options[option->opt_index].cl_reject_driver)
> > 
> > 
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)


Re: niter_base simplification

2015-04-27 Thread Jonathan Wakely

On 22/04/15 22:10 +0200, François Dumont wrote:

Hello

   I don't know if I am missing something but I think __niter_base 
could be simplified to remove usage of _Iter_base. Additionally I 
overload it to also remove __normal_iterator layer even if behind a 
reverse_iterator or move_iterator, might help compiler to optimize 
code, no ? If not, might allow other algo optimization in the 
future...


   I prefered to provide a __make_reverse_iterator to allow the 
latter in C++11 and not only in C++14. Is it fine to do it this way or 
do you prefer to simply get rid of all this part ?


It's fine to add __make_reverse_iterator but see my comment below.

   * include/bits/cpp_type_traits.h (__gnu_cxx::__normal_iterator): 
Delete.


You're removing __is_normal_iterator not __normal_iterator.


   * include/bits/stl_algobase.h (std::__niter_base): Adapt.
   * include/bits/stl_iterator.h (__make_reverse_iterator): New in C++11.
   (std::__niter_base): Overloads for std::reverse_iterator,
   __gnu_cxx::__normal_iterator and std::move_iterator.

Tested under Linux x86_64. I checked that std::copy still ends up 
calling __builtin_memmove when used on vector iterators.


François




diff --git a/libstdc++-v3/include/bits/stl_algobase.h 
b/libstdc++-v3/include/bits/stl_algobase.h
index 0bcb133..73eea6b 100644
--- a/libstdc++-v3/include/bits/stl_algobase.h
+++ b/libstdc++-v3/include/bits/stl_algobase.h
@@ -270,17 +270,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  return __a;
}

-  // If _Iterator is a __normal_iterator return its base (a plain pointer,
-  // normally) otherwise return it untouched.  See copy, fill, ... 
+  // Fallback implementation of the function used to remove the

+  // __normal_iterator wrapper. See copy, fill, ...


It's a bit strange to have a function with no other overloads visible
described as a fallback. It would be good to say that the other
definition is in bits/stl_iterator.h


  template
-struct _Niter_base
-: _Iter_base<_Iterator, __is_normal_iterator<_Iterator>::__value>
-{ };
-
-  template
-inline typename _Niter_base<_Iterator>::iterator_type
+inline _Iterator
__niter_base(_Iterator __it)
-{ return std::_Niter_base<_Iterator>::_S_base(__it); }
+{ return __it; }

  // Likewise, for move_iterator.


This comment no longer makes sense, because you've removed the comment
on _Niter_base that it referred to. Please restore the original text
of the _Niter_base comment for _Miter_base.

(Alternatively, could the same simplification be made for
__miter_base? Do we need _Miter_base<> or just two overloads of
__miter_base()?)



  template
diff --git a/libstdc++-v3/include/bits/stl_iterator.h 
b/libstdc++-v3/include/bits/stl_iterator.h
index 4a9189e..3aad9f3 100644
--- a/libstdc++-v3/include/bits/stl_iterator.h
+++ b/libstdc++-v3/include/bits/stl_iterator.h
@@ -390,7 +390,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
{ return __y.base() - __x.base(); }
  //@}

-#if __cplusplus > 201103L
+#if __cplusplus == 201103L
+  template
+inline reverse_iterator<_Iterator>
+__make_reverse_iterator(_Iterator __i)
+{ return reverse_iterator<_Iterator>(__i); }
+
+# define _GLIBCXX_MAKE_REVERSE_ITERATOR(_Iter) \
+  std::__make_reverse_iterator(_Iter)
+#elif __cplusplus > 201103L
#define __cpp_lib_make_reverse_iterator 201402

  // _GLIBCXX_RESOLVE_LIB_DEFECTS
@@ -400,6 +408,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
inline reverse_iterator<_Iterator>
make_reverse_iterator(_Iterator __i)
{ return reverse_iterator<_Iterator>(__i); }
+
+# define _GLIBCXX_MAKE_REVERSE_ITERATOR(_Iter) \
+  std::make_reverse_iterator(_Iter)
+#endif
+
+#if __cplusplus >= 201103L
+  template
+auto
+__niter_base(reverse_iterator<_Iterator> __it)
+-> decltype(_GLIBCXX_MAKE_REVERSE_ITERATOR(__niter_base(__it.base(
+{ return _GLIBCXX_MAKE_REVERSE_ITERATOR(__niter_base(__it.base())); }
#endif



It might be simpler to just add __make_reverse_iterator for >= 201103L
and then always use std::__make_reverse_iterator instead of a macro.

That's similar to what we do for std:__addressof and std:addressof.


Re: [Patch] Add regex_constants::__polynomial

2015-04-27 Thread Jonathan Wakely

On 26/04/15 17:55 -0700, Tim Shen wrote:

I didn't test with _GLIBCXX_DEBUG though. Updated the patch for
removing DFS restriction for ECMAScript.


OK for trunk.



Re: [PATCH, x86] Add TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE hook

2015-04-27 Thread Uros Bizjak
On Wed, Feb 4, 2015 at 2:21 PM, Christian Bruel  wrote:
> While trying to reduce the PR64835 case for ARM and x86, I noticed that the
> alignment flags are cleared for x86 when attribute optimized is used.
>
> With the attached testcases, the visible effects are twofold :
>
> 1) Functions compiled in with attribute optimize (-O2) are not aligned as if
> they were with the -O2 flag.
>
> 2) can_inline_edge_p fails because opts_for_fn (caller->decl) != opts_for_fn
> (callee->decl)) even-though they are compiled with the same optimization
> level.

2015-02-06  Christian Bruel  

PR target/64835
* config/i386/i386.c (ix86_default_align): New function.
(ix86_override_options_after_change): Call ix86_default_align.
(TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE): New hook.
(ix86_override_options_after_change): New function.

2015-02-06  Christian Bruel  

PR target/64835
* gcc.dg/ipa/iinline-attr.c: New test.
* gcc.target/i386/iinline-attr-2.c: New test.

OK for mainline.

Thanks,
Uros


[PATCH, ARM] Alternatives and type attributes fixes.

2015-04-27 Thread Yvan Roux
Hi,

This is a follow-up of PR64208 where an LRA loop was due to redundancy
in insn's alternatives.  I've checked all the insns in the arm backend
to avoid latent problems and this patch fixes the issues I've spotted.

Only thumb2_movsicc_insn contained duplicated alternatives, and the
rest of the changes are related to the type attribute, which were not
accurate or used accordingly to their definition.  Notice that the
modifications have only a limited impact as in most of the pipeline
descriptions, multiple/mov_reg/alu_reg/.. types are used the same way,
only cortex a8 seems to have a real difference between alu or multiple
instruction and mov.

Bootstrapped and regtested on arm-linux-gnueabihf.  Ok for trunk ?

Thanks,
Yvan

2015-04-27  Yvan Roux  

* config/arm/arm.mb (*arm_movt): Fix type attribute.
(*movsi_compare0): Likewise.
(*cmpsi_shiftsi): Likewise.
(*cmpsi_shiftsi_swp): Likewise.
(*movsicc_insn): Likewise.
(*cond_move): Likewise.
(*if_plus_move): Likewise.
(*if_move_plus): Likewise.
(*if_arith_move): Likewise.
(*if_move_arith): Likewise.
(*if_shift_move): Likewise.
(*if_move_shift): Likewise.
* config/arm/thumb2.md (*thumb2_movsi_insn): Fix type attribute.
(*thumb2_movsicc_insn): Fix alternative redundancy.
(*thumb2_addsi_short): Split register and immediate alternatives.
(thumb2_addsi3_compare0): Likewise.
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 164ac13..ee23deb 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -5631,7 +5631,7 @@
   [(set_attr "predicable" "yes")
(set_attr "predicable_short_it" "no")
(set_attr "length" "4")
-   (set_attr "type" "mov_imm")]
+   (set_attr "type" "alu_sreg")]
 )
 
 (define_insn "*arm_movsi_insn"
@@ -5919,7 +5919,7 @@
cmp%?\\t%0, #0
sub%.\\t%0, %1, #0"
   [(set_attr "conds" "set")
-   (set_attr "type" "alus_imm,alus_imm")]
+   (set_attr "type" "alus_sreg,alus_sreg")]
 )
 
 ;; Subroutine to store a half word from a register into memory.
@@ -6886,7 +6886,7 @@
   [(set_attr "conds" "set")
(set_attr "shift" "1")
(set_attr "arch" "32,a,a")
-   (set_attr "type" "alus_shift_imm,alu_shift_reg,alus_shift_imm")])
+   (set_attr "type" "alus_shift_imm,alus_shift_reg,alus_shift_imm")])
 
 (define_insn "*cmpsi_shiftsi_swp"
   [(set (reg:CC_SWP CC_REGNUM)
@@ -6899,7 +6899,7 @@
   [(set_attr "conds" "set")
(set_attr "shift" "1")
(set_attr "arch" "32,a,a")
-   (set_attr "type" "alus_shift_imm,alu_shift_reg,alus_shift_imm")])
+   (set_attr "type" "alus_shift_imm,alus_shift_reg,alus_shift_imm")])
 
 (define_insn "*arm_cmpsi_negshiftsi_si"
   [(set (reg:CC_Z CC_REGNUM)
@@ -7492,10 +7492,10 @@
 (const_string "mov_imm")
 (const_string "mov_reg"))
   (const_string "mvn_imm")
-  (const_string "mov_reg")
-  (const_string "mov_reg")
-  (const_string "mov_reg")
-  (const_string "mov_reg")])]
+  (const_string "multiple")
+  (const_string "multiple")
+  (const_string "multiple")
+  (const_string "multiple")])]
 )
 
 (define_insn "*movsfcc_soft_insn"
@@ -8653,7 +8653,14 @@
 return \"\";
   "
   [(set_attr "conds" "use")
-   (set_attr "type" "mov_reg,mov_reg,multiple")
+   (set_attr_alternative "type"
+ [(if_then_else (match_operand 2 "const_int_operand" 
"")
+(const_string "mov_imm")
+(const_string "mov_reg"))
+  (if_then_else (match_operand 1 "const_int_operand" 
"")
+(const_string "mov_imm")
+(const_string "mov_reg"))
+  (const_string "multiple")])
(set_attr "length" "4,4,8")]
 )
 
@@ -9449,8 +9456,8 @@
 (const_string "alu_imm" )
 (const_string "alu_sreg"))
   (const_string "alu_imm")
-  (const_string "alu_sreg")
-  (const_string "alu_sreg")])]
+  (const_string "multiple")
+  (const_string "multiple")])]
 )
 
 (define_insn "*ifcompare_move_plus"
@@ -9487,7 +9494,13 @@
sub%D4\\t%0, %2, #%n3\;mov%d4\\t%0, %1"
   [(set_attr "conds" "use")
(set_attr "length" "4,4,8,8")
-   (set_attr "type" "alu_sreg,alu_imm,multiple,multiple")]
+   (set_attr_alternative "type"
+ [(if_then_else (match_operand 3 "const_int_operand" 
"")
+(const_string "alu_imm" )
+(const_string "alu_sreg"))
+  (const_string "alu_imm")
+ 

[patch] Add a cross-reference link to libstdc++ manual

2015-04-27 Thread Jonathan Wakely

Committed to trunk.


commit 85b6429ce3410f9e204f7d926f67707556c6a4bf
Author: Jonathan Wakely 
Date:   Mon Apr 27 13:27:27 2015 +0100

	* doc/xml/manual/extensions.xml: Add cross-reference.
	* doc/html/manual/ext_compile_checks.html: Regenerate.

diff --git a/libstdc++-v3/doc/xml/manual/extensions.xml b/libstdc++-v3/doc/xml/manual/extensions.xml
index c4120c9..41b1a80 100644
--- a/libstdc++-v3/doc/xml/manual/extensions.xml
+++ b/libstdc++-v3/doc/xml/manual/extensions.xml
@@ -82,7 +82,8 @@ extensions, be aware of two things:
   They can be enabled at configure time with
   --enable-concept-checks.
   You can enable them on a per-translation-unit basis with
-  #define _GLIBCXX_CONCEPT_CHECKS for GCC 3.4 and higher
+  #define
+  _GLIBCXX_CONCEPT_CHECKS for GCC 3.4 and higher
   (or with #define _GLIBCPP_CONCEPT_CHECKS for versions
   3.1, 3.2 and 3.3).



Re: [PATCH] Properly valueize values we value-number to

2015-04-27 Thread Richard Biener
On Mon, 27 Apr 2015, Richard Biener wrote:

> On Fri, 24 Apr 2015, Jeff Law wrote:
> 
> > On 02/17/2015 07:58 AM, Richard Biener wrote:
> > [ Restarting a old thread... ]
> > 
> > > On a closer look the record_const_or_copy_1 hunk is redundant
> > > (record_equality is really a bit obfuscated...).
> > Agreed.  I'm not entirely sure how it got to this point.
> > 
> > > And record_equality is where the SSA_NAME_VALUEs might end up
> > > forming chains?  At least because we might record a SSA_NAME_VALUE
> > > for sth that might be the target of a SSA_NAME_VALUE, as in
> > > 
> > >   a_1 = b_2;  SSA_NAME_VALUE (a_1) == b_2;
> > >   if (b_2 == i_3(D))
> > > ... temporarily SSA_NAME_VALUE (b_2) == i_3(D)
> > > 
> > > thus under if (b_2 == i_3(D)) SSA_NAME_VALUE (a_1) == b_2 and
> > > SSA_NAME_VALUE (SSA_NAME_VALUE (a_1)) == i_3(D)?  I guess we
> > > can't easily avoid that because we don't keep track of a
> > > reverse mapping (nor would we want to update that).
> > Right.  In general, the fact that we're in SSA form means that we ought not
> > get loops in the SSA_NAME_VALUE chain since there's a single static 
> > assignment
> > and we'll visit it once.
> > 
> > That nice model breaks down in two ways.  First we derive equivalences from
> > equality conditions like you've shown above.  Second, when threading we can
> > thread through a loop latch and start reprocessing a block.  The interaction
> > between those two can result in loops in the value chain.
> > 
> > What I'm hoping to do in GCC6 is eliminate all this stuff with a threader 
> > that
> > is independent of the dominator walk (and optimizer). Instead of walking
> > forward through the dominator tree recording key nuggets, then removing 
> > those
> > nuggets as we pop back up the tree, we instead we start with the conditional
> > and walk up the use-def chains, simplifying as we go, with the goal of
> > simplifying to a constant, of course.
> > 
> > You can see the beginnings of that with the FSM bits from Sebastian.
> > 
> > Obviously until those bits are in place, we should try to clean up any warts
> > in the current implementation.
> > 
> > > 
> > > Btw, in record_equality, the == part of the <= check for the loop
> > > depth will just randomly swap x and y while we should prefer
> > > IL canonical order.  If we can't rely on the input being in IL
> > > canonical order we should prepend the whole function with
> > > 
> > >   if (tree_swap_operands_p (x, y, false))
> > > std::swap (x, y);
> > > 
> > > and change <= to < for the loop-depth check.
> > Agreed.  Makes perfect sense.
> 
> I'm now re-bootstrapping and testing the following.

Committed as follows, with XFAILing and re-opening PR65217.

Richard.

2015-04-27  Richard Biener  

* tree-ssa-dom.c (record_equivalences_from_phis): Valueize PHI arg.
(record_equivalences_from_stmt): Valueize rhs.
(record_equality): Canonicalize x and y order via
tree_swap_operands_p.  Do not swap operands for same loop depth.

* gcc.target/i386/pr65217.c: XFAIL.

Index: gcc/tree-ssa-dom.c
===
*** gcc/tree-ssa-dom.c  (revision 222360)
--- gcc/tree-ssa-dom.c  (working copy)
*** record_equivalences_from_phis (basic_blo
*** 1519,1524 
--- 1519,1531 
  if (lhs == t)
continue;
  
+ /* Valueize t.  */
+ if (TREE_CODE (t) == SSA_NAME)
+   {
+ tree tmp = SSA_NAME_VALUE (t);
+ t = tmp ? tmp : t;
+   }
+ 
  /* If we have not processed an alternative yet, then set
 RHS to this alternative.  */
  if (rhs == NULL)
*** record_equality (tree x, tree y)
*** 1752,1757 
--- 1759,1767 
  {
tree prev_x = NULL, prev_y = NULL;
  
+   if (tree_swap_operands_p (x, y, false))
+ std::swap (x, y);
+ 
if (TREE_CODE (x) == SSA_NAME)
  prev_x = SSA_NAME_VALUE (x);
if (TREE_CODE (y) == SSA_NAME)
*** record_equality (tree x, tree y)
*** 1766,1772 
else if (is_gimple_min_invariant (x)
   /* ???  When threading over backedges the following is important
  for correctness.  See PR61757.  */
!  || (loop_depth_of_name (x) <= loop_depth_of_name (y)))
  prev_x = x, x = y, y = prev_x, prev_x = prev_y;
else if (prev_x && is_gimple_min_invariant (prev_x))
  x = y, y = prev_x, prev_x = prev_y;
--- 1776,1782 
else if (is_gimple_min_invariant (x)
   /* ???  When threading over backedges the following is important
  for correctness.  See PR61757.  */
!  || (loop_depth_of_name (x) < loop_depth_of_name (y)))
  prev_x = x, x = y, y = prev_x, prev_x = prev_y;
else if (prev_x && is_gimple_min_invariant (prev_x))
  x = y, y = prev_x, prev_x = prev_y;
*** record_equivalences_from_stmt (gimple st
*** 2128,2145 
if (may_optimize_p
  && (TREE_CODE (rhs) == SSA_

Re: [PATCH][PR65818][bootstrap,hppa] Return side-effect free result in gimplify_va_arg_internal

2015-04-27 Thread Tom de Vries

On 27-04-15 10:17, Richard Biener wrote:

This patch fixes that by gimplifying the address expression of the mem-ref
>returned by the target hook (borrowing code from gimplify_expr, case
>MEM_REF).
>
>Bootstrapped and reg-tested on x86_64.
>
>Bootstrapped and reg-tested on hppa2.0w-hp-hpux11.11.
>
>OK for trunk?

Hmm, that assert looks suspicious...

Can't you simply always do

   gimplify_expr (expr, pre_p, post_p, is_gimple_lvalue, fb_lvalue);


It's a bit counter-intuitive for me, using is_gimple_lvalue for something (the 
result of va_arg) we use as rvalue.


But, it seems to work (with & in front of expr).

OK if bootstrap and reg-test on x86_64 succeeds?

Thanks,
- Tom

Return side-effect free result in gimplify_va_arg_internal

2015-04-27  Tom de Vries  

	PR tree-optimization/65818
	* gimplify.c (gimplify_va_arg_internal): Ensure that only side-effect
	free values are returned.
---
 gcc/gimplify.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index c68bd47..4a68c87 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -9352,7 +9352,9 @@ gimplify_va_arg_internal (tree valist, tree type, location_t loc,
   else
 gimplify_expr (&valist, pre_p, post_p, is_gimple_min_lval, fb_lvalue);
 
-  return targetm.gimplify_va_arg_expr (valist, type, pre_p, post_p);
+  tree expr = targetm.gimplify_va_arg_expr (valist, type, pre_p, post_p);
+  gimplify_expr (&expr, pre_p, post_p, is_gimple_lvalue, fb_lvalue);
+  return expr;
 }
 
 /* Gimplify __builtin_va_arg, aka VA_ARG_EXPR, which is not really a
-- 
1.9.1



Re: Fix xstormy16 handling of HImode predicates

2015-04-27 Thread Nicholas Clifton

Hi Richard,


gcc/
* config/stormy16/predicates.md (xs_hi_general_operand): Delete.
(xs_hi_nonmemory_operand): Remove error.
* config/stormy16/stormy16.md (movhi, movhi_internal): Use
general_operand rather than xs_hi_general_operand.


Approved - please apply.

Cheers
  Nick



Re: [PATCH][expr.c] PR 65358 Avoid clobbering partial argument during sibcall

2015-04-27 Thread John David Anglin

On 2015-04-27 6:12 AM, Kyrill Tkachov wrote:


On 22/04/15 12:51, Kyrill Tkachov wrote:

On 21/04/15 18:33, Kyrill Tkachov wrote:

On 21/04/15 15:09, Jeff Law wrote:

On 04/21/2015 02:30 AM, Kyrill Tkachov wrote:
From reading config/stormy16/stormy-abi it seems to me that we 
don't
pass arguments partially in stormy16, so this code would never be 
called

there. That leaves pa as the potential problematic target.
I don't suppose there's an easy way to test on pa? My checkout of 
binutils

doesn't seem to include a sim target for it.

No simulator, no machines in the testfarm, the box I had access to via
parisc-linux.org seems dead and my ancient PA overheats well before a
bootstrap could complete.  I often regret knowing about the backwards
way many things were done on the PA because it makes me think about
cases that only matter on dead architectures.

So what should be the action plan here? I can't add an assert on
positive result as a negative result is valid.

We want to catch the case where this would cause trouble on
pa, or change the patch until we're confident that it's fine
for pa.

That being said, reading the documentation of STACK_GROWS_UPWARD
and ARGS_GROW_DOWNWARD I'm having a hard time visualising a case
where this would cause trouble on pa.

Is the problem that in the function:

+/* Add SIZE to X and check whether it's greater than Y.
+   If it is, return the constant amount by which it's greater or 
smaller.
+   If the two are not statically comparable (for example, X and Y 
contain
+   different registers) return -1.  This is used in 
expand_push_insn to
+   figure out if reading SIZE bytes from location X will end up 
reading from

+   location Y.  */
+static int
+memory_load_overlap (rtx x, rtx y, HOST_WIDE_INT size)
+{
+  rtx tmp = plus_constant (Pmode, x, size);
+  rtx sub = simplify_gen_binary (MINUS, Pmode, tmp, y);
+
+  if (!CONST_INT_P (sub))
+return -1;
+
+  return INTVAL (sub);
+}

for ARGS_GROW_DOWNWARD we would be reading 'backwards' from x,
so the function should something like the following?

static int
memory_load_overlap (rtx x, rtx y, HOST_WIDE_INT size)
{
#ifdef ARGS_GROW_DOWNWARD
 rtx tmp = plus_constant (Pmode, x, -size);
#else
 rtx tmp = plus_constant (Pmode, x, size);
#endif
 rtx sub = simplify_gen_binary (MINUS, Pmode, tmp, y);

 if (!CONST_INT_P (sub))
   return -1;

#ifdef ARGS_GROW_DOWNWARD
 return INTVAL (-sub);
#else
 return INTVAL (sub);
#endif
}

now, say for x == sp + 4,  y == sp + 8, size == 16:
This would be a problematic case for arm, so this code on arm
(where ARGS_GROW_DOWNWARD is *not* defined) would return
12, which is the number of bytes that overlap.

On a target where ARGS_GROW_DOWNWARD is defined this would return
-20, meaning that no overlap occurs (because we read in the descending
direction from x, IIUC).

Hi Jeff,

Here's an attempt to make this more concrete.
Only the memory_load_overlap function has changed.
This time I tried to take into account the case when
ARGS_GROW_DOWNWARD.

Take the case where x == sp, y == sp + 8, size == 16.
For arm, this would return 8 as that is the number of bytes
that overlap. On pa, since ARGS_GROW_DOWNWARD is defined it
would return -1 as we're reading down from x rather than up
towards y.

In the case when x == sp + 8, y == sp, size == 16
This would return -1 on arm since we're reading upwards from x
and thefore no overlap would happen.

On pa, this would return 8, which I think is the right thing.
But again, I don't have access to any pa means of testing.

What do you think of this approach?


Hi Dave,

Would it be possible for you to test this patch on a 64-bit hppa
or at least bootstrap it?
https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01288.html
I started a build and test with your patch on hppa64-hp-hpux11.11 this 
morning.




There is a concern that it may potentially affect the passing of
complex arguments partially on the stack and partially in regs
on pa because of the way the args and stack grow on that target.

Unfortunately I don't have access to any hardware or simulators.
It would help a lot with getting this patch in.

If you write to linux-par...@vger.kernel.org, arrangements can be made for
an account on a Debian parisc linux machine for development testing.

Helge Deller has arranged for some new machines since we took over the
Debian buildd infrastructure  for parisc.  More info is here:
https://parisc.wiki.kernel.org/index.php/Main_Page



Thanks,
Kyrill




Thanks,
Kyrill

P.S. I've included the testcase from Honggyu in the patch.

2015-04-22  Kyrylo Tkachov  

  PR target/65358
  * expr.c (memory_load_overlap): New function.
  (emit_push_insn): When pushing partial args to the stack would
  clobber the register part load the overlapping part into a pseudo
  and put it into the hard reg after pushing.

2015-04-22  Honggyu Kim  

  PR target/65358
  * gcc.dg/pr65358.c: New test.



Thanks,
Kyrill


Jeff







Regards,
Dave

Re: [PATCH, ARM] Alternatives and type attributes fixes.

2015-04-27 Thread Kyrill Tkachov

Hi Yvan,

On 27/04/15 13:25, Yvan Roux wrote:

Hi,

This is a follow-up of PR64208 where an LRA loop was due to redundancy
in insn's alternatives.  I've checked all the insns in the arm backend
to avoid latent problems and this patch fixes the issues I've spotted.

Only thumb2_movsicc_insn contained duplicated alternatives, and the
rest of the changes are related to the type attribute, which were not
accurate or used accordingly to their definition.  Notice that the
modifications have only a limited impact as in most of the pipeline
descriptions, multiple/mov_reg/alu_reg/.. types are used the same way,
only cortex a8 seems to have a real difference between alu or multiple
instruction and mov.

Bootstrapped and regtested on arm-linux-gnueabihf.  Ok for trunk ?

Thanks,
Yvan

2015-04-27  Yvan Roux

 * config/arm/arm.mb (*arm_movt): Fix type attribute.
 (*movsi_compare0): Likewise.
 (*cmpsi_shiftsi): Likewise.
 (*cmpsi_shiftsi_swp): Likewise.
 (*movsicc_insn): Likewise.
 (*cond_move): Likewise.
 (*if_plus_move): Likewise.
 (*if_move_plus): Likewise.
 (*if_arith_move): Likewise.
 (*if_move_arith): Likewise.
 (*if_shift_move): Likewise.
 (*if_move_shift): Likewise.
 * config/arm/thumb2.md (*thumb2_movsi_insn): Fix type attribute.
 (*thumb2_movsicc_insn): Fix alternative redundancy.
 (*thumb2_addsi_short): Split register and immediate alternatives.
 (thumb2_addsi3_compare0): Likewise.

insn.diff


diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 164ac13..ee23deb 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -5631,7 +5631,7 @@
[(set_attr "predicable" "yes")
 (set_attr "predicable_short_it" "no")
 (set_attr "length" "4")
-   (set_attr "type" "mov_imm")]
+   (set_attr "type" "alu_sreg")]
  )


For some context, this is the *arm_movt pattern that generates
a movt instruction. The documentation in types.md says:
"This excludes MOV and MVN but includes MOVT." So using alu_sreg
is correct according to that logic.
However, don't you want to also update the arm_movtas_ze pattern
that generates movt but erroneously has the type 'mov_imm'?

  
  (define_insn "*arm_movsi_insn"

@@ -5919,7 +5919,7 @@
 cmp%?\\t%0, #0
 sub%.\\t%0, %1, #0"
[(set_attr "conds" "set")
-   (set_attr "type" "alus_imm,alus_imm")]
+   (set_attr "type" "alus_sreg,alus_sreg")]
  )


Why change that? They are instructions with immediate operands
so alus_imm should be gorrect?


@@ -486,12 +486,12 @@
  )
  
  (define_insn_and_split "*thumb2_movsicc_insn"

-  [(set (match_operand:SI 0 "s_register_operand" "=l,l,r,r,r,r,r,r,r,r,r")
+  [(set (match_operand:SI 0 "s_register_operand" "=l,l,r,r,r,r,r,r,r,r,r,r")
(if_then_else:SI
 (match_operator 3 "arm_comparison_operator"
  [(match_operand 4 "cc_register" "") (const_int 0)])
-(match_operand:SI 1 "arm_not_operand" "0 ,lPy,0 ,0,rI,K,rI,rI,K ,K,r")
-(match_operand:SI 2 "arm_not_operand" "lPy,0 ,rI,K,0 ,0,rI,K 
,rI,K,r")))]
+(match_operand:SI 1 "arm_not_operand" "0 ,lPy,0 ,0,rI,K,I ,r,rI,K 
,K,r")
+(match_operand:SI 2 "arm_not_operand" "lPy,0 ,rI,K,0 ,0,rI,I,K 
,rI,K,r")))]
"TARGET_THUMB2"
"@
 it\\t%D3\;mov%D3\\t%0, %2
@@ -504,12 +504,14 @@
 #
 #
 #
+   #
 #"
 ; alt 6: ite\\t%d3\;mov%d3\\t%0, %1\;mov%D3\\t%0, %2
-   ; alt 7: ite\\t%d3\;mov%d3\\t%0, %1\;mvn%D3\\t%0, #%B2
-   ; alt 8: ite\\t%d3\;mvn%d3\\t%0, #%B1\;mov%D3\\t%0, %2
-   ; alt 9: ite\\t%d3\;mvn%d3\\t%0, #%B1\;mvn%D3\\t%0, #%B2
-   ; alt 10: ite\\t%d3\;mov%d3\\t%0, %1\;mov%D3\\t%0, %2
+   ; alt 7: ite\\t%d3\;mov%d3\\t%0, %1\;mov%D3\\t%0, %2
+   ; alt 8: ite\\t%d3\;mov%d3\\t%0, %1\;mvn%D3\\t%0, #%B2
+   ; alt 9: ite\\t%d3\;mvn%d3\\t%0, #%B1\;mov%D3\\t%0, %2
+   ; alt 10: ite\\t%d3\;mvn%d3\\t%0, #%B1\;mvn%D3\\t%0, #%B2
+   ; alt 11: ite\\t%d3\;mov%d3\\t%0, %1\;mov%D3\\t%0, %2
"&& reload_completed"
[(const_int 0)]
{
@@ -540,8 +542,8 @@
 operands[2])));
  DONE;
}
-  [(set_attr "length" "4,4,6,6,6,6,10,10,10,10,6")
-   (set_attr "enabled_for_depr_it" "yes,yes,no,no,no,no,no,no,no,no,yes")
+  [(set_attr "length" "4,4,6,6,6,6,10,8,10,10,10,6")
+   (set_attr "enabled_for_depr_it" "yes,yes,no,no,no,no,no,no,no,no,no,yes")
 (set_attr "conds" "use")
 (set_attr "type" "multiple")]
  )


So, I think here for the first 6 alternatives we can give it a more specific 
type,
like mov_immm or mov_reg, since you're cleaning this stuff up. The instructions 
in
those alternatives are just a mov instruction enclosed in an IT block, so they 
always
have to stick together anyway.


@@ -1161,9 +1163,9 @@
  )
  
  (define_insn "*thumb2_addsi_short"

-  [(set (match_operand:SI 0 "low_register_operand" "=l,l")
-   (plus:SI (match_operand:SI 1 "low_register_operand" "l,0")
-(match_operand:SI 2 "low_reg_or_int_operand" "lPt,Ps")))
+  [(set (match_operand:SI 0 "low_register_operand"

[PATCH][AArch64] Fix operand costing logic for MINUS

2015-04-27 Thread Kyrill Tkachov

Hi all,

This fixes a case in aarch64 costs where we forgot
to account for one of the operands in the MINUS case.

This is important to get right as the mult synthesis code
can ask for the cost of a shift+sub operation and put the
shift part in any of the MINUS operands, expecting the cost function to
break it down into a separate shift operation if the
two cannot be combined.

Bootstrapped and tested on aarch64.
Ok for trunk?

Thanks,
Kyrill

2015-04-27  Kyrylo Tkachov  

* config/aarch64/aarch64.c (aarch64_rtx_costs, MINUS):
Properly account for both operand costs in simple case.
commit 4859566692e4b9195b975632ed105b9c4b6ab765
Author: Kyrylo Tkachov 
Date:   Mon Mar 2 10:18:22 2015 +

[AArch64] Properly cost both operands of MINUS.

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 7ffa7ee..5a3f887 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -5844,7 +5844,8 @@ cost_minus:
 	return true;
 	  }
 
-	*cost += rtx_cost (new_op1, MINUS, 1, speed);
+	*cost += rtx_cost (new_op1, MINUS, 1, speed)
+	 + rtx_cost (op0, MINUS, 0, speed);
 
 	if (speed)
 	  {


Re: [PATCH, rs6000, testsuite, PR65456] Changes for unaligned vector load/store support on POWER8

2015-04-27 Thread Bill Schmidt
On Mon, 2015-04-27 at 14:23 +0800, Bin.Cheng wrote:
> On Mon, Mar 30, 2015 at 1:42 AM, Bill Schmidt
>  wrote:

> 
> > Index: gcc/testsuite/gcc.dg/vect/vect-33.c
> > ===
> > --- gcc/testsuite/gcc.dg/vect/vect-33.c (revision 221118)
> > +++ gcc/testsuite/gcc.dg/vect/vect-33.c (working copy)
> > @@ -36,9 +36,10 @@ int main (void)
> >return main1 ();
> >  }
> >
> > +/* vect_hw_misalign && { ! vect64 } */
> >
> >  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  } } */
> > -/* { dg-final { scan-tree-dump "Vectorizing an unaligned access" "vect" { 
> > target { vect_hw_misalign && { {! vect64} || vect_multiple_sizes } } } } } 
> > */
> > +/* { dg-final { scan-tree-dump "Vectorizing an unaligned access" "vect" { 
> > target { { { ! powerpc*-*-* } && vect_hw_misalign } && { { ! vect64 } || 
> > vect_multiple_sizes } } } } }  */
> >  /* { dg-final { scan-tree-dump "Alignment of access forced using peeling" 
> > "vect" { target { vector_alignment_reachable && { vect64 && {! 
> > vect_multiple_sizes} } } } } } */
> >  /* { dg-final { scan-tree-dump-times "Alignment of access forced using 
> > versioning" 1 "vect" { target { { {! vector_alignment_reachable} || {! 
> > vect64} } && {! vect_hw_misalign} } } } } */
> >  /* { dg-final { cleanup-tree-dump "vect" } } */
> 
> Hi Bill,
> With this change, the test case is skipped on aarch64 now.  Since it
> passed before, Is it expected to act like this on 64bit platforms?

Hi Bin,

No, that's a mistake on my part -- thanks for the report!  That first
added line was not intended to be part of the patch:

+/* vect_hw_misalign && { ! vect64 } */

Please try removing that line and verify that the patch succeeds again
for ARM.  Assuming so, I'll prepare a patch to fix this.

It looks like this mistake was introduced only in this particular test,
but please let me know if you see any other anomalies.

Thanks very much!

Bill
> 
> PASS->NA: gcc.dg/vect/vect-33.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "Vectorizing an unaligned access" 0
> PASS->NA: gcc.dg/vect/vect-33.c scan-tree-dump-times vect "Vectorizing
> an unaligned access" 0
> 
> Thanks,
> bin
> 




[PATCH][AArch64] Improve spill code - swap order in shl pattern

2015-04-27 Thread Wilco Dijkstra
Various instructions are supported as integer operations as well as SIMD on 
AArch64. When register
pressure is high, lra-constraints inserts spill code without taking the 
allocation class into
account, and basically chooses the first available pattern that matches. Since 
this instruction has
the SIMD version first it is usually chosen eventhough some of the operands are 
eventually allocated
to integer registers. The result is inefficient code not only due to the higher 
latency of SIMD
instructions but also due to the extra int<->FP moves. Placing the integer 
variant first in the shl
pattern generates far more optimal spill code. A few more patterns are the 
wrong way around, which
I'll address in a separate patch. I'm also looking into fixing lra-constraints 
to generate the
expected code by taking the allocno class into account in the cost calculations 
during spilling.

2015-04-27  Wilco Dijkstra  

* gcc/config/aarch64/aarch64.md (aarch64_ashl_sisd_or_int_3):
Place integer variant first.

---
 gcc/config/aarch64/aarch64.md | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 7163025..baef56a 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -3334,17 +3334,17 @@
 
 ;; Logical left shift using SISD or Integer instruction
 (define_insn "*aarch64_ashl_sisd_or_int_3"
-  [(set (match_operand:GPI 0 "register_operand" "=w,w,r")
+  [(set (match_operand:GPI 0 "register_operand" "=r,w,w")
 (ashift:GPI
-  (match_operand:GPI 1 "register_operand" "w,w,r")
-  (match_operand:QI 2 "aarch64_reg_or_shift_imm_" 
"Us,w,rUs")))]
+  (match_operand:GPI 1 "register_operand" "r,w,w")
+  (match_operand:QI 2 "aarch64_reg_or_shift_imm_" 
"rUs,Us,w")))]
   ""
   "@
+   lsl\t%0, %1, %2
shl\t%0, %1, %2
-   ushl\t%0, %1, %2
-   lsl\t%0, %1, %2"
-  [(set_attr "simd" "yes,yes,no")
-   (set_attr "type" "neon_shift_imm, neon_shift_reg,shift_reg")]
+   ushl\t%0, %1, %2"
+  [(set_attr "simd" "no,yes,yes")
+   (set_attr "type" "shift_reg,neon_shift_imm, neon_shift_reg")]
 )
 
 ;; Logical right shift using SISD or Integer instruction
-- 
1.9.1




Re: [PATCH][simplify-rtx][trivial] Use std::swap instead of manually swapping

2015-04-27 Thread Jeff Law

On 04/27/2015 03:55 AM, Kyrill Tkachov wrote:

Hi all,

This patch replaces in simplify-rtx.c the manual swapping of values with
std::swap.
Precedents suggest these are considered obvious changes.

Bootstrapped and tested on aarch64, x86_64.

Will commit as obvious in a couple of days if no one objects

Thanks,
Kyrill

2015-04-27  Kyrylo Tkachov  

 * simplify-rtx.c (simplify_gen_binary): Use std::swap instead
 of manually swapping.
 (simplify_associative_operation): Likewise.
 (simplify_binary_operation): Likewise.
 (simplify_plus_minus): Likewise.
 (simplify_relational_operation): Likewise.
 (simplify_ternary_operation): Likewise.
OK.  And just to be explicit, please consider any patch which just 
converts manual swapping to std::swap as pre-approved.  Just post them 
for archival purposes.



jeff



Re: [PATCH][PR65818][bootstrap,hppa] Return side-effect free result in gimplify_va_arg_internal

2015-04-27 Thread Richard Biener
On Mon, Apr 27, 2015 at 3:06 PM, Tom de Vries  wrote:
> On 27-04-15 10:17, Richard Biener wrote:
>>>
>>> This patch fixes that by gimplifying the address expression of the
>>> mem-ref
>>> >returned by the target hook (borrowing code from gimplify_expr, case
>>> >MEM_REF).
>>> >
>>> >Bootstrapped and reg-tested on x86_64.
>>> >
>>> >Bootstrapped and reg-tested on hppa2.0w-hp-hpux11.11.
>>> >
>>> >OK for trunk?
>>
>> Hmm, that assert looks suspicious...
>>
>> Can't you simply always do
>>
>>gimplify_expr (expr, pre_p, post_p, is_gimple_lvalue, fb_lvalue);
>
>
> It's a bit counter-intuitive for me, using is_gimple_lvalue for something
> (the result of va_arg) we use as rvalue.

Yeah, choosing that was done because you could assert it's a MEM_REF
and tree-stdarg eventually builds a WITH_SIZE_EXPR around it.

It would possibly be easier to simply "inline" gimplify_va_arg_internal
at its use and only gimplify the assignment.  Though in that case the
original code probably worked - just in the lhs == NULL case it didn't,
which hints at a better place for the fix - in expand_ifn_va_arg_1 do

 if (lhs != NULL_TREE)
   {
...
   }
 else
   gimplify_expr (&expr, &pre, &post, is_gimple_val, fb_either);

So ... if you can re-try that one it's pre-approved.

Thanks,
Richard.

> But, it seems to work (with & in front of expr).
>
> OK if bootstrap and reg-test on x86_64 succeeds?
>
> Thanks,
> - Tom
>


RE: [PATCH][AArch64] Use conditional negate for abs expansion

2015-04-27 Thread Wilco Dijkstra
ping

> -Original Message-
> From: Wilco Dijkstra [mailto:wdijk...@arm.com]
> Sent: 03 March 2015 16:19
> To: GCC Patches
> Subject: [PATCH][AArch64] Use conditional negate for abs expansion
> 
> Expand abs into a compare and conditional negate. This is the most obvious 
> expansion, enables
> merging of the comparison into ALU instructions and is faster on all 
> implementations.
> Bootstrapped & regression tested.
> 
> int f(int x) { return abs (x + 1); }
> 
> Before:
> add w0, w0, 1
> sxtwx0, w0
> eor x1, x0, x0, asr 63
> sub x1, x1, x0, asr 63
> mov x0, x1
> ret
> 
> After:
> addsw0, w0, 1
> csneg   w0, w0, w0, pl
> ret
> 
> ChangeLog:
> 
> 2015-03-03  Wilco Dijkstra  
> 
>   * gcc/config/aarch64/aarch64.md (absdi2): optimize abs expansion.
>   (csneg3_insn): enable expansion of pattern.
>   * gcc/testsuite/gcc.target/aarch64/abs_1.c (abs64): update test
>   for new abs expansion.  (abs64_in_dreg): likewise.
> 
> ---
>  gcc/config/aarch64/aarch64.md| 33 
> +++-
>  gcc/testsuite/gcc.target/aarch64/abs_1.c |  5 ++---
>  2 files changed, 9 insertions(+), 29 deletions(-)
> 
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 1f4169e..46b7a63 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -2172,35 +2172,16 @@
>[(set_attr "type" "alu_ext")]
>  )
> 
> -(define_insn_and_split "absdi2"
> -  [(set (match_operand:DI 0 "register_operand" "=&r,w")
> - (abs:DI (match_operand:DI 1 "register_operand" "r,w")))]
> +(define_expand "abs2"
> +  [(match_operand:GPI 0 "register_operand" "")
> +   (match_operand:GPI 1 "register_operand" "")]
>""
> -  "@
> -   #
> -   abs\\t%d0, %d1"
> -  "reload_completed
> -   && GP_REGNUM_P (REGNO (operands[0]))
> -   && GP_REGNUM_P (REGNO (operands[1]))"
> -  [(const_int 0)]
>{
> -emit_insn (gen_rtx_SET (VOIDmode, operands[0],
> - gen_rtx_XOR (DImode,
> -  gen_rtx_ASHIFTRT (DImode,
> -operands[1],
> -GEN_INT (63)),
> -  operands[1])));
> -emit_insn (gen_rtx_SET (VOIDmode,
> - operands[0],
> - gen_rtx_MINUS (DImode,
> -operands[0],
> -gen_rtx_ASHIFTRT (DImode,
> -  operands[1],
> -  GEN_INT (63);
> +rtx ccreg = aarch64_gen_compare_reg (LT, operands[1], const0_rtx);
> +rtx x = gen_rtx_LT (VOIDmode, ccreg, const0_rtx);
> +emit_insn (gen_csneg3_insn (operands[0], x, operands[1], 
> operands[1]));
>  DONE;
>}
> -  [(set_attr "type" "alu_sreg")
> -   (set_attr "simd" "no,yes")]
>  )
> 
>  (define_insn "neg2"
> @@ -2879,7 +2860,7 @@
>[(set_attr "type" "csel")]
>  )
> 
> -(define_insn "*csneg3_insn"
> +(define_insn "csneg3_insn"
>[(set (match_operand:GPI 0 "register_operand" "=r")
>  (if_then_else:GPI
> (match_operand 1 "aarch64_comparison_operation" "")
> diff --git a/gcc/testsuite/gcc.target/aarch64/abs_1.c
> b/gcc/testsuite/gcc.target/aarch64/abs_1.c
> index 938bc84..11f1095 100644
> --- a/gcc/testsuite/gcc.target/aarch64/abs_1.c
> +++ b/gcc/testsuite/gcc.target/aarch64/abs_1.c
> @@ -7,15 +7,14 @@ extern void abort (void);
>  long long
>  abs64 (long long a)
>  {
> -  /* { dg-final { scan-assembler "eor\t" } } */
> -  /* { dg-final { scan-assembler "sub\t" } } */
> +  /* { dg-final { scan-assembler "csneg\t" } } */
>return llabs (a);
>  }
> 
>  long long
>  abs64_in_dreg (long long a)
>  {
> -  /* { dg-final { scan-assembler "abs\td\[0-9\]+, d\[0-9\]+" } } */
> +  /* { dg-final { scan-assembler "csneg\t" } } */
>register long long x asm ("d8") = a;
>register long long y asm ("d9");
>asm volatile ("" : : "w" (x));
> --
> 1.9.1




RE: [PATCH][AArch64] Make aarch64_min_divisions_for_recip_mul configurable

2015-04-27 Thread Wilco Dijkstra
ping

> -Original Message-
> From: Wilco Dijkstra [mailto:wdijk...@arm.com]
> Sent: 03 March 2015 18:06
> To: GCC Patches
> Subject: [PATCH][AArch64] Make aarch64_min_divisions_for_recip_mul 
> configurable
> 
> This patch makes aarch64_min_divisions_for_recip_mul configurable for float 
> and double. This
> allows CPUs with really fast or multiple dividers to return 3 (or even 4) if 
> that happens to
> be faster overall. No code generation change - bootstrap & regression OK.
> 
> ChangeLog:
> 2015-03-03  Wilco Dijkstra  
> 
>   * gcc/config/aarch64/aarch64-protos.h (tune_params):
>   Add min_div_recip_mul_sf and min_div_recip_mul_df fields.
>   * gcc/config/aarch64/aarch64.c (aarch64_min_divisions_for_recip_mul):
>   Return value depending on target.
>   (generic_tunings): Initialize new target settings.
>   (cortexa53_tunings): Likewise.
>   (cortexa57_tunings): Likewise.
>   (thunderx_tunings): Likewise.
>   (xgene1_tunings): Likewise.
> 
> ---
>  gcc/config/aarch64/aarch64-protos.h |  2 ++
>  gcc/config/aarch64/aarch64.c| 26 +++---
>  2 files changed, 21 insertions(+), 7 deletions(-)
> 
> diff --git a/gcc/config/aarch64/aarch64-protos.h 
> b/gcc/config/aarch64/aarch64-protos.h
> index 59c5824..4331e5c 100644
> --- a/gcc/config/aarch64/aarch64-protos.h
> +++ b/gcc/config/aarch64/aarch64-protos.h
> @@ -177,6 +177,8 @@ struct tune_params
>const int int_reassoc_width;
>const int fp_reassoc_width;
>const int vec_reassoc_width;
> +  const int min_div_recip_mul_sf;
> +  const int min_div_recip_mul_df;
>  };
> 
>  HOST_WIDE_INT aarch64_initial_elimination_offset (unsigned, unsigned);
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index e22d72e..42a96f6 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -353,7 +353,9 @@ static const struct tune_params generic_tunings =
>4, /* loop_align.  */
>2, /* int_reassoc_width.  */
>4, /* fp_reassoc_width.  */
> -  1  /* vec_reassoc_width.  */
> +  1, /* vec_reassoc_width.  */
> +  2, /* min_div_recip_mul_sf.  */
> +  2  /* min_div_recip_mul_df.  */
>  };
> 
>  static const struct tune_params cortexa53_tunings =
> @@ -371,7 +373,9 @@ static const struct tune_params cortexa53_tunings =
>4, /* loop_align.  */
>2, /* int_reassoc_width.  */
>4, /* fp_reassoc_width.  */
> -  1  /* vec_reassoc_width.  */
> +  1, /* vec_reassoc_width.  */
> +  2, /* min_div_recip_mul_sf.  */
> +  2  /* min_div_recip_mul_df.  */
>  };
> 
>  static const struct tune_params cortexa57_tunings =
> @@ -389,7 +393,9 @@ static const struct tune_params cortexa57_tunings =
>4, /* loop_align.  */
>2, /* int_reassoc_width.  */
>4, /* fp_reassoc_width.  */
> -  1  /* vec_reassoc_width.  */
> +  1, /* vec_reassoc_width.  */
> +  2, /* min_div_recip_mul_sf.  */
> +  2  /* min_div_recip_mul_df.  */
>  };
> 
>  static const struct tune_params thunderx_tunings =
> @@ -406,7 +412,9 @@ static const struct tune_params thunderx_tunings =
>8, /* loop_align.  */
>2, /* int_reassoc_width.  */
>4, /* fp_reassoc_width.  */
> -  1  /* vec_reassoc_width.  */
> +  1, /* vec_reassoc_width.  */
> +  2, /* min_div_recip_mul_sf.  */
> +  2  /* min_div_recip_mul_df.  */
>  };
> 
>  static const struct tune_params xgene1_tunings =
> @@ -423,7 +431,9 @@ static const struct tune_params xgene1_tunings =
>16,/* loop_align.  */
>2, /* int_reassoc_width.  */
>4, /* fp_reassoc_width.  */
> -  1  /* vec_reassoc_width.  */
> +  1, /* vec_reassoc_width.  */
> +  2, /* min_div_recip_mul_sf.  */
> +  2  /* min_div_recip_mul_df.  */
>  };
> 
>  /* A processor implementing AArch64.  */
> @@ -512,9 +522,11 @@ static const char * const aarch64_condition_codes[] =
>  };
> 
>  static unsigned int
> -aarch64_min_divisions_for_recip_mul (enum machine_mode mode ATTRIBUTE_UNUSED)
> +aarch64_min_divisions_for_recip_mul (enum machine_mode mode)
>  {
> -  return 2;
> +  if (GET_MODE_UNIT_SIZE (mode) == 4)
> +return aarch64_tune_params->min_div_recip_mul_sf;
> +  return aarch64_tune_params->min_div_recip_mul_df;
>  }
> 
>  static int
> --
> 1.9.1





RE: [PATCH][AArch64] Fix aarch64_rtx_costs of PLUS/MINUS

2015-04-27 Thread Wilco Dijkstra
ping

> -Original Message-
> From: Wilco Dijkstra [mailto:wdijk...@arm.com]
> Sent: 04 March 2015 15:38
> To: GCC Patches
> Subject: [PATCH][AArch64] Fix aarch64_rtx_costs of PLUS/MINUS
> 
> Include the cost of op0 and op1 in all cases in PLUS and MINUS in 
> aarch64_rtx_costs.
> Bootstrap & regression OK.
> 
> ChangeLog:
> 2015-03-04  Wilco Dijkstra  
> 
> * gcc/config/aarch64/aarch64.c (aarch64_rtx_costs):
> Calculate cost of op0 and op1 in PLUS and MINUS cases.
> 
> ---
>  gcc/config/aarch64/aarch64.c | 12 +---
>  1 file changed, 5 insertions(+), 7 deletions(-)
> 
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 39921a7..e22d72e 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -5794,6 +5794,8 @@ aarch64_rtx_costs (rtx x, int code, int outer 
> ATTRIBUTE_UNUSED,
>   op1 = XEXP (x, 1);
> 
>  cost_minus:
> + *cost += rtx_cost (op0, MINUS, 0, speed);
> +
>   /* Detect valid immediates.  */
>   if ((GET_MODE_CLASS (mode) == MODE_INT
>|| (GET_MODE_CLASS (mode) == MODE_CC
> @@ -5801,13 +5803,10 @@ cost_minus:
>   && CONST_INT_P (op1)
>   && aarch64_uimm12_shift (INTVAL (op1)))
> {
> - *cost += rtx_cost (op0, MINUS, 0, speed);
> -
>   if (speed)
> /* SUB(S) (immediate).  */
> *cost += extra_cost->alu.arith;
>   return true;
> -
> }
> 
>   /* Look for SUB (extended register).  */
> @@ -5832,7 +5831,6 @@ cost_minus:
>   *cost += aarch64_rtx_mult_cost (new_op1, MULT,
>   (enum rtx_code) code,
>   speed);
> - *cost += rtx_cost (op0, MINUS, 0, speed);
>   return true;
> }
> 
> @@ -5879,6 +5877,8 @@ cost_plus:
>   return true;
> }
> 
> + *cost += rtx_cost (op1, PLUS, 1, speed);
> +
>   /* Look for ADD (extended register).  */
>  if (aarch64_rtx_arith_op_extract_p (op0, mode))
> {
> @@ -5900,12 +5900,10 @@ cost_plus:
> {
>   *cost += aarch64_rtx_mult_cost (new_op0, MULT, PLUS,
>   speed);
> - *cost += rtx_cost (op1, PLUS, 1, speed);
>   return true;
> }
> 
> - *cost += (rtx_cost (new_op0, PLUS, 0, speed)
> -   + rtx_cost (op1, PLUS, 1, speed));
> + *cost += rtx_cost (new_op0, PLUS, 0, speed);
> 
>   if (speed)
> {
> --
> 1.9.1




Re: [PATCH, ARM] Alternatives and type attributes fixes.

2015-04-27 Thread Yvan Roux
On 27 April 2015 at 15:20, Kyrill Tkachov  wrote:
> Hi Yvan,
>
>
> On 27/04/15 13:25, Yvan Roux wrote:
>>
>> Hi,
>>
>> This is a follow-up of PR64208 where an LRA loop was due to redundancy
>> in insn's alternatives.  I've checked all the insns in the arm backend
>> to avoid latent problems and this patch fixes the issues I've spotted.
>>
>> Only thumb2_movsicc_insn contained duplicated alternatives, and the
>> rest of the changes are related to the type attribute, which were not
>> accurate or used accordingly to their definition.  Notice that the
>> modifications have only a limited impact as in most of the pipeline
>> descriptions, multiple/mov_reg/alu_reg/.. types are used the same way,
>> only cortex a8 seems to have a real difference between alu or multiple
>> instruction and mov.
>>
>> Bootstrapped and regtested on arm-linux-gnueabihf.  Ok for trunk ?
>>
>> Thanks,
>> Yvan
>>
>> 2015-04-27  Yvan Roux
>>
>>  * config/arm/arm.mb (*arm_movt): Fix type attribute.
>>  (*movsi_compare0): Likewise.
>>  (*cmpsi_shiftsi): Likewise.
>>  (*cmpsi_shiftsi_swp): Likewise.
>>  (*movsicc_insn): Likewise.
>>  (*cond_move): Likewise.
>>  (*if_plus_move): Likewise.
>>  (*if_move_plus): Likewise.
>>  (*if_arith_move): Likewise.
>>  (*if_move_arith): Likewise.
>>  (*if_shift_move): Likewise.
>>  (*if_move_shift): Likewise.
>>  * config/arm/thumb2.md (*thumb2_movsi_insn): Fix type attribute.
>>  (*thumb2_movsicc_insn): Fix alternative redundancy.
>>  (*thumb2_addsi_short): Split register and immediate alternatives.
>>  (thumb2_addsi3_compare0): Likewise.
>>
>> insn.diff
>>
>>
>> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
>> index 164ac13..ee23deb 100644
>> --- a/gcc/config/arm/arm.md
>> +++ b/gcc/config/arm/arm.md
>> @@ -5631,7 +5631,7 @@
>> [(set_attr "predicable" "yes")
>>  (set_attr "predicable_short_it" "no")
>>  (set_attr "length" "4")
>> -   (set_attr "type" "mov_imm")]
>> +   (set_attr "type" "alu_sreg")]
>>   )
>
>
> For some context, this is the *arm_movt pattern that generates
> a movt instruction. The documentation in types.md says:
> "This excludes MOV and MVN but includes MOVT." So using alu_sreg
> is correct according to that logic.
> However, don't you want to also update the arm_movtas_ze pattern
> that generates movt but erroneously has the type 'mov_imm'?

Ha, yes I missed this one ! I'll will update it.

>> (define_insn "*arm_movsi_insn"
>> @@ -5919,7 +5919,7 @@
>>  cmp%?\\t%0, #0
>>  sub%.\\t%0, %1, #0"
>> [(set_attr "conds" "set")
>> -   (set_attr "type" "alus_imm,alus_imm")]
>> +   (set_attr "type" "alus_sreg,alus_sreg")]
>>   )
>
>
> Why change that? They are instructions with immediate operands
> so alus_imm should be gorrect?

Oups, I certainly misread #0 and %0 this one is correct.

>> @@ -486,12 +486,12 @@
>>   )
>> (define_insn_and_split "*thumb2_movsicc_insn"
>> -  [(set (match_operand:SI 0 "s_register_operand"
>> "=l,l,r,r,r,r,r,r,r,r,r")
>> +  [(set (match_operand:SI 0 "s_register_operand"
>> "=l,l,r,r,r,r,r,r,r,r,r,r")
>> (if_then_else:SI
>>  (match_operator 3 "arm_comparison_operator"
>>   [(match_operand 4 "cc_register" "") (const_int 0)])
>> -(match_operand:SI 1 "arm_not_operand" "0 ,lPy,0 ,0,rI,K,rI,rI,K
>> ,K,r")
>> -(match_operand:SI 2 "arm_not_operand" "lPy,0 ,rI,K,0 ,0,rI,K
>> ,rI,K,r")))]
>> +(match_operand:SI 1 "arm_not_operand" "0 ,lPy,0 ,0,rI,K,I ,r,rI,K
>> ,K,r")
>> +(match_operand:SI 2 "arm_not_operand" "lPy,0 ,rI,K,0 ,0,rI,I,K
>> ,rI,K,r")))]
>> "TARGET_THUMB2"
>> "@
>>  it\\t%D3\;mov%D3\\t%0, %2
>> @@ -504,12 +504,14 @@
>>  #
>>  #
>>  #
>> +   #
>>  #"
>>  ; alt 6: ite\\t%d3\;mov%d3\\t%0, %1\;mov%D3\\t%0, %2
>> -   ; alt 7: ite\\t%d3\;mov%d3\\t%0, %1\;mvn%D3\\t%0, #%B2
>> -   ; alt 8: ite\\t%d3\;mvn%d3\\t%0, #%B1\;mov%D3\\t%0, %2
>> -   ; alt 9: ite\\t%d3\;mvn%d3\\t%0, #%B1\;mvn%D3\\t%0, #%B2
>> -   ; alt 10: ite\\t%d3\;mov%d3\\t%0, %1\;mov%D3\\t%0, %2
>> +   ; alt 7: ite\\t%d3\;mov%d3\\t%0, %1\;mov%D3\\t%0, %2
>> +   ; alt 8: ite\\t%d3\;mov%d3\\t%0, %1\;mvn%D3\\t%0, #%B2
>> +   ; alt 9: ite\\t%d3\;mvn%d3\\t%0, #%B1\;mov%D3\\t%0, %2
>> +   ; alt 10: ite\\t%d3\;mvn%d3\\t%0, #%B1\;mvn%D3\\t%0, #%B2
>> +   ; alt 11: ite\\t%d3\;mov%d3\\t%0, %1\;mov%D3\\t%0, %2
>> "&& reload_completed"
>> [(const_int 0)]
>> {
>> @@ -540,8 +542,8 @@
>>  operands[2])));
>>   DONE;
>> }
>> -  [(set_attr "length" "4,4,6,6,6,6,10,10,10,10,6")
>> -   (set_attr "enabled_for_depr_it" "yes,yes,no,no,no,no,no,no,no,no,yes")
>> +  [(set_attr "length" "4,4,6,6,6,6,10,8,10,10,10,6")
>> +   (set_attr "enabled_for_depr_it"
>> "yes,yes,no,no,no,no,no,no,no,no,no,yes")
>>  (set_attr "conds" "use")
>>  (set_attr "type" "multiple")]
>>   )
>
>
> So, I think here for the first 6 alternatives we can give it a more specific
> type,
> like mov_im

Re: [PATCH][AArch64] Fix aarch64_rtx_costs of PLUS/MINUS

2015-04-27 Thread Kyrill Tkachov

Hi Wilco,

On 27/04/15 14:43, Wilco Dijkstra wrote:

ping


-Original Message-
From: Wilco Dijkstra [mailto:wdijk...@arm.com]
Sent: 04 March 2015 15:38
To: GCC Patches
Subject: [PATCH][AArch64] Fix aarch64_rtx_costs of PLUS/MINUS

Include the cost of op0 and op1 in all cases in PLUS and MINUS in 
aarch64_rtx_costs.
Bootstrap & regression OK.

ChangeLog:
2015-03-04  Wilco Dijkstra  

 * gcc/config/aarch64/aarch64.c (aarch64_rtx_costs):
 Calculate cost of op0 and op1 in PLUS and MINUS cases.

---
  gcc/config/aarch64/aarch64.c | 12 +---
  1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 39921a7..e22d72e 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -5794,6 +5794,8 @@ aarch64_rtx_costs (rtx x, int code, int outer 
ATTRIBUTE_UNUSED,
op1 = XEXP (x, 1);

  cost_minus:
+   *cost += rtx_cost (op0, MINUS, 0, speed);
+
/* Detect valid immediates.  */
if ((GET_MODE_CLASS (mode) == MODE_INT
 || (GET_MODE_CLASS (mode) == MODE_CC
@@ -5801,13 +5803,10 @@ cost_minus:
&& CONST_INT_P (op1)
&& aarch64_uimm12_shift (INTVAL (op1)))
  {
-   *cost += rtx_cost (op0, MINUS, 0, speed);
-
if (speed)
  /* SUB(S) (immediate).  */
  *cost += extra_cost->alu.arith;
return true;
-
  }

/* Look for SUB (extended register).  */
@@ -5832,7 +5831,6 @@ cost_minus:
*cost += aarch64_rtx_mult_cost (new_op1, MULT,
(enum rtx_code) code,
speed);
-   *cost += rtx_cost (op0, MINUS, 0, speed);
return true;
  }

@@ -5879,6 +5877,8 @@ cost_plus:
return true;
  }

+   *cost += rtx_cost (op1, PLUS, 1, speed);
+


I don't think this is correct. In the code directly below
(when the aarch64_rtx_arith_op_extract_p condition is true)
we have a shift/extend operation by a constant, so we don't
want to take into account the cost of operand 1 (which is
the extend+shift rtx).

So, I looked at this code myself recently and found that the
wrong logic is addressed by my patch at:
https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01617.html
which fixes the case where we forget to cost operand0 in the
MINUS case.

What do you think?
Kyrill


/* Look for ADD (extended register).  */
  if (aarch64_rtx_arith_op_extract_p (op0, mode))
  {
@@ -5900,12 +5900,10 @@ cost_plus:
  {
*cost += aarch64_rtx_mult_cost (new_op0, MULT, PLUS,
speed);
-   *cost += rtx_cost (op1, PLUS, 1, speed);
return true;
  }

-   *cost += (rtx_cost (new_op0, PLUS, 0, speed)
- + rtx_cost (op1, PLUS, 1, speed));
+   *cost += rtx_cost (new_op0, PLUS, 0, speed);

if (speed)
  {
--
1.9.1






Re: [PATCH] [AArch32] Additional bics patterns.

2015-04-27 Thread Alex Velenko

On 24/04/15 16:41, Alex Velenko wrote:

Hi,

This patch adds rtl patterns to generate bics instructions with shift.

Added attribute predicable_short_it since last respin.

Done full regression run on arm-none-eabi and arm-none-gnueabihf.
Bootstrapped on arm-none-gnueabihf.

Is this patch ok?

gcc/config

2015-04-24  Alex Velenko  

  * arm/arm.md (andsi_not_shiftsi_si_scc): New pattern.
  * (andsi_not_shiftsi_si_scc_no_reuse): New pattern.

gcc/testsuite

2015-04-24  Alex Velenko 

  * gcc.target/arm/bics_1.c : New testcase.
  * gcc.target/arm/bics_2.c : New testcase.
  * gcc.target/arm/bics_3.c : New testcase.
  * gcc.target/arm/bics_4.c : New testcase.
---
 gcc/config/arm/arm.md | 44 +++
 gcc/testsuite/gcc.target/arm/bics_1.c | 54 
+
 gcc/testsuite/gcc.target/arm/bics_2.c | 57 
+++

 gcc/testsuite/gcc.target/arm/bics_3.c | 41 +
 gcc/testsuite/gcc.target/arm/bics_4.c | 49 
++

 5 files changed, 245 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/arm/bics_1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/bics_2.c
 create mode 100644 gcc/testsuite/gcc.target/arm/bics_3.c
 create mode 100644 gcc/testsuite/gcc.target/arm/bics_4.c

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 164ac13..9e774c1 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -2768,6 +2768,50 @@
   (const_string "logic_shift_reg")))]
 )

+(define_insn "andsi_not_shiftsi_si_scc_no_reuse"
+  [(set (reg:CC_NOOV CC_REGNUM)
+(compare:CC_NOOV
+(and:SI (not:SI (match_operator:SI 0 "shift_operator"
+[(match_operand:SI 1 "s_register_operand" "r")
+ (match_operand:SI 2 "arm_rhs_operand" "rM")]))
+(match_operand:SI 3 "s_register_operand" "r"))
+(const_int 0)))
+   (clobber (match_scratch:SI 4 "=r"))]
+  "TARGET_32BIT"
+  "bic%.%?\\t%4, %3, %1%S0"
+  [(set_attr "predicable" "yes")
+   (set_attr "predicable_short_it" "no")
+   (set_attr "conds" "set")
+   (set_attr "shift" "1")
+   (set (attr "type") (if_then_else (match_operand 2 
"const_int_operand" "")

+  (const_string "logic_shift_imm")
+  (const_string "logic_shift_reg")))]
+)
+
+(define_insn "andsi_not_shiftsi_si_scc"
+  [(parallel [(set (reg:CC_NOOV CC_REGNUM)
+(compare:CC_NOOV
+(and:SI (not:SI (match_operator:SI 0 "shift_operator"
+[(match_operand:SI 1 "s_register_operand" "r")
+ (match_operand:SI 2 "arm_rhs_operand" "rM")]))
+(match_operand:SI 3 "s_register_operand" "r"))
+(const_int 0)))
+(set (match_operand:SI 4 "s_register_operand" "=r")
+ (and:SI (not:SI (match_op_dup 0
+ [(match_dup 1)
+  (match_dup 2)]))
+ (match_dup 3)))])]
+  "TARGET_32BIT"
+  "bic%.%?\\t%4, %3, %1%S0"
+  [(set_attr "predicable" "yes")
+   (set_attr "predicable_short_it" "no")
+   (set_attr "conds" "set")
+   (set_attr "shift" "1")
+   (set (attr "type") (if_then_else (match_operand 2 
"const_int_operand" "")

+  (const_string "logic_shift_imm")
+  (const_string "logic_shift_reg")))]
+)
+
 (define_insn "*andsi_notsi_si_compare0"
   [(set (reg:CC_NOOV CC_REGNUM)
 (compare:CC_NOOV
diff --git a/gcc/testsuite/gcc.target/arm/bics_1.c 
b/gcc/testsuite/gcc.target/arm/bics_1.c

new file mode 100644
index 000..173eb89
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/bics_1.c
@@ -0,0 +1,54 @@
+/* { dg-do run } */
+/* { dg-options "-O2 --save-temps -fno-inline" } */
+/* { dg-require-effective-target arm32 } */
+
+extern void abort (void);
+
+int
+bics_si_test1 (int a, int b, int c)
+{
+  int d = a & ~b;
+
+  /* { dg-final { scan-assembler-times "bics\tr\[0-9\]+, r\[0-9\]+, 
r\[0-9\]+" 2 } } */

+  if (d == 0)
+return a + c;
+  else
+return b + d + c;
+}
+
+int
+bics_si_test2 (int a, int b, int c)
+{
+  int d = a & ~(b << 3);
+
+  /* { dg-final { scan-assembler-times "bics\tr\[0-9\]+, r\[0-9\]+, 
r\[0-9\]+, .sl \#3" 1 } } */

+  if (d == 0)
+return a + c;
+  else
+return b + d + c;
+}
+
+int
+main ()
+{
+  int x;
+
+  x = bics_si_test1 (29, ~4, 5);
+  if (x != ((29 & 4) + ~4 + 5))
+abort ();
+
+  x = bics_si_test1 (5, ~2, 20);
+  if (x != 25)
+abort ();
+
+x = bics_si_test2 (35, ~4, 5);
+  if (x != ((35 & ~(~4 << 3)) + ~4 + 5))
+abort ();
+
+  x = bics_si_test2 (96, ~2, 20);
+  if (x != 116)
+  abort ();
+
+  return 0;
+}
+/* { dg-final { cleanup-saved-temps } } */
diff --git a/gcc/testsuite/gcc.target/arm/bics_2.c 
b/gcc/testsuite/gcc.target/arm/bics_2.c

new file mode 100644
index 000..740d7c9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/bics_2.c
@@ -0,0 +1,57 @@
+/* { dg-do run } */
+/* { dg-options "-O2 --save-temps -fno-inline" } */
+/* { dg-require-effective-target arm32 } */
+
+extern void abort (void);
+
+int
+bics_si_test1 (int a, int b, int c)
+{
+  int d = 

RE: [PATCH][AArch64] Fix Cortex-A53 shift costs

2015-04-27 Thread Wilco Dijkstra
ping

> -Original Message-
> From: Wilco Dijkstra [mailto:wdijk...@arm.com]
> Sent: 05 March 2015 14:49
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH][AArch64] Fix Cortex-A53 shift costs
> 
> This patch fixes the shift costs for Cortex-A53 so they are more accurate - 
> immediate shifts
> use SBFM/UBFM which takes 2 cycles, register controlled shifts take 1 cycle. 
> Bootstrap and
> regression OK.
> 
> ChangeLog:
> 2015-03-05  Wilco Dijkstra  
> 
>   * gcc/config/arm/aarch-cost-tables.h (cortexa53_extra_costs):
>   Make Cortex-A53 shift costs more accurate.
> 
> ---
>  gcc/config/arm/aarch-cost-tables.h | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/gcc/config/arm/aarch-cost-tables.h 
> b/gcc/config/arm/aarch-cost-tables.h
> index 05e96a9..6bb8ede 100644
> --- a/gcc/config/arm/aarch-cost-tables.h
> +++ b/gcc/config/arm/aarch-cost-tables.h
> @@ -130,12 +130,12 @@ const struct cpu_cost_table cortexa53_extra_costs =
>  0,   /* arith.  */
>  0,   /* logical.  */
>  COSTS_N_INSNS (1),   /* shift.  */
> -COSTS_N_INSNS (2),   /* shift_reg.  */
> +0,   /* shift_reg.  */
>  COSTS_N_INSNS (1),   /* arith_shift.  */
> -COSTS_N_INSNS (2),   /* arith_shift_reg.  */
> +COSTS_N_INSNS (1),   /* arith_shift_reg.  */
>  COSTS_N_INSNS (1),   /* log_shift.  */
> -COSTS_N_INSNS (2),   /* log_shift_reg.  */
> -0,   /* extend.  */
> +COSTS_N_INSNS (1),   /* log_shift_reg.  */
> +COSTS_N_INSNS (1),   /* extend.  */
>  COSTS_N_INSNS (1),   /* extend_arith.  */
>  COSTS_N_INSNS (1),   /* bfi.  */
>  COSTS_N_INSNS (1),   /* bfx.  */
> --
> 1.9.1




Re: [PATCH][AArch64] Fix aarch64_rtx_costs of PLUS/MINUS

2015-04-27 Thread Kyrill Tkachov


On 27/04/15 14:53, Kyrill Tkachov wrote:

Hi Wilco,

On 27/04/15 14:43, Wilco Dijkstra wrote:

ping


-Original Message-
From: Wilco Dijkstra [mailto:wdijk...@arm.com]
Sent: 04 March 2015 15:38
To: GCC Patches
Subject: [PATCH][AArch64] Fix aarch64_rtx_costs of PLUS/MINUS

Include the cost of op0 and op1 in all cases in PLUS and MINUS in 
aarch64_rtx_costs.
Bootstrap & regression OK.

ChangeLog:
2015-03-04  Wilco Dijkstra  

  * gcc/config/aarch64/aarch64.c (aarch64_rtx_costs):
  Calculate cost of op0 and op1 in PLUS and MINUS cases.

---
   gcc/config/aarch64/aarch64.c | 12 +---
   1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 39921a7..e22d72e 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -5794,6 +5794,8 @@ aarch64_rtx_costs (rtx x, int code, int outer 
ATTRIBUTE_UNUSED,
op1 = XEXP (x, 1);

   cost_minus:
+   *cost += rtx_cost (op0, MINUS, 0, speed);
+
/* Detect valid immediates.  */
if ((GET_MODE_CLASS (mode) == MODE_INT
 || (GET_MODE_CLASS (mode) == MODE_CC
@@ -5801,13 +5803,10 @@ cost_minus:
&& CONST_INT_P (op1)
&& aarch64_uimm12_shift (INTVAL (op1)))
  {
-   *cost += rtx_cost (op0, MINUS, 0, speed);
-
if (speed)
  /* SUB(S) (immediate).  */
  *cost += extra_cost->alu.arith;
return true;
-
  }

/* Look for SUB (extended register).  */
@@ -5832,7 +5831,6 @@ cost_minus:
*cost += aarch64_rtx_mult_cost (new_op1, MULT,
(enum rtx_code) code,
speed);
-   *cost += rtx_cost (op0, MINUS, 0, speed);
return true;
  }

@@ -5879,6 +5877,8 @@ cost_plus:
return true;
  }

+   *cost += rtx_cost (op1, PLUS, 1, speed);
+

I don't think this is correct. In the code directly below
(when the aarch64_rtx_arith_op_extract_p condition is true)
we have a shift/extend operation by a constant, so we don't
want to take into account the cost of operand 1 (which is
the extend+shift rtx).

So, I looked at this code myself recently and found that the
wrong logic is addressed by my patch at:
https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01617.html
which fixes the case where we forget to cost operand0 in the
MINUS case.


Never mind, I misread the code. The case is indeed
missing the costing of operand 1, so your patch is
a superset of mine. I think it's correct, but I can't
approve. If this goes in, then I withdraw my patch at:

https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01617.html


Sorry for the confusion,
Kyrill




What do you think?
Kyrill


/* Look for ADD (extended register).  */
   if (aarch64_rtx_arith_op_extract_p (op0, mode))
  {
@@ -5900,12 +5900,10 @@ cost_plus:
  {
*cost += aarch64_rtx_mult_cost (new_op0, MULT, PLUS,
speed);
-   *cost += rtx_cost (op1, PLUS, 1, speed);
return true;
  }

-   *cost += (rtx_cost (new_op0, PLUS, 0, speed)
- + rtx_cost (op1, PLUS, 1, speed));
+   *cost += rtx_cost (new_op0, PLUS, 0, speed);

if (speed)
  {
--
1.9.1




Re: [PATCH 2/13] musl libc config

2015-04-27 Thread Szabolcs Nagy

On 22/04/15 16:42, Szabolcs Nagy wrote:
> yes, i didn't realize that this depends on the target specific parts
> 
> i will prepare an updated patch that works if the target has no musl
> dynamic linker name defined
> 

Updated the patch, this time I tested it separately.
(The test revealed that -mmusl was not wired up properly
in linux.opt through the "Negative" directive, now fixed).

A default dynamic linker name is defined for musl: "/dev/null".
(The arch specific headers have to be updated to undef the
MUSL_DYNAMIC_LINKER* macros).
(Other options were considered such as print an error with %e,
but then the user cannot override it with -Wl,-dynamic-linker)

The other parts of the patch (config, include path reordering)
are not changed.

gcc/Changelog

2015-04-27  Gregor Richards  
Szabolcs Nagy  

* config.gcc (LIBC_MUSL): New tm_defines macro.
* config/linux.h (OPTION_MUSL): Define.
(MUSL_DYNAMIC_LINKER, MUSL_DYNAMIC_LINKER32,)
(MUSL_DYNAMIC_LINKER64, MUSL_DYNAMIC_LINKERX32,)
(INCLUDE_DEFAULTS_MUSL_GPP, INCLUDE_DEFAULTS_MUSL_LOCAL,)
(INCLUDE_DEFAULTS_MUSL_PREFIX, INCLUDE_DEFAULTS_MUSL_CROSS,)
(INCLUDE_DEFAULTS_MUSL_TOOL, INCLUDE_DEFAULTS_MUSL_NATIVE): Define.

* config/linux.opt (mmusl): New option.
* configure.ac (gcc_cv_libc_provides_ssp): Add *-*-musl*.
(gcc_cv_target_dl_iterate_phdr): Add *-linux-musl*.

* configure: Regenerate.
diff --git a/gcc/config.gcc b/gcc/config.gcc
index a1df043..0176ca2 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -575,7 +575,7 @@ case ${target} in
 esac
 
 # Common C libraries.
-tm_defines="$tm_defines LIBC_GLIBC=1 LIBC_UCLIBC=2 LIBC_BIONIC=3"
+tm_defines="$tm_defines LIBC_GLIBC=1 LIBC_UCLIBC=2 LIBC_BIONIC=3 LIBC_MUSL=4"
 
 # 32-bit x86 processors supported by --with-arch=.  Each processor
 # MUST be separated by exactly one space.
@@ -725,6 +725,9 @@ case ${target} in
 *-*-*uclibc*)
   tm_defines="$tm_defines DEFAULT_LIBC=LIBC_UCLIBC"
   ;;
+*-*-*musl*)
+  tm_defines="$tm_defines DEFAULT_LIBC=LIBC_MUSL"
+  ;;
 *)
   tm_defines="$tm_defines DEFAULT_LIBC=LIBC_GLIBC"
   ;;
diff --git a/gcc/config/linux.h b/gcc/config/linux.h
index 857389a..b551c56 100644
--- a/gcc/config/linux.h
+++ b/gcc/config/linux.h
@@ -32,10 +32,12 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #define OPTION_GLIBC  (DEFAULT_LIBC == LIBC_GLIBC)
 #define OPTION_UCLIBC (DEFAULT_LIBC == LIBC_UCLIBC)
 #define OPTION_BIONIC (DEFAULT_LIBC == LIBC_BIONIC)
+#define OPTION_MUSL   (DEFAULT_LIBC == LIBC_MUSL)
 #else
 #define OPTION_GLIBC  (linux_libc == LIBC_GLIBC)
 #define OPTION_UCLIBC (linux_libc == LIBC_UCLIBC)
 #define OPTION_BIONIC (linux_libc == LIBC_BIONIC)
+#define OPTION_MUSL   (linux_libc == LIBC_MUSL)
 #endif
 
 #define GNU_USER_TARGET_OS_CPP_BUILTINS()			\
@@ -50,21 +52,25 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 } while (0)
 
 /* Determine which dynamic linker to use depending on whether GLIBC or
-   uClibc or Bionic is the default C library and whether
-   -muclibc or -mglibc or -mbionic has been passed to change the default.  */
+   uClibc or Bionic or musl is the default C library and whether
+   -muclibc or -mglibc or -mbionic or -mmusl has been passed to change
+   the default.  */
 
-#define CHOOSE_DYNAMIC_LINKER1(LIBC1, LIBC2, LIBC3, LD1, LD2, LD3)	\
-  "%{" LIBC2 ":" LD2 ";:%{" LIBC3 ":" LD3 ";:" LD1 "}}"
+#define CHOOSE_DYNAMIC_LINKER1(LIBC1, LIBC2, LIBC3, LIBC4, LD1, LD2, LD3, LD4)	\
+  "%{" LIBC2 ":" LD2 ";:%{" LIBC3 ":" LD3 ";:%{" LIBC4 ":" LD4 ";:" LD1 "}}}"
 
 #if DEFAULT_LIBC == LIBC_GLIBC
-#define CHOOSE_DYNAMIC_LINKER(G, U, B) \
-  CHOOSE_DYNAMIC_LINKER1 ("mglibc", "muclibc", "mbionic", G, U, B)
+#define CHOOSE_DYNAMIC_LINKER(G, U, B, M) \
+  CHOOSE_DYNAMIC_LINKER1 ("mglibc", "muclibc", "mbionic", "mmusl", G, U, B, M)
 #elif DEFAULT_LIBC == LIBC_UCLIBC
-#define CHOOSE_DYNAMIC_LINKER(G, U, B) \
-  CHOOSE_DYNAMIC_LINKER1 ("muclibc", "mglibc", "mbionic", U, G, B)
+#define CHOOSE_DYNAMIC_LINKER(G, U, B, M) \
+  CHOOSE_DYNAMIC_LINKER1 ("muclibc", "mglibc", "mbionic", "mmusl", U, G, B, M)
 #elif DEFAULT_LIBC == LIBC_BIONIC
-#define CHOOSE_DYNAMIC_LINKER(G, U, B) \
-  CHOOSE_DYNAMIC_LINKER1 ("mbionic", "mglibc", "muclibc", B, G, U)
+#define CHOOSE_DYNAMIC_LINKER(G, U, B, M) \
+  CHOOSE_DYNAMIC_LINKER1 ("mbionic", "mglibc", "muclibc", "mmusl", B, G, U, M)
+#elif DEFAULT_LIBC == LIBC_MUSL
+#define CHOOSE_DYNAMIC_LINKER(G, U, B, M) \
+  CHOOSE_DYNAMIC_LINKER1 ("mmusl", "mglibc", "muclibc", "mbionic", M, G, U, B)
 #else
 #error "Unsupported DEFAULT_LIBC"
 #endif /* DEFAULT_LIBC */
@@ -81,24 +87,100 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #define BIONIC_DYNAMIC_LINKER32 "/system/bin/linker"
 #define BIONIC_DYNAMIC_LINKER64 "/system/bin/linker64"
 #define BIONIC_DYNAMIC_LINKERX32 "/system/bin/linkerx32"
+/* Should be redefined for each target that supports m

Re: [PATCH 3/13] aarch64 musl support

2015-04-27 Thread Szabolcs Nagy

On 21/04/15 15:16, pins...@gmail.com wrote:
> 
> I don't think you need to check if defaulting to little or big-endian here 
> are the specs always have one or the other passing through. 
> 
> Also if musl does not support ilp32, you might want to error out. Or even 
> define the dynamic linker name even before support goes into musl. 
> 

Patch v2.

(now _ilp32 abi variant is included too)

gcc/Changelog:

2015-04-27  Gregor Richards  
Szabolcs Nagy  

* config/aarch64/aarch64-linux.h (MUSL_DYNAMIC_LINKER): Define.
diff --git a/gcc/config/aarch64/aarch64-linux.h b/gcc/config/aarch64/aarch64-linux.h
index 9abb252..2d647c3 100644
--- a/gcc/config/aarch64/aarch64-linux.h
+++ b/gcc/config/aarch64/aarch64-linux.h
@@ -23,6 +23,9 @@
 
 #define GLIBC_DYNAMIC_LINKER "/lib/ld-linux-aarch64%{mbig-endian:_be}%{mabi=ilp32:_ilp32}.so.1"
 
+#undef MUSL_DYNAMIC_LINKER
+#define MUSL_DYNAMIC_LINKER "/lib/ld-musl-aarch64%{mbig-endian:_be}%{mabi=ilp32:_ilp32}.so.1"
+
 #undef  ASAN_CC1_SPEC
 #define ASAN_CC1_SPEC "%{%:sanitize(address):-funwind-tables}"
 


Re: [PATCH 5/13] microblaze musl support

2015-04-27 Thread Szabolcs Nagy

On 20/04/15 19:54, Szabolcs Nagy wrote:
> Set up dynamic linker name for microblaze.
> 

Patch v2.
(undef MUSL_DYNAMIC_LINKER that comes from config/linux.h)

gcc/Changelog:

2015-04-24  Gregor Richards  

* config/microblaze/linux.h (MUSL_DYNAMIC_LINKER): Define.
(DYNAMIC_LINKER): Change.
diff --git a/gcc/config/microblaze/linux.h b/gcc/config/microblaze/linux.h
index a7faa7d..3e08138 100644
--- a/gcc/config/microblaze/linux.h
+++ b/gcc/config/microblaze/linux.h
@@ -25,7 +25,22 @@
 #undef TLS_NEEDS_GOT
 #define TLS_NEEDS_GOT 1
 
-#define DYNAMIC_LINKER "/lib/ld.so.1"
+#if TARGET_BIG_ENDIAN_DEFAULT == 0 /* LE */
+#define MUSL_DYNAMIC_LINKER_E "%{mbig-endian:;:el}"
+#else
+#define MUSL_DYNAMIC_LINKER_E "%{mlittle-endian:el}"
+#endif
+
+#undef MUSL_DYNAMIC_LINKER
+#define MUSL_DYNAMIC_LINKER "/lib/ld-musl-microblaze" MUSL_DYNAMIC_LINKER_E ".so.1"
+#define GLIBC_DYNAMIC_LINKER "/lib/ld.so.1"
+
+#if DEFAULT_LIBC == LIBC_MUSL
+#define DYNAMIC_LINKER MUSL_DYNAMIC_LINKER
+#else
+#define DYNAMIC_LINKER GLIBC_DYNAMIC_LINKER
+#endif
+
 #undef  SUBTARGET_EXTRA_SPECS
 #define SUBTARGET_EXTRA_SPECS \
   { "dynamic_linker", DYNAMIC_LINKER }


Re: [PATCH][combine][obvious] Use std::swap instead of manually swapping

2015-04-27 Thread Segher Boessenkool
On Mon, Apr 27, 2015 at 10:55:17AM +0100, Kyrill Tkachov wrote:
> Precedents suggest these changes are considered obvious.
> So I'll commit this in a couple of days unless someone objects.

Yes it's obvious.  One tiny thing...

> @@ -9062,7 +9061,7 @@ known_cond (rtx x, enum rtx_code cond, rtx reg, rtx val)
>else if (COMPARISON_P (x) || COMMUTATIVE_ARITH_P (x))
>  {
>if (rtx_equal_p (XEXP (x, 0), val))
> - cond = swap_condition (cond), temp = val, val = reg, reg = temp;
> + cond = swap_condition (cond), std::swap (val, reg);
>  
>if (rtx_equal_p (XEXP (x, 0), reg) && rtx_equal_p (XEXP (x, 1), val))
>   {

Might as well write this as two statements, like everywhere else, e.g.

> @@ -11454,7 +11453,7 @@ simplify_comparison (enum rtx_code code, rtx *pop0, 
> rtx *pop1)
>   is already a constant integer.  */
>if (swap_commutative_operands_p (op0, op1))
>  {
> -  tem = op0, op0 = op1, op1 = tem;
> +  std::swap (op0, op1);
>code = swap_condition (code);
>  }

Thanks,


Segher


Re: [PATCH 6/13] mips musl support

2015-04-27 Thread Szabolcs Nagy


On 21/04/15 15:59, Matthew Fortune wrote:
> Rich Felker  writes:
>> On Tue, Apr 21, 2015 at 01:58:02PM +, Matthew Fortune wrote:
>>> There does however appear to be both soft and hard float variants

Patch v2.

Now all the ABI variants musl plans to support are represented.

gcc/Changelog:

2015-04-27  Gregor Richards  
Szabolcs Nagy  

* config/mips/linux.h (MUSL_DYNAMIC_LINKER32): Define.
(MUSL_DYNAMIC_LINKER64, MUSL_DYNAMIC_LINKERN32): Define.
(GNU_USER_DYNAMIC_LINKERN32): Update.
diff --git a/gcc/config/mips/linux.h b/gcc/config/mips/linux.h
index 91df261..6038498 100644
--- a/gcc/config/mips/linux.h
+++ b/gcc/config/mips/linux.h
@@ -37,7 +37,13 @@ along with GCC; see the file COPYING3.  If not see
 #define UCLIBC_DYNAMIC_LINKERN32 \
   "%{mnan=2008:/lib32/ld-uClibc-mipsn8.so.0;:/lib32/ld-uClibc.so.0}"
 
+#undef MUSL_DYNAMIC_LINKER32
+#define MUSL_DYNAMIC_LINKER32 "/lib/ld-musl-mips%{EL:el}%{msoft-float:-sf}.so.1"
+#undef MUSL_DYNAMIC_LINKER64
+#define MUSL_DYNAMIC_LINKER64 "/lib/ld-musl-mips64%{EL:el}.so.1"
+#define MUSL_DYNAMIC_LINKERN32 "/lib/ld-musl-mipsn32%{EL:el}.so.1"
+
 #define BIONIC_DYNAMIC_LINKERN32 "/system/bin/linker32"
 #define GNU_USER_DYNAMIC_LINKERN32 \
   CHOOSE_DYNAMIC_LINKER (GLIBC_DYNAMIC_LINKERN32, UCLIBC_DYNAMIC_LINKERN32, \
- BIONIC_DYNAMIC_LINKERN32)
+ BIONIC_DYNAMIC_LINKERN32, MUSL_DYNAMIC_LINKERN32)


Re: [C/C++ PATCH] Improve -Wlogical-op (PR c/63357)

2015-04-27 Thread Marek Polacek
On Sat, Apr 25, 2015 at 10:18:59PM +0200, Gerald Pfeifer wrote:
> In case this example feels too contrived (even though it is an
> excerpt of Wine code), we now also warn about the following where
> the two types and variables are defined in different places and
> the size of one is set implicitly:
> 
>   typedef int r_fun_t (int);
> 
>   r_fun_t * text_funcs[] = {0,0,0};
> 
>   int report (unsigned t)
>   {
> typedef int s_fun_t (long, char);
> 
> static s_fun_t * GUI_funcs[3];
> 
> return (t < sizeof text_funcs / sizeof text_funcs[0] &&
> t < sizeof GUI_funcs / sizeof GUI_funcs[0]);
>   }
> 
> (I also filed this as https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65891
> so that we keep track.)

I'm afraid there isn't an easy solution to this; the problem is that we fold
sizeof early, so the warning sees 

  t < sizeof 4 && t < 4

and warns.  Maybe using SIZEOF_EXPR would help...

Marek


Re: [PATCH 8/13] sh musl support

2015-04-27 Thread Szabolcs Nagy


On 20/04/15 21:29, Oleg Endo wrote:
> On Mon, 2015-04-20 at 19:56 +0100, Szabolcs Nagy wrote:
>> Set up dynamic linker name for sh.
>>
> 
> The SH parts are OK for trunk.
> 

Patch v2

It turns out we have to distinguish between hard float and nofpu ABI.
I simply listed the options that turn on nofpu and hardfloat code
generation (the single precision modes are not relevant to musl,
currently musl only supports sh4 and up because it assumes either
gusa or llsc atomics is available and requires an mmu, but I listed
all the options for the nofpu check for completeness).

gcc/Changelog:

2015-04-27  Gregor Richards  
Szabolcs Nagy  

* config/sh/linux.h (MUSL_DYNAMIC_LINKER): Define.
(MUSL_DYNAMIC_LINKER_E, MUSL_DYNAMIC_LINKER_FP): Define.
diff --git a/gcc/config/sh/linux.h b/gcc/config/sh/linux.h
index 0f5d614..4c167c6 100644
--- a/gcc/config/sh/linux.h
+++ b/gcc/config/sh/linux.h
@@ -43,6 +43,28 @@ along with GCC; see the file COPYING3.  If not see
 
 #define TARGET_ASM_FILE_END file_end_indicate_exec_stack
 
+#if TARGET_ENDIAN_DEFAULT == MASK_LITTLE_ENDIAN
+#define MUSL_DYNAMIC_LINKER_E "%{mb:eb}"
+#else
+#define MUSL_DYNAMIC_LINKER_E "%{!ml:eb}"
+#endif
+
+#if TARGET_CPU_DEFAULT & ( MASK_HARD_SH2A_DOUBLE | MASK_SH4 )
+/* "-nofpu" if any nofpu option is specified */
+#define MUSL_DYNAMIC_LINKER_FP \
+  "%{m1|m2|m2a-nofpu|m3|m4-nofpu|m4-100-nofpu|m4-200-nofpu|m4-300-nofpu|" \
+  "m4-340|m4-400|m4-500|m4al|m5-32media-nofpu|m5-64media-nofpu|" \
+  "m5-compact-nofpu:-nofpu}"
+#else
+/* "-nofpu" if none of the hard fpu options are specified */
+#define MUSL_DYNAMIC_LINKER_FP \
+  "%{m2a|m4|m4-100|m4-200|m4-300|m4a|m5-32media|m5-64media|m5-compact:;:-nofpu}"
+#endif
+
+#undef MUSL_DYNAMIC_LINKER
+#define MUSL_DYNAMIC_LINKER \
+  "/lib/ld-musl-sh" MUSL_DYNAMIC_LINKER_E MUSL_DYNAMIC_LINKER_FP ".so.1"
+
 #define GLIBC_DYNAMIC_LINKER "/lib/ld-linux.so.2"
 
 #undef SUBTARGET_LINK_EMUL_SUFFIX


Re: [C/C++ PATCH] Improve -Wlogical-op (PR c/63357)

2015-04-27 Thread Marek Polacek
On Mon, Apr 27, 2015 at 04:47:12PM +0200, Marek Polacek wrote:
> On Sat, Apr 25, 2015 at 10:18:59PM +0200, Gerald Pfeifer wrote:
> > In case this example feels too contrived (even though it is an
> > excerpt of Wine code), we now also warn about the following where
> > the two types and variables are defined in different places and
> > the size of one is set implicitly:
> > 
> >   typedef int r_fun_t (int);
> > 
> >   r_fun_t * text_funcs[] = {0,0,0};
> > 
> >   int report (unsigned t)
> >   {
> > typedef int s_fun_t (long, char);
> > 
> > static s_fun_t * GUI_funcs[3];
> > 
> > return (t < sizeof text_funcs / sizeof text_funcs[0] &&
> > t < sizeof GUI_funcs / sizeof GUI_funcs[0]);
> >   }
> > 
> > (I also filed this as https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65891
> > so that we keep track.)
> 
> I'm afraid there isn't an easy solution to this; the problem is that we fold
> sizeof early, so the warning sees 
> 
>   t < sizeof 4 && t < 4
> 
> and warns.  Maybe using SIZEOF_EXPR would help...

That said, this isn't something that regressed with my patch.  Consider e.g.:

struct S { int i; };
struct T { int i; };
int
foo (unsigned int t)
{
  return t < sizeof (struct S) && t > sizeof (struct T);
}

with -Wlogical-op:
w2.c: In function ‘foo’:
w2.c:6:32: warning: logical ‘and’ of mutually exclusive tests is always false 
[-Wlogical-op]
   return t < sizeof (struct S) && t > sizeof (struct T);
^


Re: [PATCH 9/13] x86 musl support

2015-04-27 Thread Szabolcs Nagy
On 20/04/15 19:57, Szabolcs Nagy wrote:
> Set up dynamic linker name for x86.
> 

Patch v2.

(undef MUSL_DYNAMIC_LINKER* before defining it).

gcc/Changelog:

2015-04-27  Gregor Richards  

* config/i386/linux.h (MUSL_DYNAMIC_LINKER): Define.
* config/i386/linux64.h (MUSL_DYNAMIC_LINKER32): Define.
(MUSL_DYNAMIC_LINKER64, MUSL_DYNAMIC_LINKERX32): Define.
diff --git a/gcc/config/i386/linux.h b/gcc/config/i386/linux.h
index a100963..9c47fcb 100644
--- a/gcc/config/i386/linux.h
+++ b/gcc/config/i386/linux.h
@@ -21,3 +21,5 @@ along with GCC; see the file COPYING3.  If not see
 
 #define GNU_USER_LINK_EMULATION "elf_i386"
 #define GLIBC_DYNAMIC_LINKER "/lib/ld-linux.so.2"
+#undef MUSL_DYNAMIC_LINKER
+#define MUSL_DYNAMIC_LINKER "/lib/ld-musl-i386.so.1"
diff --git a/gcc/config/i386/linux64.h b/gcc/config/i386/linux64.h
index a27d3be..e300480 100644
--- a/gcc/config/i386/linux64.h
+++ b/gcc/config/i386/linux64.h
@@ -30,3 +30,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #define GLIBC_DYNAMIC_LINKER32 "/lib/ld-linux.so.2"
 #define GLIBC_DYNAMIC_LINKER64 "/lib64/ld-linux-x86-64.so.2"
 #define GLIBC_DYNAMIC_LINKERX32 "/libx32/ld-linux-x32.so.2"
+
+#undef MUSL_DYNAMIC_LINKER32
+#define MUSL_DYNAMIC_LINKER32 "/lib/ld-musl-i386.so.1"
+#undef MUSL_DYNAMIC_LINKER64
+#define MUSL_DYNAMIC_LINKER64 "/lib/ld-musl-x86_64.so.1"
+#undef MUSL_DYNAMIC_LINKERX32
+#define MUSL_DYNAMIC_LINKERX32 "/lib/ld-musl-x32.so.1"


Re: [PATCH 7/13] powerpc musl support

2015-04-27 Thread Szabolcs Nagy
On 20/04/15 19:55, Szabolcs Nagy wrote:
> Set up dynamic linker name for powerpc.  Musl only supports powerpc
> with secure plt, so appropriate options are passed to the linker by
> default.

Patch v2.

Powerpc does not include the top level linux.h, so I had to
repeat the include order fixes from there in rs6000/sysv4.h.

I corrected the endianness handling (the "le" suffix should
be added correctly now).

gcc/Changelog:

2015-04-24  Gregor Richards  
Szabolcs Nagy  

* config.gcc (secure_plt): Add *-linux*-musl*.
* config/rs6000/linux64.h (MUSL_DYNAMIC_LINKER32): Define.
(MUSL_DYNAMIC_LINKER64): Define.
(GNU_USER_DYNAMIC_LINKER32): Update.
(GNU_USER_DYNAMIC_LINKER64): Update.
(CHOOSE_DYNAMIC_LINKER): Update.

* config/rs6000/secureplt.h (LINK_SECURE_PLT_DEFAULT_SPEC): Define.
* config/rs6000/sysv4.h (GNU_USER_DYNAMIC_LINKER): Update.
(MUSL_DYNAMIC_LINKER, MUSL_DYNAMIC_LINKER_E,)
(INCLUDE_DEFAULTS_MUSL_GPP, INCLUDE_DEFAULTS_MUSL_LOCAL,)
(INCLUDE_DEFAULTS_MUSL_PREFIX, INCLUDE_DEFAULTS_MUSL_CROSS,)
(INCLUDE_DEFAULTS_MUSL_TOOL, INCLUDE_DEFAULTS_MUSL_NATIVE): Define.
(LINK_SECURE_PLT_DEFAULT_SPEC): Define.
(CHOOSE_DYNAMIC_LINKER, LINK_TARGET_SPEC, LINK_OS_LINUX_SPEC): Update.

* config/rs6000/sysv4le.h (MUSL_DYNAMIC_LINKER_E): Define.
diff --git a/gcc/config/rs6000/linux64.h b/gcc/config/rs6000/linux64.h
index 0879e7e..a6ad842 100644
--- a/gcc/config/rs6000/linux64.h
+++ b/gcc/config/rs6000/linux64.h
@@ -365,17 +365,21 @@ extern int dot_symbols;
 #endif
 #define UCLIBC_DYNAMIC_LINKER32 "/lib/ld-uClibc.so.0"
 #define UCLIBC_DYNAMIC_LINKER64 "/lib/ld64-uClibc.so.0"
+#define MUSL_DYNAMIC_LINKER32 "/lib/ld-musl-powerpc" MUSL_DYNAMIC_LINKER_E ".so.1"
+#define MUSL_DYNAMIC_LINKER64 "/lib/ld-musl-powerpc64" MUSL_DYNAMIC_LINKER_E ".so.1"
 #if DEFAULT_LIBC == LIBC_UCLIBC
-#define CHOOSE_DYNAMIC_LINKER(G, U) "%{mglibc:" G ";:" U "}"
+#define CHOOSE_DYNAMIC_LINKER(G, U, M) "%{mglibc:" G ";:%{mmusl:" M ";:" U "}}"
 #elif DEFAULT_LIBC == LIBC_GLIBC
-#define CHOOSE_DYNAMIC_LINKER(G, U) "%{muclibc:" U ";:" G "}"
+#define CHOOSE_DYNAMIC_LINKER(G, U, M) "%{muclibc:" U ";:%{mmusl:" M ";:" G "}}"
+#elif DEFAULT_LIBC == LIBC_MUSL
+#define CHOOSE_DYNAMIC_LINKER(G, U, M) "%{mglibc:" G ";:%{muclibc:" U ";:" M "}}"
 #else
 #error "Unsupported DEFAULT_LIBC"
 #endif
 #define GNU_USER_DYNAMIC_LINKER32 \
-  CHOOSE_DYNAMIC_LINKER (GLIBC_DYNAMIC_LINKER32, UCLIBC_DYNAMIC_LINKER32)
+  CHOOSE_DYNAMIC_LINKER (GLIBC_DYNAMIC_LINKER32, UCLIBC_DYNAMIC_LINKER32, MUSL_DYNAMIC_LINKER32)
 #define GNU_USER_DYNAMIC_LINKER64 \
-  CHOOSE_DYNAMIC_LINKER (GLIBC_DYNAMIC_LINKER64, UCLIBC_DYNAMIC_LINKER64)
+  CHOOSE_DYNAMIC_LINKER (GLIBC_DYNAMIC_LINKER64, UCLIBC_DYNAMIC_LINKER64, MUSL_DYNAMIC_LINKER64)
 
 #undef  DEFAULT_ASM_ENDIAN
 #if (TARGET_DEFAULT & MASK_LITTLE_ENDIAN)
diff --git a/gcc/config/rs6000/secureplt.h b/gcc/config/rs6000/secureplt.h
index b463463..77edf2a 100644
--- a/gcc/config/rs6000/secureplt.h
+++ b/gcc/config/rs6000/secureplt.h
@@ -18,3 +18,4 @@ along with GCC; see the file COPYING3.  If not see
 .  */
 
 #define CC1_SECURE_PLT_DEFAULT_SPEC "-msecure-plt"
+#define LINK_SECURE_PLT_DEFAULT_SPEC "--secure-plt"
diff --git a/gcc/config/rs6000/sysv4.h b/gcc/config/rs6000/sysv4.h
index 9917c2f..94c4263 100644
--- a/gcc/config/rs6000/sysv4.h
+++ b/gcc/config/rs6000/sysv4.h
@@ -537,6 +537,9 @@ ENDIAN_SELECT(" -mbig", " -mlittle", DEFAULT_ASM_ENDIAN)
 #ifndef CC1_SECURE_PLT_DEFAULT_SPEC
 #define CC1_SECURE_PLT_DEFAULT_SPEC ""
 #endif
+#ifndef LINK_SECURE_PLT_DEFAULT_SPEC
+#define LINK_SECURE_PLT_DEFAULT_SPEC ""
+#endif
 
 /* Pass -G xxx to the compiler.  */
 #undef CC1_SPEC
@@ -586,7 +589,8 @@ ENDIAN_SELECT(" -mbig", " -mlittle", DEFAULT_ASM_ENDIAN)
 
 /* Override the default target of the linker.  */
 #define	LINK_TARGET_SPEC \
-  ENDIAN_SELECT("", " --oformat elf32-powerpcle", "")
+  ENDIAN_SELECT("", " --oformat elf32-powerpcle", "") \
+  "%{!mbss-plt: %{!msecure-plt: %(link_secure_plt_default)}}"
 
 /* Any specific OS flags.  */
 #define LINK_OS_SPEC "\
@@ -762,17 +766,22 @@ ENDIAN_SELECT(" -mbig", " -mlittle", DEFAULT_ASM_ENDIAN)
 
 #define LINK_START_LINUX_SPEC ""
 
+#define MUSL_DYNAMIC_LINKER_E ENDIAN_SELECT("","le","")
+
 #define GLIBC_DYNAMIC_LINKER "/lib/ld.so.1"
 #define UCLIBC_DYNAMIC_LINKER "/lib/ld-uClibc.so.0"
+#define MUSL_DYNAMIC_LINKER "/lib/ld-musl-powerpc" MUSL_DYNAMIC_LINKER_E ".so.1"
 #if DEFAULT_LIBC == LIBC_UCLIBC
-#define CHOOSE_DYNAMIC_LINKER(G, U) "%{mglibc:" G ";:" U "}"
+#define CHOOSE_DYNAMIC_LINKER(G, U, M) "%{mglibc:" G ";:%{mmusl:" M ";:" U "}}"
+#elif DEFAULT_LIBC == LIBC_MUSL
+#define CHOOSE_DYNAMIC_LINKER(G, U, M) "%{mglibc:" G ";:%{muclibc:" U ";:" M "}}"
 #elif !defined (DEFAULT_LIBC) || DEFAULT_LIBC == LIBC_GLIBC
-#define CHOOSE_DYNAMIC_LINKER(G, U) "%{muclibc:" U ";:" G "}"
+#define CHOOSE_DYNAMIC_LINKER(G, U, M) "%{muclibc:" U ";:%{mmusl:" M ";:" G 

RE: [PATCH][combine][obvious] Use std::swap instead of manually swapping

2015-04-27 Thread Kyrill Tkachov
On 27/04/15 15:37, Segher Boessenkool wrote:
> On Mon, Apr 27, 2015 at 10:55:17AM +0100, Kyrill Tkachov wrote:
>> Precedents suggest these changes are considered obvious.
>> So I'll commit this in a couple of days unless someone objects.
>
> Yes it's obvious.  One tiny thing...
>
>> @@ -9062,7 +9061,7 @@ known_cond (rtx x, enum rtx_code cond, rtx reg, rtx
val)
>>else if (COMPARISON_P (x) || COMMUTATIVE_ARITH_P (x))
>>  {
>>if (rtx_equal_p (XEXP (x, 0), val))
>> -cond = swap_condition (cond), temp = val, val = reg, reg = temp;
>> +cond = swap_condition (cond), std::swap (val, reg);
>>  
>>if (rtx_equal_p (XEXP (x, 0), reg) && rtx_equal_p (XEXP (x, 1),
val))
>>  {
>
> Might as well write this as two statements, like everywhere else, e.g.
>
>> @@ -11454,7 +11453,7 @@ simplify_comparison (enum rtx_code code, rtx
*pop0, rtx *pop1)
>>   is already a constant integer.  */
>>if (swap_commutative_operands_p (op0, op1))
>>  {
>> -  tem = op0, op0 = op1, op1 = tem;
>> +  std::swap (op0, op1);
>>code = swap_condition (code);
>>  }
>
> Thanks,

Thanks, here's what I committed with r222468.

Kyrill

>
>
>
> Segher
>


combine-swap.patch
Description: Binary data


RE: [PATCH] Fix IRA register preferencing

2015-04-27 Thread Wilco Dijkstra
> Jeff Law wrote:
> On 12/10/14 06:26, Wilco Dijkstra wrote:
> >
> > If recomputing is best does that mean that record_reg_classes should not
> > give a boost to the preferred class in the 2nd pass?
> Perhaps.  I haven't looked deeply at this part of IRA.  I was relaying
> my experiences with (ab)using the ira-reload callbacks to handle
> allocation after splitting -- where getting the costs and classes
> updated in a reasonable manner was clearly important to getting good
> code.  One could probably argue I should have kept testcases from that
> work :-)
> 
> 
> I don't understand
> > what purpose this has - if the preferred class is from the first pass, it
> > is already correct, so all it does is boost the preferred class further.
> > And if the preferred class is wrong (eg. after live range splitting), it
> > will boost the incorrect class even harder, so in the end you never get
> > a different class.
> It may be historical from the old regclass code, not really sure.
> 
> >  From what you're saying, recomputing seems best, and I'd be happy to submit
> > a patch to remove all the preferred class code from record_reg_classes.
> Recomputing certainly helped the cases I was looking at.
> >
> > However there is still the possibility the preferred class is queried before
> > the recomputation happens (I think that is a case Renlin fixed). Either 
> > these
> > should be faulted and fixed by forcing recomputation, or we need to provide 
> > a
> > correct preferred class. That should be a copy of the original class.
> I believe I had copied the original classes, then recomputed them to
> avoid any uninitialized memory reads and the like.  But looking at the
> old branch, I don't see the recomputation for classes (though I do see
> it for costs).  Of course all the backwards walk stuff isn't there
> either, so there's clearly code I worked on extensively, but never
> committed...

Well, looking into this further it does look like the preferences are improved
during the 2nd pass (in particular fewer cases of the bad ALL_REGS preference
that causes all the inefficient code), so it looks like recomputing preferences
is not beneficial and my original patch is the right fix for now.

https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00829.html

Wilco





Re: [PATCH][PR65818][bootstrap,hppa] Return side-effect free result in gimplify_va_arg_internal

2015-04-27 Thread Tom de Vries

On 27-04-15 15:40, Richard Biener wrote:

On Mon, Apr 27, 2015 at 3:06 PM, Tom de Vries  wrote:

On 27-04-15 10:17, Richard Biener wrote:


This patch fixes that by gimplifying the address expression of the
mem-ref

returned by the target hook (borrowing code from gimplify_expr, case
MEM_REF).

Bootstrapped and reg-tested on x86_64.

Bootstrapped and reg-tested on hppa2.0w-hp-hpux11.11.

OK for trunk?


Hmm, that assert looks suspicious...

Can't you simply always do

gimplify_expr (expr, pre_p, post_p, is_gimple_lvalue, fb_lvalue);



It's a bit counter-intuitive for me, using is_gimple_lvalue for something
(the result of va_arg) we use as rvalue.


Yeah, choosing that was done because you could assert it's a MEM_REF
and tree-stdarg eventually builds a WITH_SIZE_EXPR around it.

It would possibly be easier to simply "inline" gimplify_va_arg_internal
at its use and only gimplify the assignment.  Though in that case the
original code probably worked - just in the lhs == NULL case it didn't,
which hints at a better place for the fix - in expand_ifn_va_arg_1 do

  if (lhs != NULL_TREE)
{
...
}
  else
gimplify_expr (&expr, &pre, &post, is_gimple_val, fb_either);

So ... if you can re-try that one it's pre-approved.



Using that proposed patch, we run into the following problem.

Consider this testcase with ignored-result-va_arg:
...
#include 

void
f2 (int i, ...)
{
  va_list ap;
  va_start (ap, i);
  va_arg (ap, long);
  va_end (ap);
}
...

after gimplify_va_arg_internal we have:
...
(gdb) call debug_generic_expr (expr)
*(long int * {ref-all}) addr.2
...

In a bit more detail:
...
(gdb) call debug_tree (expr)
 
unit size 
align 64 symtab 0 alias set -1 canonical type 0x764a77e0 precision 
64 min  max 0x764a3f48 9223372036854775807>

pointer_to_this >

arg 0 type long int>
public static unsigned DI size  unit 
size 

align 64 symtab 0 alias set -1 canonical type 0x765e0498>

arg 0 
used unsigned ignored DI file stdarg-1.c line 4 col 1 size 
 unit size 

align 64 context >>
arg 1  
constant 0>>

...

During 'gimplify_expr (&expr, &pre, &post, is_gimple_val, fb_either)' we ICE 
here:
...
8886  gcc_assert ((*gimple_test_f) (*expr_p));
...

At this point expr is:
...
(gdb) call debug_generic_expr (*expr_p)
*addr.2
...

In more detail:
...
(gdb) call debug_tree (*expr_p)
 >

arg 0 
sizes-gimplified public unsigned DI
size 
unit size 
align 64 symtab 0 alias set -1 canonical type 0x764c2150
pointer_to_this >
used unsigned ignored DI file stdarg-1.c line 4 col 1 size 0x764a3ca8 64> unit size 

align 64 context >
arg 1  
constant 0>>

...

The memref is now VOID type. And that's not a gimple_val:
...
(gdb) p is_gimple_val (*expr_p)
$1 = false
...

Attached patch uses is_gimple_lvalue, but in expand_ifn_va_arg_1.

I'll bootstrap and reg-test on x86_64 and commit, unless further comments of 
course.

Thanks,
- Tom
Evaluate side-effects in expand_ifn_va_arg_1

2015-04-27  Tom de Vries  

	PR tree-optimization/65818
	* tree-stdarg.c (expand_ifn_va_arg_1): Ensure that side-effects are
	evaluated.
---
 gcc/tree-stdarg.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/tree-stdarg.c b/gcc/tree-stdarg.c
index 16a9e2c..1356374 100644
--- a/gcc/tree-stdarg.c
+++ b/gcc/tree-stdarg.c
@@ -1078,6 +1078,8 @@ expand_ifn_va_arg_1 (function *fun)
 	   types.  */
 	gimplify_assign (lhs, expr, &pre);
 	  }
+	else
+	  gimplify_expr (&expr, &pre, &post, is_gimple_lvalue, fb_lvalue);
 
 	pop_gimplify_context (NULL);
 
-- 
1.9.1



Re: [PATCH] libmpx: remove AC_PROG_CXX/GCC_LIBSTDCXX_RAW_CXX_FLAGS from configure.ac

2015-04-27 Thread Ilya Enkovich
2015-04-23 12:53 GMT+03:00 Steven Noonan :
> This causes libmpx to fail the configure stage on my build hosts because 
> 'xg++'
> was invoked with the bogus '-funconfigured-libstdc++-v3' flag, and the 
> fallback
> preprocessor '/lib/cpp' did not exist on my systems.
>
> Since libmpx has no C++ code, requiring a C++ preprocessor and compiler at
> configure time doesn't make any sense. Simply remove the directives from
> configure.ac.
>
> I tested this change on my build environments and the libmpx configure 
> failures
> I was seeing went away. Recommend this be applied to trunk and the gcc-5
> branch.

Thank you for the patch.  I applied it to trunk and will apply to
gcc-5 branch later.

Ilya

>
> ChangeLog
>
> 2015-04-23  Steven Noonan 
>
>* configure.ac: Drop AC_PROG_CXX and friends, since libmpx has no C++
>  sources.
>* configure: Regenerate.
> ---
>  libmpx/configure.ac | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/libmpx/configure.ac b/libmpx/configure.ac
> index 3f8b50f..463e855 100644
> --- a/libmpx/configure.ac
> +++ b/libmpx/configure.ac
> @@ -23,7 +23,6 @@ AC_MSG_RESULT($version_specific_libs)
>  AC_CANONICAL_SYSTEM
>  target_alias=${target_alias-$host_alias}
>  AC_SUBST(target_alias)
> -GCC_LIBSTDCXX_RAW_CXX_FLAGS
>
>  # See if supported.
>  unset LIBMPX_SUPPORTED
> @@ -94,7 +93,6 @@ AC_SUBST(toolexeclibdir)
>  m4_rename([_AC_ARG_VAR_PRECIOUS],[real_PRECIOUS])
>  m4_define([_AC_ARG_VAR_PRECIOUS],[])
>  AC_PROG_CC
> -AC_PROG_CXX
>  m4_rename_force([real_PRECIOUS],[_AC_ARG_VAR_PRECIOUS])
>
>  AM_PROG_CC_C_O
> --
> 2.3.6
>


Re: [PATCH v2 3/4] libcc1: Add 'set compile-gcc'

2015-04-27 Thread Pedro Alves
On 04/23/2015 09:38 PM, Jan Kratochvil wrote:
> - configury triplet prefix to the compiler.
> + configury triplet prefix to the compiler; TRIPLET_REGEXP can be
> + also absolute filename  to the computer.

"to the computer" is probably a typo for "to the compiler".
Also, double space after "filename".

>   The arguments are copied by GCC.  ARGV need not be


Thanks,
Pedro Alves


Re: [PATCH, rs6000, testsuite] Fix PR target/64579, __TM_end __builtin_tend failed to return transactional state

2015-04-27 Thread Peter Bergner
On Fri, 2015-04-24 at 11:40 -0400, David Edelsohn wrote:
> On Fri, Apr 24, 2015 at 11:34 AM, Segher Boessenkool
> > All looks great to me now, thanks for the changes,
> 
> Okay for me.

In back porting my patch, I see that the patch doesn't apply
cleanly to the 4.8/4.9 branch code.  Looking at why, I see that it's
because the 4.8/4.9 code doesn't have the following change from
Segher.  I think we want this patch back ported too, don't we?

2014-09-11  Segher Boessenkool  

   * config/rs6000/htm.md (tabort, tabortdc, tabortdci, tabortwc,
   tabortwci, tbegin, tcheck, tend, trechkpt, treclaim, tsr): Use xor
   instead of minus.
   * config/rs6000/vector.md (cr6_test_for_zero_reverse,
   cr6_test_for_lt_reverse): Ditto.

If so, I can back port it and test it along with mine.  Otherwise, I'll
just work around it.  Let me know what you want me to do.

Peter




Re: [PATCH] Make wider use of "v" constraint in i386.md

2015-04-27 Thread Ilya Tocar
On 17 Apr 10:09, Uros Bizjak wrote:
> On Thu, Mar 19, 2015 at 10:24 AM, Ilya Tocar  wrote:
> > Hi,
> >
> > There were some discussion about "x" constraints being too conservative
> > for some patterns in i386.md.
> > Patch below fixes it. This is probably stage1 material.
> >
> > ChangeLog:
> >
> > gcc/
> >
> > 2015-03-19  Ilya Tocar  
> >
> > * config/i386/i386.h (EXT_SSE_REG_P): New.
> > * config/i386/i386.md (*cmpi_mixed): Use 
> > "v"
> > constraint.
> > (*cmpi_sse): Ditto.
> > (*movxi_internal_avx512f): Ditto.
> > (define_split): Check for xmm16+, when splitting scalar 
> > float_extend.
> > (*extendsfdf2_mixed): Use "v" constraint.
> > (*extendsfdf2_sse): Ditto.
> > (define_split): Check for xmm16+, when splitting scalar 
> > float_truncate.
> > (*truncdfsf_fast_sse): Use "v" constraint.
> > (fix_trunc_sse): Ditto.
> > (*float2_sse): Ditto.
> > (define_peephole2): Check for xmm16+, when converting scalar
> > float_truncate.
> > (define_peephole2): Check for xmm16+, when converting scalar
> > float_extend.
> > (*fop__comm_mixed): Use "v" constraint.
> > (*fop__comm_sse): Ditto.
> > (*fop__1_mixed): Ditto.
> > (*sqrt2_sse): Ditto.
> > (*ieee_s3): Ditto.
> 
> I wonder if there are also changes needed in mmx.md. There are a
> couple of patterns that operate on xmm registers, so they should be
> reviewed if they need to change their constraint to "v" to accept
> extended xmm register set.
>

Doesn't look like it. At first glance non-v stuff in mmx.md isn't even
avx enabled, so v is not relevant. Moreover we don't allow mmx modes
(v2sf etc.) in xmm16+ via  ix86_hard_regno_mode_ok.


Re: [C/C++ PATCH] Implement -Wshift-negative-value (PR c/65179)

2015-04-27 Thread Marek Polacek
On Sat, Apr 25, 2015 at 08:13:08PM +, Joseph Myers wrote:
> On Sat, 25 Apr 2015, Marek Polacek wrote:
> 
> > +   pedwarn (location, OPT_Wshift_negative_value,
> > +"left shift of negative value");
> 
> Use of pedwarn is always suspect for something only undefined at runtime; 
> it must not produce an error with -pedantic-errors in any context where a 
> constant expression is not required.

Makes sense.  So how about moving the warning into -Wextra?  This way it won't
trigger by default.  One change is that we reject programs that use shift with
undefined behavior in a context where a constant expression is required, thus
e.g. enum E { A = -1 << 0 };
But I hope that's reasonable.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2015-04-27  Marek Polacek  

PR c/65179
* c-common.c (c_fully_fold_internal): Warn when left shifting a
negative value.
* c.opt (Wshift-negative-value): New option.

* c-typeck.c (build_binary_op): Warn when left shifting a negative
value.

* typeck.c (cp_build_binary_op): Warn when left shifting a negative
value.

* doc/invoke.texi: Document -Wshift-negative-value.

* c-c++-common/Wshift-negative-value-1.c: New test.
* c-c++-common/Wshift-negative-value-2.c: New test.
* c-c++-common/Wshift-negative-value-3.c: New test.
* c-c++-common/Wshift-negative-value-4.c: New test.
* gcc.dg/c99-const-expr-7.c: Add dg-error.

diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c
index 9797e17..f2fe65e 100644
--- gcc/c-family/c-common.c
+++ gcc/c-family/c-common.c
@@ -1361,6 +1361,15 @@ c_fully_fold_internal (tree expr, bool in_init, bool 
*maybe_const_operands,
  && !TREE_OVERFLOW_P (op0)
  && !TREE_OVERFLOW_P (op1))
overflow_warning (EXPR_LOCATION (expr), ret);
+  if (code == LSHIFT_EXPR
+ && TREE_CODE (orig_op0) != INTEGER_CST
+ && TREE_CODE (TREE_TYPE (orig_op0)) == INTEGER_TYPE
+ && TREE_CODE (op0) == INTEGER_CST
+ && c_inhibit_evaluation_warnings == 0
+ && tree_int_cst_sgn (op0) < 0
+ && flag_isoc99)
+   warning_at (loc, OPT_Wshift_negative_value,
+   "left shift of negative value");
   if ((code == LSHIFT_EXPR || code == RSHIFT_EXPR)
  && TREE_CODE (orig_op1) != INTEGER_CST
  && TREE_CODE (op1) == INTEGER_CST
diff --git gcc/c-family/c.opt gcc/c-family/c.opt
index 983f4a8..c61dfb1 100644
--- gcc/c-family/c.opt
+++ gcc/c-family/c.opt
@@ -781,6 +781,10 @@ Wshift-count-overflow
 C ObjC C++ ObjC++ Var(warn_shift_count_overflow) Init(1) Warning
 Warn if shift count >= width of type
 
+Wshift-negative-value
+C ObjC C++ ObjC++ Var(warn_shift_negative_value) Warning EnabledBy(Wextra)
+Warn if left shifting a negative value
+
 Wsign-compare
 C ObjC C++ ObjC++ Var(warn_sign_compare) Warning LangEnabledBy(C++ ObjC++,Wall)
 Warn about signed-unsigned comparisons
diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c
index 91735b5..36cebc6 100644
--- gcc/c/c-typeck.c
+++ gcc/c/c-typeck.c
@@ -10707,6 +10707,15 @@ build_binary_op (location_t location, enum tree_code 
code,
  && code1 == INTEGER_TYPE)
{
  doing_shift = true;
+ if (TREE_CODE (op0) == INTEGER_CST
+ && tree_int_cst_sgn (op0) < 0
+ && flag_isoc99)
+   {
+ int_const = false;
+ if (c_inhibit_evaluation_warnings == 0)
+   warning_at (location, OPT_Wshift_negative_value,
+   "left shift of negative value");
+   }
  if (TREE_CODE (op1) == INTEGER_CST)
{
  if (tree_int_cst_sgn (op1) < 0)
diff --git gcc/cp/typeck.c gcc/cp/typeck.c
index 91db32a..3cb5a8a 100644
--- gcc/cp/typeck.c
+++ gcc/cp/typeck.c
@@ -4326,11 +4326,21 @@ cp_build_binary_op (location_t location,
}
   else if (code0 == INTEGER_TYPE && code1 == INTEGER_TYPE)
{
+ tree const_op0 = fold_non_dependent_expr (op0);
+ if (TREE_CODE (const_op0) != INTEGER_CST)
+   const_op0 = op0;
  tree const_op1 = fold_non_dependent_expr (op1);
  if (TREE_CODE (const_op1) != INTEGER_CST)
const_op1 = op1;
  result_type = type0;
  doing_shift = true;
+ if (TREE_CODE (const_op0) == INTEGER_CST
+ && tree_int_cst_sgn (const_op0) < 0
+ && (complain & tf_warning)
+ && c_inhibit_evaluation_warnings == 0
+ && cxx_dialect >= cxx11)
+   warning (OPT_Wshift_negative_value,
+"left shift of negative value");
  if (TREE_CODE (const_op1) == INTEGER_CST)
{
  if (tree_int_cst_lt (const_op1, integer_zero_node))
diff --git gcc/doc/invoke.texi gcc/doc/invoke.texi
index 7d2f6e5..dcfe4cf 100644
--- gcc/doc/invoke.texi
+++ gcc/doc/invoke.texi
@@ -271,7 +271,7 @@ Objective-C and Objective-C++ Dialects}.
 -Wpointe

Re: [C/C++ PATCH] Improve -Wlogical-op (PR c/63357)

2015-04-27 Thread Jeff Law

On 04/27/2015 08:47 AM, Marek Polacek wrote:

On Sat, Apr 25, 2015 at 10:18:59PM +0200, Gerald Pfeifer wrote:

In case this example feels too contrived (even though it is an
excerpt of Wine code), we now also warn about the following where
the two types and variables are defined in different places and
the size of one is set implicitly:

   typedef int r_fun_t (int);

   r_fun_t * text_funcs[] = {0,0,0};

   int report (unsigned t)
   {
 typedef int s_fun_t (long, char);

 static s_fun_t * GUI_funcs[3];

 return (t < sizeof text_funcs / sizeof text_funcs[0] &&
 t < sizeof GUI_funcs / sizeof GUI_funcs[0]);
   }

(I also filed this as https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65891
so that we keep track.)


I'm afraid there isn't an easy solution to this; the problem is that we fold
sizeof early, so the warning sees

   t < sizeof 4 && t < 4

and warns.  Maybe using SIZEOF_EXPR would help...
Can you file a bug for this for future reference?  We may not tackle 
this specific issue in the current iteration of delayed folding, but it 
certainly helps to have BZs for specific issues that we'd like to fix 
once we can delay folding.


jeff


Re: [PATCH][combine][obvious] Use std::swap instead of manually swapping

2015-04-27 Thread Jeff Law

On 04/27/2015 08:37 AM, Segher Boessenkool wrote:

On Mon, Apr 27, 2015 at 10:55:17AM +0100, Kyrill Tkachov wrote:

Precedents suggest these changes are considered obvious.
So I'll commit this in a couple of days unless someone objects.


Yes it's obvious.  One tiny thing...


@@ -9062,7 +9061,7 @@ known_cond (rtx x, enum rtx_code cond, rtx reg, rtx val)
else if (COMPARISON_P (x) || COMMUTATIVE_ARITH_P (x))
  {
if (rtx_equal_p (XEXP (x, 0), val))
-   cond = swap_condition (cond), temp = val, val = reg, reg = temp;
+   cond = swap_condition (cond), std::swap (val, reg);

if (rtx_equal_p (XEXP (x, 0), reg) && rtx_equal_p (XEXP (x, 1), val))
{


Might as well write this as two statements, like everywhere else, e.g.
Agreed.  I really dislike using ',' like is shown above from a 
readability standpoint.


jeff



Re: [PATCH 6/n] OpenMP 4.0 offloading infrastructure: option handling

2015-04-27 Thread Thomas Schwinge
Hi!

Ping.

On Tue, 10 Mar 2015 13:37:47 +0100, I wrote:
> Ping.
> 
> On Thu, 19 Feb 2015 09:39:52 +0100, I wrote:
> > On Wed, 18 Feb 2015 18:04:21 +0100, I wrote:
> > > On Mon, 13 Oct 2014 14:33:11 +0400, Ilya Verbin  wrote:
> > > > On 13 Oct 12:19, Jakub Jelinek wrote:
> > > > > On Sat, Oct 11, 2014 at 06:49:00PM +0400, Ilya Verbin wrote:
> > > > > > 2. -foffload-abi=[lp64|ilp32]
> > > > > >This option is supposed to tell mkoffload (and offload compiler) 
> > > > > > which ABI is
> > > > > > used in streamed GIMPLE.  This option is desirable, because host 
> > > > > > and offload
> > > > > > compilers must have the same ABI.  The option is generated by the 
> > > > > > host compiler
> > > > > > automatically, it should not be specified by user.
> > > > > 
> > > > > But I'd like to understand why is this one needed.
> > > > > Why should the compilers care?  Aggregates layout and alignment of
> > > > > integral/floating types must match between host and offload 
> > > > > compilers, sure,
> > > > > but isn't that something streamed already in the LTO bytecode?
> > > > > Or is LTO streamer not streaming some types like long_type_node?
> > > > > I'd expect if host and offload compiler disagree on long type size 
> > > > > that
> > > > > you'd just use a different integral type with the same size as long 
> > > > > on the
> > > > > host.
> > > > > Different sized pointers are of course a bigger problem, but can't 
> > > > > you just
> > > > > error out on that during reading of the LTO, or even handle it (just 
> > > > > use
> > > > > some integral type for when is the pointer stored in memory, and just
> > > > > convert to pointer after reads from memory, and convert back before 
> > > > > storing
> > > > > to memory).  Erroring out during LTO streaming in sounds just fine to 
> > > > > me
> > > > > though.
> > > > 
> > > > Actually this option was developed by Bernd, so I think PTX team is 
> > > > going to use
> > > > it somehow.  In MIC's case we're planning just to check in mkoffload 
> > > > that host
> > > > and target compiler's ABI are the same.  Without this check we will 
> > > > crash in LTO
> > > > streamer with ICE, so I'd like to issue an error message, rather than 
> > > > crashing.
> > > 
> > > In gcc/config/i386/intelmic-mkoffload.c, this option is now parsed to
> > > initialize the target_ilp32 variable, which will then be used
> > > (target_ilp32 ? "-m32" : "-m64") when invoking different tools.
> > > 
> > > In nvptx, we've been using the following approach:
> > > 
> > > --- gcc/config/nvptx/nvptx.h
> > > +++ gcc/config/nvptx/nvptx.h
> > > @@ -54,24 +54,28 @@
> > >  
> > >  /* Type Layout.  */
> > >  
> > > +#define TARGET_64BIT \
> > > +  (flag_offload_abi == OFFLOAD_ABI_UNSET ? TARGET_ABI64 \
> > > +   : flag_offload_abi == OFFLOAD_ABI_LP64 ? true : false)
> > > +
> > >  #define DEFAULT_SIGNED_CHAR 1
> > >  
> > >  #define SHORT_TYPE_SIZE 16
> > >  #define INT_TYPE_SIZE 32
> > > -#define LONG_TYPE_SIZE (TARGET_ABI64 ? 64 : 32)
> > > +#define LONG_TYPE_SIZE (TARGET_64BIT ? 64 : 32)
> > >  #define LONG_LONG_TYPE_SIZE 64
> > >  #define FLOAT_TYPE_SIZE 32
> > >  #define DOUBLE_TYPE_SIZE 64
> > >  #define LONG_DOUBLE_TYPE_SIZE 64
> > >  
> > >  #undef SIZE_TYPE
> > > -#define SIZE_TYPE (TARGET_ABI64 ? "long unsigned int" : "unsigned int")
> > > +#define SIZE_TYPE (TARGET_64BIT ? "long unsigned int" : "unsigned int")
> > >  #undef PTRDIFF_TYPE
> > > -#define PTRDIFF_TYPE (TARGET_ABI64 ? "long int" : "int")
> > > +#define PTRDIFF_TYPE (TARGET_64BIT ? "long int" : "int")
> > >  
> > > -#define POINTER_SIZE (TARGET_ABI64 ? 64 : 32)
> > > +#define POINTER_SIZE (TARGET_64BIT ? 64 : 32)
> > >  
> > > -#define Pmode (TARGET_ABI64 ? DImode : SImode)
> > > +#define Pmode (TARGET_64BIT ? DImode : SImode)
> > >  
> > >  /* Registers.  Since ptx is a virtual target, we just define a few
> > > hard registers for special purposes and leave pseudos unallocated.  */
> > > 
> > > Should we settle on one of the two, that is, either pass -m[...] from
> > > mkoffload, or handle flag_offload_abi in the respective backend?  I think
> > > I prefer the intelmic-mkoffload.c approach; this seems cleaner to me:
> > > mkoffload "configures" the offloading compiler.  (Also, the flag 32-bit
> > > vs. 64-bit flag may in fact be needed for tools other than the offloading
> > > compiler).
> > 
> > Well, "for instance", when in -m32 mode, we also need to use it in
> > gcc/config/nvptx/mkoffload.c:compile_native -- otherwise, later on, the
> > linker will not be happy with the request to link a 64-bit object file
> > with 32-bit object files...
> > 
> > > Bernd, is there any specific reason for the approach you had
> > > chosen?
> > 
> > OK to do the following instead?  (Coding style/code copied from
> > gcc/config/i386/intelmic-mkoffload.c for consistency.)
> > 
> > --- gcc/config/nvptx/mkoffload.c
> > +++ gcc/config/nvptx/mkoffload.c
> > @@ -126,6 +126,9 @@ static id_map *var_ids, **vars_tail = &var_ids;
> >  static c

Re: [C/C++ PATCH] Improve -Wlogical-op (PR c/63357)

2015-04-27 Thread Marek Polacek
On Mon, Apr 27, 2015 at 10:06:26AM -0600, Jeff Law wrote:
> On 04/27/2015 08:47 AM, Marek Polacek wrote:
> >On Sat, Apr 25, 2015 at 10:18:59PM +0200, Gerald Pfeifer wrote:
> >>In case this example feels too contrived (even though it is an
> >>excerpt of Wine code), we now also warn about the following where
> >>the two types and variables are defined in different places and
> >>the size of one is set implicitly:
> >>
> >>   typedef int r_fun_t (int);
> >>
> >>   r_fun_t * text_funcs[] = {0,0,0};
> >>
> >>   int report (unsigned t)
> >>   {
> >> typedef int s_fun_t (long, char);
> >>
> >> static s_fun_t * GUI_funcs[3];
> >>
> >> return (t < sizeof text_funcs / sizeof text_funcs[0] &&
> >> t < sizeof GUI_funcs / sizeof GUI_funcs[0]);
> >>   }
> >>
> >>(I also filed this as https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65891
> >>so that we keep track.)
> >
> >I'm afraid there isn't an easy solution to this; the problem is that we fold
> >sizeof early, so the warning sees
> >
> >   t < sizeof 4 && t < 4
> >
> >and warns.  Maybe using SIZEOF_EXPR would help...
> Can you file a bug for this for future reference?  We may not tackle this
> specific issue in the current iteration of delayed folding, but it certainly
> helps to have BZs for specific issues that we'd like to fix once we can
> delay folding.

Gerald already opened PR65891 for this :).

Marek


Re: [PATCH][AArch64] Improve spill code - swap order in shl pattern

2015-04-27 Thread James Greenhalgh
On Mon, Apr 27, 2015 at 02:37:12PM +0100, Wilco Dijkstra wrote:
> Various instructions are supported as integer operations as well as SIMD on
> AArch64. When register pressure is high, lra-constraints inserts spill code
> without taking the allocation class into account, and basically chooses the
> first available pattern that matches. Since this instruction has the SIMD
> version first it is usually chosen eventhough some of the operands are
> eventually allocated to integer registers. The result is inefficient code not
> only due to the higher latency of SIMD instructions but also due to the extra
> int<->FP moves. Placing the integer variant first in the shl pattern
> generates far more optimal spill code. A few more patterns are the wrong way
> around, which I'll address in a separate patch. I'm also looking into fixing
> lra-constraints to generate the expected code by taking the allocno class
> into account in the cost calculations during spilling.
> 
> 2015-04-27  Wilco Dijkstra  
> 
> * gcc/config/aarch64/aarch64.md (aarch64_ashl_sisd_or_int_3):
> Place integer variant first.

OK, thanks for the fix.

Cheers,
James



Re: Offloading compilers' libgcc installation

2015-04-27 Thread Thomas Schwinge
Hi!

Ping.  (Or can Bernd just commit this patch,

(with my review comment addressed,
),
given his nvptx architecture maintainership?)

On Tue, 10 Mar 2015 13:34:51 +0100, I wrote:
> All the "offloading folks" agree, but we need someone to "formally
> approve" this patch?
> 
> 
> On Fri, 20 Feb 2015 22:27:43 +0300, Ilya Verbin  wrote:
> > On Fri, Feb 20, 2015 at 10:27:26 +0100, Thomas Schwinge wrote:
> > > On Thu, 19 Feb 2015 10:28:46 +0100, Bernd Schmidt 
> > >  wrote:
> > > > issue when trying to 
> > > > get at the libgcc for the nvptx accel compiler after it's been 
> > > > installed. The libgcc Makefile puts it in the wrong place - 
> > > > gcc/nvptx-none/accel/nvptx-none instead of gcc/host/accel/nvptx-none. 
> > > > The patch below corrects that and removes an intelmicemul special case 
> > > > which I believe has the same effect - Ilya, could you test this?
> > > 
> > > Works fine for me for intelmic (no changes), and nvptx (changes as
> > > expected).
> > 
> > OK to me.


Grüße,
 Thomas


signature.asc
Description: PGP signature


Re: Offloading compilers' libgcc installation

2015-04-27 Thread Jakub Jelinek
On Mon, Apr 27, 2015 at 06:14:36PM +0200, Thomas Schwinge wrote:
> Hi!
> 
> Ping.  (Or can Bernd just commit this patch,
> 
> (with my review comment addressed,
> ),
> given his nvptx architecture maintainership?)

Ok for trunk.

Jakub


Re: [PATCH] Properly valueize values we value-number to

2015-04-27 Thread Jeff Law

On 04/27/2015 06:46 AM, Richard Biener wrote:

On Mon, 27 Apr 2015, Richard Biener wrote:


On Fri, 24 Apr 2015, Jeff Law wrote:


On 02/17/2015 07:58 AM, Richard Biener wrote:
[ Restarting a old thread... ]


On a closer look the record_const_or_copy_1 hunk is redundant
(record_equality is really a bit obfuscated...).

Agreed.  I'm not entirely sure how it got to this point.


And record_equality is where the SSA_NAME_VALUEs might end up
forming chains?  At least because we might record a SSA_NAME_VALUE
for sth that might be the target of a SSA_NAME_VALUE, as in

   a_1 = b_2;  SSA_NAME_VALUE (a_1) == b_2;
   if (b_2 == i_3(D))
 ... temporarily SSA_NAME_VALUE (b_2) == i_3(D)

thus under if (b_2 == i_3(D)) SSA_NAME_VALUE (a_1) == b_2 and
SSA_NAME_VALUE (SSA_NAME_VALUE (a_1)) == i_3(D)?  I guess we
can't easily avoid that because we don't keep track of a
reverse mapping (nor would we want to update that).

Right.  In general, the fact that we're in SSA form means that we ought not
get loops in the SSA_NAME_VALUE chain since there's a single static assignment
and we'll visit it once.

That nice model breaks down in two ways.  First we derive equivalences from
equality conditions like you've shown above.  Second, when threading we can
thread through a loop latch and start reprocessing a block.  The interaction
between those two can result in loops in the value chain.

What I'm hoping to do in GCC6 is eliminate all this stuff with a threader that
is independent of the dominator walk (and optimizer). Instead of walking
forward through the dominator tree recording key nuggets, then removing those
nuggets as we pop back up the tree, we instead we start with the conditional
and walk up the use-def chains, simplifying as we go, with the goal of
simplifying to a constant, of course.

You can see the beginnings of that with the FSM bits from Sebastian.

Obviously until those bits are in place, we should try to clean up any warts
in the current implementation.



Btw, in record_equality, the == part of the <= check for the loop
depth will just randomly swap x and y while we should prefer
IL canonical order.  If we can't rely on the input being in IL
canonical order we should prepend the whole function with

   if (tree_swap_operands_p (x, y, false))
 std::swap (x, y);

and change <= to < for the loop-depth check.

Agreed.  Makes perfect sense.


I'm now re-bootstrapping and testing the following.


Committed as follows, with XFAILing and re-opening PR65217.

Richard.

2015-04-27  Richard Biener  

* tree-ssa-dom.c (record_equivalences_from_phis): Valueize PHI arg.
(record_equivalences_from_stmt): Valueize rhs.
(record_equality): Canonicalize x and y order via
tree_swap_operands_p.  Do not swap operands for same loop depth.

* gcc.target/i386/pr65217.c: XFAIL.
Looks good.  I'd spun the same record_equality changes over the weekend 
and have state on regressing 65217.


65217 is an interesting test.


  if ((n & -n) != n)
__builtin_unreachable();
  return n;

n & -n will (of course) get computed into an SSA_NAME.  We then 
propagate that name for the use of "n" in the return statement rather 
than using the effectively zero cost "n".


If we put some smarts into tree_swap_operands_p to order sensibly in 
this kind of case, we end up regressing a different case that I'll be 
looking at today.


jeff




Re: [PATCH][AArch64] Use conditional negate for abs expansion

2015-04-27 Thread James Greenhalgh
On Mon, Apr 27, 2015 at 02:42:36PM +0100, Wilco Dijkstra wrote:
> > -Original Message-
> > From: Wilco Dijkstra [mailto:wdijk...@arm.com]
> > Sent: 03 March 2015 16:19
> > To: GCC Patches
> > Subject: [PATCH][AArch64] Use conditional negate for abs expansion
> > 
> > Expand abs into a compare and conditional negate. This is the most obvious 
> > expansion, enables
> > merging of the comparison into ALU instructions and is faster on all 
> > implementations.
> > Bootstrapped & regression tested.
> > 
> > int f(int x) { return abs (x + 1); }
> > 
> > Before:
> > add w0, w0, 1
> > sxtwx0, w0
> > eor x1, x0, x0, asr 63
> > sub x1, x1, x0, asr 63
> > mov x0, x1
> > ret
> > 
> > After:
> > addsw0, w0, 1
> > csneg   w0, w0, w0, pl
> > ret
> > 
> > ChangeLog:
> > 
> > 2015-03-03  Wilco Dijkstra  
> > 
> > * gcc/config/aarch64/aarch64.md (absdi2): optimize abs expansion.
> > (csneg3_insn): enable expansion of pattern.
> > * gcc/testsuite/gcc.target/aarch64/abs_1.c (abs64): update test
> > for new abs expansion.  (abs64_in_dreg): likewise.


This looks like it breaks support for abs in a D register (for example
at the end of a loop, or extracted from Neon Intrinsics, etc).

e.g. (totally contrived...)

  int64x1_t
  abs_max (int64x2_t x, int64_t *wb)
  {
int64_t y = vgetq_lane_s64 (x, 0);
if (y < 0)
  y = -y;
return vdup_n_s64 (y);
  }

Which currently generates:

  abs_max:
  abs d0, d0
  ret

I suppose we don't need to worry too much about that (and the current
implementation doesn't seem to catch it reliably anyway), but it would be
good if we could keep the support - even if it is rarely used.

Thanks,
James

> > 
> > ---
> >  gcc/config/aarch64/aarch64.md| 33 
> > +++-
> >  gcc/testsuite/gcc.target/aarch64/abs_1.c |  5 ++---
> >  2 files changed, 9 insertions(+), 29 deletions(-)
> > 
> > diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> > index 1f4169e..46b7a63 100644
> > --- a/gcc/config/aarch64/aarch64.md
> > +++ b/gcc/config/aarch64/aarch64.md
> > @@ -2172,35 +2172,16 @@
> >[(set_attr "type" "alu_ext")]
> >  )
> > 
> > -(define_insn_and_split "absdi2"
> > -  [(set (match_operand:DI 0 "register_operand" "=&r,w")
> > -   (abs:DI (match_operand:DI 1 "register_operand" "r,w")))]
> > +(define_expand "abs2"
> > +  [(match_operand:GPI 0 "register_operand" "")
> > +   (match_operand:GPI 1 "register_operand" "")]
> >""
> > -  "@
> > -   #
> > -   abs\\t%d0, %d1"
> > -  "reload_completed
> > -   && GP_REGNUM_P (REGNO (operands[0]))
> > -   && GP_REGNUM_P (REGNO (operands[1]))"
> > -  [(const_int 0)]
> >{
> > -emit_insn (gen_rtx_SET (VOIDmode, operands[0],
> > -   gen_rtx_XOR (DImode,
> > -gen_rtx_ASHIFTRT (DImode,
> > -  operands[1],
> > -  GEN_INT (63)),
> > -operands[1])));
> > -emit_insn (gen_rtx_SET (VOIDmode,
> > -   operands[0],
> > -   gen_rtx_MINUS (DImode,
> > -  operands[0],
> > -  gen_rtx_ASHIFTRT (DImode,
> > -operands[1],
> > -GEN_INT (63);
> > +rtx ccreg = aarch64_gen_compare_reg (LT, operands[1], const0_rtx);
> > +rtx x = gen_rtx_LT (VOIDmode, ccreg, const0_rtx);
> > +emit_insn (gen_csneg3_insn (operands[0], x, operands[1], 
> > operands[1]));
> >  DONE;
> >}
> > -  [(set_attr "type" "alu_sreg")
> > -   (set_attr "simd" "no,yes")]
> >  )
> > 
> >  (define_insn "neg2"
> > @@ -2879,7 +2860,7 @@
> >[(set_attr "type" "csel")]
> >  )
> > 
> > -(define_insn "*csneg3_insn"
> > +(define_insn "csneg3_insn"
> >[(set (match_operand:GPI 0 "register_operand" "=r")
> >  (if_then_else:GPI
> >   (match_operand 1 "aarch64_comparison_operation" "")
> > diff --git a/gcc/testsuite/gcc.target/aarch64/abs_1.c
> > b/gcc/testsuite/gcc.target/aarch64/abs_1.c
> > index 938bc84..11f1095 100644
> > --- a/gcc/testsuite/gcc.target/aarch64/abs_1.c
> > +++ b/gcc/testsuite/gcc.target/aarch64/abs_1.c
> > @@ -7,15 +7,14 @@ extern void abort (void);
> >  long long
> >  abs64 (long long a)
> >  {
> > -  /* { dg-final { scan-assembler "eor\t" } } */
> > -  /* { dg-final { scan-assembler "sub\t" } } */
> > +  /* { dg-final { scan-assembler "csneg\t" } } */
> >return llabs (a);
> >  }
> > 
> >  long long
> >  abs64_in_dreg (long long a)
> >  {
> > -  /* { dg-final { scan-assembler "abs\td\[0-9\]+, d\[0-9\]+" } } */
> > +  /* { dg-final { scan-assembler "csneg\t" } } */
> >register long long x asm ("d8") = a;
> >register long long y asm (

Re: [PATCH, combine] Try REG_EQUAL for nonzero_bits

2015-04-27 Thread Jeff Law

On 04/27/2015 04:26 AM, Thomas Preud'homme wrote:

From: Jeff Law [mailto:l...@redhat.com]
Sent: Saturday, April 25, 2015 3:00 AM
Do you have a testcase where this change can result in better generated
code.  If so please add that testcase.  It's OK if it's ARM specific.


Hi Jeff,

Last time I tried I couldn't reduce the code to a small testcase but if I 
remember
well it was mostly due to the problem of finding a good test for creduce
(zero extension is not unique enough). I'll try again with a more manual 
approach
and get back to you.
OK.  No need for heroics -- give it a shot, but don't burn an insane 
amount of time on it.  If we can't get to a reasonable testcase, then so 
be it.


jeff


Re: [PATCH, RFC]: Next stage1, refactoring: propagating rtx subclasses

2015-04-27 Thread Jeff Law

On 04/25/2015 05:49 AM, Richard Sandiford wrote:



@@ -2099,9 +2107,9 @@ fix_crossing_conditional_branches (void)
  emit_label (new_label);

  gcc_assert (GET_CODE (old_label) == LABEL_REF);
- old_label = JUMP_LABEL (old_jump);
- new_jump = emit_jump_insn (gen_jump (old_label));
- JUMP_LABEL (new_jump) = old_label;
+ old_label_insn = JUMP_LABEL_AS_INSN (old_jump);
+ new_jump = emit_jump_insn (gen_jump (old_label_insn));
+ JUMP_LABEL (new_jump) = old_label_insn;

  last_bb = EXIT_BLOCK_PTR_FOR_FN (cfun)->prev_bb;
  new_bb = create_basic_block (new_label, new_jump, last_bb);


I think the eventual aim would be to have rtx_jump_insn member functions
that get and set the jump label as an rtx_insn, with JUMP_LABEL_AS_INSN
being a stepping stone towards that.  In cases like this it might make
more sense to ensure old_jump has the right type (rtx_jump_insn) and go
straight to the member functions, rather than switching to JUMP_LABEL_AS_INSN
now and then having to rewrite it later.
I'm comfortable with either way, so long as we get there.  I know that 
David certainly found it easier to introduce "scaffolding" early in this 
patch series, then exploit it, then tear down the scaffolding near the 
end of a patch series.



diff --git a/gcc/is-a.h b/gcc/is-a.h
index 58917eb..4fb9dde 100644
--- a/gcc/is-a.h
+++ b/gcc/is-a.h
@@ -46,6 +46,11 @@ TYPE as_a  (pointer)

do_something_with (as_a  *ptr);

+TYPE assert_as_a  (pointer)
+
+Like as_a  (pointer), but uses assertion, which is enabled even in
+non-checking (release) build.
+
  TYPE safe_as_a  (pointer)

  Like as_a  (pointer), but where pointer could be NULL.  This
@@ -193,6 +198,17 @@ as_a (U *p)
return is_a_helper ::cast (p);
  }

+/* Same as above, but checks the condition even in release build.  */
+
+template 
+inline T
+assert_as_a (U *p)
+{
+  gcc_assert (is_a  (p));
+  return is_a_helper ::cast (p);
+}
+
+
  /* Similar to as_a<>, but where the pointer can be NULL, even if
 is_a_helper doesn't check for NULL.  */


This preserves the behaviour of the original code but I'm not sure
it's worth it.  I doubt the distinction between:

   gcc_assert (JUMP_P (x));

and:

   gcc_checking_assert (JUMP_P (x));

was ever very scientific.  Seems like we should take this refactoring as
an opportunity to make the checking more consistent.
Without some guidelines I suspect usage of gcc_check_assert would be 
highly inconsistent.


And ultimately we want to get away from the helpers as much as possible, 
instead relying on the static typesystem to detect errors at compile 
time.  So unless there's a compelling reason, I'd prefer not to add more 
"support" for these helpers.


[ snip]



That seems pretty heavy-weight for LRA-local code.  Also, the long-term
plan is for INSN_LIST and rtx_insn to be in separate hierarchies,
at which point we'd have no alias-safe way to distinguish them.
That's certainly what I think ought to happen.  INSN_LIST should turn 
into a standard vector or forward list.  For the use cases in GCC, 
either ought to be acceptable.


Jeff


Re: [PATCH][OpenMP] Fix resolve_device with -foffload=disable

2015-04-27 Thread Ilya Verbin
On Mon, Apr 20, 2015 at 17:16:03 +0300, Ilya Verbin wrote:
> Hi!
> 
> Currently if a compiler is configured with enabled offloading, the 'devices'
> array in libgomp is filled properly with a number of available devices.
> However, if a program is compiled with -foffload=disable, the resolve_device
> function returns a pointer to the device, and host-fallback is not happening.
> The patch below fixes this issue.
> make check-target-libgomp passed.  OK for trunk?
> 
> 
> libgomp/
>   * libgomp.h (struct gomp_device_descr): Add num_images.
>   * target.c (resolve_device): Call gomp_init_device.  Return NULL if
>   there is no image loaded to the device.
>   (gomp_offload_image_to_device): Increase num_images.
>   (GOMP_offload_unregister): Decrease num_images.
>   (GOMP_target): Don't call gomp_init_device.
>   (GOMP_target_data): Ditto.
>   (GOMP_target_update): Ditto.
>   (gomp_target_init): Set num_images to 0.
>   * testsuite/libgomp.c/target-1-disable.c: New test.

Ping.

  -- Ilya


Re: var-tracking vs. pseudo registers (was: Option overriding in the offloading code path)

2015-04-27 Thread Thomas Schwinge
Hi!

On Thu, 26 Feb 2015 11:46:00 +0100, Jakub Jelinek  wrote:
> On Thu, Feb 26, 2015 at 11:27:06AM +0100, Thomas Schwinge wrote:
> > Is this not the right way to skip it, or, Bernd, is this because we're
> > not yet handling some debug stuff in nvptx?  (I tested that
> > 
> > does not help with that.)  The following does make it work (that is,
> > resolve the ICEs), but that feels a bit too much ;-) of a hack:
> > 
> > --- gcc/var-tracking.c
> > +++ gcc/var-tracking.c
> > @@ -10305,7 +10322,8 @@ variable_tracking_main_1 (void)
> >  {
> >bool success;
> >  
> > -  if (flag_var_tracking_assignments < 0)
> > +  if (flag_var_tracking_assignments < 0
> > +  || targetm.no_register_allocation)
> >  {
> >delete_debug_insns ();
> >return 0;
> 
> No, IMHO that is the right fix, not a hack.  But of course would deserve
> a comment.

Jakub has done that in r221693,

-- thanks!


> Plus of course you can also just override
> flag_var_tracking_assignments in the nvptx override option hook, and
> perhaps in tree-streamer-in.c clear flag_var_tracking_assignments in the
> OPTIMIZATION_NODE too (or just add a hook for that, as I said before).

That remains to be done.


Given Jakub's commit got merged into gomp-4_0-branch, I committed the
following to gomp-4_0-branch in r222472:

commit 9973ee1f73fc61ad643e3f7b83331a3e3fef1fd4
Author: tschwinge 
Date:   Mon Apr 27 16:41:31 2015 +

Revert "libgomp Fortran testing: for -g torture testing, disable variable 
tracking."

This reverts commit r221411.

libgomp/
* testsuite/libgomp.fortran/fortran.exp (DG_TORTURE_OPTIONS):
Don't add -fno-var-tracking next to any -g usage.
* testsuite/libgomp.oacc-fortran/fortran.exp (DG_TORTURE_OPTIONS):
Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@222472 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog.gomp |7 +++
 libgomp/testsuite/libgomp.fortran/fortran.exp  |4 
 libgomp/testsuite/libgomp.oacc-fortran/fortran.exp |4 
 3 files changed, 7 insertions(+), 8 deletions(-)

diff --git libgomp/ChangeLog.gomp libgomp/ChangeLog.gomp
index 3d762bd..0f1b23d 100644
--- libgomp/ChangeLog.gomp
+++ libgomp/ChangeLog.gomp
@@ -1,3 +1,10 @@
+2015-04-27  Thomas Schwinge  
+
+   * testsuite/libgomp.fortran/fortran.exp (DG_TORTURE_OPTIONS):
+   Don't add -fno-var-tracking next to any -g usage.
+   * testsuite/libgomp.oacc-fortran/fortran.exp (DG_TORTURE_OPTIONS):
+   Likewise.
+
 2015-04-21  Tom de Vries  
Thomas Schwinge  
 
diff --git libgomp/testsuite/libgomp.fortran/fortran.exp 
libgomp/testsuite/libgomp.fortran/fortran.exp
index ead94f3..9e6b643 100644
--- libgomp/testsuite/libgomp.fortran/fortran.exp
+++ libgomp/testsuite/libgomp.fortran/fortran.exp
@@ -21,10 +21,6 @@ dg-init
 # Turn on OpenMP.
 lappend ALWAYS_CFLAGS "additional_flags=-fopenmp"
 
-# TODO: for -g torture testing, disable variable tracking.
-# 
.
-regsub -all -- { -g[^ ]*} $DG_TORTURE_OPTIONS {& -fno-var-tracking} 
DG_TORTURE_OPTIONS
-
 if { $blddir != "" } {
 set lang_source_re {^.*\.[fF](|90|95|03|08)$}
 set lang_include_flags "-fintrinsic-modules-path=${blddir}"
diff --git libgomp/testsuite/libgomp.oacc-fortran/fortran.exp 
libgomp/testsuite/libgomp.oacc-fortran/fortran.exp
index 203106e..a68e039 100644
--- libgomp/testsuite/libgomp.oacc-fortran/fortran.exp
+++ libgomp/testsuite/libgomp.oacc-fortran/fortran.exp
@@ -23,10 +23,6 @@ dg-init
 # Turn on OpenACC.
 lappend ALWAYS_CFLAGS "additional_flags=-fopenacc"
 
-# TODO: for -g torture testing, disable variable tracking.
-# 
.
-regsub -all -- { -g[^ ]*} $DG_TORTURE_OPTIONS {& -fno-var-tracking} 
DG_TORTURE_OPTIONS
-
 if { $blddir != "" } {
 set lang_source_re {^.*\.[fF](|90|95|03|08)$}
 set lang_include_flags "-fintrinsic-modules-path=${blddir}"


Grüße,
 Thomas


signature.asc
Description: PGP signature


Re: [PATCH, RFC]: Next stage1, refactoring: propagating rtx subclasses

2015-04-27 Thread Richard Sandiford
Jeff Law  writes:
> On 04/25/2015 05:49 AM, Richard Sandiford wrote:
>> That seems pretty heavy-weight for LRA-local code.  Also, the long-term
>> plan is for INSN_LIST and rtx_insn to be in separate hierarchies,
>> at which point we'd have no alias-safe way to distinguish them.
> That's certainly what I think ought to happen.  INSN_LIST should turn 
> into a standard vector or forward list.  For the use cases in GCC, 
> either ought to be acceptable.

OK.  But I think whatever replaces INSN_LIST will still need to be GCed,
for uses such as nonlocal_goto_handler_labels.  My point was that in this
case we don't want a GCed list, so it'd be better to avoid INSN_LIST
altogether.

Thanks,
Richard



RE: [PATCH][AArch64] Use conditional negate for abs expansion

2015-04-27 Thread Wilco Dijkstra
> James Greenhalgh wrote:
> On Mon, Apr 27, 2015 at 02:42:36PM +0100, Wilco Dijkstra wrote:
> > > -Original Message-
> > > From: Wilco Dijkstra [mailto:wdijk...@arm.com]
> > > Sent: 03 March 2015 16:19
> > > To: GCC Patches
> > > Subject: [PATCH][AArch64] Use conditional negate for abs expansion
> > >
> > > Expand abs into a compare and conditional negate. This is the most 
> > > obvious expansion,
> enables
> > > merging of the comparison into ALU instructions and is faster on all 
> > > implementations.
> > > Bootstrapped & regression tested.
> > >
> > > int f(int x) { return abs (x + 1); }
> > >
> > > Before:
> > > add w0, w0, 1
> > > sxtwx0, w0
> > > eor x1, x0, x0, asr 63
> > > sub x1, x1, x0, asr 63
> > > mov x0, x1
> > > ret
> > >
> > > After:
> > > addsw0, w0, 1
> > > csneg   w0, w0, w0, pl
> > > ret
> > >
> > > ChangeLog:
> > >
> > > 2015-03-03  Wilco Dijkstra  
> > >
> > >   * gcc/config/aarch64/aarch64.md (absdi2): optimize abs expansion.
> > >   (csneg3_insn): enable expansion of pattern.
> > >   * gcc/testsuite/gcc.target/aarch64/abs_1.c (abs64): update test
> > >   for new abs expansion.  (abs64_in_dreg): likewise.
> 
> 
> This looks like it breaks support for abs in a D register (for example
> at the end of a loop, or extracted from Neon Intrinsics, etc).
> 
> e.g. (totally contrived...)
> 
>   int64x1_t
>   abs_max (int64x2_t x, int64_t *wb)
>   {
> int64_t y = vgetq_lane_s64 (x, 0);
> if (y < 0)
>   y = -y;
> return vdup_n_s64 (y);
>   }
> 
> Which currently generates:
> 
>   abs_max:
>   abs d0, d0
>   ret
> 
> I suppose we don't need to worry too much about that (and the current
> implementation doesn't seem to catch it reliably anyway), but it would be
> good if we could keep the support - even if it is rarely used.

Well it may be possible to write a peephole for this rare case, but I hope
people would use a vabs if that's what they wanted...

Wilco




C PATCH to reject va_arg (ap, void) (PR c/65901)

2015-04-27 Thread Marek Polacek
As reported in the PR, va_arg (ap, void) probably doesn't make any sense and
should be rejected thus.  cc1plus and clang reject va_arg with an incomplete
type.  This patch makes cc1 reject it as well.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2015-04-27  Marek Polacek  

PR c/65901
* c-typeck.c (c_build_va_arg): Require TYPE be a complete type.

* gcc.c-torture/compile/pr48767.c (foo): Add dg-error.
* gcc.dg/pr65901.c: New test.

diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c
index 91735b5..c58e918 100644
--- gcc/c/c-typeck.c
+++ gcc/c/c-typeck.c
@@ -12648,6 +12648,11 @@ c_build_va_arg (location_t loc, tree expr, tree type)
   if (warn_cxx_compat && TREE_CODE (type) == ENUMERAL_TYPE)
 warning_at (loc, OPT_Wc___compat,
"C++ requires promoted type, not enum type, in %");
+  if (type == error_mark_node || !COMPLETE_TYPE_P (type))
+{
+  c_incomplete_type_error (NULL_TREE, type);
+  return error_mark_node;
+}
   return build_va_arg (loc, expr, type);
 }
 
diff --git gcc/testsuite/gcc.c-torture/compile/pr48767.c 
gcc/testsuite/gcc.c-torture/compile/pr48767.c
index 66cb348..c8fef35 100644
--- gcc/testsuite/gcc.c-torture/compile/pr48767.c
+++ gcc/testsuite/gcc.c-torture/compile/pr48767.c
@@ -3,5 +3,5 @@
 void
 foo (__builtin_va_list ap)
 {
-  __builtin_va_arg (ap, void);
+  __builtin_va_arg (ap, void);  /* { dg-error "invalid use of void expression" 
} */
 }
diff --git gcc/testsuite/gcc.dg/pr65901.c gcc/testsuite/gcc.dg/pr65901.c
index e69de29..8708a1e 100644
--- gcc/testsuite/gcc.dg/pr65901.c
+++ gcc/testsuite/gcc.dg/pr65901.c
@@ -0,0 +1,16 @@
+/* PR c/65901 */
+/* { dg-do compile } */
+/* { dg-options "" } */
+
+struct S;
+enum E;
+union U;
+
+void
+foo (__builtin_va_list ap)
+{
+  __builtin_va_arg (ap, void);  /* { dg-error "invalid use of void expression" 
} */
+  __builtin_va_arg (ap, struct S);  /* { dg-error "invalid use of undefined 
type" } */
+  __builtin_va_arg (ap, enum E);  /* { dg-error "invalid use of undefined 
type" } */
+  __builtin_va_arg (ap, union U);  /* { dg-error "invalid use of undefined 
type" } */
+}

Marek


Re: [Patch, fortran, pr59678, v1] -- [F03] Segfault on equalizing variables of a complex derived type

2015-04-27 Thread Andre Vehreschild
Hi Paul, hi all,

Paul, thanks for the review. I have integrated all your comments (i) to (iii)
and commited as r222477.

Regards,
Andre

On Sat, 25 Apr 2015 16:42:52 +0200
Paul Richard Thomas  wrote:

> Dear Andre,
> 
> The patch is OK with three changes:
> (i) Put the PR line in the testsuite ChangeLog;
> (ii) Put the dg-do header information in the testcase, together with
> lines to say which PR it fixes and who the contributor is; and
> (iii) Add the testcase for PR65841 since your patch for pr65792 breaks
> this side-effect fix.
> 
> I will turn my attention to your patch for pr65792 next and try to
> figure out why (iii) is necessary.
> 
> Thanks for the patch
> 
> Paul
> 
> 
> 
> On 18 April 2015 at 12:55, Andre Vehreschild  wrote:
> > Hi all,
> >
> > this patch fixes a deep copy issue, when allocatable components of an entity
> > were not allocated. Before the patch the deep copy was run without
> > checking if the component is actually allocated and the program crashed
> > because a null pointer was dereferenced. Furthermore, was the code to copy
> > a structure component not checking the correct ref to determine whether a
> > component was allocated, when allocatable components were nested. Example:
> >
> > type InnerT
> >   integer, allocatable :: inner_I
> > end type
> > type T
> >   type(InnerT), allocatable :: in
> > end type
> >
> > The pseudo pseudo code generated for this was something like:
> >
> > subroutine copy(src,dst)
> >   dst = src
> >   if (allocated (src.in.inner_I)) // crash
> > allocate (dst.in)
> >   end if
> >
> >   dst.in.inner_I = src.in.inner_I // crash
> > end subroutine
> >
> > The patch fixes this by generating:
> >
> > subroutine copy(src,dst)
> >   dst = src
> >   if (allocated (src.in))
> > allocate (dst.in)
> > dst.in= src.in
> > if (allocated (src.in.inner_I))
> >   allocate (dst.in.inner_I)
> >   dst.in.inner_I = src.in.inner_I
> > end
> >   end
> > end subroutine
> >
> > Of course is this pseudo pseudo code shortened dramatically to show just the
> > necessary bits.
> >
> > Bootstraps and regtests ok on x86_64-linux-gnu/F21.
> >
> > Ok, for trunk?
> >
> > Thanks to Dominique for identifying the pr addressed by this patch.
> >
> > Regards,
> > Andre
> > --
> > Andre Vehreschild * Email: vehre ad gmx dot de
> 
> 
> 


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
Index: gcc/testsuite/gfortran.dg/alloc_comp_deep_copy_2.f03
===
--- gcc/testsuite/gfortran.dg/alloc_comp_deep_copy_2.f03	(Revision 0)
+++ gcc/testsuite/gfortran.dg/alloc_comp_deep_copy_2.f03	(Revision 222477)
@@ -0,0 +1,21 @@
+! { dg-do run }
+!
+! Testcase for PR fortran/65841
+! Contributed by Damian Rousson
+!
+program alloc_comp_deep_copy_2
+  type a
+real, allocatable :: f
+  end type
+  type b
+type(a), allocatable :: g
+  end type
+
+  type(b) c,d
+
+  c%g=a(1.) 
+  d=c
+  if (d%g%f /= 1.0) call abort()
+  d%g%f = 2.0
+  if (d%g%f /= 2.0) call abort()
+end program
Index: gcc/testsuite/gfortran.dg/alloc_comp_deep_copy_1.f03
===
--- gcc/testsuite/gfortran.dg/alloc_comp_deep_copy_1.f03	(Revision 0)
+++ gcc/testsuite/gfortran.dg/alloc_comp_deep_copy_1.f03	(Revision 222477)
@@ -0,0 +1,270 @@
+! { dg-do run }
+!
+! Check fix for correctly deep copying allocatable components.
+! PR fortran/59678
+! Contributed by Andre Vehreschild  
+!
+program alloc_comp_copy_test
+
+  type InnerT
+integer :: ii
+integer, allocatable :: ai
+integer, allocatable :: v(:)
+  end type InnerT
+
+  type T
+integer :: i
+integer, allocatable :: a_i
+type(InnerT), allocatable :: it
+type(InnerT), allocatable :: vec(:)
+  end type T
+
+  type(T) :: o1, o2
+  class(T), allocatable :: o3, o4
+  o1%i = 42
+
+  call copyO(o1, o2)
+  if (o2%i /= 42) call abort ()
+  if (allocated(o2%a_i)) call abort()
+  if (allocated(o2%it)) call abort()
+  if (allocated(o2%vec)) call abort()
+
+  allocate (o1%a_i, source=2)
+  call copyO(o1, o2)
+  if (o2%i /= 42) call abort ()
+  if (.not. allocated(o2%a_i)) call abort()
+  if (o2%a_i /= 2) call abort()
+  if (allocated(o2%it)) call abort()
+  if (allocated(o2%vec)) call abort()
+
+  allocate (o1%it)
+  o1%it%ii = 3
+  call copyO(o1, o2)
+  if (o2%i /= 42) call abort ()
+  if (.not. allocated(o2%a_i)) call abort()
+  if (o2%a_i /= 2) call abort()
+  if (.not. allocated(o2%it)) call abort()
+  if (o2%it%ii /= 3) call abort()
+  if (allocated(o2%it%ai)) call abort()
+  if (allocated(o2%it%v)) call abort()
+  if (allocated(o2%vec)) call abort()
+
+  allocate (o1%it%ai)
+  o1%it%ai = 4
+  call copyO(o1, o2)
+  if (o2%i /= 42) call abort ()
+  if (.not. allocated(o2%a_i)) call abort()
+  if (o2%a_i /= 2) call abort()
+  if (.not. allocated(o2%it)) call abort()
+  if (o2%it%ii /= 3) call abort()
+  if (.not. allocated(o2%it%ai)) call abort()
+  if (o2%it%ai /= 4) call abort()
+  if (allocated(o2

Re: [commited, Patch, Fortran, pr60322, addendum] was: [Patch 1/2, Fortran, pr60322] [OOP] Incorrect bounds on polymorphic dummy array

2015-04-27 Thread Andre Vehreschild
Hi all,

sorry, I forgot to svn-add the testcases for the patch of pr60322. I fixed this
with commit r222478.

My apologies for the oversight.

Regards,
Andre

On Thu, 23 Apr 2015 13:34:16 +0200
Andre Vehreschild  wrote:

> Hi Paul, hi all,
> 
> Paul, thanks for the review. I have commited this as r222361.
> 
> Regards,
>   Andre
> 
> On Thu, 16 Apr 2015 21:13:31 +0200
> Paul Richard Thomas  wrote:
> 
> > Hi Andre,
> > 
> > The delta patch is OK for trunk and eventual backport to 5.2.
> > 
> > Thanks for all the hard work
> > 
> > Paul
> > 
> > On 14 April 2015 at 19:00, Andre Vehreschild  wrote:
> > > Hi all,
> > >
> > > during further testing of a big Fortran software I encounter two bugs with
> > > class arrays, that are somehow connected to pr60322. I therefore propose
> > > an extended patch for pr60322. Because Paul has already reviewed most the
> > > extended patch, I give you two patches:
> > >
> > > 1. a full patch, fixing all the issues connected to pr60322, and
> > > 2. a delta patch to get from the reviewed patch to the latest version.
> > >
> > > With the second patch I hope to get a faster review, because it is
> > > significantly shorter.
> > >
> > > Now what was the issue? To be precise there were two issues:
> > >
> > > i. a pointer to a class array (CLASS_DATA(sym).attr.class_pointer == 1)
> > > was dereferenced, which lead to an ICE (the patch for this in the delta is
> > > chunk 5 in gfc_conv_expr_descriptor, and
> > >
> > > ii. (and this was a severe brain cracker) in chains of references
> > > consisting of more then one class-(array)-ref always the _vptr of the
> > > first symbol was taken and not the _vptr of the currently dereferenced
> > > class object. This occurred when fortran code similiar to this was
> > > executed:
> > >
> > > type innerT
> > >   integer, allocatable :: arr(:)
> > > end type
> > >
> > > type T
> > >   class(innerT) :: mat(:,:)
> > > end type
> > >
> > > class(T) :: o
> > >
> > > allocate(o%mat(2,2))
> > > allocate(o%mat(:,:)%arr(10)) ! This is obviously pseudo code,
> > > ! but I think you get what is meant.
> > >
> > > o%mat(1,1)%arr(1) = 1
> > >
> > > In the last line the address to get to arr(1) was computed using the
> > > _vptr->size of o and not of o%mat(1,1). To fix this gfc_component_ref ()
> > > now computes the class' _vptr-ref whenever it does a _data-ref (chunk 1 of
> > > trans-expr.c in the delta patch). The _vptr-ref is stored in gfc_se,
> > > where I added the new member class_vptr. The gfc_se->class_vptr is then
> > > used in array-refs (chunk 2 of trans.c) to get the size of the array
> > > elements of the correct level.
> > >
> > > The other chunks of the delta patch are:
> > > - parameter passing fixes, and
> > > - documentation fixes as requested for the version 5 of the pr60322 patch.
> > >
> > > I hope this helps in getting the patch reviewed quickly.
> > >
> > > Bootstraps and regtests ok on x86_64-linux-gnu/F21.
> > >
> > > Ok for trunk -> 6.0?
> > > Ok, for backport to 5.2, once available?
> > >
> > > Note, the patches may apply with shifts, as I forgot to update before
> > > taking the diffs.
> > >
> > > Regards,
> > > Andre
> > >
> > > On Thu, 9 Apr 2015 14:37:09 +0200
> > > Andre Vehreschild  wrote:
> > >
> > >> Hi Paul, hi all,
> > >>
> > >> Paul, thanks for the review. Answers to your questions are inline below:
> > >>
> > >> On Sun, 5 Apr 2015 11:13:05 +0200
> > >> Paul Richard Thomas  wrote:
> > >> 
> > >> > +  /* The dummy is returned for pointer, allocatable or assumed rank
> > >> > arrays.
> > >> > + The check for pointerness needs to be repeated here (it is done
> > >> > in
> > >> > + IS_CLASS_ARRAY (), too), because for class arrays that are
> > >> > pointers, as
> > >> > + is the one of the sym, which is incorrect here.  */
> > >> >
> > >> > What does this mean, please?
> > >>
> > >> The first sentence is about regular arrays and should be unchanged from
> > >> the original source. Then I have to check for class (arrays) that are
> > >> pointers, i.e., independent of whether the sym is a class array or a
> > >> regular pointer to a class object. (The latter shouldn't make it into
> > >> the routine anyway.) IS_CLASS_ARRAY () returns false for too many
> > >> reasons to be of use here. I have to apologize and confess that the
> > >> comment was a mere note to myself to not return to use is_classarray in
> > >> the if below. Let me rephrase the comment to be:
> > >>
> > >> /* The dummy is returned for pointer, allocatable or assumed rank arrays.
> > >>For class arrays the information if sym is an allocatable or pointer
> > >>object needs to be checked explicitly (IS_CLASS_ARRAY can be false for
> > >>too many reasons to be of use here).  */
> > >>
> > >> > +  /* Returning the descriptor for dummy class arrays is hazardous,
> > >> > because
> > >> > + some caller is expecting an expression to apply the component
> > >> > refs to.
> > >> > + Therefore the descri

Re: C PATCH to reject va_arg (ap, void) (PR c/65901)

2015-04-27 Thread Jeff Law

On 04/27/2015 11:30 AM, Marek Polacek wrote:

As reported in the PR, va_arg (ap, void) probably doesn't make any sense and
should be rejected thus.  cc1plus and clang reject va_arg with an incomplete
type.  This patch makes cc1 reject it as well.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2015-04-27  Marek Polacek  

PR c/65901
* c-typeck.c (c_build_va_arg): Require TYPE be a complete type.

* gcc.c-torture/compile/pr48767.c (foo): Add dg-error.
* gcc.dg/pr65901.c: New test.

OK.
jeff



Re: Extend list of codes accepted by address_operand

2015-04-27 Thread Jeff Law

On 04/27/2015 04:07 AM, Richard Sandiford wrote:

While testing the patch that I'm about to post, I found that the list
of address_operand codes in gensupport.c was incomplete.  This patch
fills in some "obvious" missing entries.

It isn't exactly clear which codes are allowed at the top level of
an address.  decompose_address is known to be general enough for all LRA
targets, but it also technically accepts any unary operator as the outer
code, whereas I think only SIGN_EXTEND and ZERO_EXTEND are used at the
top level.  The routine accepts ZERO_EXTRACT and SIGN_EXTRACT, but I
think in practice they're only used for index values and not full addresses.

Tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard


gcc/
* gensupport.c (std_preds): Add missing codes to address_operand entry.

OK.
jeff



Re: [PATCH] Properly valueize values we value-number to

2015-04-27 Thread Richard Biener
On April 27, 2015 6:24:47 PM GMT+02:00, Jeff Law  wrote:
>On 04/27/2015 06:46 AM, Richard Biener wrote:
>> On Mon, 27 Apr 2015, Richard Biener wrote:
>>
>>> On Fri, 24 Apr 2015, Jeff Law wrote:
>>>
 On 02/17/2015 07:58 AM, Richard Biener wrote:
 [ Restarting a old thread... ]

> On a closer look the record_const_or_copy_1 hunk is redundant
> (record_equality is really a bit obfuscated...).
 Agreed.  I'm not entirely sure how it got to this point.

> And record_equality is where the SSA_NAME_VALUEs might end up
> forming chains?  At least because we might record a SSA_NAME_VALUE
> for sth that might be the target of a SSA_NAME_VALUE, as in
>
>a_1 = b_2;  SSA_NAME_VALUE (a_1) == b_2;
>if (b_2 == i_3(D))
>  ... temporarily SSA_NAME_VALUE (b_2) == i_3(D)
>
> thus under if (b_2 == i_3(D)) SSA_NAME_VALUE (a_1) == b_2 and
> SSA_NAME_VALUE (SSA_NAME_VALUE (a_1)) == i_3(D)?  I guess we
> can't easily avoid that because we don't keep track of a
> reverse mapping (nor would we want to update that).
 Right.  In general, the fact that we're in SSA form means that we
>ought not
 get loops in the SSA_NAME_VALUE chain since there's a single static
>assignment
 and we'll visit it once.

 That nice model breaks down in two ways.  First we derive
>equivalences from
 equality conditions like you've shown above.  Second, when
>threading we can
 thread through a loop latch and start reprocessing a block.  The
>interaction
 between those two can result in loops in the value chain.

 What I'm hoping to do in GCC6 is eliminate all this stuff with a
>threader that
 is independent of the dominator walk (and optimizer). Instead of
>walking
 forward through the dominator tree recording key nuggets, then
>removing those
 nuggets as we pop back up the tree, we instead we start with the
>conditional
 and walk up the use-def chains, simplifying as we go, with the goal
>of
 simplifying to a constant, of course.

 You can see the beginnings of that with the FSM bits from
>Sebastian.

 Obviously until those bits are in place, we should try to clean up
>any warts
 in the current implementation.

>
> Btw, in record_equality, the == part of the <= check for the loop
> depth will just randomly swap x and y while we should prefer
> IL canonical order.  If we can't rely on the input being in IL
> canonical order we should prepend the whole function with
>
>if (tree_swap_operands_p (x, y, false))
>  std::swap (x, y);
>
> and change <= to < for the loop-depth check.
 Agreed.  Makes perfect sense.
>>>
>>> I'm now re-bootstrapping and testing the following.
>>
>> Committed as follows, with XFAILing and re-opening PR65217.
>>
>> Richard.
>>
>> 2015-04-27  Richard Biener  
>>
>>  * tree-ssa-dom.c (record_equivalences_from_phis): Valueize PHI arg.
>>  (record_equivalences_from_stmt): Valueize rhs.
>>  (record_equality): Canonicalize x and y order via
>>  tree_swap_operands_p.  Do not swap operands for same loop depth.
>>
>>  * gcc.target/i386/pr65217.c: XFAIL.
>Looks good.  I'd spun the same record_equality changes over the weekend
>
>and have state on regressing 65217.
>
>65217 is an interesting test.
>
>
>   if ((n & -n) != n)
> __builtin_unreachable();
>   return n;
>
>n & -n will (of course) get computed into an SSA_NAME.  We then 
>propagate that name for the use of "n" in the return statement rather 
>than using the effectively zero cost "n".
>
>If we put some smarts into tree_swap_operands_p to order sensibly in 
>this kind of case, we end up regressing a different case that I'll be 
>looking at today.

In this case the temporary we propagate has a single use (in the comparison).  
Might be interesting to disallow this case by extra checks in record equality.  
I wouldn't change tree_swap_operands_p.

Richard.

>
>jeff




Re: [Patch, Fortran] Simplify lbound

2015-04-27 Thread Thomas Koenig
Am 25.04.2015 um 20:12 schrieb Mikael Morin:

> I've double-checked in the standard, and it seems it is not possible to
> simplify after all:
> 
>   If ARRAY is a whole array and either ARRAY is an assumed-size
>   array of rank DIM or dimension DIM of ARRAY has nonzero extent,
>   LBOUND (ARRAY, DIM) has a value equal to the lower bound for
>   subscript DIM of ARRAY. Otherwise the result value is 1.
> 
> We can't tell whether the array is zero-sized, so we can't tell the
> lbound value.

So it is only possible to simplify LBOUND if the lower bound is
equal to one, both for assumed-shape and explicit-shape arrays...
OK.

The attached patch does that, including a test case which catches
that particular case.

> As you may want to simplify in the limited scope of the matmul inlining,
> I'm giving comments about the patch (otherwise you can ignore them):
>  - No need to check for allocatable or pointer, it should be excluded by
> as->type == AS_ASSUMED_SHAPE (but does no harm either).

Actually, no.  You can have assumed-shape allocatable or pointer
dummy arguments which keep their original lbound; see the subroutine
'bar' in the test case.

>  - Please modify the early return condition:
>  if (as && (as->type == AS_DEFERRED || as->type == AS_ASSUMED_SHAPE
>   || as->type == AS_ASSUMED_RANK))
>return NULL;
>and let the existing code do the simplification work.

That is not part of my patch.

> Or drop the lbound simplification idea, and fetch the lbound "by hand"
> at matmul inline time.

I will probably do so as a future optimization, but I think that most
people will see no reason for using different lower bounds, so it is
OK for the time being to (slightly) pessimize this case.

So... here is the new patch.  OK for trunk?

Thomas

2015-04-25  Thomas Koenig  

PR fortran/37131
* simplify.c (simplify_bound): Get constant lower bounds of one
from array spec for assumed and explicit shape shape arrays if
the lower bounds are indeed one.

2015-04-25  Thomas Koenig  

PR fortran/37131
* gfortran.dg/coarray_lib_this_image_2.f90:  Adjust
scan pattern.
* gfortran.dg/bound_9.f90:  New test case.

P.S:

In an earlier version, I also added

Index: trans-array.c
===
--- trans-array.c   (Revision 222431)
+++ trans-array.c   (Arbeitskopie)
@@ -5693,6 +5693,7 @@ gfc_trans_array_bounds (tree type, gfc_symbol * sy
 to being zero size.  */
  tmp = fold_build2_loc (input_location, GE_EXPR, boolean_type_node,
 stride, gfc_index_zero_node);
+ tmp = gfc_likely (tmp, PRED_FORTRAN_SIZE_ZERO);
  tmp = fold_build3_loc (input_location, COND_EXPR,
 gfc_array_index_type, tmp,
 stride, gfc_index_zero_node);

but that caused the condition to always return true.  I haven't figured
out why, but either I am misunderstanding something here, or gfc_likely
is buggy, or both.

Index: simplify.c
===
--- simplify.c	(Revision 222431)
+++ simplify.c	(Arbeitskopie)
@@ -3445,6 +3445,39 @@ simplify_bound (gfc_expr *array, gfc_expr *dim, gf
 
  done:
 
+  /* If the array shape is assumed shape or explicit, we can simplify lbound
+ to 1 if the given lower bound is one because this matches what lbound
+ should return for an empty array.  */
+
+  if (!upper && as && dim && dim->expr_type == EXPR_CONSTANT
+  && (as->type == AS_ASSUMED_SHAPE || as->type == AS_EXPLICIT) 
+  && ref->u.ar.type != AR_SECTION)
+{
+  /* Watch out for allocatable or pointer dummy arrays, they can have
+	 lower bounds that are not equal to one.  */
+  if (!(array->symtree && array->symtree->n.sym
+	&& (array->symtree->n.sym->attr.allocatable
+		|| array->symtree->n.sym->attr.pointer)))
+	{
+	  unsigned long int ndim;
+	  gfc_expr *lower, *res;
+
+	  ndim = mpz_get_si (dim->value.integer) - 1;
+	  lower = as->lower[ndim];
+	  if (lower->expr_type == EXPR_CONSTANT
+	  && mpz_cmp_si (lower->value.integer, 1) == 0)
+	{
+	  res = gfc_copy_expr (lower);
+	  if (kind)
+		{
+		  int nkind = mpz_get_si (kind->value.integer);
+		  res->ts.kind = nkind;
+		}
+	  return res;
+	}
+	}
+}
+
   if (as && (as->type == AS_DEFERRED || as->type == AS_ASSUMED_SHAPE
 	 || as->type == AS_ASSUMED_RANK))
 return NULL;
! { dg-do  run }
! { dg-options "-fdump-tree-original" }
! Check for different combinations of lbound for dummy arrays,
! stressing empty arrays.  The assignments with "one =" should
! be simplified at compile time.
module tst
  implicit none
contains
  subroutine foo (a, b, one, m)
integer, dimension(:), intent(in) :: a
integer, dimension (-2:), intent(in) :: b
integer, intent(out) :: one, m
one = lbound(a,1)
m = lbound(b,1)
  

Re: [PATCH] Properly valueize values we value-number to

2015-04-27 Thread Jeff Law

On 04/27/2015 12:29 PM, Richard Biener wrote:

On April 27, 2015 6:24:47 PM GMT+02:00, Jeff Law 

n & -n will (of course) get computed into an SSA_NAME.  We then
propagate that name for the use of "n" in the return statement
rather than using the effectively zero cost "n".

If we put some smarts into tree_swap_operands_p to order sensibly
in this kind of case, we end up regressing a different case that
I'll be looking at today.


In this case the temporary we propagate has a single use (in the
comparison).  Might be interesting to disallow this case by extra
checks in record equality.  I wouldn't change tree_swap_operands_p.

Yea, I keep going back and forth over whether or not the heuristic 
should go in tree_swap_operand_p or in record_equality.


It's a fairly narrow issue, so maybe record_equality would be a better spot.

jeff


Re: [PATCH] Add new target h8300-*-linux

2015-04-27 Thread Jeff Law

On 04/26/2015 12:21 AM, Yoshinori Sato wrote:

Updated.
Some cleanup.

Thanks.  Committed to the trunk.
jeff



Re: [Patch, Fortran] Simplify lbound

2015-04-27 Thread Thomas Koenig
Hello world,

here is a slight correction: This patch includes the change to
the test case.

Regards

Thomas


Index: fortran/simplify.c
===
--- fortran/simplify.c	(Revision 222431)
+++ fortran/simplify.c	(Arbeitskopie)
@@ -3445,6 +3445,39 @@ simplify_bound (gfc_expr *array, gfc_expr *dim, gf
 
  done:
 
+  /* If the array shape is assumed shape or explicit, we can simplify lbound
+ to 1 if the given lower bound is one because this matches what lbound
+ should return for an empty array.  */
+
+  if (!upper && as && dim && dim->expr_type == EXPR_CONSTANT
+  && (as->type == AS_ASSUMED_SHAPE || as->type == AS_EXPLICIT) 
+  && ref->u.ar.type != AR_SECTION)
+{
+  /* Watch out for allocatable or pointer dummy arrays, they can have
+	 lower bounds that are not equal to one.  */
+  if (!(array->symtree && array->symtree->n.sym
+	&& (array->symtree->n.sym->attr.allocatable
+		|| array->symtree->n.sym->attr.pointer)))
+	{
+	  unsigned long int ndim;
+	  gfc_expr *lower, *res;
+
+	  ndim = mpz_get_si (dim->value.integer) - 1;
+	  lower = as->lower[ndim];
+	  if (lower->expr_type == EXPR_CONSTANT
+	  && mpz_cmp_si (lower->value.integer, 1) == 0)
+	{
+	  res = gfc_copy_expr (lower);
+	  if (kind)
+		{
+		  int nkind = mpz_get_si (kind->value.integer);
+		  res->ts.kind = nkind;
+		}
+	  return res;
+	}
+	}
+}
+
   if (as && (as->type == AS_DEFERRED || as->type == AS_ASSUMED_SHAPE
 	 || as->type == AS_ASSUMED_RANK))
 return NULL;
Index: testsuite/gfortran.dg/coarray_lib_this_image_2.f90
===
--- testsuite/gfortran.dg/coarray_lib_this_image_2.f90	(Revision 222431)
+++ testsuite/gfortran.dg/coarray_lib_this_image_2.f90	(Arbeitskopie)
@@ -20,7 +20,7 @@ end
 ! { dg-final { scan-tree-dump-times "mylcobound = 5;" 1 "original" } }
 ! { dg-final { scan-tree-dump-times "parm...dim\\\[1\\\].lbound = 5;" 1 "original" } }
 ! { dg-final { scan-tree-dump-times "myucobound =\[^\n\r\]* parm...dim\\\[1\\\].lbound \\+ \[^\n\r\]*_gfortran_caf_num_images \\(0, -1\\).? \\+ -?\[0-9\]+\\);" 1 "original" } }
-! { dg-final { scan-tree-dump-times "mylbound = parm...dim\\\[0\\\].stride >= 0 && parm...dim\\\[0\\\].ubound >= parm...dim\\\[0\\\].lbound \\|\\| parm...dim\\\[0\\\].stride < 0 \\?\[^\n\r\]* parm...dim\\\[0\\\].lbound : 1;" 1 "original" } }
+! { dg-final { scan-tree-dump-times "mylbound = 1;" 1 "original" } }
 ! { dg-final { scan-tree-dump-times "mythis_image = _gfortran_caf_this_image \\(0\\);" 1 "original" } }
 ! { dg-final { scan-tree-dump-times "bar \\(&parm.\[0-9\]+, caf_token.\[0-9\]+, \\(integer\\(kind=\[48\]\\)\\) parm.\[0-9\]+.data - \\(integer\\(kind=\[48\]\\)\\) x\\);" 1 "original" } }
 ! { dg-final { scan-tree-dump-times "_gfortran_caf_init \\(&argc, &argv\\);" 1 "original" } }


Re: [PATCH 2/13] musl libc config

2015-04-27 Thread Jeff Law

On 04/27/2015 08:30 AM, Szabolcs Nagy wrote:


On 22/04/15 16:42, Szabolcs Nagy wrote:

yes, i didn't realize that this depends on the target specific parts

i will prepare an updated patch that works if the target has no musl
dynamic linker name defined



Updated the patch, this time I tested it separately.
(The test revealed that -mmusl was not wired up properly
in linux.opt through the "Negative" directive, now fixed).

A default dynamic linker name is defined for musl: "/dev/null".
(The arch specific headers have to be updated to undef the
MUSL_DYNAMIC_LINKER* macros).
(Other options were considered such as print an error with %e,
but then the user cannot override it with -Wl,-dynamic-linker)

The other parts of the patch (config, include path reordering)
are not changed.

gcc/Changelog

2015-04-27  Gregor Richards  
Szabolcs Nagy  

* config.gcc (LIBC_MUSL): New tm_defines macro.
* config/linux.h (OPTION_MUSL): Define.
(MUSL_DYNAMIC_LINKER, MUSL_DYNAMIC_LINKER32,)
(MUSL_DYNAMIC_LINKER64, MUSL_DYNAMIC_LINKERX32,)
(INCLUDE_DEFAULTS_MUSL_GPP, INCLUDE_DEFAULTS_MUSL_LOCAL,)
(INCLUDE_DEFAULTS_MUSL_PREFIX, INCLUDE_DEFAULTS_MUSL_CROSS,)
(INCLUDE_DEFAULTS_MUSL_TOOL, INCLUDE_DEFAULTS_MUSL_NATIVE): Define.

* config/linux.opt (mmusl): New option.
* configure.ac (gcc_cv_libc_provides_ssp): Add *-*-musl*.
(gcc_cv_target_dl_iterate_phdr): Add *-linux-musl*.

* configure: Regenerate.

Still OK :-)

I'm going to leave review of the target config changes to the target 
maintainers.


jeff



Re: gcc make clean fixes

2015-04-27 Thread Jeff Law

On 04/23/2015 09:34 PM, Jim Wilson wrote:

I noticed that make clean isn't removing C++ binaries.  It was broken here
 https://gcc.gnu.org/ml/gcc-patches/2007-01/msg00214.html
Removing stagestuff accidentally removed those items from the
mostlyclean rule.  They need to be re-added to the language dependent
mostlyclean rules.  All of the old languages are affected, except for
java which someone already fixed.  And plus the new language go which
inherits the same mistake from the older languages.

Also the new language jit which has no clean rules at all, so I added the bare
minimum that we should have.

This was tested by building every language except ada, running make
clean, and looking at what was left.  This gave the expected result.
There are still a few misc things around, but this patch gets most of
it.

I didn't build ada because when I try to install gnat, apt-get tells
me it has to remove gdb.  I'm not willing to try that on my main
development machine.  But the ada patch follows the exact same rule as
the others, which is to add stuff that used to be in stagestuff, and
appears to be trivially correct.

Looks good to me.  Please install if you haven't already done so.

jeff


Re: [PATCH libstdc++] Fix for std::uncaught_exception (PR 62258)

2015-04-27 Thread Jonathan Wakely

On 02/02/15 03:37 +0100, Michael Hanselmann wrote:

Calls to `std::uncaught_exception` after calling
`std::rethrow_exception' always return `true' when
`std::uncaught_exception' should return `false' unless an exception is
in flight. `std::rethrow_exception' does not update
`__cxa_eh_globals::uncaughtExceptions' while the following call to
`__cxa_begin_catch' decrements it. This fixes PR 62258.

The original two-line patch was created by Dmitry Prokoptsev. Michael



Hanselmann implemented a testcase.


Committed to trunk - thanks for the patch and test.

commit 1cfba1875f21f485fe82e8c118b2355c4714bcdc
Author: Jonathan Wakely 
Date:   Mon Apr 27 13:48:23 2015 +0100

2015-04-27  Dmitry Prokoptsev  
	Michael Hanselmann  

	PR libstdc++/62258
	* libsupc++/eh_ptr.cc (rethrow_exception): Increment count of
	uncaught exceptions.
	* testsuite/18_support/exception_ptr/62258.cc: New.

diff --git a/libstdc++-v3/libsupc++/eh_ptr.cc b/libstdc++-v3/libsupc++/eh_ptr.cc
index 685c9e0..dae9246 100644
--- a/libstdc++-v3/libsupc++/eh_ptr.cc
+++ b/libstdc++-v3/libsupc++/eh_ptr.cc
@@ -245,6 +245,9 @@ std::rethrow_exception(std::exception_ptr ep)
   __GXX_INIT_DEPENDENT_EXCEPTION_CLASS(dep->unwindHeader.exception_class);
   dep->unwindHeader.exception_cleanup = __gxx_dependent_exception_cleanup;
 
+  __cxa_eh_globals *globals = __cxa_get_globals ();
+  globals->uncaughtExceptions += 1;
+
 #ifdef _GLIBCXX_SJLJ_EXCEPTIONS
   _Unwind_SjLj_RaiseException (&dep->unwindHeader);
 #else
diff --git a/libstdc++-v3/testsuite/18_support/exception_ptr/62258.cc b/libstdc++-v3/testsuite/18_support/exception_ptr/62258.cc
new file mode 100644
index 000..3734f19
--- /dev/null
+++ b/libstdc++-v3/testsuite/18_support/exception_ptr/62258.cc
@@ -0,0 +1,61 @@
+// { dg-options "-std=gnu++11" }
+// { dg-require-atomic-builtins "" }
+
+// Copyright (C) 2015 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// PR libstdc++/62258
+
+#include 
+#include 
+
+struct check_on_destruct
+{
+  ~check_on_destruct();
+};
+
+check_on_destruct::~check_on_destruct()
+{
+  VERIFY(std::uncaught_exception());
+}
+
+int main ()
+{
+  VERIFY(!std::uncaught_exception());
+
+  try
+{
+  check_on_destruct check;
+
+  try
+{
+  throw 1;
+}
+  catch (...)
+{
+  VERIFY(!std::uncaught_exception());
+
+  std::rethrow_exception(std::current_exception());
+}
+}
+  catch (...)
+{
+  VERIFY(!std::uncaught_exception());
+}
+
+  VERIFY(!std::uncaught_exception());
+}


  1   2   >