Re: RFC: [build, ada] Centralize PICFLAG configuration

2011-08-23 Thread Paolo Bonzini

On 08/22/2011 07:11 PM, Rainer Orth wrote:

installed, thanks.

Do I need to sync the config and libiberty parts to src manually or does
this happen by some sort of magic?


I'll take care of that.

Paolo


Re: Vector Comparison patch

2011-08-23 Thread Richard Guenther
On Mon, Aug 22, 2011 at 11:11 PM, Artem Shinkarov
 wrote:
> I'll just send you my current version. I'll be a little bit more specific.
>
> The problem starts when you try to lower the following expression:
>
> x = a > b;
> x1 = vcond 
> vcond 
>
> Now, you go from the beginning to the end of the block, and you cannot
> leave a > b, because only vconds are valid expressions to expand.
>
> Now, you meet a > b first. You try to transform it into vcond  b,
> -1, 0>, you build this expression, then you try to gimplify it, and
> you see that you have something like:
>
> x' = a >b;
> x = vcond 
> x1 = vcond 
> vcond 
>
> and your gsi stands at the x1 now, so the gimplification created a
> comparison that optab would not understand. And I am not really sure
> that you would be able to solve this problem easily.
>
> It would helpr, if you could create vcond, but you
> cant and x op y is a single tree that must be gimplified, and I am not
> sure that you can persuade gimplifier to leave this expression
> untouched.
>
> In the attachment the current version of the patch.

I can't reproduce it with your patch.  For

#define vector(elcount, type)  \
__attribute__((vector_size((elcount)*sizeof(type type

vector (4, float) x, y;
vector (4, int) a,b;
int
main (int argc, char *argv[])
{
  vector (4, int) i0 = x < y;
  vector (4, int) i1 = i0 ? a : b;
  return 0;
}

I get from the C frontend:

  vector(4) int i0 =  VEC_COND_EXPR < SAVE_EXPR  , { -1, -1,
-1, -1 } , { 0, 0, 0, 0 } > ;
  vector(4) int i1 =  VEC_COND_EXPR < SAVE_EXPR  , SAVE_EXPR  ,
SAVE_EXPR  > ;

but I have expected i0 != 0 in the second VEC_COND_EXPR.

I do see that the gimplifier pulls away the condition for the first
VEC_COND_EXPR though:

  x.0 = x;
  y.1 = y;
  D.2735 = x.0 < y.1;
  D.2734 = D.2735;
  D.2736 = D.2734;
  i0 = [vec_cond_expr]  VEC_COND_EXPR < D.2736 , { -1, -1, -1, -1 } ,
{ 0, 0, 0, 0 } > ;

which is, I believe because of the SAVE_EXPR wrapped around the
comparison.  Why do you bother wrapping all operands in save-exprs?

With that the

  /* Currently the expansion of VEC_COND_EXPR does not allow
 expessions where the type of vectors you compare differs
 form the type of vectors you select from. For the time
 being we insert implicit conversions.  */
  if ((COMPARISON_CLASS_P (ifexp)
   && TREE_TYPE (TREE_OPERAND (ifexp, 0)) != TREE_TYPE (op1))
  || TREE_TYPE (ifexp) != TREE_TYPE (op1))

checks will fail (because ifexp is a SAVE_EXPR).

I'll run into errors when not adding the SAVE_EXPR around the ifexp,
the transform into x < y ? {-1,...} : {0,...} is not happening.

>
> Thanks,
> Artem.
>
>
> On Mon, Aug 22, 2011 at 9:58 PM, Richard Guenther
>  wrote:
>> On Mon, Aug 22, 2011 at 10:49 PM, Artem Shinkarov
>>  wrote:
>>> On Mon, Aug 22, 2011 at 9:42 PM, Richard Guenther
>>>  wrote:
 On Mon, Aug 22, 2011 at 5:58 PM, Artem Shinkarov
  wrote:
> On Mon, Aug 22, 2011 at 4:50 PM, Richard Guenther
>  wrote:
>> On Mon, Aug 22, 2011 at 5:43 PM, Artem Shinkarov
>>  wrote:
>>> On Mon, Aug 22, 2011 at 4:34 PM, Richard Guenther
>>>  wrote:
 On Mon, Aug 22, 2011 at 5:21 PM, Artem Shinkarov
  wrote:
> On Mon, Aug 22, 2011 at 4:01 PM, Richard Guenther
>  wrote:
>> On Mon, Aug 22, 2011 at 2:05 PM, Artem Shinkarov
>>  wrote:
>>> On Mon, Aug 22, 2011 at 12:25 PM, Richard Guenther
>>>  wrote:
 On Mon, Aug 22, 2011 at 12:53 AM, Artem Shinkarov
  wrote:
> Richard
>
> I formalized an approach a little-bit, now it works without target
> hooks, but some polishing is still required. I want you to 
> comment on
> the several important approaches that I use in the patch.
>
> So how does it work.
> 1) All the vector comparisons at the level of  type-checker are
> introduced using VEC_COND_EXPR with constant selection operands 
> being
> {-1} and {0}. For example v0 > v1 is transformed into 
> VEC_COND_EXPR> v1, {-1}, {0}>.
>
> 2) When optabs expand VEC_COND_EXPR, two cases are considered:
> 2.a) first operand of VEC_COND_EXPR is comparison, in that case 
> nothing changes.
> 2.b) first operand is something else, in that case, we specially 
> mark
> this case, recognize it in the backend, and do not create a
> comparison, but use the mask as it was a result of some 
> comparison.
>
> 3) In order to make sure that mask in VEC_COND_EXPR 
> is a
> vector comparison we use is_vector_comparison function, if it 
> returns
> false, then we replace mask with mask != {0}.
>
> So we end-up with the following functionality:
> VEC_COND_EXPR -- if we know that 

[PATCH][1/n] Wading through data-dependence analysis

2011-08-23 Thread Richard Guenther

I'm somewhat stuck with fixing PR50067 because when I fix some bugs
I get missed-optimizations because we do rely on those bugs ...

Well, I'm still trying to not dump in one mega-patch rewriting it
all, so this is the only piece that passed bootstrap and regtest
individually (heh ...).  It will avoid some (but unfortunately
not all) fallout from followups.  And it adds some more debugging
printing.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2011-08-23  Richard Guenther  

* tree-data-ref.c (dr_analyze_indices): Add comments, handle
REALPART_EXPR and IMAGPART_EXPR similar to ARRAY_REFs.
(create_data_ref): Also dump access functions for the created
data-ref.

Index: gcc/tree-data-ref.c
===
--- gcc/tree-data-ref.c (revision 177949)
+++ gcc/tree-data-ref.c (working copy)
@@ -844,20 +844,35 @@ dr_analyze_indices (struct data_referenc
   if (nest)
 before_loop = block_before_loop (nest);
 
+  /* Analyze access functions of dimensions we know to be independent.  */
   while (handled_component_p (aref))
 {
+  /* For ARRAY_REFs the base is the reference with the index replaced
+by zero.  */
   if (TREE_CODE (aref) == ARRAY_REF)
{
  op = TREE_OPERAND (aref, 1);
  if (nest)
{
- access_fn = analyze_scalar_evolution (loop, op);
+ access_fn = analyze_scalar_evolution (loop, op);
  access_fn = instantiate_scev (before_loop, loop, access_fn);
  VEC_safe_push (tree, heap, access_fns, access_fn);
}
-
  TREE_OPERAND (aref, 1) = build_int_cst (TREE_TYPE (op), 0);
}
+  /* REALPART_EXPR and IMAGPART_EXPR can be handled like accesses
+into a two element array with a constant index.  The base is
+then just the immediate underlying object.  */
+  else if (TREE_CODE (aref) == REALPART_EXPR)
+   {
+ ref = TREE_OPERAND (ref, 0);
+ VEC_safe_push (tree, heap, access_fns, integer_zero_node);
+   }
+  else if (TREE_CODE (aref) == IMAGPART_EXPR)
+   {
+ ref = TREE_OPERAND (ref, 0);
+ VEC_safe_push (tree, heap, access_fns, integer_one_node);
+   }
 
   aref = TREE_OPERAND (aref, 0);
 }
@@ -956,6 +971,7 @@ create_data_ref (loop_p nest, loop_p loo
 
   if (dump_file && (dump_flags & TDF_DETAILS))
 {
+  unsigned i;
   fprintf (dump_file, "\tbase_address: ");
   print_generic_expr (dump_file, DR_BASE_ADDRESS (dr), TDF_SLIM);
   fprintf (dump_file, "\n\toffset from base address: ");
@@ -969,6 +985,11 @@ create_data_ref (loop_p nest, loop_p loo
   fprintf (dump_file, "\n\tbase_object: ");
   print_generic_expr (dump_file, DR_BASE_OBJECT (dr), TDF_SLIM);
   fprintf (dump_file, "\n");
+  for (i = 0; i < DR_NUM_DIMENSIONS (dr); i++)
+   {
+ fprintf (dump_file, "\tAccess function %d: ", i);
+ print_generic_stmt (dump_file, DR_ACCESS_FN (dr, i), TDF_SLIM);
+   }
 }
 
   return dr;


Re: Add __builtin_clrsb, similar to clz/ctz

2011-08-23 Thread Jakub Jelinek
On Mon, Jun 20, 2011 at 09:38:22PM +0200, Bernd Schmidt wrote:
> D'oh. Blackfin has a (clrsb:HI (operand:SI)) instruction, so adding this
> showed a problem with some of the existing simplify_const_unop cases:
> for ffs/clz/ctz/clrsb/parity/popcount, we should look at the mode of the
> operand, rather than the mode of the operation. This limits what we can
> do in that function, since op_mode is sometimes VOIDmode - we really
> should add builtin folders for these at some point.

>   * simplify-rtx.c (simplify_const_unary_operation): Likewise.
>   Use op_mode rather than mode when optimizing ffs, clz, ctz, parity
>   and popcount.

This change is IMHO wrong, see e.g.
PR50161 where we have (subreg:SI (popcount:DI (const_int -1))).  This
is supposed to yield 64, but with your changes
it yields 128 - the op_mode here is VOIDmode, so the first if that used
to handle it is no longer used, but as width is <= 2 *
HOST_BITS_PER_WIDE_INT, it is treated as TImode constant.

IMHO best would be just to mandate that for these unary ops like
FFS, CLZ, CLRSB, CTZ, POPCOUNT, PARITY, BSWAP the operand has the same mode
(or VOIDmode) as the unary rtx and that the operation is being carried in
the unop's mode, it shouldn't be hard to adjust the few
*.md patterns (mainly in avr.md, bfin.md).
I think it is bad enough that ZERO_EXTEND must not have CONST_INT argument,
making CONST_INT undefined also for all these unary ops is unnecessary.
I think for NEG/NOT we already have such a guarantee (and thus your change
pessimizes it anyway).
avr.md/bfin.md etc. can use (subreg:HI (popcount:SI (match_operand:SI ...)))
(or (zero_extend:HI (popcount:QI (match_operand:QI ...))) and similar.

Or the
  /* We can do some operations on integer CONST_DOUBLEs.  Also allow
 for a DImode operation on a CONST_INT.  */
  else if (GET_MODE (op) == VOIDmode
   && width <= HOST_BITS_PER_WIDE_INT * 2
   && (GET_CODE (op) == CONST_DOUBLE
   || CONST_INT_P (op)))
case would need to change too to test that op_width ==
HOST_BITS_PER_WIDE_INT * 2 (but then, it would again pessimize at least
NEG/NOT/ABS that are defined sanely).  But we'd also need to change many
other places, e.g. cse_process_notes_1, that currently special case
ZERO_EXTEND/SUBREG (and sometimes SIGN_EXTEND) and pessimize them because
those rtxes aren't allowed to have VOIDmode arguments.  cse_process_notes_1
perhaps could be changed for VOIDmode new_rtx to try to
simplify_replace_rtx it...

Jakub


Re: Add __builtin_clrsb, similar to clz/ctz

2011-08-23 Thread Bernd Schmidt
On 08/23/11 11:05, Jakub Jelinek wrote:
> On Mon, Jun 20, 2011 at 09:38:22PM +0200, Bernd Schmidt wrote:
>> D'oh. Blackfin has a (clrsb:HI (operand:SI)) instruction, so adding this
>> showed a problem with some of the existing simplify_const_unop cases:
>> for ffs/clz/ctz/clrsb/parity/popcount, we should look at the mode of the
>> operand, rather than the mode of the operation. This limits what we can
>> do in that function, since op_mode is sometimes VOIDmode - we really
>> should add builtin folders for these at some point.
> 
>>  * simplify-rtx.c (simplify_const_unary_operation): Likewise.
>>  Use op_mode rather than mode when optimizing ffs, clz, ctz, parity
>>  and popcount.
> 
> This change is IMHO wrong,

Conceptually, I think it is exactly right. It may however be
inconvenient in some cases.

> see e.g.
> PR50161 where we have (subreg:SI (popcount:DI (const_int -1))).  This
> is supposed to yield 64, but with your changes
> it yields 128 - the op_mode here is VOIDmode,

This is what shouldn't happen.

> cse_process_notes_1
> perhaps could be changed for VOIDmode new_rtx to try to
> simplify_replace_rtx it...

Is this where the problem came from? Sounds like it's worth a try.

Wasn't Richard S. working on a patch to give constants modes?


Bernd



Re: Vector Comparison patch

2011-08-23 Thread Artem Shinkarov
On Tue, Aug 23, 2011 at 9:17 AM, Richard Guenther
 wrote:
> On Mon, Aug 22, 2011 at 11:11 PM, Artem Shinkarov
>  wrote:
>> I'll just send you my current version. I'll be a little bit more specific.
>>
>> The problem starts when you try to lower the following expression:
>>
>> x = a > b;
>> x1 = vcond 
>> vcond 
>>
>> Now, you go from the beginning to the end of the block, and you cannot
>> leave a > b, because only vconds are valid expressions to expand.
>>
>> Now, you meet a > b first. You try to transform it into vcond  b,
>> -1, 0>, you build this expression, then you try to gimplify it, and
>> you see that you have something like:
>>
>> x' = a >b;
>> x = vcond 
>> x1 = vcond 
>> vcond 
>>
>> and your gsi stands at the x1 now, so the gimplification created a
>> comparison that optab would not understand. And I am not really sure
>> that you would be able to solve this problem easily.
>>
>> It would helpr, if you could create vcond, but you
>> cant and x op y is a single tree that must be gimplified, and I am not
>> sure that you can persuade gimplifier to leave this expression
>> untouched.
>>
>> In the attachment the current version of the patch.
>
> I can't reproduce it with your patch.  For
>
> #define vector(elcount, type)  \
>    __attribute__((vector_size((elcount)*sizeof(type type
>
> vector (4, float) x, y;
> vector (4, int) a,b;
> int
> main (int argc, char *argv[])
> {
>  vector (4, int) i0 = x < y;
>  vector (4, int) i1 = i0 ? a : b;
>  return 0;
> }
>
> I get from the C frontend:
>
>  vector(4) int i0 =  VEC_COND_EXPR < SAVE_EXPR  , { -1, -1,
> -1, -1 } , { 0, 0, 0, 0 } > ;
>  vector(4) int i1 =  VEC_COND_EXPR < SAVE_EXPR  , SAVE_EXPR  ,
> SAVE_EXPR  > ;
>
> but I have expected i0 != 0 in the second VEC_COND_EXPR.

I don't put it there. This patch adds != 0, rather removing. But this
could be changed.

> I do see that the gimplifier pulls away the condition for the first
> VEC_COND_EXPR though:
>
>  x.0 = x;
>  y.1 = y;
>  D.2735 = x.0 < y.1;
>  D.2734 = D.2735;
>  D.2736 = D.2734;
>  i0 = [vec_cond_expr]  VEC_COND_EXPR < D.2736 , { -1, -1, -1, -1 } ,
> { 0, 0, 0, 0 } > ;
>
> which is, I believe because of the SAVE_EXPR wrapped around the
> comparison.  Why do you bother wrapping all operands in save-exprs?

I bother because they could be MAYBE_CONST which breaks the
gimplifier. But I don't really know if you can do it better. I can
always do this checking on operands of constructed vcond...

You are right, that if you just put a comparison of variables there
then we are fine. My point is that whenever gimplifier is pulling out
the comparison from the first operand, replacing it with the variable,
then we are screwed, because there is no chance to put it back, and
that is exactly what happens in expand_vector_comparison, if you
uncomment the replacement -- comparison is always represented as x = a
> b.

> With that the
>
>  /* Currently the expansion of VEC_COND_EXPR does not allow
>     expessions where the type of vectors you compare differs
>     form the type of vectors you select from. For the time
>     being we insert implicit conversions.  */
>  if ((COMPARISON_CLASS_P (ifexp)
>       && TREE_TYPE (TREE_OPERAND (ifexp, 0)) != TREE_TYPE (op1))
>      || TREE_TYPE (ifexp) != TREE_TYPE (op1))
>
> checks will fail (because ifexp is a SAVE_EXPR).
>
> I'll run into errors when not adding the SAVE_EXPR around the ifexp,
> the transform into x < y ? {-1,...} : {0,...} is not happening.
>>
>> Thanks,
>> Artem.
>>
>>
>> On Mon, Aug 22, 2011 at 9:58 PM, Richard Guenther
>>  wrote:
>>> On Mon, Aug 22, 2011 at 10:49 PM, Artem Shinkarov
>>>  wrote:
 On Mon, Aug 22, 2011 at 9:42 PM, Richard Guenther
  wrote:
> On Mon, Aug 22, 2011 at 5:58 PM, Artem Shinkarov
>  wrote:
>> On Mon, Aug 22, 2011 at 4:50 PM, Richard Guenther
>>  wrote:
>>> On Mon, Aug 22, 2011 at 5:43 PM, Artem Shinkarov
>>>  wrote:
 On Mon, Aug 22, 2011 at 4:34 PM, Richard Guenther
  wrote:
> On Mon, Aug 22, 2011 at 5:21 PM, Artem Shinkarov
>  wrote:
>> On Mon, Aug 22, 2011 at 4:01 PM, Richard Guenther
>>  wrote:
>>> On Mon, Aug 22, 2011 at 2:05 PM, Artem Shinkarov
>>>  wrote:
 On Mon, Aug 22, 2011 at 12:25 PM, Richard Guenther
  wrote:
> On Mon, Aug 22, 2011 at 12:53 AM, Artem Shinkarov
>  wrote:
>> Richard
>>
>> I formalized an approach a little-bit, now it works without 
>> target
>> hooks, but some polishing is still required. I want you to 
>> comment on
>> the several important approaches that I use in the patch.
>>
>> So how does it work.
>> 1) All the vector comparisons at the level of  type-checker are
>> introduced using VEC_COND_EXPR with constant selection operands 
>> being
>> {-1} and {0}. For example v0 > v1 is transformed

[Patch, Fortran] PR 31600 - Better diagnosis when redeclaring used-assoc symbol

2011-08-23 Thread Tobias Burnus

Before, one got the following error for the attached test case:

integer :: bar
  1
Error: Symbol 'bar' at (1) already has basic type of INTEGER


Which can be a bit puzzling in larger programs. With the patch, one gets:

use_16.f90:15.14:
integer :: bar
  1
use_16.f90:13.83:
use a
 2
Error: Symbol 'bar' at (1) conflicts with symbol from module 'a', 
use-associated at (2)




The module change is a bit unrelated but makes the error a bit more 
readable. Instead of having:


 dg-error "Symbol 'bar' at \\(1\\) conflicts with symbol from module 'a'" }
   2

One now has:

use a ! { dg-error "Symbol 'bar' at \\(1\\) conflicts with symbol from 
module '

2


Build and regtested on x86-64-linux.
OK for the trunk?

Tobias
2011-08-23  Tobias Burnus  

	PR fortran/31600
	* symbol.c (gfc_add_type): Better diagnostic if redefining
	use-associated symbol.
	* module.c (gfc_use_module): Use module name as locus.

2011-08-23  Tobias Burnus  

	PR fortran/31600
	* gfortran.dg/use_16.f90: New.

diff --git a/gcc/fortran/module.c b/gcc/fortran/module.c
index aef3404..4250a17 100644
--- a/gcc/fortran/module.c
+++ b/gcc/fortran/module.c
@@ -5727,6 +5727,9 @@ gfc_use_module (void)
   int c, line, start;
   gfc_symtree *mod_symtree;
   gfc_use_list *use_stmt;
+  locus old_locus = gfc_current_locus;
+
+  gfc_current_locus = use_locus;
 
   filename = (char *) alloca (strlen (module_name) + strlen (MODULE_EXTENSION)
 			  + 1);
@@ -5748,6 +5751,7 @@ gfc_use_module (void)
 			 "intrinsic module at %C") != FAILURE)
{
 	 use_iso_fortran_env_module ();
+	 gfc_current_locus = old_locus;
 	 return;
}
 
@@ -5756,6 +5760,7 @@ gfc_use_module (void)
 			 "ISO_C_BINDING module at %C") != FAILURE)
 	{
 	  import_iso_c_binding_module();
+	  gfc_current_locus = old_locus;
 	  return;
 	}
 
@@ -5845,6 +5850,8 @@ gfc_use_module (void)
   gfc_rename_list = NULL;
   use_stmt->next = gfc_current_ns->use_stmts;
   gfc_current_ns->use_stmts = use_stmt;
+
+  gfc_current_locus = old_locus;
 }
 
 
diff --git a/gcc/fortran/symbol.c b/gcc/fortran/symbol.c
index 4463460..126a52b 100644
--- a/gcc/fortran/symbol.c
+++ b/gcc/fortran/symbol.c
@@ -1672,7 +1672,12 @@ gfc_add_type (gfc_symbol *sym, gfc_typespec *ts, locus *where)
 
   if (type != BT_UNKNOWN && !(sym->attr.function && sym->attr.implicit_type))
 {
-  gfc_error ("Symbol '%s' at %L already has basic type of %s", sym->name,
+  if (sym->attr.use_assoc)
+	gfc_error ("Symbol '%s' at %L conflicts with symbol from module '%s', "
+		   "use-associated at %L", sym->name, where, sym->module,
+		   &sym->declared_at);
+  else
+	gfc_error ("Symbol '%s' at %L already has basic type of %s", sym->name,
 		 where, gfc_basic_typename (type));
   return FAILURE;
 }
--- /dev/null	2011-08-23 07:28:57.751883742 +0200
+++ gcc/gcc/testsuite/gfortran.dg/use_16.f90	2011-08-23 09:59:19.0 +0200
@@ -0,0 +1,18 @@
+! { dg-do compile }
+!
+! PR fortran/31600
+!
+module a
+implicit none
+contains
+  integer function bar()
+bar = 42
+  end function
+end module a
+
+use a ! { dg-error "Symbol 'bar' at \\(1\\) conflicts with symbol from module 'a'" }
+implicit none
+integer :: bar ! { dg-error "Symbol 'bar' at \\(1\\) conflicts with symbol from module 'a'" }
+end
+
+! { dg-final { cleanup-modules "a" } }


Re: Add __builtin_clrsb, similar to clz/ctz

2011-08-23 Thread Richard Guenther
On Tue, Aug 23, 2011 at 11:35 AM, Bernd Schmidt  wrote:
> On 08/23/11 11:05, Jakub Jelinek wrote:
>> On Mon, Jun 20, 2011 at 09:38:22PM +0200, Bernd Schmidt wrote:
>>> D'oh. Blackfin has a (clrsb:HI (operand:SI)) instruction, so adding this
>>> showed a problem with some of the existing simplify_const_unop cases:
>>> for ffs/clz/ctz/clrsb/parity/popcount, we should look at the mode of the
>>> operand, rather than the mode of the operation. This limits what we can
>>> do in that function, since op_mode is sometimes VOIDmode - we really
>>> should add builtin folders for these at some point.
>>
>>>      * simplify-rtx.c (simplify_const_unary_operation): Likewise.
>>>      Use op_mode rather than mode when optimizing ffs, clz, ctz, parity
>>>      and popcount.
>>
>> This change is IMHO wrong,
>
> Conceptually, I think it is exactly right. It may however be
> inconvenient in some cases.
>
>> see e.g.
>> PR50161 where we have (subreg:SI (popcount:DI (const_int -1))).  This
>> is supposed to yield 64, but with your changes
>> it yields 128 - the op_mode here is VOIDmode,
>
> This is what shouldn't happen.

If it shouldn't happen, does some verifier catch it?

>> cse_process_notes_1
>> perhaps could be changed for VOIDmode new_rtx to try to
>> simplify_replace_rtx it...
>
> Is this where the problem came from? Sounds like it's worth a try.
>
> Wasn't Richard S. working on a patch to give constants modes?
>
>
> Bernd
>
>


Re: Add __builtin_clrsb, similar to clz/ctz

2011-08-23 Thread Jakub Jelinek
On Tue, Aug 23, 2011 at 11:35:07AM +0200, Bernd Schmidt wrote:
> > cse_process_notes_1
> > perhaps could be changed for VOIDmode new_rtx to try to
> > simplify_replace_rtx it...
> 
> Is this where the problem came from? Sounds like it's worth a try.

In this case, yes.  But there are many other places all around the
compiler that need to disallow unary op with VOIDmode operand.
In cse.c alone e.g. fold_rtx (twice), in combine.c e.g. in do_SUBST,
subst, etc.  Do we want to special case all those 7 unary ops there too?
Is it really worth it to save one subreg or truncate in the md patterns
for rarely used rtxes?

> Wasn't Richard S. working on a patch to give constants modes?

I don't think this is achievable for 4.7...

Jakub


Re: Add __builtin_clrsb, similar to clz/ctz

2011-08-23 Thread Bernd Schmidt
On 08/23/11 11:52, Jakub Jelinek wrote:
> On Tue, Aug 23, 2011 at 11:35:07AM +0200, Bernd Schmidt wrote:
>>> cse_process_notes_1
>>> perhaps could be changed for VOIDmode new_rtx to try to
>>> simplify_replace_rtx it...
>>
>> Is this where the problem came from? Sounds like it's worth a try.
> 
> In this case, yes.  But there are many other places all around the
> compiler that need to disallow unary op with VOIDmode operand.
> In cse.c alone e.g. fold_rtx (twice), in combine.c e.g. in do_SUBST,
> subst, etc.  Do we want to special case all those 7 unary ops there too?
> Is it really worth it to save one subreg or truncate in the md patterns
> for rarely used rtxes?

Maybe not. I'll approve a patch to change it back, even if I think it's
not a good representation.


Bernd


Re: Add __builtin_clrsb, similar to clz/ctz

2011-08-23 Thread Richard Sandiford
Bernd Schmidt  writes:
> Wasn't Richard S. working on a patch to give constants modes?

A whole series more like.  Don't hold your breath!

Richard


Re: Vector Comparison patch

2011-08-23 Thread Richard Guenther
On Tue, Aug 23, 2011 at 11:44 AM, Artem Shinkarov
 wrote:
> On Tue, Aug 23, 2011 at 9:17 AM, Richard Guenther
>  wrote:
>> On Mon, Aug 22, 2011 at 11:11 PM, Artem Shinkarov
>>  wrote:
>>> I'll just send you my current version. I'll be a little bit more specific.
>>>
>>> The problem starts when you try to lower the following expression:
>>>
>>> x = a > b;
>>> x1 = vcond 
>>> vcond 
>>>
>>> Now, you go from the beginning to the end of the block, and you cannot
>>> leave a > b, because only vconds are valid expressions to expand.
>>>
>>> Now, you meet a > b first. You try to transform it into vcond  b,
>>> -1, 0>, you build this expression, then you try to gimplify it, and
>>> you see that you have something like:
>>>
>>> x' = a >b;
>>> x = vcond 
>>> x1 = vcond 
>>> vcond 
>>>
>>> and your gsi stands at the x1 now, so the gimplification created a
>>> comparison that optab would not understand. And I am not really sure
>>> that you would be able to solve this problem easily.
>>>
>>> It would helpr, if you could create vcond, but you
>>> cant and x op y is a single tree that must be gimplified, and I am not
>>> sure that you can persuade gimplifier to leave this expression
>>> untouched.
>>>
>>> In the attachment the current version of the patch.
>>
>> I can't reproduce it with your patch.  For
>>
>> #define vector(elcount, type)  \
>>    __attribute__((vector_size((elcount)*sizeof(type type
>>
>> vector (4, float) x, y;
>> vector (4, int) a,b;
>> int
>> main (int argc, char *argv[])
>> {
>>  vector (4, int) i0 = x < y;
>>  vector (4, int) i1 = i0 ? a : b;
>>  return 0;
>> }
>>
>> I get from the C frontend:
>>
>>  vector(4) int i0 =  VEC_COND_EXPR < SAVE_EXPR  , { -1, -1,
>> -1, -1 } , { 0, 0, 0, 0 } > ;
>>  vector(4) int i1 =  VEC_COND_EXPR < SAVE_EXPR  , SAVE_EXPR  ,
>> SAVE_EXPR  > ;
>>
>> but I have expected i0 != 0 in the second VEC_COND_EXPR.
>
> I don't put it there. This patch adds != 0, rather removing. But this
> could be changed.

?

>> I do see that the gimplifier pulls away the condition for the first
>> VEC_COND_EXPR though:
>>
>>  x.0 = x;
>>  y.1 = y;
>>  D.2735 = x.0 < y.1;
>>  D.2734 = D.2735;
>>  D.2736 = D.2734;
>>  i0 = [vec_cond_expr]  VEC_COND_EXPR < D.2736 , { -1, -1, -1, -1 } ,
>> { 0, 0, 0, 0 } > ;
>>
>> which is, I believe because of the SAVE_EXPR wrapped around the
>> comparison.  Why do you bother wrapping all operands in save-exprs?
>
> I bother because they could be MAYBE_CONST which breaks the
> gimplifier. But I don't really know if you can do it better. I can
> always do this checking on operands of constructed vcond...

Err, the patch does

+  /* Avoid C_MAYBE_CONST in VEC_COND_EXPR.  */
+  tmp = c_fully_fold (ifexp, false, &maybe_const);
+  ifexp = save_expr (tmp);
+  wrap &= maybe_const;

why is

  ifexp = save_expr (tmp);

necessary here?  SAVE_EXPR is if you need to protect side-effects
from being evaluated twice if you use an operand twice.  But all
operands are just used a single time.

And I expected, instead of

+  if ((COMPARISON_CLASS_P (ifexp)
+   && TREE_TYPE (TREE_OPERAND (ifexp, 0)) != TREE_TYPE (op1))
+  || TREE_TYPE (ifexp) != TREE_TYPE (op1))
+{
+  tree comp_type = COMPARISON_CLASS_P (ifexp)
+  ? TREE_TYPE (TREE_OPERAND (ifexp, 0))
+  : TREE_TYPE (ifexp);
+
+  op1 = convert (comp_type, op1);
+  op2 = convert (comp_type, op2);
+  vcond = build3 (VEC_COND_EXPR, comp_type, ifexp, op1, op2);
+  vcond = convert (TREE_TYPE (op1), vcond);
+}
+  else
+vcond = build3 (VEC_COND_EXPR, TREE_TYPE (op1), ifexp, op1, op2);

  if (!COMPARISON_CLASS_P (ifexp))
ifexp = build2 (NE_EXPR, TREE_TYPE (ifexp), ifexp,
 build_vector_from_val (TREE_TYPE (ifexp), 0));

  if (TREE_TYPE (ifexp) != TREE_TYPE (op1))
{
...

> You are right, that if you just put a comparison of variables there
> then we are fine. My point is that whenever gimplifier is pulling out
> the comparison from the first operand, replacing it with the variable,
> then we are screwed, because there is no chance to put it back, and
> that is exactly what happens in expand_vector_comparison, if you
> uncomment the replacement -- comparison is always represented as x = a
>> b.
>
>> With that the
>>
>>  /* Currently the expansion of VEC_COND_EXPR does not allow
>>     expessions where the type of vectors you compare differs
>>     form the type of vectors you select from. For the time
>>     being we insert implicit conversions.  */
>>  if ((COMPARISON_CLASS_P (ifexp)
>>       && TREE_TYPE (TREE_OPERAND (ifexp, 0)) != TREE_TYPE (op1))
>>      || TREE_TYPE (ifexp) != TREE_TYPE (op1))
>>
>> checks will fail (because ifexp is a SAVE_EXPR).
>>
>> I'll run into errors when not adding the SAVE_EXPR around the ifexp,
>> the transform into x < y ? {-1,...} : {0,...} is not happening.


PING: allow match_test to be used for attribute rtxes

2011-08-23 Thread Richard Sandiford
Ping for:

http://gcc.gnu.org/ml/gcc-patches/2011-08/msg01181.html
http://gcc.gnu.org/ml/gcc-patches/2011-08/msg01182.html

which allow attributes to use (match_test ...) instead of
(ne (symbol_ref ..) (const_int 0)).

Uros has already approved the x86 part (thanks).

Richard


Ping: Rename across basic block boundaries

2011-08-23 Thread Bernd Schmidt
Ping for the patch at
http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01595.html

> This patch requires
>   http://gcc.gnu.org/ml/gcc-patches/2011-05/msg02193.html
> as a prerequisite, and supersedes
>   http://gcc.gnu.org/ml/gcc-patches/2011-05/msg02194.html
> 
> The idea here is to allow regrename to operate across basic block
> boundaries. This helps for targets that use sched_ebb (such as C6X), and
> by exposing more chains, we also help on targets that can benefit from
> PREFERRED_RENAME_CLASS (again, C6X).


Bernd


[PATCH] Fix PR50162

2011-08-23 Thread Richard Guenther

We fail to properly lookup the last argument of a pair of vectorized
args when vectorizing a packing function call.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk
sofar, to eventually catch fallout.

Richard.

2011-08-23  Richard Guenther  

PR tree-optimization/50162
* tree-vect-stmts.c (vectorizable_call): Fix argument lookup.

Index: gcc/tree-vect-stmts.c
===
--- gcc/tree-vect-stmts.c   (revision 177983)
+++ gcc/tree-vect-stmts.c   (working copy)
@@ -1697,7 +1697,7 @@ vectorizable_call (gimple stmt, gimple_s
}
  else
{
- vec_oprnd1 = gimple_call_arg (new_stmt, 2*i);
+ vec_oprnd1 = gimple_call_arg (new_stmt, 2*i + 1);
  vec_oprnd0
= vect_get_vec_def_for_stmt_copy (dt[i], vec_oprnd1);
  vec_oprnd1


Re: Vector Comparison patch

2011-08-23 Thread Artem Shinkarov
On Tue, Aug 23, 2011 at 11:08 AM, Richard Guenther
 wrote:
> On Tue, Aug 23, 2011 at 11:44 AM, Artem Shinkarov
>  wrote:
>> On Tue, Aug 23, 2011 at 9:17 AM, Richard Guenther
>>  wrote:
>>> On Mon, Aug 22, 2011 at 11:11 PM, Artem Shinkarov
>>>  wrote:
 I'll just send you my current version. I'll be a little bit more specific.

 The problem starts when you try to lower the following expression:

 x = a > b;
 x1 = vcond 
 vcond 

 Now, you go from the beginning to the end of the block, and you cannot
 leave a > b, because only vconds are valid expressions to expand.

 Now, you meet a > b first. You try to transform it into vcond  b,
 -1, 0>, you build this expression, then you try to gimplify it, and
 you see that you have something like:

 x' = a >b;
 x = vcond 
 x1 = vcond 
 vcond 

 and your gsi stands at the x1 now, so the gimplification created a
 comparison that optab would not understand. And I am not really sure
 that you would be able to solve this problem easily.

 It would helpr, if you could create vcond, but you
 cant and x op y is a single tree that must be gimplified, and I am not
 sure that you can persuade gimplifier to leave this expression
 untouched.

 In the attachment the current version of the patch.
>>>
>>> I can't reproduce it with your patch.  For
>>>
>>> #define vector(elcount, type)  \
>>>    __attribute__((vector_size((elcount)*sizeof(type type
>>>
>>> vector (4, float) x, y;
>>> vector (4, int) a,b;
>>> int
>>> main (int argc, char *argv[])
>>> {
>>>  vector (4, int) i0 = x < y;
>>>  vector (4, int) i1 = i0 ? a : b;
>>>  return 0;
>>> }
>>>
>>> I get from the C frontend:
>>>
>>>  vector(4) int i0 =  VEC_COND_EXPR < SAVE_EXPR  , { -1, -1,
>>> -1, -1 } , { 0, 0, 0, 0 } > ;
>>>  vector(4) int i1 =  VEC_COND_EXPR < SAVE_EXPR  , SAVE_EXPR  ,
>>> SAVE_EXPR  > ;
>>>
>>> but I have expected i0 != 0 in the second VEC_COND_EXPR.
>>
>> I don't put it there. This patch adds != 0, rather removing. But this
>> could be changed.
>
> ?
>
>>> I do see that the gimplifier pulls away the condition for the first
>>> VEC_COND_EXPR though:
>>>
>>>  x.0 = x;
>>>  y.1 = y;
>>>  D.2735 = x.0 < y.1;
>>>  D.2734 = D.2735;
>>>  D.2736 = D.2734;
>>>  i0 = [vec_cond_expr]  VEC_COND_EXPR < D.2736 , { -1, -1, -1, -1 } ,
>>> { 0, 0, 0, 0 } > ;
>>>
>>> which is, I believe because of the SAVE_EXPR wrapped around the
>>> comparison.  Why do you bother wrapping all operands in save-exprs?
>>
>> I bother because they could be MAYBE_CONST which breaks the
>> gimplifier. But I don't really know if you can do it better. I can
>> always do this checking on operands of constructed vcond...
>
> Err, the patch does
>
> +  /* Avoid C_MAYBE_CONST in VEC_COND_EXPR.  */
> +  tmp = c_fully_fold (ifexp, false, &maybe_const);
> +  ifexp = save_expr (tmp);
> +  wrap &= maybe_const;
>
> why is
>
>  ifexp = save_expr (tmp);
>
> necessary here?  SAVE_EXPR is if you need to protect side-effects
> from being evaluated twice if you use an operand twice.  But all
> operands are just used a single time.

Again, the only reason why save_expr is there is to avoid MAYBE_CONST
nodes to break the gimplification. But may be it is a wrong way of
doing it, but it does the job.

> And I expected, instead of
>
> +  if ((COMPARISON_CLASS_P (ifexp)
> +       && TREE_TYPE (TREE_OPERAND (ifexp, 0)) != TREE_TYPE (op1))
> +      || TREE_TYPE (ifexp) != TREE_TYPE (op1))
> +    {
> +      tree comp_type = COMPARISON_CLASS_P (ifexp)
> +                      ? TREE_TYPE (TREE_OPERAND (ifexp, 0))
> +                      : TREE_TYPE (ifexp);
> +
> +      op1 = convert (comp_type, op1);
> +      op2 = convert (comp_type, op2);
> +      vcond = build3 (VEC_COND_EXPR, comp_type, ifexp, op1, op2);
> +      vcond = convert (TREE_TYPE (op1), vcond);
> +    }
> +  else
> +    vcond = build3 (VEC_COND_EXPR, TREE_TYPE (op1), ifexp, op1, op2);
>
>  if (!COMPARISON_CLASS_P (ifexp))
>    ifexp = build2 (NE_EXPR, TREE_TYPE (ifexp), ifexp,
>                         build_vector_from_val (TREE_TYPE (ifexp), 0));
>
>  if (TREE_TYPE (ifexp) != TREE_TYPE (op1))
>    {
> ...
>
Why?
This is a function to constuct any vcond. The result of ifexp is
always signed integer vector if it is a comparison, but we need to
make sure that all the elements of vcond have the same type.

And I didn't really understand if we can guarantee that vector
comparison would not be lifted out by the gimplifier. It happens in
case I put this save_expr, it could possibly happen in some other
cases. How can we prevent that?


Artem.

>> You are right, that if you just put a comparison of variables there
>> then we are fine. My point is that whenever gimplifier is pulling out
>> the comparison from the first operand, replacing it with the variable,
>> then we are screwed, because there is no chance to put it back, and
>> that is exactly what happens in expand_vector_comparison, if you

Re: Vector Comparison patch

2011-08-23 Thread Richard Guenther
On Tue, Aug 23, 2011 at 12:24 PM, Artem Shinkarov
 wrote:
> On Tue, Aug 23, 2011 at 11:08 AM, Richard Guenther
>  wrote:
>> On Tue, Aug 23, 2011 at 11:44 AM, Artem Shinkarov
>>  wrote:
>>> On Tue, Aug 23, 2011 at 9:17 AM, Richard Guenther
>>>  wrote:
 On Mon, Aug 22, 2011 at 11:11 PM, Artem Shinkarov
  wrote:
> I'll just send you my current version. I'll be a little bit more specific.
>
> The problem starts when you try to lower the following expression:
>
> x = a > b;
> x1 = vcond 
> vcond 
>
> Now, you go from the beginning to the end of the block, and you cannot
> leave a > b, because only vconds are valid expressions to expand.
>
> Now, you meet a > b first. You try to transform it into vcond  b,
> -1, 0>, you build this expression, then you try to gimplify it, and
> you see that you have something like:
>
> x' = a >b;
> x = vcond 
> x1 = vcond 
> vcond 
>
> and your gsi stands at the x1 now, so the gimplification created a
> comparison that optab would not understand. And I am not really sure
> that you would be able to solve this problem easily.
>
> It would helpr, if you could create vcond, but you
> cant and x op y is a single tree that must be gimplified, and I am not
> sure that you can persuade gimplifier to leave this expression
> untouched.
>
> In the attachment the current version of the patch.

 I can't reproduce it with your patch.  For

 #define vector(elcount, type)  \
    __attribute__((vector_size((elcount)*sizeof(type type

 vector (4, float) x, y;
 vector (4, int) a,b;
 int
 main (int argc, char *argv[])
 {
  vector (4, int) i0 = x < y;
  vector (4, int) i1 = i0 ? a : b;
  return 0;
 }

 I get from the C frontend:

  vector(4) int i0 =  VEC_COND_EXPR < SAVE_EXPR  , { -1, -1,
 -1, -1 } , { 0, 0, 0, 0 } > ;
  vector(4) int i1 =  VEC_COND_EXPR < SAVE_EXPR  , SAVE_EXPR  ,
 SAVE_EXPR  > ;

 but I have expected i0 != 0 in the second VEC_COND_EXPR.
>>>
>>> I don't put it there. This patch adds != 0, rather removing. But this
>>> could be changed.
>>
>> ?
>>
 I do see that the gimplifier pulls away the condition for the first
 VEC_COND_EXPR though:

  x.0 = x;
  y.1 = y;
  D.2735 = x.0 < y.1;
  D.2734 = D.2735;
  D.2736 = D.2734;
  i0 = [vec_cond_expr]  VEC_COND_EXPR < D.2736 , { -1, -1, -1, -1 } ,
 { 0, 0, 0, 0 } > ;

 which is, I believe because of the SAVE_EXPR wrapped around the
 comparison.  Why do you bother wrapping all operands in save-exprs?
>>>
>>> I bother because they could be MAYBE_CONST which breaks the
>>> gimplifier. But I don't really know if you can do it better. I can
>>> always do this checking on operands of constructed vcond...
>>
>> Err, the patch does
>>
>> +  /* Avoid C_MAYBE_CONST in VEC_COND_EXPR.  */
>> +  tmp = c_fully_fold (ifexp, false, &maybe_const);
>> +  ifexp = save_expr (tmp);
>> +  wrap &= maybe_const;
>>
>> why is
>>
>>  ifexp = save_expr (tmp);
>>
>> necessary here?  SAVE_EXPR is if you need to protect side-effects
>> from being evaluated twice if you use an operand twice.  But all
>> operands are just used a single time.
>
> Again, the only reason why save_expr is there is to avoid MAYBE_CONST
> nodes to break the gimplification. But may be it is a wrong way of
> doing it, but it does the job.
>
>> And I expected, instead of
>>
>> +  if ((COMPARISON_CLASS_P (ifexp)
>> +       && TREE_TYPE (TREE_OPERAND (ifexp, 0)) != TREE_TYPE (op1))
>> +      || TREE_TYPE (ifexp) != TREE_TYPE (op1))
>> +    {
>> +      tree comp_type = COMPARISON_CLASS_P (ifexp)
>> +                      ? TREE_TYPE (TREE_OPERAND (ifexp, 0))
>> +                      : TREE_TYPE (ifexp);
>> +
>> +      op1 = convert (comp_type, op1);
>> +      op2 = convert (comp_type, op2);
>> +      vcond = build3 (VEC_COND_EXPR, comp_type, ifexp, op1, op2);
>> +      vcond = convert (TREE_TYPE (op1), vcond);
>> +    }
>> +  else
>> +    vcond = build3 (VEC_COND_EXPR, TREE_TYPE (op1), ifexp, op1, op2);
>>
>>  if (!COMPARISON_CLASS_P (ifexp))
>>    ifexp = build2 (NE_EXPR, TREE_TYPE (ifexp), ifexp,
>>                         build_vector_from_val (TREE_TYPE (ifexp), 0));
>>
>>  if (TREE_TYPE (ifexp) != TREE_TYPE (op1))
>>    {
>> ...
>>
> Why?
> This is a function to constuct any vcond. The result of ifexp is
> always signed integer vector if it is a comparison, but we need to
> make sure that all the elements of vcond have the same type.
>
> And I didn't really understand if we can guarantee that vector
> comparison would not be lifted out by the gimplifier. It happens in
> case I put this save_expr, it could possibly happen in some other
> cases. How can we prevent that?

We don't need to prevent it.  If the C frontend makes sure that the
mask of a VEC_COND_EXPR is always {-1,...} or {0,} by expanding
mask ? v1 : v2 to V

[rs6000] Fix creation of invalid CONST_VECTORs

2011-08-23 Thread Richard Sandiford
My patches to more "accurately" detect the number of zero elements in a
compound initialiser caused pr34856 to trigger on powerpc*-darwin:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34856
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49987

The problem is the same as it was on i386 and spu: the backend can
create CONST_VECTORs with symbolic elements, which in the 34856 trail
above was decided to be invalid.  Although a patch was written for
powerpc at the same time, the problem apparently didn't trigger on
powerpc targets until after my patch.

Tested by Dominique on powerpc-apple-darwin9.8.0 (thanks).  OK to install?

Richard


gcc/
PR target/49987
* config/rs6000/rs6000.c (paired_expand_vector_init): Check for
valid CONST_VECTOR operands.
(rs6000_expand_vector_init): Likewise.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  2011-08-18 13:37:31.395814534 +0100
+++ gcc/config/rs6000/rs6000.c  2011-08-23 11:35:07.417677742 +0100
@@ -4503,7 +4503,9 @@ paired_expand_vector_init (rtx target, r
   for (i = 0; i < n_elts; ++i)
 {
   x = XVECEXP (vals, 0, i);
-  if (!CONSTANT_P (x))
+  if (!(CONST_INT_P (x)
+   || GET_CODE (x) == CONST_DOUBLE
+   || GET_CODE (x) == CONST_FIXED))
++n_var;
 }
   if (n_var == 0)
@@ -4655,7 +4657,9 @@ rs6000_expand_vector_init (rtx target, r
   for (i = 0; i < n_elts; ++i)
 {
   x = XVECEXP (vals, 0, i);
-  if (!CONSTANT_P (x))
+  if (!(CONST_INT_P (x)
+   || GET_CODE (x) == CONST_DOUBLE
+   || GET_CODE (x) == CONST_FIXED))
++n_var, one_var = i;
   else if (x != CONST0_RTX (inner_mode))
all_const_zero = false;


Re: Vector Comparison patch

2011-08-23 Thread Artem Shinkarov
On Tue, Aug 23, 2011 at 11:33 AM, Richard Guenther
 wrote:
> On Tue, Aug 23, 2011 at 12:24 PM, Artem Shinkarov
>  wrote:
>> On Tue, Aug 23, 2011 at 11:08 AM, Richard Guenther
>>  wrote:
>>> On Tue, Aug 23, 2011 at 11:44 AM, Artem Shinkarov
>>>  wrote:
 On Tue, Aug 23, 2011 at 9:17 AM, Richard Guenther
  wrote:
> On Mon, Aug 22, 2011 at 11:11 PM, Artem Shinkarov
>  wrote:
>> I'll just send you my current version. I'll be a little bit more 
>> specific.
>>
>> The problem starts when you try to lower the following expression:
>>
>> x = a > b;
>> x1 = vcond 
>> vcond 
>>
>> Now, you go from the beginning to the end of the block, and you cannot
>> leave a > b, because only vconds are valid expressions to expand.
>>
>> Now, you meet a > b first. You try to transform it into vcond  b,
>> -1, 0>, you build this expression, then you try to gimplify it, and
>> you see that you have something like:
>>
>> x' = a >b;
>> x = vcond 
>> x1 = vcond 
>> vcond 
>>
>> and your gsi stands at the x1 now, so the gimplification created a
>> comparison that optab would not understand. And I am not really sure
>> that you would be able to solve this problem easily.
>>
>> It would helpr, if you could create vcond, but you
>> cant and x op y is a single tree that must be gimplified, and I am not
>> sure that you can persuade gimplifier to leave this expression
>> untouched.
>>
>> In the attachment the current version of the patch.
>
> I can't reproduce it with your patch.  For
>
> #define vector(elcount, type)  \
>    __attribute__((vector_size((elcount)*sizeof(type type
>
> vector (4, float) x, y;
> vector (4, int) a,b;
> int
> main (int argc, char *argv[])
> {
>  vector (4, int) i0 = x < y;
>  vector (4, int) i1 = i0 ? a : b;
>  return 0;
> }
>
> I get from the C frontend:
>
>  vector(4) int i0 =  VEC_COND_EXPR < SAVE_EXPR  , { -1, -1,
> -1, -1 } , { 0, 0, 0, 0 } > ;
>  vector(4) int i1 =  VEC_COND_EXPR < SAVE_EXPR  , SAVE_EXPR  ,
> SAVE_EXPR  > ;
>
> but I have expected i0 != 0 in the second VEC_COND_EXPR.

 I don't put it there. This patch adds != 0, rather removing. But this
 could be changed.
>>>
>>> ?
>>>
> I do see that the gimplifier pulls away the condition for the first
> VEC_COND_EXPR though:
>
>  x.0 = x;
>  y.1 = y;
>  D.2735 = x.0 < y.1;
>  D.2734 = D.2735;
>  D.2736 = D.2734;
>  i0 = [vec_cond_expr]  VEC_COND_EXPR < D.2736 , { -1, -1, -1, -1 } ,
> { 0, 0, 0, 0 } > ;
>
> which is, I believe because of the SAVE_EXPR wrapped around the
> comparison.  Why do you bother wrapping all operands in save-exprs?

 I bother because they could be MAYBE_CONST which breaks the
 gimplifier. But I don't really know if you can do it better. I can
 always do this checking on operands of constructed vcond...
>>>
>>> Err, the patch does
>>>
>>> +  /* Avoid C_MAYBE_CONST in VEC_COND_EXPR.  */
>>> +  tmp = c_fully_fold (ifexp, false, &maybe_const);
>>> +  ifexp = save_expr (tmp);
>>> +  wrap &= maybe_const;
>>>
>>> why is
>>>
>>>  ifexp = save_expr (tmp);
>>>
>>> necessary here?  SAVE_EXPR is if you need to protect side-effects
>>> from being evaluated twice if you use an operand twice.  But all
>>> operands are just used a single time.
>>
>> Again, the only reason why save_expr is there is to avoid MAYBE_CONST
>> nodes to break the gimplification. But may be it is a wrong way of
>> doing it, but it does the job.
>>
>>> And I expected, instead of
>>>
>>> +  if ((COMPARISON_CLASS_P (ifexp)
>>> +       && TREE_TYPE (TREE_OPERAND (ifexp, 0)) != TREE_TYPE (op1))
>>> +      || TREE_TYPE (ifexp) != TREE_TYPE (op1))
>>> +    {
>>> +      tree comp_type = COMPARISON_CLASS_P (ifexp)
>>> +                      ? TREE_TYPE (TREE_OPERAND (ifexp, 0))
>>> +                      : TREE_TYPE (ifexp);
>>> +
>>> +      op1 = convert (comp_type, op1);
>>> +      op2 = convert (comp_type, op2);
>>> +      vcond = build3 (VEC_COND_EXPR, comp_type, ifexp, op1, op2);
>>> +      vcond = convert (TREE_TYPE (op1), vcond);
>>> +    }
>>> +  else
>>> +    vcond = build3 (VEC_COND_EXPR, TREE_TYPE (op1), ifexp, op1, op2);
>>>
>>>  if (!COMPARISON_CLASS_P (ifexp))
>>>    ifexp = build2 (NE_EXPR, TREE_TYPE (ifexp), ifexp,
>>>                         build_vector_from_val (TREE_TYPE (ifexp), 0));
>>>
>>>  if (TREE_TYPE (ifexp) != TREE_TYPE (op1))
>>>    {
>>> ...
>>>
>> Why?
>> This is a function to constuct any vcond. The result of ifexp is
>> always signed integer vector if it is a comparison, but we need to
>> make sure that all the elements of vcond have the same type.
>>
>> And I didn't really understand if we can guarantee that vector
>> comparison would not be lifted out by the gimplifier. It happens in
>> case I put this save_expr, it could possibly h

Re: Vector Comparison patch

2011-08-23 Thread Richard Guenther
On Tue, Aug 23, 2011 at 12:45 PM, Artem Shinkarov
 wrote:
> On Tue, Aug 23, 2011 at 11:33 AM, Richard Guenther
>  wrote:
>> On Tue, Aug 23, 2011 at 12:24 PM, Artem Shinkarov
>>  wrote:
>>> On Tue, Aug 23, 2011 at 11:08 AM, Richard Guenther
>>>  wrote:
 On Tue, Aug 23, 2011 at 11:44 AM, Artem Shinkarov
  wrote:
> On Tue, Aug 23, 2011 at 9:17 AM, Richard Guenther
>  wrote:
>> On Mon, Aug 22, 2011 at 11:11 PM, Artem Shinkarov
>>  wrote:
>>> I'll just send you my current version. I'll be a little bit more 
>>> specific.
>>>
>>> The problem starts when you try to lower the following expression:
>>>
>>> x = a > b;
>>> x1 = vcond 
>>> vcond 
>>>
>>> Now, you go from the beginning to the end of the block, and you cannot
>>> leave a > b, because only vconds are valid expressions to expand.
>>>
>>> Now, you meet a > b first. You try to transform it into vcond  b,
>>> -1, 0>, you build this expression, then you try to gimplify it, and
>>> you see that you have something like:
>>>
>>> x' = a >b;
>>> x = vcond 
>>> x1 = vcond 
>>> vcond 
>>>
>>> and your gsi stands at the x1 now, so the gimplification created a
>>> comparison that optab would not understand. And I am not really sure
>>> that you would be able to solve this problem easily.
>>>
>>> It would helpr, if you could create vcond, but you
>>> cant and x op y is a single tree that must be gimplified, and I am not
>>> sure that you can persuade gimplifier to leave this expression
>>> untouched.
>>>
>>> In the attachment the current version of the patch.
>>
>> I can't reproduce it with your patch.  For
>>
>> #define vector(elcount, type)  \
>>    __attribute__((vector_size((elcount)*sizeof(type type
>>
>> vector (4, float) x, y;
>> vector (4, int) a,b;
>> int
>> main (int argc, char *argv[])
>> {
>>  vector (4, int) i0 = x < y;
>>  vector (4, int) i1 = i0 ? a : b;
>>  return 0;
>> }
>>
>> I get from the C frontend:
>>
>>  vector(4) int i0 =  VEC_COND_EXPR < SAVE_EXPR  , { -1, -1,
>> -1, -1 } , { 0, 0, 0, 0 } > ;
>>  vector(4) int i1 =  VEC_COND_EXPR < SAVE_EXPR  , SAVE_EXPR  ,
>> SAVE_EXPR  > ;
>>
>> but I have expected i0 != 0 in the second VEC_COND_EXPR.
>
> I don't put it there. This patch adds != 0, rather removing. But this
> could be changed.

 ?

>> I do see that the gimplifier pulls away the condition for the first
>> VEC_COND_EXPR though:
>>
>>  x.0 = x;
>>  y.1 = y;
>>  D.2735 = x.0 < y.1;
>>  D.2734 = D.2735;
>>  D.2736 = D.2734;
>>  i0 = [vec_cond_expr]  VEC_COND_EXPR < D.2736 , { -1, -1, -1, -1 } ,
>> { 0, 0, 0, 0 } > ;
>>
>> which is, I believe because of the SAVE_EXPR wrapped around the
>> comparison.  Why do you bother wrapping all operands in save-exprs?
>
> I bother because they could be MAYBE_CONST which breaks the
> gimplifier. But I don't really know if you can do it better. I can
> always do this checking on operands of constructed vcond...

 Err, the patch does

 +  /* Avoid C_MAYBE_CONST in VEC_COND_EXPR.  */
 +  tmp = c_fully_fold (ifexp, false, &maybe_const);
 +  ifexp = save_expr (tmp);
 +  wrap &= maybe_const;

 why is

  ifexp = save_expr (tmp);

 necessary here?  SAVE_EXPR is if you need to protect side-effects
 from being evaluated twice if you use an operand twice.  But all
 operands are just used a single time.
>>>
>>> Again, the only reason why save_expr is there is to avoid MAYBE_CONST
>>> nodes to break the gimplification. But may be it is a wrong way of
>>> doing it, but it does the job.
>>>
 And I expected, instead of

 +  if ((COMPARISON_CLASS_P (ifexp)
 +       && TREE_TYPE (TREE_OPERAND (ifexp, 0)) != TREE_TYPE (op1))
 +      || TREE_TYPE (ifexp) != TREE_TYPE (op1))
 +    {
 +      tree comp_type = COMPARISON_CLASS_P (ifexp)
 +                      ? TREE_TYPE (TREE_OPERAND (ifexp, 0))
 +                      : TREE_TYPE (ifexp);
 +
 +      op1 = convert (comp_type, op1);
 +      op2 = convert (comp_type, op2);
 +      vcond = build3 (VEC_COND_EXPR, comp_type, ifexp, op1, op2);
 +      vcond = convert (TREE_TYPE (op1), vcond);
 +    }
 +  else
 +    vcond = build3 (VEC_COND_EXPR, TREE_TYPE (op1), ifexp, op1, op2);

  if (!COMPARISON_CLASS_P (ifexp))
    ifexp = build2 (NE_EXPR, TREE_TYPE (ifexp), ifexp,
                         build_vector_from_val (TREE_TYPE (ifexp), 0));

  if (TREE_TYPE (ifexp) != TREE_TYPE (op1))
    {
 ...

>>> Why?
>>> This is a function to constuct any vcond. The result of ifexp is
>>> always signed integer vector if it is a comparison, but we need to
>>> make sure that all the elements of vcond have the 

Re: [Patch, Fortran] PR 31600 - Better diagnosis when redeclaring used-assoc symbol

2011-08-23 Thread Mikael Morin
On Tuesday 23 August 2011 11:48:27 Tobias Burnus wrote:
> Build and regtested on x86-64-linux.
> OK for the trunk?
> 
OK.

Mikael



Re: Vector Comparison patch

2011-08-23 Thread Artem Shinkarov
On Tue, Aug 23, 2011 at 11:56 AM, Richard Guenther
 wrote:
> On Tue, Aug 23, 2011 at 12:45 PM, Artem Shinkarov
>  wrote:
>> On Tue, Aug 23, 2011 at 11:33 AM, Richard Guenther
>>  wrote:
>>> On Tue, Aug 23, 2011 at 12:24 PM, Artem Shinkarov
>>>  wrote:
 On Tue, Aug 23, 2011 at 11:08 AM, Richard Guenther
  wrote:
> On Tue, Aug 23, 2011 at 11:44 AM, Artem Shinkarov
>  wrote:
>> On Tue, Aug 23, 2011 at 9:17 AM, Richard Guenther
>>  wrote:
>>> On Mon, Aug 22, 2011 at 11:11 PM, Artem Shinkarov
>>>  wrote:
 I'll just send you my current version. I'll be a little bit more 
 specific.

 The problem starts when you try to lower the following expression:

 x = a > b;
 x1 = vcond 
 vcond 

 Now, you go from the beginning to the end of the block, and you cannot
 leave a > b, because only vconds are valid expressions to expand.

 Now, you meet a > b first. You try to transform it into vcond  b,
 -1, 0>, you build this expression, then you try to gimplify it, and
 you see that you have something like:

 x' = a >b;
 x = vcond 
 x1 = vcond 
 vcond 

 and your gsi stands at the x1 now, so the gimplification created a
 comparison that optab would not understand. And I am not really sure
 that you would be able to solve this problem easily.

 It would helpr, if you could create vcond, but you
 cant and x op y is a single tree that must be gimplified, and I am not
 sure that you can persuade gimplifier to leave this expression
 untouched.

 In the attachment the current version of the patch.
>>>
>>> I can't reproduce it with your patch.  For
>>>
>>> #define vector(elcount, type)  \
>>>    __attribute__((vector_size((elcount)*sizeof(type type
>>>
>>> vector (4, float) x, y;
>>> vector (4, int) a,b;
>>> int
>>> main (int argc, char *argv[])
>>> {
>>>  vector (4, int) i0 = x < y;
>>>  vector (4, int) i1 = i0 ? a : b;
>>>  return 0;
>>> }
>>>
>>> I get from the C frontend:
>>>
>>>  vector(4) int i0 =  VEC_COND_EXPR < SAVE_EXPR  , { -1, -1,
>>> -1, -1 } , { 0, 0, 0, 0 } > ;
>>>  vector(4) int i1 =  VEC_COND_EXPR < SAVE_EXPR  , SAVE_EXPR  ,
>>> SAVE_EXPR  > ;
>>>
>>> but I have expected i0 != 0 in the second VEC_COND_EXPR.
>>
>> I don't put it there. This patch adds != 0, rather removing. But this
>> could be changed.
>
> ?
>
>>> I do see that the gimplifier pulls away the condition for the first
>>> VEC_COND_EXPR though:
>>>
>>>  x.0 = x;
>>>  y.1 = y;
>>>  D.2735 = x.0 < y.1;
>>>  D.2734 = D.2735;
>>>  D.2736 = D.2734;
>>>  i0 = [vec_cond_expr]  VEC_COND_EXPR < D.2736 , { -1, -1, -1, -1 } ,
>>> { 0, 0, 0, 0 } > ;
>>>
>>> which is, I believe because of the SAVE_EXPR wrapped around the
>>> comparison.  Why do you bother wrapping all operands in save-exprs?
>>
>> I bother because they could be MAYBE_CONST which breaks the
>> gimplifier. But I don't really know if you can do it better. I can
>> always do this checking on operands of constructed vcond...
>
> Err, the patch does
>
> +  /* Avoid C_MAYBE_CONST in VEC_COND_EXPR.  */
> +  tmp = c_fully_fold (ifexp, false, &maybe_const);
> +  ifexp = save_expr (tmp);
> +  wrap &= maybe_const;
>
> why is
>
>  ifexp = save_expr (tmp);
>
> necessary here?  SAVE_EXPR is if you need to protect side-effects
> from being evaluated twice if you use an operand twice.  But all
> operands are just used a single time.

 Again, the only reason why save_expr is there is to avoid MAYBE_CONST
 nodes to break the gimplification. But may be it is a wrong way of
 doing it, but it does the job.

> And I expected, instead of
>
> +  if ((COMPARISON_CLASS_P (ifexp)
> +       && TREE_TYPE (TREE_OPERAND (ifexp, 0)) != TREE_TYPE (op1))
> +      || TREE_TYPE (ifexp) != TREE_TYPE (op1))
> +    {
> +      tree comp_type = COMPARISON_CLASS_P (ifexp)
> +                      ? TREE_TYPE (TREE_OPERAND (ifexp, 0))
> +                      : TREE_TYPE (ifexp);
> +
> +      op1 = convert (comp_type, op1);
> +      op2 = convert (comp_type, op2);
> +      vcond = build3 (VEC_COND_EXPR, comp_type, ifexp, op1, op2);
> +      vcond = convert (TREE_TYPE (op1), vcond);
> +    }
> +  else
> +    vcond = build3 (VEC_COND_EXPR, TREE_TYPE (op1), ifexp, op1, op2);
>
>  if (!COMPARISON_CLASS_P (ifexp))
>    ifexp = build2 (NE_EXPR, TREE_TYPE (ifexp), ifexp,
>                         build_vector_from_val (TREE_TYPE (ifexp), 0));
>
>  if (TREE_TYPE (ifexp) != TREE_TYPE (op1))
>    {
> ...
>

Re: Vector Comparison patch

2011-08-23 Thread Artem Shinkarov
Sorry, not
rhs = gimplify_build3 (gsi, VEC_COND_EXPR, a, b, {-1}, {0}>

but rather

rhs = gimplify_build3 (gsi, VEC_COND_EXPR, build2 (GT_EXPR, type, a,
b), {-1}, {0}>


Artem.


Re: [PATCH, testsuite, i386] AVX2 support for GCC

2011-08-23 Thread Kirill Yukhin
Sorry, lost body for previous message:

Here is last patch to add initial support of AVX2 in GCC.
It contains bunch of tests for built-ins.
All tests pass under simulator and ignored when AVX2 is out.

patch and testsuite/ChangeLog entry are attached,

Is it OK

Thanks, K

On Tue, Aug 23, 2011 at 2:54 PM, Kirill Yukhin  wrote:
>  Hi,
>  Here is last patch to add initial support of AVX2 in GCC.
>  It contains bunch of tests for built-ins.
>  All tests pass under simulator and ignored when AVX2 is out.
>
>  patch and testsuite/ChangeLog entry are attached,
>
>  Is it OK?
>
>  Thanks, K
>
>> On Mon, Aug 22, 2011 at 5:59 PM, Kirill Yukhin  
>> wrote:
>>> Thanks!
>>>
>>> K
>>>
>>> On Mon, Aug 22, 2011 at 5:57 PM, H.J. Lu  wrote:
 On Mon, Aug 22, 2011 at 6:18 AM, Kirill Yukhin  
 wrote:
> Hi,
> thanks for input, Uros. Spaces were fixed.
>
> Updated patch is attached. ChangeLog entry is attached.
>
> Could anybody please commit it?
>

 I checked in for you.


 --
 H.J.

>>>
>>
>


Re: Vector Comparison patch

2011-08-23 Thread Richard Guenther
On Tue, Aug 23, 2011 at 1:11 PM, Artem Shinkarov
 wrote:
> On Tue, Aug 23, 2011 at 11:56 AM, Richard Guenther
>  wrote:
>> On Tue, Aug 23, 2011 at 12:45 PM, Artem Shinkarov
>>  wrote:
>>> I'm confused.
>>> There is a set of problems which are tightly connected and you address
>>> only one one of them.
>>>
>>> I need to do something with C_MAYBE_CONST_EXPR node to allow the
>>> gimplification of the expression. In order to achieve that I am
>>> wrapping expression which can contain C_MAYBE_EXPR_NODE into
>>> SAVE_EXPR. This works fine, but, the vector condition is lifted out.
>>> So the question is how to get rid of C_MAYBE_CONST_EXPR nodes, making
>>> sure that the expression is still inside VEC_COND_EXPR?
>>
>> I can't answer this, but no C_MAYBE_CONST_EXPR nodes may survive
>> until gimplification.  I thought c_fully_fold is exactly used (instead
>> of c_save_expr) because it _doesn't_ wrap things in C_MAYBE_CONST_EXPR
>> nodes.  Instead you delay that (well, commented out in your patch).
>
> Ok. So for the time being save_expr is the only way that we know to
> avoid C_MAYBE_CONST_EXPR nodes.
>
>>> All the rest is fine -- a > b is transformed to VEC_COND_EXPR of the
>>> integer type, and when we are using it we can add != 0 to the mask, no
>>> problem. The problem is to make sure that the vector expression is not
>>> lifted out from the VEC_COND_EXPR and that C_MAYBE_CONST_EXPRs are
>>> also no there at the same time.
>>
>> Well, for example for floating-point comparisons and -fnon-call-exceptions
>> you _will_ get comparisons lifted out of the VEC_COND_EXPR.  But
>> that shouldn't be an issue because C semantics are ensured for
>> the mask ? v0 : v1 source form by changing it to mask != 0 ? v0 : v1 and
>> the VEC_COND_EXPR semantic for a non-comparison mask operand
>> is (v0 & mask) | (v1 & ~mask).  Which means that we have to be able to
>> expand mask = v0 < v1 anyway, but we'll simply expand it if it were
>> VEC_COND_EXPR .
>
> Richard, I think you almost get it, but there is a tiny thing you have missed.
> Look, let's assume, that by some reason when we gimplified a > b, the
> comparison was lifted out. So we have the following situation:
>
> D.1 = a > b;
> comp = vcond
> ...
>
> Ok?
> Now, I fully agree that we want to treat lifted a > b as VCOND. Now,
> what I am doing in the veclower is when I meet vector comparison a >
> b, I wrap it in the VCOND, otherwise it would not be recognized by
> optabs. literally I am doing:
>
> rhs = gimplify_build3 (gsi, VEC_COND_EXPR, a, b, {-1}, {0}>
>
> And here is a devil hidden. By some reason, when this expression is
> gimplified, a > b is lifted again and is left outside the
> VEC_COND_EXPR, and that is the problem I am trying to fight with. Have
> any ideas what could be done here?

Well, don't do it.  Check if the target can expand

 D.1 = a > b;

via feeding it vcond  and if not, expand it piecewise
in veclower.  If it can handle it - leave it alone!

In expand_expr_real_2 add to the EQ_EXPR (etc.) case the case
of a vector-typed comparison and use the vcond optab for it, again
via vcond .  If you look at the EQ_EXPR case
it dispatches to do_store_flag - that's the best place to handle
vector-typed compares.

Richard.


Re: [PATCH v3, i386] BMI2 support for GCC, mulx, rorx, x part

2011-08-23 Thread Uros Bizjak
On Tue, Aug 23, 2011 at 1:07 PM, Kirill Yukhin  wrote:
> Hi,
> I've slightly updated mulx split to avoid ICE.
> Updated patch, ChangeLog entry (with Uros's contribution) and
> ChangeLog.testsuite entry are attached.
>
> Bootstrapped and make-checked.
>
> Tests all pass under simulator (expept one, but it is simulator issue).
>
> Uros, you asked if BMI2 is inherited from BMI. The answer is no, these
> 2 extensions are not connected.
>
> Is is OK?

+{
+  operands[3] = gen_lowpart (mode, operands[0]);
+  operands[4] = gen_highpart (mode, operands[0]);
+  operands[5] = GEN_INT (GET_MODE_BITSIZE (mode));
+})

Please change this part to:

{
  split_double_mode (mode, &operands[0], 1, &operands[3], &operands[4]);

  operands[5] = GEN_INT (GET_MODE_BITSIZE (mode));
})

Please also add -mbmi2 to gcc.target/i386/sse-{12,13,14,22,23}.c files.

Please also change some entries in the ChangeLog to:

* config/i386/i386-c.c (ix86_target_macros_internal):
Conditionally define __BMI2__.
* config/i386/i386.c (ix86_option_override_internal): Define PTA_BMI2.
Handle BMI2 option.
(ix86_valid_target_attribute_inner_p): Handle BMI2 option.

OK with these changes.

Thanks,
Uros.


Re: [PATCH, testsuite, i386] AVX2 support for GCC

2011-08-23 Thread Uros Bizjak
On Tue, Aug 23, 2011 at 1:22 PM, Kirill Yukhin  wrote:

> Here is last patch to add initial support of AVX2 in GCC.
> It contains bunch of tests for built-ins.
> All tests pass under simulator and ignored when AVX2 is out.
>
> patch and testsuite/ChangeLog entry are attached,
>
> Is it OK

Please do not change existing vect-104 and avoid -O0 in the testsuite
unless really necessary.

You should also add -mavx2 to gcc.target/i386/sse-{13,14,22,23} to
check AVX2 intrinsics.

Uros.


Re: [4.7][google]Support for getting CPU type and feature information at run-time. (issue4893046)

2011-08-23 Thread Michael Matz
Hi,

On Mon, 22 Aug 2011, H.J. Lu wrote:

> > void __attribute__((constructor)) bla(void)
> > {
> >  __cpu_indicator_init ();
> > }
> >
> > I don't see any complication.?
> >
> 
> Order of constructors.  A constructor may call functions
> which use __cpu_indicator.

That's why I wrote also:

> The initializer function has to be callable from pre-.init contexts, e.g.
> ifunc dispatchers.

It obviously has to be guarded against multiple calls.  The ctor in libgcc 
would be mere convenience because then non-ctor code can rely on the data 
being initialized, and only (potential) ctor code has to check and call 
the init function on demand.


Ciao,
Michael.

Re: [var-tracking] small speed-ups

2011-08-23 Thread Dimitrios Apostolou

Hi jakub,

On Mon, 22 Aug 2011, Jakub Jelinek wrote:

On Mon, Aug 22, 2011 at 01:30:33PM +0300, Dimitrios Apostolou wrote:


@@ -1191,7 +1189,7 @@ dv_uid2hash (dvuid uid)
 static inline hashval_t
 dv_htab_hash (decl_or_value dv)
 {
-  return dv_uid2hash (dv_uid (dv));
+  return (hashval_t) (dv_uid (dv));
 }


Why?  dv_uid2hash is an inline that does exactly that.


@@ -1202,7 +1200,7 @@ variable_htab_hash (const void *x)
 {
   const_variable const v = (const_variable) x;

-  return dv_htab_hash (v->dv);
+  return (hashval_t) (dv_uid (v->dv));
 }


Why?


 /* Compare the declaration of variable X with declaration Y.  */
@@ -1211,9 +1209,8 @@ static int
 variable_htab_eq (const void *x, const void *y)
 {
   const_variable const v = (const_variable) x;
-  decl_or_value dv = CONST_CAST2 (decl_or_value, const void *, y);

-  return (dv_as_opaque (v->dv) == dv_as_opaque (dv));
+  return (v->dv) == y;
 }


Why?


I was hoping you'd ask so I'll ask back :-) Why are we doing it that way? 
Why so much indirection in the first place? Why create inline functions 
just to typecast and why do we need this CONST_CAST2 ugliness in C code. I 
bet there are things I don't understand so I'd be happy to listen...


The reason I did this (and many more I didn't publish) simplifications 
within var-tracking is because it hurt my brains to follow the 
logic. Even with the help of TAGS I have a specific stack depth before I 
forget where I begun diving into TAG declarations. Well in var-tracking 
this limit was surpassed by much...





@@ -1397,19 +1398,40 @@ shared_var_p (variable var, shared_hash
  || shared_hash_shared (vars));
 }

+/* Copy all variables from hash table SRC to hash table DST without rehashing
+   any values.  */
+
+static htab_t
+htab_dup (htab_t src)
+{
+  htab_t dst;
+
+  dst = (htab_t) xmalloc (sizeof (*src));
+  memcpy (dst, src, sizeof (*src));
+  dst->entries = (void **) xmalloc (src->size * sizeof (*src->entries));
+  memcpy (dst->entries, src->entries,
+ src->size * sizeof (*src->entries));
+  return dst;
+}
+


This certainly doesn't belong here, it should go into libiberty/hashtab.c
and prototype into include/hashtab.h.  It relies on hashtab.c
implementation details.


OK I'll do that in the future. Should I also move some other htab 
functions I saw in var-tracking and rtl? FOR_EACH_HTAB_ELEMENT comes to 
mind, probably other too.





@@ -2034,7 +2041,8 @@ val_resolve (dataflow_set *set, rtx val,
 static void
 dataflow_set_init (dataflow_set *set)
 {
-  init_attrs_list_set (set->regs);
+  /* Initialize the set (array) SET of attrs to empty lists.  */
+  memset (set->regs, 0, sizeof (set->regs));
   set->vars = shared_hash_copy (empty_shared_hash);
   set->stack_adjust = 0;
   set->traversed_vars = NULL;


I'd say you should instead just implement init_attrs_list_set inline using
memset.


It's used only once, that's why I deleted the function. I'll bring it back 
if you think it helps.





   dst->vars = (shared_hash) pool_alloc (shared_hash_pool);
   dst->vars->refcount = 1;
   dst->vars->htab
-= htab_create (MAX (src1_elems, src2_elems), variable_htab_hash,
+= htab_create (2 * MAX (src1_elems, src2_elems), variable_htab_hash,
   variable_htab_eq, variable_htab_free);


This looks wrong, 2 * max is definitely too much.


For a hash table to fit N elements, it has to have at least 4/3*N 
slots, or 2*N slots if htab has the 50% load factor I was proposing.



@@ -8996,11 +9006,13 @@ vt_finalize (void)

   FOR_ALL_BB (bb)
 {
-  dataflow_set_destroy (&VTI (bb)->in);
-  dataflow_set_destroy (&VTI (bb)->out);
+  /* The "false" do_free parameter means to not bother to iterate and free
+all hash table elements, since we'll destroy the pools. */
+  dataflow_set_destroy (&VTI (bb)->in, false);
+  dataflow_set_destroy (&VTI (bb)->out, false);
   if (VTI (bb)->permp)
{
- dataflow_set_destroy (VTI (bb)->permp);
+ dataflow_set_destroy (VTI (bb)->permp, false);
  XDELETE (VTI (bb)->permp);
}
 }



How much does this actually speed things up (the not freeing pool allocated
stuff during finalizaqtion)?  Is it really worth it?


In total for dataflow_set_destroy I can see that calls to 
attrs_list_clear() have been reduced from 500K to 250K, and I can also see 
a reduction of free() calls from htab_delete(), from 30K to 10K. I'm 
willing to bet that much of this is because of this change, I have kept 
only the ones that showed difference and remember clearly that 
var-tracking is iterating over hash tables too much, either directly or 
from htab_traverse()/htab_delete().



Thanks,
Dimitris



Re: Vector Comparison patch

2011-08-23 Thread Artem Shinkarov
On Tue, Aug 23, 2011 at 12:23 PM, Richard Guenther
 wrote:
> On Tue, Aug 23, 2011 at 1:11 PM, Artem Shinkarov
>  wrote:
>> On Tue, Aug 23, 2011 at 11:56 AM, Richard Guenther
>>  wrote:
>>> On Tue, Aug 23, 2011 at 12:45 PM, Artem Shinkarov
>>>  wrote:
 I'm confused.
 There is a set of problems which are tightly connected and you address
 only one one of them.

 I need to do something with C_MAYBE_CONST_EXPR node to allow the
 gimplification of the expression. In order to achieve that I am
 wrapping expression which can contain C_MAYBE_EXPR_NODE into
 SAVE_EXPR. This works fine, but, the vector condition is lifted out.
 So the question is how to get rid of C_MAYBE_CONST_EXPR nodes, making
 sure that the expression is still inside VEC_COND_EXPR?
>>>
>>> I can't answer this, but no C_MAYBE_CONST_EXPR nodes may survive
>>> until gimplification.  I thought c_fully_fold is exactly used (instead
>>> of c_save_expr) because it _doesn't_ wrap things in C_MAYBE_CONST_EXPR
>>> nodes.  Instead you delay that (well, commented out in your patch).
>>
>> Ok. So for the time being save_expr is the only way that we know to
>> avoid C_MAYBE_CONST_EXPR nodes.
>>
 All the rest is fine -- a > b is transformed to VEC_COND_EXPR of the
 integer type, and when we are using it we can add != 0 to the mask, no
 problem. The problem is to make sure that the vector expression is not
 lifted out from the VEC_COND_EXPR and that C_MAYBE_CONST_EXPRs are
 also no there at the same time.
>>>
>>> Well, for example for floating-point comparisons and -fnon-call-exceptions
>>> you _will_ get comparisons lifted out of the VEC_COND_EXPR.  But
>>> that shouldn't be an issue because C semantics are ensured for
>>> the mask ? v0 : v1 source form by changing it to mask != 0 ? v0 : v1 and
>>> the VEC_COND_EXPR semantic for a non-comparison mask operand
>>> is (v0 & mask) | (v1 & ~mask).  Which means that we have to be able to
>>> expand mask = v0 < v1 anyway, but we'll simply expand it if it were
>>> VEC_COND_EXPR .
>>
>> Richard, I think you almost get it, but there is a tiny thing you have 
>> missed.
>> Look, let's assume, that by some reason when we gimplified a > b, the
>> comparison was lifted out. So we have the following situation:
>>
>> D.1 = a > b;
>> comp = vcond
>> ...
>>
>> Ok?
>> Now, I fully agree that we want to treat lifted a > b as VCOND. Now,
>> what I am doing in the veclower is when I meet vector comparison a >
>> b, I wrap it in the VCOND, otherwise it would not be recognized by
>> optabs. literally I am doing:
>>
>> rhs = gimplify_build3 (gsi, VEC_COND_EXPR, a, b, {-1}, {0}>
>>
>> And here is a devil hidden. By some reason, when this expression is
>> gimplified, a > b is lifted again and is left outside the
>> VEC_COND_EXPR, and that is the problem I am trying to fight with. Have
>> any ideas what could be done here?
>
> Well, don't do it.  Check if the target can expand
>
>  D.1 = a > b;
>
> via feeding it vcond  and if not, expand it 
> piecewise
> in veclower.  If it can handle it - leave it alone!
>
> In expand_expr_real_2 add to the EQ_EXPR (etc.) case the case
> of a vector-typed comparison and use the vcond optab for it, again
> via vcond .  If you look at the EQ_EXPR case
> it dispatches to do_store_flag - that's the best place to handle
> vector-typed compares.
>
> Richard.
>
That sounds like a plan. I'll investigate if it can be done.
Also, if we can handle a > b, then we don't need to construct vcond b, {-1}, {0}>, we will know that it would be constructed correctly
when expanding.


Thanks for your help,
Artem.


[trans-mem] Add method groups and change TM method lifecycle and selection.

2011-08-23 Thread Torvald Riegel
The patch adds method groups for TM methods, which group methods that
can run concurrently together. The lifecycle and state management
responsibilities of method groups and methods get documented.

For now, there is just a method group for all serial methods, more will
follow when further TM methods are added.

A new default dispatch (aka method) is maintained, and there is a very
simple runtime adaption scheme that uses dispatch_serialirr() if there
is just one registered thread, and the default dispatch (currently
serialirr_onwrite) if there is more than one. The user can override this
by specifying a dispatch via the ITM_DEFAULT_METHOD environment
variable, which then makes libitm always use this method.

This should be the last major step of the refactoring. Future patches
can now add further TM methods, and those methods should fit in more
easily.

OK for branch?
commit 431f3c067873e41f536022f6bcfc307464bf91fd
Author: Torvald Riegel 
Date:   Tue Aug 23 13:44:19 2011 +0200

Add method groups and change TM method lifecycle and selection.

* retry.cc (GTM::gtm_thread::decide_retry_strategy): Cleanup. Fix
restarting without switching to serial mode.
(GTM::gtm_thread::decide_begin_dispatch): Let the caller set the
transaction state. Choose closed-nesting alternative if available.
(GTM::gtm_thread::set_default_dispatch): New.
(parse_default_method): New.
(GTM::gtm_thread::number_of_threads_changed): New.
* method-serial.cc (GTM::serial_mg): New method group class.
(GTM::serialirr_dispatch): Belongs to serial_mg. Remove reinit and
fini.
(GTM::serial_dispatch): Same.
(GTM::serialirr_onwrite_dispatch): Same.
(GTM::gtm_thread::serialirr_mode): Remove calls to fini.
* beginend.cc (GTM::gtm_thread::~gtm_thread): Maintain number of
registered threads.
(GTM::gtm_thread::gtm_thread): Same.
(_ITM_abortTransaction): Remove calls to abi_dispatch::fini().
(GTM::gtm_thread::trycommit): Same. Reset number of restarts.
(GTM::gtm_thread::begin_transaction): Let decide_begin_dispatch()
choose dispatch but set state according to dispatch here.
* dispatch.h (GTM::abi_dispatch::fini): Move to method group.
(GTM::method_group): New class.
(GTM::abi_dispatch): Add comments. Maintain pointer to method_group.
* libitm_i.h (GTM::gtm_thread): Add declarations for new members.
* libitm.texi: Document TM methods, method groups, method life cycle.
Rename method sets to method groups.

diff --git a/libitm/beginend.cc b/libitm/beginend.cc
index e53ea6c..cc25d17 100644
--- a/libitm/beginend.cc
+++ b/libitm/beginend.cc
@@ -34,6 +34,7 @@ extern __thread gtm_thread_tls _gtm_thr_tls;
 
 gtm_rwlock GTM::gtm_thread::serial_lock;
 gtm_thread *GTM::gtm_thread::list_of_threads = 0;
+unsigned GTM::gtm_thread::number_of_threads = 0;
 
 gtm_stmlock GTM::gtm_stmlock_array[LOCK_ARRAY_SIZE];
 gtm_version GTM::gtm_clock;
@@ -103,6 +104,8 @@ GTM::gtm_thread::~gtm_thread()
   break;
 }
 }
+  number_of_threads--;
+  number_of_threads_changed(number_of_threads + 1, number_of_threads);
   serial_lock.write_unlock ();
 }
 
@@ -117,6 +120,8 @@ GTM::gtm_thread::gtm_thread ()
   serial_lock.write_lock ();
   next_thread = list_of_threads;
   list_of_threads = this;
+  number_of_threads++;
+  number_of_threads_changed(number_of_threads - 1, number_of_threads);
   serial_lock.write_unlock ();
 
   if (pthread_once(&thr_release_once, thread_exit_init))
@@ -226,27 +231,16 @@ GTM::gtm_thread::begin_transaction (uint32_t prop, const 
gtm_jmpbuf *jb)
   else
 {
   // Outermost transaction
-  // TODO Pay more attention to prop flags (eg, *omitted) when selecting
-  // dispatch.
-  if ((prop & pr_doesGoIrrevocable) || !(prop & pr_instrumentedCode))
-tx->state = (STATE_SERIAL | STATE_IRREVOCABLE);
-
-  else
-disp = tx->decide_begin_dispatch (prop);
-
-  if (tx->state & STATE_SERIAL)
+  disp = tx->decide_begin_dispatch (prop);
+  if (disp == dispatch_serialirr() || disp == dispatch_serial())
 {
+  tx->state = STATE_SERIAL;
+  if (disp == dispatch_serialirr())
+tx->state |= STATE_IRREVOCABLE;
   serial_lock.write_lock ();
-
-  if (tx->state & STATE_IRREVOCABLE)
-disp = dispatch_serialirr ();
-  else
-disp = dispatch_serial ();
 }
   else
-{
-  serial_lock.read_lock (tx);
-}
+serial_lock.read_lock (tx);
 
   set_abi_disp (disp);
 }
@@ -387,7 +381,6 @@ _ITM_abortTransaction (_ITM_abortReason reason)
   gtm_jmpbuf longjmp_jb = tx->jb;
 
   tx->rollback (cp);
-  abi_disp()->fini ();
 
   // Jump to nested transaction (use the saved jump buffer).
   GTM_longjmp (&longjmp_jb, a_abortTransaction | a_restoreLiveVariables,
@@ -397,7 +390,6 @@ _ITM_

[PATCH] For FFS/CLZ/CTZ/CLRSB/POPCOUNT/PARITY/BSWAP require operand mode equal to operation mode (or VOIDmode) (PR middle-end/50161)

2011-08-23 Thread Jakub Jelinek
On Tue, Aug 23, 2011 at 11:57:58AM +0200, Bernd Schmidt wrote:
> On 08/23/11 11:52, Jakub Jelinek wrote:
> > On Tue, Aug 23, 2011 at 11:35:07AM +0200, Bernd Schmidt wrote:
> >>> cse_process_notes_1
> >>> perhaps could be changed for VOIDmode new_rtx to try to
> >>> simplify_replace_rtx it...
> >>
> >> Is this where the problem came from? Sounds like it's worth a try.
> > 
> > In this case, yes.  But there are many other places all around the
> > compiler that need to disallow unary op with VOIDmode operand.
> > In cse.c alone e.g. fold_rtx (twice), in combine.c e.g. in do_SUBST,
> > subst, etc.  Do we want to special case all those 7 unary ops there too?
> > Is it really worth it to save one subreg or truncate in the md patterns
> > for rarely used rtxes?
> 
> Maybe not. I'll approve a patch to change it back, even if I think it's
> not a good representation.

We can remove that restriction again once CONST_INTs are no longer VOIDmode.

Here is an untested patch, will bootstrap/regtest it now on x86_64-linux
and i686-linux, on c6x it should make no difference IMHO (looked like a typo
in the expander which wasn't used anyway), can somebody test it on AVR and
BFIN?  My grepping through *.md didn't find any other places where the
operand wouldn't have the same mode as operation.

2011-08-23  Jakub Jelinek  

PR middle-end/50161
* simplify-rtx.c (simplify_const_unary_operation): If
op is CONST_INT, don't look at op_mode, but use instead
mode.
* optabs.c (add_equal_note): For FFS, CLZ, CTZ,
CLRSB, POPCOUNT, PARITY and BSWAP use operand mode for
operation and TRUNCATE/ZERO_EXTEND if needed.
* doc/rtl.texi (ffs, clrsb, clz, ctz, popcount, parity, bswap):
Document that operand mode must be same as operation mode,
or VOIDmode.
* config/avr/avr.md (paritysi2, *parityqihi2.libgcc,
*paritysihi2.libgcc, popcountsi2, *popcountsi2.libgcc,
*popcountqihi2.libgcc, clzsi2, *clzsihi2.libgcc, ctzsi2,
*ctzsihi2.libgcc, ffssi2, *ffssihi2.libgcc): For unary ops
use the mode of operand for the operation and add truncate
or zero_extend around if needed.
* config/c6x/c6x.md (ctzdi2): Likewise.
* config/bfin/bfin.md (clrsbsi2, signbitssi2): Likewise.

* gcc.dg/pr50161.c: New test.

--- gcc/simplify-rtx.c.jj   2011-08-22 08:17:07.0 +0200
+++ gcc/simplify-rtx.c  2011-08-23 13:24:08.0 +0200
@@ -1373,8 +1373,7 @@ simplify_const_unary_operation (enum rtx
 }
 
   if (CONST_INT_P (op)
-  && width <= HOST_BITS_PER_WIDE_INT
-  && op_width <= HOST_BITS_PER_WIDE_INT && op_width > 0)
+  && width <= HOST_BITS_PER_WIDE_INT && width > 0)
 {
   HOST_WIDE_INT arg0 = INTVAL (op);
   HOST_WIDE_INT val;
@@ -1394,50 +1393,50 @@ simplify_const_unary_operation (enum rtx
  break;
 
case FFS:
- arg0 &= GET_MODE_MASK (op_mode);
+ arg0 &= GET_MODE_MASK (mode);
  val = ffs_hwi (arg0);
  break;
 
case CLZ:
- arg0 &= GET_MODE_MASK (op_mode);
- if (arg0 == 0 && CLZ_DEFINED_VALUE_AT_ZERO (op_mode, val))
+ arg0 &= GET_MODE_MASK (mode);
+ if (arg0 == 0 && CLZ_DEFINED_VALUE_AT_ZERO (mode, val))
;
  else
-   val = GET_MODE_PRECISION (op_mode) - floor_log2 (arg0) - 1;
+   val = GET_MODE_PRECISION (mode) - floor_log2 (arg0) - 1;
  break;
 
case CLRSB:
- arg0 &= GET_MODE_MASK (op_mode);
+ arg0 &= GET_MODE_MASK (mode);
  if (arg0 == 0)
-   val = GET_MODE_PRECISION (op_mode) - 1;
+   val = GET_MODE_PRECISION (mode) - 1;
  else if (arg0 >= 0)
-   val = GET_MODE_PRECISION (op_mode) - floor_log2 (arg0) - 2;
+   val = GET_MODE_PRECISION (mode) - floor_log2 (arg0) - 2;
  else if (arg0 < 0)
-   val = GET_MODE_PRECISION (op_mode) - floor_log2 (~arg0) - 2;
+   val = GET_MODE_PRECISION (mode) - floor_log2 (~arg0) - 2;
  break;
 
case CTZ:
- arg0 &= GET_MODE_MASK (op_mode);
+ arg0 &= GET_MODE_MASK (mode);
  if (arg0 == 0)
{
  /* Even if the value at zero is undefined, we have to come
 up with some replacement.  Seems good enough.  */
- if (! CTZ_DEFINED_VALUE_AT_ZERO (op_mode, val))
-   val = GET_MODE_PRECISION (op_mode);
+ if (! CTZ_DEFINED_VALUE_AT_ZERO (mode, val))
+   val = GET_MODE_PRECISION (mode);
}
  else
val = ctz_hwi (arg0);
  break;
 
case POPCOUNT:
- arg0 &= GET_MODE_MASK (op_mode);
+ arg0 &= GET_MODE_MASK (mode);
  val = 0;
  while (arg0)
val++, arg0 &= arg0 - 1;
  break;
 
case PARITY:
- arg0 &= GET_MODE_MASK (op_mode);
+ arg0 &= GET_MODE_MASK (mode);
  val = 0;
  while (arg0)
  

[PATCH][2/n] Wading through data-dependence analysis

2011-08-23 Thread Richard Guenther

This patch tries to disentangle the loop vs. non-loop code in
data dependence analysis.  For loop data-dependences we always
want (and need) to compute a distance vector if two references
from any loop iteration may alias.  Thus we have to disable
all offset-based analysis.  This is not yet what this patch does,
this patch merely makes the non-loop code stronger which
relies (*sigh*) on those invalid disambiguations (well, invalid
only for the loop case ...).

Thus we employ something slightly stronger than refs_may_alias_p,
namely what loop invariant motion does (but cheaper, without
actually expanding things).

Which allows us to do nothing for the non-loop case in
dr_analyze_indices, which makes sense.

Bootstrapped and tested on x86_64-unknown-linux-gnu, re-bootstrapping
after a stylistic change (s/VEC() *loop_nest/bool/ for dr_may_alias_p).

Richard.

2011-08-23  Richard Guenther  

* Makefile.in (tree-data-ref.o): Add tree-affine.h dependency.
* tree-affine.h (aff_comb_cannot_overlap_p): Declare.
* tree-affine.c (aff_comb_cannot_overlap_p): New function, moved
from ...
* tree-ssa-loop-im.c (cannot_overlap_p): ... here.
(mem_refs_may_alias_p): Adjust.
* tree-data-ref.h (dr_may_alias_p): Adjust.
* tree-data-ref.c: Include tree-affine.h.
(dr_analyze_indices): Do nothing for the non-loop case.
(dr_may_alias_p): Distinguish loop and non-loop case.  Disambiguate
more cases in the non-loop case.
* graphite-sese-to-poly.c (write_alias_graph_to_ascii_dimacs): Adjust
calls to dr_may_alias_p.
(write_alias_graph_to_ascii_ecc): Likewise.
(write_alias_graph_to_ascii_dot): Likewise.
(build_alias_set_optimal_p): Likewise.

Index: gcc/Makefile.in
===
--- gcc/Makefile.in (revision 177983)
+++ gcc/Makefile.in (working copy)
@@ -2690,7 +2690,7 @@ tree-scalar-evolution.o : tree-scalar-ev
$(TREE_PASS_H) $(PARAMS_H) gt-tree-scalar-evolution.h
 tree-data-ref.o : tree-data-ref.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
gimple-pretty-print.h $(TREE_FLOW_H) $(CFGLOOP_H) $(TREE_DATA_REF_H) \
-   $(TREE_PASS_H) langhooks.h
+   $(TREE_PASS_H) langhooks.h tree-affine.h
 sese.o : sese.c sese.h $(CONFIG_H) $(SYSTEM_H) coretypes.h tree-pretty-print.h 
\
$(TREE_FLOW_H) $(CFGLOOP_H) $(TREE_DATA_REF_H) tree-pass.h value-prof.h
 graphite.o : graphite.c $(CONFIG_H) $(SYSTEM_H) coretypes.h 
$(DIAGNOSTIC_CORE_H) \
Index: gcc/tree-affine.c
===
--- gcc/tree-affine.c   (revision 177983)
+++ gcc/tree-affine.c   (working copy)
@@ -887,3 +887,30 @@ get_inner_reference_aff (tree ref, aff_t
   *size = shwi_to_double_int ((bitsize + BITS_PER_UNIT - 1) / BITS_PER_UNIT);
 }
 
+/* Returns true if a region of size SIZE1 at position 0 and a region of
+   size SIZE2 at position DIFF cannot overlap.  */
+
+bool
+aff_comb_cannot_overlap_p (aff_tree *diff, double_int size1, double_int size2)
+{
+  double_int d, bound;
+
+  /* Unless the difference is a constant, we fail.  */
+  if (diff->n != 0)
+return false;
+
+  d = diff->offset;
+  if (double_int_negative_p (d))
+{
+  /* The second object is before the first one, we succeed if the last
+element of the second object is before the start of the first one.  */
+  bound = double_int_add (d, double_int_add (size2, double_int_minus_one));
+  return double_int_negative_p (bound);
+}
+  else
+{
+  /* We succeed if the second object starts after the first one ends.  */
+  return double_int_scmp (size1, d) <= 0;
+}
+}
+
Index: gcc/tree-affine.h
===
--- gcc/tree-affine.h   (revision 177983)
+++ gcc/tree-affine.h   (working copy)
@@ -76,6 +76,7 @@ void tree_to_aff_combination_expand (tre
 struct pointer_map_t **);
 void get_inner_reference_aff (tree, aff_tree *, double_int *);
 void free_affine_expand_cache (struct pointer_map_t **);
+bool aff_comb_cannot_overlap_p (aff_tree *, double_int, double_int);
 
 /* Debugging functions.  */
 void print_aff (FILE *, aff_tree *);
Index: gcc/tree-ssa-loop-im.c
===
--- gcc/tree-ssa-loop-im.c  (revision 177983)
+++ gcc/tree-ssa-loop-im.c  (working copy)
@@ -1835,33 +1835,6 @@ analyze_memory_references (void)
   create_vop_ref_mapping ();
 }
 
-/* Returns true if a region of size SIZE1 at position 0 and a region of
-   size SIZE2 at position DIFF cannot overlap.  */
-
-static bool
-cannot_overlap_p (aff_tree *diff, double_int size1, double_int size2)
-{
-  double_int d, bound;
-
-  /* Unless the difference is a constant, we fail.  */
-  if (diff->n != 0)
-return false;
-
-  d = diff->offset;
-  if (double_int_negative_p (d))
-{
-  /* The second object is before the first one, we succe

[Patch, Fortran] PR 50163 - ICE with nonconst expr in init expr

2011-08-23 Thread Tobias Burnus
The bug is a regression: An error was printed with 4.1.x but since 4.3.x 
one gets an ICE. [No idea what GCC 4.2 does.] The solution is simply: 
Returning if there is a MATCH_ERROR.


See PR (esp. comment 2) for a more detailed description:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50163#c0

Build and regtested on x86-64-linux.
OK for the trunk? And to which version should it be backported? Only 
4.6? Also 4.5? Or even 4.4?


Tobias
2011-08-23  Tobias Burnus  

	PR fortran/50163
	* check_init_expr (check_init_expr): Return when an error occured.

2011-08-23  Tobias Burnus  

	PR fortran/50163
	* gfortran.dg/initialization_28.f90: New.

diff --git a/gcc/fortran/expr.c b/gcc/fortran/expr.c
index 9922094..b050b11 100644
--- a/gcc/fortran/expr.c
+++ b/gcc/fortran/expr.c
@@ -2481,6 +2481,9 @@ check_init_expr (gfc_expr *e)
 	m = MATCH_ERROR;
 	  }
 
+	if (m == MATCH_ERROR)
+	  return FAILURE;
+
 	/* Try to scalarize an elemental intrinsic function that has an
 	   array argument.  */
 	isym = gfc_find_function (e->symtree->n.sym->name);
--- /dev/null	2011-08-23 07:28:57.751883742 +0200
+++ gcc/gcc/testsuite/gfortran.dg/initialization_28.f90	2011-08-23 14:02:02.0 +0200
@@ -0,0 +1,9 @@
+! { dg-do compile }
+!
+! PR fortran/50163
+!
+! Contributed by Philip Mason
+!
+character(len=2) :: xx ='aa'
+integer :: iloc=index(xx,'bb') ! { dg-error "has not been declared or is a variable" }
+end


Re: [trans-mem] Use __x86_64__ instead of __LP64__.

2011-08-23 Thread Torvald Riegel
On Mon, 2011-08-22 at 14:42 -0700, Richard Henderson wrote:
> On 08/22/2011 02:42 AM, Torvald Riegel wrote:
> > Use __x86_64__ instead of __LP64__ in setjmp/longjmp and TLS
> > definitions.
> > 
> > H.J.: Is that sufficient for x32, or do we need entirely different code?
> > If so, can you please provide the required changes?
> 
> The SJLJ part should be ok for x32.
> 
> The TLS part needs to use a 32-bit load and "*4".

Hmm, like in the attached patch? (I'm just guessing here ...!)
commit 3f10f6882e8dd19ca0f11a0f9d953aebe6027ead
Author: Torvald Riegel 
Date:   Mon Aug 22 11:21:03 2011 +0200

Use __x86_64__ instead of __LP64__.

* config/x86/tls.h: Use __x86_64__ instead of __LP64__.
Add X32 support.
* config/x86/sjlj.S: Same.

diff --git a/libitm/config/x86/sjlj.S b/libitm/config/x86/sjlj.S
index 0e9c246..725ffec 100644
--- a/libitm/config/x86/sjlj.S
+++ b/libitm/config/x86/sjlj.S
@@ -1,4 +1,4 @@
-/* Copyright (C) 2008, 2009 Free Software Foundation, Inc.
+/* Copyright (C) 2008, 2009, 2011 Free Software Foundation, Inc.
Contributed by Richard Henderson .
 
This file is part of the GNU Transactional Memory Library (libitm).
@@ -29,7 +29,7 @@
 
 _ITM_beginTransaction:
.cfi_startproc
-#ifdef __LP64__
+#ifdef __x86_64__
leaq8(%rsp), %rax
movq(%rsp), %r8
subq$72, %rsp
@@ -72,7 +72,7 @@ _ITM_beginTransaction:
 
 GTM_longjmp:
.cfi_startproc
-#ifdef __LP64__
+#ifdef __x86_64__
movq(%rdi), %rcx
movq8(%rdi), %rdx
movq16(%rdi), %rbx
diff --git a/libitm/config/x86/tls.h b/libitm/config/x86/tls.h
index 03fdab2..3d247e3 100644
--- a/libitm/config/x86/tls.h
+++ b/libitm/config/x86/tls.h
@@ -37,6 +37,7 @@
 #if defined(__GLIBC_PREREQ) && __GLIBC_PREREQ(2, 10)
 namespace GTM HIDDEN {
 
+#ifdef __x86_64__
 #ifdef __LP64__
 # define SEG_READ(OFS) "movq\t%%fs:(" #OFS "*8),%0"
 # define SEG_WRITE(OFS)"movq\t%0,%%fs:(" #OFS "*8)"
@@ -47,6 +48,17 @@ namespace GTM HIDDEN {
"rolq\t$17,%0\n\t" \
SEG_WRITE(OFS)
 #else
+// For X32.
+# define SEG_READ(OFS)  "movl\t%%fs:(" #OFS "*4),%0"
+# define SEG_WRITE(OFS) "movl\t%0,%%fs:(" #OFS "*4)"
+# define SEG_DECODE_READ(OFS)   SEG_READ(OFS) "\n\t" \
+"rorl\t$9,%0\n\t" \
+"xorl\t%%fs:24,%0"
+# define SEG_ENCODE_WRITE(OFS)  "xorl\t%%fs:24,%0\n\t" \
+"roll\t$9,%0\n\t" \
+SEG_WRITE(OFS)
+#endif
+#else
 # define SEG_READ(OFS)  "movl\t%%gs:(" #OFS "*4),%0"
 # define SEG_WRITE(OFS) "movl\t%0,%%gs:(" #OFS "*4)"
 # define SEG_DECODE_READ(OFS)  SEG_READ(OFS) "\n\t" \


[google] Remove timestamped line from gengtype state file comment headers

2011-08-23 Thread Simon Baldwin
Remove the timestamped line from gengtype state file comment headers.

Gcc builds after r177358 include a file .../plugin/gtype.state as part of
their binary installation.  The file contains a comment line that includes
the current date and time.  Variations in the file contents due to only
changes in the timestamp can be an issue for build and packaging systems
that prefer or insist on binary compatibility.

This patch removes the comment line, to provide binary reproducibility for
any generated gtype.state files.

Tested for x86 and PowerPC, no bootstrap in both cases.

OK for google/integration?  Also, OK for trunk?

libstdc++-v3/ChangeLog:
2011-05-20  Simon Baldwin  

* scripts/extract_symvers.in: Handle processor/OS specific or
unknown symbol binding strings from readelf.


Index: gcc/gengtype-state.c
===
--- gcc/gengtype-state.c(revision 177984)
+++ gcc/gengtype-state.c(working copy)
@@ -1194,8 +1194,6 @@ write_state (const char *state_path)
   fprintf (state_file,
   ";;; This file should be parsed by the same %s which wrote it.\n",
   progname);
-  fprintf (state_file, ";;; file %s generated on %s\n", state_path,
-  ctime (&now));
   /* The first non-comment significant line gives the version string.  */
   write_state_version (version_string);
   write_state_srcdir ();


Re: [google] Remove timestamped line from gengtype state file comment headers

2011-08-23 Thread Richard Guenther
On Tue, Aug 23, 2011 at 2:36 PM, Simon Baldwin  wrote:
> Remove the timestamped line from gengtype state file comment headers.
>
> Gcc builds after r177358 include a file .../plugin/gtype.state as part of
> their binary installation.  The file contains a comment line that includes
> the current date and time.  Variations in the file contents due to only
> changes in the timestamp can be an issue for build and packaging systems
> that prefer or insist on binary compatibility.
>
> This patch removes the comment line, to provide binary reproducibility for
> any generated gtype.state files.
>
> Tested for x86 and PowerPC, no bootstrap in both cases.
>
> OK for google/integration?  Also, OK for trunk?

Ok for trunk.

Richard.

> libstdc++-v3/ChangeLog:
> 2011-05-20  Simon Baldwin  
>
>        * scripts/extract_symvers.in: Handle processor/OS specific or
>        unknown symbol binding strings from readelf.
>
>
> Index: gcc/gengtype-state.c
> ===
> --- gcc/gengtype-state.c        (revision 177984)
> +++ gcc/gengtype-state.c        (working copy)
> @@ -1194,8 +1194,6 @@ write_state (const char *state_path)
>   fprintf (state_file,
>           ";;; This file should be parsed by the same %s which wrote it.\n",
>           progname);
> -  fprintf (state_file, ";;; file %s generated on %s\n", state_path,
> -          ctime (&now));
>   /* The first non-comment significant line gives the version string.  */
>   write_state_version (version_string);
>   write_state_srcdir ();
>


Re: [google] Remove timestamped line from gengtype state file comment headers

2011-08-23 Thread Michael Matz
Hi,

On Tue, 23 Aug 2011, Richard Guenther wrote:

> > This patch removes the comment line, to provide binary reproducibility for
> > any generated gtype.state files.
> >
> > Tested for x86 and PowerPC, no bootstrap in both cases.
> >
> > OK for google/integration?  Also, OK for trunk?
> 
> Ok for trunk.

But perhaps with a different ChangeLog entry :)

> > libstdc++-v3/ChangeLog:
> > 2011-05-20  Simon Baldwin  
> >
> >        * scripts/extract_symvers.in: Handle processor/OS specific or
> >        unknown symbol binding strings from readelf.


Ciao,
Michael.

Re: [var-tracking] small speed-ups

2011-08-23 Thread Jakub Jelinek
On Tue, Aug 23, 2011 at 02:40:56PM +0300, Dimitrios Apostolou wrote:
> I was hoping you'd ask so I'll ask back :-) Why are we doing it that
> way? Why so much indirection in the first place? Why create inline
> functions just to typecast and why do we need this CONST_CAST2
> ugliness in C code. I bet there are things I don't understand so I'd
> be happy to listen...

It is not indirection, it is abstraction, which should make the code more
readable and allow changing the implementation details.

> OK I'll do that in the future. Should I also move some other htab
> functions I saw in var-tracking and rtl? FOR_EACH_HTAB_ELEMENT comes
> to mind, probably other too.

I guess FOR_EACH_HTAB_ELEMENT could move too (together with all the support
though - tree-flow.h and tree-flow-inline.h related stuff).

> It's used only once, that's why I deleted the function. I'll bring
> it back if you think it helps.

Yes, please.

> >>   dst->vars = (shared_hash) pool_alloc (shared_hash_pool);
> >>   dst->vars->refcount = 1;
> >>   dst->vars->htab
> >>-= htab_create (MAX (src1_elems, src2_elems), variable_htab_hash,
> >>+= htab_create (2 * MAX (src1_elems, src2_elems), variable_htab_hash,
> >>   variable_htab_eq, variable_htab_free);
> >
> >This looks wrong, 2 * max is definitely too much.
> 
> For a hash table to fit N elements, it has to have at least 4/3*N
> slots, or 2*N slots if htab has the 50% load factor I was proposing.

For var-tracking, 50% load factor is IMHO a very bad idea, memory
consumption of var-tracking is already high right now, we IMHO don't have
the luxury to waste more RAM.

> In total for dataflow_set_destroy I can see that calls to
> attrs_list_clear() have been reduced from 500K to 250K, and I can
> also see a reduction of free() calls from htab_delete(), from 30K to

free calls?  If you avoid free calls, then you end up with a memory leak.
I can understand when you avoid pool_free calls...

Jakub


Re: [PATCH] For FFS/CLZ/CTZ/CLRSB/POPCOUNT/PARITY/BSWAP require operand mode equal to operation mode (or VOIDmode) (PR middle-end/50161)

2011-08-23 Thread Georg-Johann Lay
Jakub Jelinek wrote:
> On Tue, Aug 23, 2011 at 11:57:58AM +0200, Bernd Schmidt wrote:
>> On 08/23/11 11:52, Jakub Jelinek wrote:
>>> On Tue, Aug 23, 2011 at 11:35:07AM +0200, Bernd Schmidt wrote:
> cse_process_notes_1
> perhaps could be changed for VOIDmode new_rtx to try to
> simplify_replace_rtx it...
 Is this where the problem came from? Sounds like it's worth a try.
>>> In this case, yes.  But there are many other places all around the
>>> compiler that need to disallow unary op with VOIDmode operand.
>>> In cse.c alone e.g. fold_rtx (twice), in combine.c e.g. in do_SUBST,
>>> subst, etc.  Do we want to special case all those 7 unary ops there too?
>>> Is it really worth it to save one subreg or truncate in the md patterns
>>> for rarely used rtxes?
>> Maybe not. I'll approve a patch to change it back, even if I think it's
>> not a good representation.
> 
> We can remove that restriction again once CONST_INTs are no longer VOIDmode.
> 
> Here is an untested patch, will bootstrap/regtest it now on x86_64-linux
> and i686-linux, on c6x it should make no difference IMHO (looked like a typo
> in the expander which wasn't used anyway), can somebody test it on AVR and

Tested you patch against r177949 on avr-unknown-none for C/C++.
There are no regressions and the new test case passes fine.

Johann

> BFIN?  My grepping through *.md didn't find any other places where the
> operand wouldn't have the same mode as operation.
> 
> 2011-08-23  Jakub Jelinek  
> 
>   PR middle-end/50161
>   * simplify-rtx.c (simplify_const_unary_operation): If
>   op is CONST_INT, don't look at op_mode, but use instead
>   mode.
>   * optabs.c (add_equal_note): For FFS, CLZ, CTZ,
>   CLRSB, POPCOUNT, PARITY and BSWAP use operand mode for
>   operation and TRUNCATE/ZERO_EXTEND if needed.
>   * doc/rtl.texi (ffs, clrsb, clz, ctz, popcount, parity, bswap):
>   Document that operand mode must be same as operation mode,
>   or VOIDmode.
>   * config/avr/avr.md (paritysi2, *parityqihi2.libgcc,
>   *paritysihi2.libgcc, popcountsi2, *popcountsi2.libgcc,
>   *popcountqihi2.libgcc, clzsi2, *clzsihi2.libgcc, ctzsi2,
>   *ctzsihi2.libgcc, ffssi2, *ffssihi2.libgcc): For unary ops
>   use the mode of operand for the operation and add truncate
>   or zero_extend around if needed.
>   * config/c6x/c6x.md (ctzdi2): Likewise.
>   * config/bfin/bfin.md (clrsbsi2, signbitssi2): Likewise.
> 
>   * gcc.dg/pr50161.c: New test.



Re: [var-tracking] small speed-ups

2011-08-23 Thread Dimitrios Apostolou

On Tue, 23 Aug 2011, Jakub Jelinek wrote:

On Tue, Aug 23, 2011 at 02:40:56PM +0300, Dimitrios Apostolou wrote:


  dst->vars = (shared_hash) pool_alloc (shared_hash_pool);
  dst->vars->refcount = 1;
  dst->vars->htab
-= htab_create (MAX (src1_elems, src2_elems), variable_htab_hash,
+= htab_create (2 * MAX (src1_elems, src2_elems), variable_htab_hash,
   variable_htab_eq, variable_htab_free);


This looks wrong, 2 * max is definitely too much.


For a hash table to fit N elements, it has to have at least 4/3*N
slots, or 2*N slots if htab has the 50% load factor I was proposing.


For var-tracking, 50% load factor is IMHO a very bad idea, memory
consumption of var-tracking is already high right now, we IMHO don't have
the luxury to waste more RAM.


Agreed, then I 'll change it to 4/3 * MAX so that I avoid expansions.


In total for dataflow_set_destroy I can see that calls to
attrs_list_clear() have been reduced from 500K to 250K, and I can
also see a reduction of free() calls from htab_delete(), from 30K to


free calls?  If you avoid free calls, then you end up with a memory leak.
I can understand when you avoid pool_free calls...


You are right, I just mentioned the total difference in free() calls from 
all my patches. But in this part there is no free() involved, so the major 
gain should be from avoiding htab_delete() iterating many times over the 
hash tables. Annotated source from the callgrind profiler (shows 
instruction count):


Before:

 .  void
15,820,597  htab_delete (htab_t htab)
   250,914  {
41,819size_t size = htab_size (htab);
41,819PTR *entries = htab->entries;
 .int i;
 .
83,638if (htab->del_f)
66,493,884  for (i = size - 1; i >= 0; i--)
49,825,998if (entries[i] != HTAB_EMPTY_ENTRY && entries[i] != 
HTAB_DELETED_ENTRY)
 1,813,018  (*htab->del_f) (entries[i]);
 1,800,731  => tree-into-ssa.c:def_blocks_free (10988x)
   345,910  => tree-into-ssa.c:repl_map_free (2815x)
53,950  => tree-scalar-evolution.c:del_scev_info (1197x)
   139,354  => tree-ssa-loop-im.c:memref_free (198x)
   281,777  => tree-ssa-sccvn.c:free_phi (3788x)
81,726  => tree-ssa-uncprop.c:equiv_free (463x)
17,359,572  => var-tracking.c:variable_htab_free (835512x)
42,161  => cfgloop.c:loop_exit_free (720x)
   284,904  => ira-costs.c:cost_classes_del (2270x)
   157  => tree-ssa-loop-im.c:vtoe_free (1x)
11,454,221  => ???:free (37228x)
 1,460,684  => tree-ssa-sccvn.c:free_reference (11329x)
 .
   125,457if (htab->free_f != NULL)



After:

 .  void
 6,543,474  htab_delete (htab_t htab)
   250,914  {
41,819size_t size = htab_size (htab);
41,819PTR *entries = htab->entries;
 .int i;
 .
83,638if (htab->del_f)
29,288,268  for (i = size - 1; i >= 0; i--)
21,927,330if (entries[i] != HTAB_EMPTY_ENTRY && entries[i] != 
HTAB_DELETED_ENTRY)
 1,738,584  (*htab->del_f) (entries[i]);
   139,344  => tree-ssa-loop-im.c:memref_free (198x)
   157  => tree-ssa-loop-im.c:vtoe_free (1x)
81,762  => tree-ssa-uncprop.c:equiv_free (463x)
   281,884  => tree-ssa-sccvn.c:free_phi (3788x)
42,145  => cfgloop.c:loop_exit_free (720x)
   345,870  => tree-into-ssa.c:repl_map_free (2815x)
 1,462,000  => tree-ssa-sccvn.c:free_reference (11329x)
 1,800,951  => tree-into-ssa.c:def_blocks_free (10988x)
16,518,004  => var-tracking.c:variable_htab_free (824010x)
   284,979  => ira-costs.c:cost_classes_del (2270x)
 9,080,245  => ???:free (11513x)
53,921  => tree-scalar-evolution.c:del_scev_info (1197x)
 .
   125,457if (htab->free_f != NULL)



So if the part relevant to this patch is the number of calls to 
variable_htab_free, I suppose it won't make a big difference.


But if I take a look at the top callers and top callees of htab_delete():

Before:

 19,590,149  < tree-ssa-structalias.c:solve_constraints (2058x) [cc1]
 33,299,064  < var-tracking.c:shared_hash_destroy (6923x) [cc1]
 68,941,719  < tree-ssa-pre.c:execute_pre (2058x) [cc1]
134,915,334  *  hashtab.c:htab_delete [cc1]
 17,359,572  >   var-tracking.c:variable_htab_free (835512x) [cc1]
 11,454,221  >   ???:free (108456x) [libc-2.12.so]
  1,800,731  >   tree-into-ssa.c:def_blocks_free (10988x) [cc1]


After:

  8,756,328  < tree-ssa-structalias.c:delete_points_to_sets (1029x) [cc1]
  9,316,165  < tree-ssa-dom.c:tree_ssa_dominator_optimize (562x) [cc1]
 83,696,211  < var-tracking.c:shared_hash_destroy (6923x) [cc1]
 60,459,493  *  hashtab.c:htab_delete [cc1]
 16,518,004  >   var-tracking.c:variable_htab_free (824010x) [cc1]
  9,080,245  >   ???:free (82741x) [libc-2.12.so]
  1,800,951  >   tree-into-ssa.c:def_blocks_free (10988x) [cc1]


Then I'm more confused, htab_delete() seems to be much colder, probably 
due to less traversals of hash tables, but It's not clear how much 
var-tracking changes are responsible for this. I'll keep in mind to revert 
the do_free parameter and report back. 

Re: [rfa] Set alignment of pseudos via get_pointer_alignment

2011-08-23 Thread Michael Matz
Hi,

> > Like so.  Regstrapped on x86_64-linux (all languages + Ada).  Okay for
> > trunk?
> 
> Ok.

r177989 (JFYI because it's some time ago to make searching the archives 
easier).


Ciao,
Michael.

Re: [google] Remove timestamped line from gengtype state file comment headers

2011-08-23 Thread Simon Baldwin
On 23 August 2011 15:34, Michael Matz  wrote:
>
> Hi,
>
> On Tue, 23 Aug 2011, Richard Guenther wrote:
>
> > > This patch removes the comment line, to provide binary reproducibility for
> > > any generated gtype.state files.
> > >
> > > Tested for x86 and PowerPC, no bootstrap in both cases.
> > >
> > > OK for google/integration?  Also, OK for trunk?
> >
> > Ok for trunk.
>
> But perhaps with a different ChangeLog entry :)

Heh!  Thanks, yes.  Plainly I'm out of practice with gcc patches.
Corrected version below, for completeness.

--
Remove the timestamped line from gengtype state file comment headers.

Gcc builds after r177358 include a file .../plugin/gtype.state as part of
their binary installation.  The file contains a comment line that includes
the current date and time.  Variations in the file contents due to only
changes in the timestamp can be an issue for build and packaging systems
that prefer or insist on binary compatibility.

This patch removes the comment line, to provide binary reproducibility for
any generated gtype.state files.

Tested for x86 and PowerPC, no bootstrap in both cases.

OK for google/integration?  Also, OK for trunk?

gcc/ChangeLog:
2011-08-23  Simon Baldwin  

* gengtype-state.c (write_state): Remove timestamped header line.


Index: gcc/gengtype-state.c
===
--- gcc/gengtype-state.c(revision 177984)
+++ gcc/gengtype-state.c(working copy)
@@ -1194,8 +1194,6 @@ write_state (const char *state_path)
   fprintf (state_file,
   ";;; This file should be parsed by the same %s which wrote it.\n",
   progname);
-  fprintf (state_file, ";;; file %s generated on %s\n", state_path,
-  ctime (&now));
   /* The first non-comment significant line gives the version string.  */
   write_state_version (version_string);
   write_state_srcdir ();

--
Google UK Limited | Registered Office: Belgrave House, 76 Buckingham
Palace Road, London SW1W 9TQ | Registered in England Number: 3977902


Re: [trans-mem] Use __x86_64__ instead of __LP64__.

2011-08-23 Thread H.J. Lu
On Mon, Aug 22, 2011 at 6:44 AM, H.J. Lu  wrote:
> On Mon, Aug 22, 2011 at 2:42 AM, Torvald Riegel  wrote:
>> Use __x86_64__ instead of __LP64__ in setjmp/longjmp and TLS
>> definitions.
>>
>> H.J.: Is that sufficient for x32, or do we need entirely different code?
>> If so, can you please provide the required changes?
>>
>
> I need to take a look.
>

transactional-memory is from 2010-04-13 and doesn't support x32
at all.



-- 
H.J.


Re: [PATCH] Fix configure --with-cloog

2011-08-23 Thread Romain Geissler
Ping


Re: [Patch] Properly find getopt system declaration

2011-08-23 Thread Romain Geissler
2011/8/10 Romain GEISSLER :
> Hi
>
> Thanks to the recent changes made to stage 2 and 3 (now built with g++), i
> noticed a little error in the configure script that tries the
> system getopt declaration. Indeed, if your system defines it in a system
> header file named "getopt.h" (for example /usr/include/getopt.h on a Red Hat
> 4 configuration), the configure script will incorrectly load
> /path/to/gcc/src/include/getopt.h instead, and thus find no getopt
> declaration.
>
> This can be solved by changing the appropriate -I${srcdir}/../include by
> -iquote ${srcdir}/../include. I added a configure check to verify that the
> compiler accepts the -iquote switch (and fallback to -I otherwise).
> Note that this only solve the getopt case, but -I is certainly often misused
> (instead of -iquote that would prevent error with system header having the
> same name than gcc header).
>
> The attached patch has been tested for regression with a native x86_64
> bootstrap.
>
> config/
>
> 2011-08-10  Romain Geissler  
>
>        * acx.m4 (ACX_CHECK_CC_ACCEPTS_IQUOTE) : Define.
>
>
> gcc/
>
> 2011-08-10  Romain Geissler  
>
>        * configure.ac (acx_cv_cc_accepts_iquote): Define through a call
>        to ACX_CHECK_CC_ACCEPTS_IQUOTE.
>        (CFLAGS for gcc_AC_CHECK_DECLS): Use $acx_cv_cc_accepts_iquote
>        instead of "-I".
>        * configure: Regenerate.
>
>
> Romain Geissler
>

Ping


Re: [PATCH, i386, testsuite] FMA intrinsics

2011-08-23 Thread Uros Bizjak
On Tue, Aug 23, 2011 at 4:19 PM, Ilya Tocar  wrote:
> I removed unnecessary expands/builtins and tests are now compiled with -O2.
> Is this version ok?

OK with minor comments:

- Please remove extra blank lines you introduced in sse.md
- Also, I'd recomend you to pass new testcases through "indent"
command to fix formatting.

Thanks,
Uros.


Re: [PATCH] PR c++/50055: Location information for the throw() specification in a function may be incorrect

2011-08-23 Thread Jason Merrill

On 08/12/2011 03:18 PM, Siddhesh Poyarekar wrote:

When the location for throw() exception specification is not the same
as the function it is written against, it leads gcov to give incorrect
results. See bug 50055 for details of the the same. The following
patch makes sure that the exception specification block (nothrow or
otherwise) is always associated with the function definition line
number.


I'm applying this patch, thanks.

This patch is small enough not to need it, but for future contributions 
please file a copyright assignment with the FSF.  Send email to 
ass...@gnu.org for more information.


Jason


Re: [PATCH] PR c++/50055: Location information for the throw() specification in a function may be incorrect

2011-08-23 Thread Jakub Jelinek
On Tue, Aug 23, 2011 at 10:56:26AM -0400, Jason Merrill wrote:
> This patch is small enough not to need it, but for future
> contributions please file a copyright assignment with the FSF.  Send
> email to ass...@gnu.org for more information.

I think Siddhesh should be covered by the Red Hat assignment (it would help
if the patch has been mailed from a redhat.com address to notice that).

Jakub


Re: [PATCH] PR c++/50055: Location information for the throw() specification in a function may be incorrect

2011-08-23 Thread Siddhesh Poyarekar
On Tue, Aug 23, 2011 at 8:33 PM, Jakub Jelinek  wrote:
> On Tue, Aug 23, 2011 at 10:56:26AM -0400, Jason Merrill wrote:
>> This patch is small enough not to need it, but for future
>> contributions please file a copyright assignment with the FSF.  Send
>> email to ass...@gnu.org for more information.
>
> I think Siddhesh should be covered by the Red Hat assignment (it would help
> if the patch has been mailed from a redhat.com address to notice that).
>

Thanks! I will keep that in mind for future submissions.


-- 
Siddhesh Poyarekar
http://siddhesh.in


[PATCH, MELT] Calling unloaded shared library

2011-08-23 Thread Alexandre Lissy
Hello,

The following patch fixes a 'small' issue I have been facing since a
couple of days: MELT would not be able to call the start_module_melt()
symbol that is present in warmelt-first(...).so. It turned out that the
shared library where *unloaded* at run time ; each one being loaded and
then unloaded.

This was due to a misplaced else branch which is supposed to check the
handler returned by dlopen is valid, and if it is not then unload the
library ; this check has been displaced as an alternative to the
presence of debug options. Since I was not using debug, the else branch
was taken and thus the shared library as unloaded.



[PATCH] Fix .so handling on failure

2011-08-23 Thread Alexandre Lissy

When an invalid handle from dlopen() was generated, the else branch
corresponding was not at the right place and was a child of
'if (!quiet_flag || flag_melt_debug)' condition. This leads to unloading
of shared library just loaded by MELT runtime when debug is not enabled,
hence the next call to mi->mmi_startrout() in
meltgc_start_module_by_index would refer to an invalid address (shared
library unloaded at the time of the call).
---
 gcc/ChangeLog.MELT |3 +++
 gcc/melt-runtime.c |   10 +-
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/gcc/ChangeLog.MELT b/gcc/ChangeLog.MELT
index f4cc6a5..4282321 100644
--- a/gcc/ChangeLog.MELT
+++ b/gcc/ChangeLog.MELT
@@ -1,3 +1,6 @@
+2011-08-23  Alexandre Lissy 
+	* melt-runtime.c (melt_load_module_index): Correct handling of invalid
+	dlopen handle.
 
 2011-08-05  Basile Starynkevitch  
 	* melt-runtime.c (melt_run_make_for_plugin): Don't use fullbinfile.
diff --git a/gcc/melt-runtime.c b/gcc/melt-runtime.c
index 8eea8f1..ca580a4 100644
--- a/gcc/melt-runtime.c
+++ b/gcc/melt-runtime.c
@@ -8764,11 +8764,11 @@ melt_load_module_index (const char*srcbase, const char*flavor)
 		  MELTDESCR_REQUIRED(melt_gen_timestamp), 
 		  MELTDESCR_REQUIRED(melt_build_timestamp));
   }
-  else 
-	{
-	  debugeprintf ("melt_load_module_index invalid dlh %p sopath %s", dlh, sopath);
-	  dlclose (dlh), dlh = NULL;
-	}
+}
+else 
+{
+	debugeprintf ("melt_load_module_index invalid dlh %p sopath %s", dlh, sopath);
+	dlclose (dlh), dlh = NULL;
 }
  end:
   if (srcpath) 


C++ PATCH to allow 'this' in constexpr member functions

2011-08-23 Thread Jason Merrill
In a pre-standard version of the constexpr specification use of 'this' 
was not valid in a constexpr function except as part of a member access. 
 We dropped the notion of potential constant expression from the 
standard, in favor of just saying that a constexpr function that can 
never produce a constant expression is ill-formed, no diagnostic 
required.  We still use potential_constant_expression_1 to give a 
diagnostic where reasonable, but clearly it is not reasonable in this 
case and we should drop this check from that function.  Any actually 
problematic use of 'this' will show up when we try to actually get a 
constant value.


Tested x86_64-pc-linux-gnu, applying to trunk.  I'm not bothering to add 
a testcase because Benjamin needs this for tuple, so it will be tested 
there.
commit 3d177986ce1b5fcf030a0928e155673459547410
Author: Jason Merrill 
Date:   Tue Aug 16 23:22:57 2011 -0400

	* semantics.c (potential_constant_expression_1): Allow 'this'.

diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 59b25e5..1f6b49a 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -7681,21 +7681,7 @@ potential_constant_expression_1 (tree t, bool want_rval, tsubst_flags_t flags)
 case IDENTIFIER_NODE:
   /* We can see a FIELD_DECL in a pointer-to-member expression.  */
 case FIELD_DECL:
-  return true;
-
 case PARM_DECL:
-  /* -- this (5.1) unless it appears as the postfix-expression in a
-class member access expression, including the result of the
-implicit transformation in the body of the non-static
-member function (9.3.1);  */
-  /* FIXME this restriction seems pointless since the standard dropped
-	 "potential constant expression".  */
-  if (is_this_parameter (t))
-{
-  if (flags & tf_error)
-error ("%qE is not a potential constant expression", t);
-  return false;
-}
   return true;
 
 case AGGR_INIT_EXPR:


C++ PATCH for c++/50024 (ICE with new int{})

2011-08-23 Thread Jason Merrill
maybe_constant_value was failing to recognize that we can't ask for the 
constant value of an initializer list because it has no type.  After 
fixing that, we still incorrectly rejected the testcase, so I had to fix 
a couple of other spots as well.


Tested x86_64-pc-linux-gnu, applying to trunk and perhaps 4.6.
commit 1738cacbeb65b9fb0e155fa7a1369e647674082c
Author: Jason Merrill 
Date:   Fri Aug 19 00:52:38 2011 -0400

	PR c++/50024
	* semantics.c (maybe_constant_value): Don't try to fold { }.
	* pt.c (build_non_dependent_expr): Don't wrap { }.
	* init.c (build_value_init): Allow scalar value-init in templates.

diff --git a/gcc/cp/init.c b/gcc/cp/init.c
index 4fa627b..847f519 100644
--- a/gcc/cp/init.c
+++ b/gcc/cp/init.c
@@ -330,7 +330,7 @@ build_value_init (tree type, tsubst_flags_t complain)
  constructor.  */
 
   /* The AGGR_INIT_EXPR tweaking below breaks in templates.  */
-  gcc_assert (!processing_template_decl);
+  gcc_assert (!processing_template_decl || SCALAR_TYPE_P (type));
 
   if (CLASS_TYPE_P (type))
 {
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index ed4fe72..3f9a4c0 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -19669,6 +19669,10 @@ build_non_dependent_expr (tree expr)
   if (TREE_CODE (expr) == THROW_EXPR)
 return expr;
 
+  /* Don't wrap an initializer list, we need to be able to look inside.  */
+  if (BRACE_ENCLOSED_INITIALIZER_P (expr))
+return expr;
+
   if (TREE_CODE (expr) == COND_EXPR)
 return build3 (COND_EXPR,
 		   TREE_TYPE (expr),
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 1f6b49a..2f62e35 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -7542,6 +7542,7 @@ maybe_constant_value (tree t)
 
   if (type_dependent_expression_p (t)
   || type_unknown_p (t)
+  || BRACE_ENCLOSED_INITIALIZER_P (t)
   || !potential_constant_expression (t)
   || value_dependent_expression_p (t))
 {
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-initlist5.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-initlist5.C
new file mode 100644
index 000..97f0399
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-initlist5.C
@@ -0,0 +1,15 @@
+// PR c++/50024
+// { dg-options -std=c++0x }
+
+template< class T >
+struct Container
+{
+  Container(){
+int* ptr = new int{};
+  }
+};
+
+int main() {
+Container< int > c;
+}
+


C++ PATCH for core issue 975 (extended lambda return type deduction)

2011-08-23 Thread Jason Merrill
At the Bloomington meeting last week the committee finally accepted my 
proposal to allow return type deduction from lambdas of arbitrary form 
as long as the deduced type is the same for all return statements, as 
implemented in G++.  My implementation did the type comparison at 
template definition time, but others on the committee thought that the 
comparison should happen at instantiation time.  This is indeed more 
user-friendly, and I went ahead and implemented the final resolution.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit f9451c5652862c301ecfe838fb335fccf03ed56a
Author: Jason Merrill 
Date:   Wed Aug 17 22:48:39 2011 -0400

	Core 975
	* decl.c (cxx_init_decl_processing): Initialize
	dependent_lambda_return_type_node.
	* cp-tree.h (cp_tree_index): Add CPTI_DEPENDENT_LAMBDA_RETURN_TYPE.
	(dependent_lambda_return_type_node): Define.
	(DECLTYPE_FOR_LAMBDA_RETURN): Remove.
	* semantics.c (lambda_return_type): Handle overloaded function.
	Use dependent_lambda_return_type_node instead of
	DECLTYPE_FOR_LAMBDA_RETURN.
	(apply_lambda_return_type): Don't check dependent_type_p.
	* pt.c (tsubst_copy_and_build): Handle lambda return type deduction.
	(instantiate_class_template_1): Likewise.
	(tsubst): Don't use DECLTYPE_FOR_LAMBDA_RETURN.
	* mangle.c (write_type): Likewise.
	* typeck.c (structural_comptypes): Likewise.
	(check_return_expr): Handle dependent_lambda_return_type_node.

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index ff5509e..8595943 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -83,7 +83,6 @@ c-common.h, not after.
   STMT_IS_FULL_EXPR_P (in _STMT)
   TARGET_EXPR_LIST_INIT_P (in TARGET_EXPR)
   LAMBDA_EXPR_MUTABLE_P (in LAMBDA_EXPR)
-  DECLTYPE_FOR_LAMBDA_RETURN (in DECLTYPE_TYPE)
   DECL_FINAL_P (in FUNCTION_DECL)
   QUALIFIED_NAME_IS_TEMPLATE (in SCOPE_REF)
2: IDENTIFIER_OPNAME_P (in IDENTIFIER_NODE)
@@ -775,6 +774,7 @@ enum cp_tree_index
 CPTI_CLASS_TYPE,
 CPTI_UNKNOWN_TYPE,
 CPTI_INIT_LIST_TYPE,
+CPTI_DEPENDENT_LAMBDA_RETURN_TYPE,
 CPTI_VTBL_TYPE,
 CPTI_VTBL_PTR_TYPE,
 CPTI_STD,
@@ -846,6 +846,7 @@ extern GTY(()) tree cp_global_trees[CPTI_MAX];
 #define class_type_node			cp_global_trees[CPTI_CLASS_TYPE]
 #define unknown_type_node		cp_global_trees[CPTI_UNKNOWN_TYPE]
 #define init_list_type_node		cp_global_trees[CPTI_INIT_LIST_TYPE]
+#define dependent_lambda_return_type_node cp_global_trees[CPTI_DEPENDENT_LAMBDA_RETURN_TYPE]
 #define vtbl_type_node			cp_global_trees[CPTI_VTBL_TYPE]
 #define vtbl_ptr_type_node		cp_global_trees[CPTI_VTBL_PTR_TYPE]
 #define std_node			cp_global_trees[CPTI_STD]
@@ -3425,12 +3426,10 @@ more_aggr_init_expr_args_p (const aggr_init_expr_arg_iterator *iter)
   (DECLTYPE_TYPE_CHECK (NODE))->type_common.string_flag
 
 /* These flags indicate that we want different semantics from normal
-   decltype: lambda capture just drops references, lambda return also does
-   type decay, lambda proxies look through implicit dereference.  */
+   decltype: lambda capture just drops references, lambda proxies look
+   through implicit dereference.  */
 #define DECLTYPE_FOR_LAMBDA_CAPTURE(NODE) \
   TREE_LANG_FLAG_0 (DECLTYPE_TYPE_CHECK (NODE))
-#define DECLTYPE_FOR_LAMBDA_RETURN(NODE) \
-  TREE_LANG_FLAG_1 (DECLTYPE_TYPE_CHECK (NODE))
 #define DECLTYPE_FOR_LAMBDA_PROXY(NODE) \
   TREE_LANG_FLAG_2 (DECLTYPE_TYPE_CHECK (NODE))
 
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index c125f05..c375cf7 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -3597,6 +3597,10 @@ cxx_init_decl_processing (void)
   init_list_type_node = make_node (LANG_TYPE);
   record_unknown_type (init_list_type_node, "init list");
 
+  dependent_lambda_return_type_node = make_node (LANG_TYPE);
+  record_unknown_type (dependent_lambda_return_type_node,
+		   "undeduced lambda return type");
+
   {
 /* Make sure we get a unique function type, so we can give
its pointer type a name.  (This wins for gdb.) */
diff --git a/gcc/cp/mangle.c b/gcc/cp/mangle.c
index 53d4bc6..4c7cc79 100644
--- a/gcc/cp/mangle.c
+++ b/gcc/cp/mangle.c
@@ -1953,7 +1953,7 @@ write_type (tree type)
 case DECLTYPE_TYPE:
 	  /* These shouldn't make it into mangling.  */
 	  gcc_assert (!DECLTYPE_FOR_LAMBDA_CAPTURE (type)
-			  && !DECLTYPE_FOR_LAMBDA_RETURN (type));
+			  && !DECLTYPE_FOR_LAMBDA_PROXY (type));
 
 	  /* In ABI <5, we stripped decltype of a plain decl.  */
 	  if (!abi_version_at_least (5)
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 3f9a4c0..6b970f9 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -8887,7 +8887,16 @@ instantiate_class_template_1 (tree type)
 }
 
   if (CLASSTYPE_LAMBDA_EXPR (type))
-maybe_add_lambda_conv_op (type);
+{
+  tree lambda = CLASSTYPE_LAMBDA_EXPR (type);
+  if (LAMBDA_EXPR_DEDUCE_RETURN_TYPE_P (lambda))
+	{
+	  apply_lambda_return_type (lambda, void_type_node);
+	  LAMBDA_EXPR_RETURN_TYPE (lambda) = NULL_

Re: [PATCH] For FFS/CLZ/CTZ/CLRSB/POPCOUNT/PARITY/BSWAP require operand mode equal to operation mode (or VOIDmode) (PR middle-end/50161)

2011-08-23 Thread Jakub Jelinek
On Tue, Aug 23, 2011 at 02:06:05PM +0200, Jakub Jelinek wrote:
> We can remove that restriction again once CONST_INTs are no longer VOIDmode.
> 
> Here is an untested patch, will bootstrap/regtest it now on x86_64-linux
> and i686-linux, on c6x it should make no difference IMHO (looked like a typo
> in the expander which wasn't used anyway), can somebody test it on AVR and
> BFIN?  My grepping through *.md didn't find any other places where the
> operand wouldn't have the same mode as operation.

Now successfully bootstrapped/regtested on x86_64-linux and i686-linux
and tested on AVR by Georg-Johann, ok for trunk?

Jakub


[C++ PATCH] Fix -Wunused-but-set-* on RHS of x op= rhs if rhs has side-effects (PR c++/50158)

2011-08-23 Thread Jakub Jelinek
Hi!

The recent cp_build_modify_expr change broke the following testcase,
we now complain that b is set but not used.  The problem is that
when rhs has side-effects, stabilize_expr returns a temporary (+ creates
a TARGET_EXPR), which means that mark_exp_read isn't actually called on the
original RHS expression.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2011-08-23  Jakub Jelinek  

PR c++/50158
* typeck.c (cp_build_modify_expr): Call mark_rvalue_use on rhs
if it has side-effects and needs to be preevaluated.

* g++.dg/warn/Wunused-var-16.C: New test.

--- gcc/cp/typeck.c.jj  2011-08-18 08:35:38.0 +0200
+++ gcc/cp/typeck.c 2011-08-23 14:36:42.0 +0200
@@ -6692,6 +6692,8 @@ cp_build_modify_expr (tree lhs, enum tre
 side effect associated with any single compound assignment
 operator. -- end note ]  */
  lhs = stabilize_reference (lhs);
+ if (TREE_SIDE_EFFECTS (rhs))
+   rhs = mark_rvalue_use (rhs);
  rhs = stabilize_expr (rhs, &init);
  newrhs = cp_build_binary_op (input_location,
   modifycode, lhs, rhs,
--- gcc/testsuite/g++.dg/warn/Wunused-var-16.C.jj   2011-08-23 
14:37:31.0 +0200
+++ gcc/testsuite/g++.dg/warn/Wunused-var-16.C  2011-08-23 14:37:19.0 
+0200
@@ -0,0 +1,13 @@
+// PR c++/50158
+// { dg-do compile }
+// { dg-options "-Wunused" }
+
+int bar (int);
+
+int
+foo (int a)
+{
+  int b[] = { a, -a };
+  a += b[bar (a) < a];
+  return a;
+}

Jakub


Re: [PATCH] For FFS/CLZ/CTZ/CLRSB/POPCOUNT/PARITY/BSWAP require operand mode equal to operation mode (or VOIDmode) (PR middle-end/50161)

2011-08-23 Thread Bernd Schmidt
On 08/23/11 17:42, Jakub Jelinek wrote:
> On Tue, Aug 23, 2011 at 02:06:05PM +0200, Jakub Jelinek wrote:
>> We can remove that restriction again once CONST_INTs are no longer VOIDmode.
>>
>> Here is an untested patch, will bootstrap/regtest it now on x86_64-linux
>> and i686-linux, on c6x it should make no difference IMHO (looked like a typo
>> in the expander which wasn't used anyway), can somebody test it on AVR and
>> BFIN?  My grepping through *.md didn't find any other places where the
>> operand wouldn't have the same mode as operation.
> 
> Now successfully bootstrapped/regtested on x86_64-linux and i686-linux
> and tested on AVR by Georg-Johann, ok for trunk?

Ok. Will fix up bfin/c6x if necessary.


Bernd



C++ PATCH for core issue 903 (C++11 null pointer constant)

2011-08-23 Thread Jason Merrill
C++11 greatly expands the set of constant expressions, which aggravates 
the existing issue with overloading and null pointer constants.  If an 
expression could potentially be a constant expression, we need to find 
its constant value in order to determine how it interacts with overload 
resolution.  In C++03 that doesn't involve much beyond the constant 
folding we already do, but in C++11 that means substituting into 
constexpr functions, so we decided to limit null pointer constants in 
C++11 to literal 0 (or 0L, etc).


This patch doesn't attempt to treat things like 0+0 as non-null pointer 
constants yet, just avoids doing anything beyond the usual constant folding.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit af3382cf8702c1dfb5a98b2b2093aac21a5c4dff
Author: Jason Merrill 
Date:   Fri Aug 19 10:38:12 2011 -0400

	Core 903 (partial)
	* call.c (null_ptr_cst_p): Only 0 qualifies in C++11.

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index d2700cb..e5f65b3 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -541,20 +541,14 @@ null_ptr_cst_p (tree t)
 return true;
   if (CP_INTEGRAL_TYPE_P (TREE_TYPE (t)))
 {
-  if (cxx_dialect >= cxx0x)
-	{
-	  t = fold_non_dependent_expr (t);
-	  t = maybe_constant_value (t);
-	  if (TREE_CONSTANT (t) && integer_zerop (t))
-	return true;
-	}
-  else
+  /* Core issue 903 says only literal 0 is a null pointer constant.  */
+  if (cxx_dialect < cxx0x)
 	{
 	  t = integral_constant_value (t);
 	  STRIP_NOPS (t);
-	  if (integer_zerop (t) && !TREE_OVERFLOW (t))
-	return true;
 	}
+  if (integer_zerop (t) && !TREE_OVERFLOW (t))
+	return true;
 }
   return false;
 }
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-nullptr.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-nullptr.C
index 7ac53db..6381323 100644
--- a/gcc/testsuite/g++.dg/cpp0x/constexpr-nullptr.C
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-nullptr.C
@@ -2,5 +2,5 @@
 
 constexpr int zero() { return 0; }
 
-void* ptr1 = zero(); // #1
-constexpr void* ptr2 = zero(); // #2
+void* ptr1 = zero();		// { dg-error "int" }
+constexpr void* ptr2 = zero();	// { dg-error "int" }


Re: [C++ PATCH] Fix -Wunused-but-set-* on RHS of x op= rhs if rhs has side-effects (PR c++/50158)

2011-08-23 Thread Jason Merrill

OK.

Jason


C++ PATCH for a fixme in build_functional_cast

2011-08-23 Thread Jason Merrill
While looking at another issue I noticed this fixme.  A TARGET_EXPR of 
literal type should have TREE_CONSTANT set iff the initializer is 
constant, and this patch implements that.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 21aebcf68d774827647731368a5708e1c60c31af
Author: Jason Merrill 
Date:   Fri Aug 19 10:14:12 2011 -0400

	* tree.c (build_target_expr): Set TREE_CONSTANT on
	literal TARGET_EXPR if the value is constant.
	* typeck2.c (build_functional_cast): Don't set it here.

diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index 4ef89c4..00598ce 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -288,6 +288,7 @@ static tree
 build_target_expr (tree decl, tree value, tsubst_flags_t complain)
 {
   tree t;
+  tree type = TREE_TYPE (decl);
 
 #ifdef ENABLE_CHECKING
   gcc_assert (VOID_TYPE_P (TREE_TYPE (value))
@@ -302,12 +303,14 @@ build_target_expr (tree decl, tree value, tsubst_flags_t complain)
   t = cxx_maybe_build_cleanup (decl, complain);
   if (t == error_mark_node)
 return error_mark_node;
-  t = build4 (TARGET_EXPR, TREE_TYPE (decl), decl, value, t, NULL_TREE);
+  t = build4 (TARGET_EXPR, type, decl, value, t, NULL_TREE);
   /* We always set TREE_SIDE_EFFECTS so that expand_expr does not
  ignore the TARGET_EXPR.  If there really turn out to be no
  side-effects, then the optimizer should be able to get rid of
  whatever code is generated anyhow.  */
   TREE_SIDE_EFFECTS (t) = 1;
+  if (literal_type_p (type))
+TREE_CONSTANT (t) = TREE_CONSTANT (value);
 
   return t;
 }
diff --git a/gcc/cp/typeck2.c b/gcc/cp/typeck2.c
index 79aa354..97f98ab 100644
--- a/gcc/cp/typeck2.c
+++ b/gcc/cp/typeck2.c
@@ -1684,9 +1684,6 @@ build_functional_cast (tree exp, tree parms, tsubst_flags_t complain)
 {
   exp = build_value_init (type, complain);
   exp = get_target_expr_sfinae (exp, complain);
-  /* FIXME this is wrong */
-  if (literal_type_p (type))
-	TREE_CONSTANT (exp) = true;
   return exp;
 }
 


C++ PATCH for c++/49045 (core 1321, equivalence of dependent names)

2011-08-23 Thread Jason Merrill
In this PR, g++ was treating two uses of "swap" in a decltype as 
different because the result of unqualified name lookup changed between 
them.  At the meeting last week the committee decided that two dependent 
names should be considered equivalent even if that happens.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 00ece4f71ac12eddf4e5c0c44ead47bd2aa41475
Author: Jason Merrill 
Date:   Fri Aug 19 13:08:47 2011 -0400

	PR c++/49045
	Core 1321
	* tree.c (dependent_name): New.
	(cp_tree_equal): Two calls with the same dependent name are
	equivalent even if the overload sets are different.

diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index 00598ce..13421a4 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -1450,6 +1450,21 @@ is_overloaded_fn (tree x)
 	   || TREE_CODE (x) == OVERLOAD);
 }
 
+/* X is the CALL_EXPR_FN of a CALL_EXPR.  If X represents a dependent name
+   (14.6.2), return the IDENTIFIER_NODE for that name.  Otherwise, return
+   NULL_TREE.  */
+
+static tree
+dependent_name (tree x)
+{
+  if (TREE_CODE (x) == IDENTIFIER_NODE)
+return x;
+  if (TREE_CODE (x) != COMPONENT_REF
+  && is_overloaded_fn (x))
+return DECL_NAME (get_first_fn (x));
+  return NULL_TREE;
+}
+
 /* Returns true iff X is an expression for an overloaded function
whose type cannot be known without performing overload
resolution.  */
@@ -2187,7 +2202,12 @@ cp_tree_equal (tree t1, tree t2)
   {
 	tree arg1, arg2;
 	call_expr_arg_iterator iter1, iter2;
-	if (!cp_tree_equal (CALL_EXPR_FN (t1), CALL_EXPR_FN (t2)))
+	/* Core 1321: dependent names are equivalent even if the
+	   overload sets are different.  */
+	tree name1 = dependent_name (CALL_EXPR_FN (t1));
+	tree name2 = dependent_name (CALL_EXPR_FN (t2));
+	if (!(name1 && name2 && name1 == name2)
+	&& !cp_tree_equal (CALL_EXPR_FN (t1), CALL_EXPR_FN (t2)))
 	  return false;
 	for (arg1 = first_call_expr_arg (t1, &iter1),
 	   arg2 = first_call_expr_arg (t2, &iter2);
diff --git a/gcc/testsuite/g++.dg/cpp0x/overload2.C b/gcc/testsuite/g++.dg/cpp0x/overload2.C
new file mode 100644
index 000..ff8ad22
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/overload2.C
@@ -0,0 +1,24 @@
+// Core 1321
+// { dg-options -std=c++0x }
+// Two dependent names are equivalent even if the overload sets found by
+// phase 1 lookup are different.  Merging them keeps the earlier set.
+
+int g1(int);
+template  decltype(g1(T())) f1();
+int g1();
+template  decltype(g1(T())) f1()
+{ return g1(T()); }
+int i1 = f1();	// OK, g1(int) was declared before the first f1
+
+template  decltype(g2(T())) f2();
+int g2(int);
+template  decltype(g2(T())) f2() // { dg-error "g2. was not declared" }
+{ return g2(T()); }
+int i2 = f2();			  // { dg-error "no match" }
+
+int g3();
+template  decltype(g3(T())) f3();
+int g3(int);
+template  decltype(g3(T())) f3() // { dg-error "too many arguments" }
+{ return g3(T()); }
+int i3 = f3();			  // { dg-error "no match" }


C++ PATCH to make build_functional_cast call build_value_init unconditionally

2011-08-23 Thread Jason Merrill
The conditions for value-initialization of a class type have gotten more 
complicated, so we shouldn't try to duplicate them in the caller.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit b45bd43f5bf3aceb81d467be40bc97bafacb5d1e
Author: Jason Merrill 
Date:   Fri Aug 19 16:35:40 2011 -0400

	* typeck2.c (build_functional_cast): Don't try to avoid calling
	build_value_init.
	* pt.c (instantiate_class_template_1): Don't copy TYPE_HAS_* flags.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 6b970f9..3c6b2c5 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -8503,16 +8503,6 @@ instantiate_class_template_1 (tree type)
   input_location = DECL_SOURCE_LOCATION (TYPE_NAME (type)) =
 DECL_SOURCE_LOCATION (typedecl);
 
-  TYPE_HAS_USER_CONSTRUCTOR (type) = TYPE_HAS_USER_CONSTRUCTOR (pattern);
-  TYPE_HAS_NEW_OPERATOR (type) = TYPE_HAS_NEW_OPERATOR (pattern);
-  TYPE_HAS_ARRAY_NEW_OPERATOR (type) = TYPE_HAS_ARRAY_NEW_OPERATOR (pattern);
-  TYPE_GETS_DELETE (type) = TYPE_GETS_DELETE (pattern);
-  TYPE_HAS_COPY_ASSIGN (type) = TYPE_HAS_COPY_ASSIGN (pattern);
-  TYPE_HAS_CONST_COPY_ASSIGN (type) = TYPE_HAS_CONST_COPY_ASSIGN (pattern);
-  TYPE_HAS_COPY_CTOR (type) = TYPE_HAS_COPY_CTOR (pattern);
-  TYPE_HAS_CONST_COPY_CTOR (type) = TYPE_HAS_CONST_COPY_CTOR (pattern);
-  TYPE_HAS_DEFAULT_CONSTRUCTOR (type) = TYPE_HAS_DEFAULT_CONSTRUCTOR (pattern);
-  TYPE_HAS_CONVERSION (type) = TYPE_HAS_CONVERSION (pattern);
   TYPE_PACKED (type) = TYPE_PACKED (pattern);
   TYPE_ALIGN (type) = TYPE_ALIGN (pattern);
   TYPE_USER_ALIGN (type) = TYPE_USER_ALIGN (pattern);
diff --git a/gcc/cp/typeck2.c b/gcc/cp/typeck2.c
index 97f98ab..901e4ee 100644
--- a/gcc/cp/typeck2.c
+++ b/gcc/cp/typeck2.c
@@ -1677,10 +1677,7 @@ build_functional_cast (tree exp, tree parms, tsubst_flags_t complain)
  void type, creates an rvalue of the specified type, which is
  value-initialized.  */
 
-  if (parms == NULL_TREE
-  /* If there's a user-defined constructor, value-initialization is
-	 just calling the constructor, so fall through.  */
-  && !TYPE_HAS_USER_CONSTRUCTOR (type))
+  if (parms == NULL_TREE)
 {
   exp = build_value_init (type, complain);
   exp = get_target_expr_sfinae (exp, complain);
diff --git a/gcc/testsuite/g++.dg/template/crash7.C b/gcc/testsuite/g++.dg/template/crash7.C
index 88d3af8..5bd275e 100644
--- a/gcc/testsuite/g++.dg/template/crash7.C
+++ b/gcc/testsuite/g++.dg/template/crash7.C
@@ -5,10 +5,11 @@
 // PR c++/10108: ICE in tsubst_decl for error due to non-existence
 // nested type.
 
-template  struct A	// { dg-message "A.void.::A.const A" }
+template  struct A
 {
 template  A(typename A::X) {} // { dg-error "no type" }
 };
 
-A a;	// { dg-error "required|no match" }
-// { dg-prune-output "note" }
+// We currently don't give the "no match" error because we don't add the
+// invalid constructor template to TYPE_METHODS.
+A a;			// { dg-message "required" }


Re: [PATCH, testsuite, i386] AVX2 support for GCC

2011-08-23 Thread Uros Bizjak
On Tue, Aug 23, 2011 at 6:03 PM, Kirill Yukhin  wrote:

> thanks for inputs! I've applied all.
>
> Also I fixed I a bug (which produced ICE).
>
> ChangeLog entry:
> 2011-08-23  Kirill Yukhin  
>
>        * config/i386/sse.md (mul3_highpart): Update.
>
> Patch and testsuite/ChangeLog entry are attached.
>
> Is it OK?

Please remove #define DEBUG from testcases, it is set dynamically
during with -DDEBUG in the testsuite run.
Also, pass new testcases through "indent" program.

OK with these changes.

Thanks,
Uros.


Re: [PATCH, testsuite, i386] AVX2 support for GCC

2011-08-23 Thread Kirill Yukhin
Thanks, done.
Updated patch attached.

--
K

On Tue, Aug 23, 2011 at 8:16 PM, Uros Bizjak  wrote:
> On Tue, Aug 23, 2011 at 6:03 PM, Kirill Yukhin  
> wrote:
>
>> thanks for inputs! I've applied all.
>>
>> Also I fixed I a bug (which produced ICE).
>>
>> ChangeLog entry:
>> 2011-08-23  Kirill Yukhin  
>>
>>        * config/i386/sse.md (mul3_highpart): Update.
>>
>> Patch and testsuite/ChangeLog entry are attached.
>>
>> Is it OK?
>
> Please remove #define DEBUG from testcases, it is set dynamically
> during with -DDEBUG in the testsuite run.
> Also, pass new testcases through "indent" program.
>
> OK with these changes.
>
> Thanks,
> Uros.
>


avx2-8.gcc.patch.tgz
Description: GNU Zip compressed data


Re: [PATCH v3, i386] BMI2 support for GCC, mulx, rorx, x part

2011-08-23 Thread Uros Bizjak
On Tue, Aug 23, 2011 at 6:22 PM, Kirill Yukhin  wrote:

> thanks. I've applied your inputs.
>
> Updated patch, ChangeLog, testsuite/ChangeLog are attached.
>
> Are they ok now?

OK for mainline.

Thanks,
Uros.


Re: [trans-mem] Add method groups and change TM method lifecycle and selection.

2011-08-23 Thread Richard Henderson
On 08/23/2011 05:01 AM, Torvald Riegel wrote:
> Add method groups and change TM method lifecycle and selection.
> 
>   * retry.cc (GTM::gtm_thread::decide_retry_strategy): Cleanup. Fix
>   restarting without switching to serial mode.
>   (GTM::gtm_thread::decide_begin_dispatch): Let the caller set the
>   transaction state. Choose closed-nesting alternative if available.
>   (GTM::gtm_thread::set_default_dispatch): New.
>   (parse_default_method): New.
>   (GTM::gtm_thread::number_of_threads_changed): New.
>   * method-serial.cc (GTM::serial_mg): New method group class.
>   (GTM::serialirr_dispatch): Belongs to serial_mg. Remove reinit and
>   fini.
>   (GTM::serial_dispatch): Same.
>   (GTM::serialirr_onwrite_dispatch): Same.
>   (GTM::gtm_thread::serialirr_mode): Remove calls to fini.
>   * beginend.cc (GTM::gtm_thread::~gtm_thread): Maintain number of
>   registered threads.
>   (GTM::gtm_thread::gtm_thread): Same.
>   (_ITM_abortTransaction): Remove calls to abi_dispatch::fini().
>   (GTM::gtm_thread::trycommit): Same. Reset number of restarts.
>   (GTM::gtm_thread::begin_transaction): Let decide_begin_dispatch()
>   choose dispatch but set state according to dispatch here.
>   * dispatch.h (GTM::abi_dispatch::fini): Move to method group.
>   (GTM::method_group): New class.
>   (GTM::abi_dispatch): Add comments. Maintain pointer to method_group.
>   * libitm_i.h (GTM::gtm_thread): Add declarations for new members.
>   * libitm.texi: Document TM methods, method groups, method life cycle.
>   Rename method sets to method groups.

Ok.


r~


Re: [PATCH, testsuite, i386] AVX2 support for GCC

2011-08-23 Thread Uros Bizjak
On Tue, Aug 23, 2011 at 6:31 PM, Kirill Yukhin  wrote:
> Thanks, done.
> Updated patch attached.

OK for mainline.

Thanks,
Uros.


Re: [PATCH v3, i386] BMI2 support for GCC, mulx, rorx, x part

2011-08-23 Thread Kirill Yukhin
Great! Thanks.

Could anybody please commit that?

K

On Tue, Aug 23, 2011 at 8:53 PM, Uros Bizjak  wrote:
> On Tue, Aug 23, 2011 at 6:22 PM, Kirill Yukhin  
> wrote:
>
>> thanks. I've applied your inputs.
>>
>> Updated patch, ChangeLog, testsuite/ChangeLog are attached.
>>
>> Are they ok now?
>
> OK for mainline.
>
> Thanks,
> Uros.
>


Re: [PATCH, testsuite, i386] AVX2 support for GCC

2011-08-23 Thread Kirill Yukhin
Thanks,

could anybody please commit that?


K

On Tue, Aug 23, 2011 at 8:54 PM, Uros Bizjak  wrote:
> On Tue, Aug 23, 2011 at 6:31 PM, Kirill Yukhin  
> wrote:
>> Thanks, done.
>> Updated patch attached.
>
> OK for mainline.
>
> Thanks,
> Uros.
>


Re: [trans-mem] Use __x86_64__ instead of __LP64__.

2011-08-23 Thread Richard Henderson
On 08/23/2011 05:33 AM, Torvald Riegel wrote:
> Use __x86_64__ instead of __LP64__.
> 
>   * config/x86/tls.h: Use __x86_64__ instead of __LP64__.
>   Add X32 support.
>   * config/x86/sjlj.S: Same.

Ok.

At some point we really should merge from mainline, so that we
can test this properly...

r~


Re: [PATCH v3, i386] BMI2 support for GCC, mulx, rorx, x part

2011-08-23 Thread H.J. Lu
On Tue, Aug 23, 2011 at 9:55 AM, Kirill Yukhin  wrote:
> Great! Thanks.
>
> Could anybody please commit that?

Done.

Thanks.

> K
>
> On Tue, Aug 23, 2011 at 8:53 PM, Uros Bizjak  wrote:
>> On Tue, Aug 23, 2011 at 6:22 PM, Kirill Yukhin  
>> wrote:
>>
>>> thanks. I've applied your inputs.
>>>
>>> Updated patch, ChangeLog, testsuite/ChangeLog are attached.
>>>
>>> Are they ok now?
>>
>> OK for mainline.
>>
>> Thanks,
>> Uros.
>>
>



-- 
H.J.



Re: [PATCH, testsuite, i386] AVX2 support for GCC

2011-08-23 Thread H.J. Lu
On Tue, Aug 23, 2011 at 9:55 AM, Kirill Yukhin  wrote:
> Thanks,
>
> could anybody please commit that?
>

Please regenerate AVX2 patch with the current trunk
since your change won't apply after BMI2 checkin.

Thanks.


-- 
H.J.


[PATCH] Add infrastructure to merge standard builtin enums with backend builtins

2011-08-23 Thread Michael Meissner
Over the years, it has been a problem for ports like the PowerPC that have
more builtins than the standard list.  Code that references the built_in_decls
and implicit_built_in_decls arrays are supposed to check DECL_BUILT_IN_CLASS
being BUILT_IN_NORMAL first, but every so often code doesn't do this check, and
it checks random pieces of memory, because the backend builtin function number
is greater than the number of machine independent builtins.

This patch is an infrastructure patch that lets backends decide to merge their
builtin enumerations at the end of the standard set of enumerations.  Only the
PowerPC port is modified with this patch.  If this goes in, the port
maintainers for the other ports with a lot of builtins (x86, spu, etc.) can
decide whether to move to this new infrastructure.

This patch growes the bit field that has the builtin function number by one
bit.  Strictly speaking for the PowerPC, we don't need it just yet, but it
gives us a margin of safety.  Right now, there are about 730 machine
independent builtins and 950 PowerPC builtins, which gives us a margin of 350
more builtins before the field is full, if I didn't grow the size of the
builtin function bitfield.

When I was documenting the tm-bu-funcs.def file built by the Makefile, I
noticed there was a FIXME comment asking why tm_p.h existed, so I added an
explanation.

Originally I wanted to allow the MD file to allow all of the builtins to be
initialized when the main builtins are setup.  This would have fit into the
infrastructure, by having MD versions of builtin-attrs.def and
builtin-types.def.  However, the problem is Fortran doesn't use the C, C++, and
LTO common builtin infrastructure, but it does want to initialize the target
builtins via the targetm.init_builtins hook.  So I decided not to include that
support in this patch.

These are meant to be committed as a single patch, but I have separated the
patches into a machine indendepent patch, and one that moves the PowerPC to use
this new infrastructure.

I anticipate there will be additional patches in the powerpc builtin area to
allow target attributes and pragmas to enable new builtins, but that will be in
a later patch.  I wanted in this patch to have a fairly minimimal set of
changes.

I have bootstrapped and done make checks on the PowerPC with no regressions.
In addition, I have bootstrapped the x86_64 to make sure it continues to work
for a port that wasn't modified.  Are these patches ok to commit?

[gcc]
2011-08-23  Michael Meissner  

* doc/configfiles.texi (tm_p.h): Document why tm_p.h is needed.
(tm-bu-funcs.def): Document the include file that includes the
machine dependent builtin functions.

* tree.h (struct tree_function_decl): Grow function code field by
1 bit to allow for machines with lots of builtins.

* builtins.def (BUILT_IN_NONE): Reserve builtin index 0 so it is
not a legitimate builtin.
(DEF_BUILTIN_MD): New macro for defining machine dependent
builtins.
(toplevel): Include tm-bu-funcs.def.

* configure.ac (tm_builtin_funcs): New autoconf variable to merge
backend builtins into the main builtin handling.  Include
rs6000-builtin.def on rs6000.
* configure: Regenerate.
* config.gcc (rs6000*, powerpc*): Ditto.

* Makefile.in (BUILTINS_DEF): Add support for merging machine
dependent builtins at the end of the standard builtins.
(BUILTIN_FUNCS_MD): Ditto.
(c-family/c-common.o): Ditto.
(mostlyclean): Ditto.
(tm-bu-funcs.def): New header built that includes machine
dependent builtins.

* config/rs6000/rs6000-protos.h (rs6000_builtin_types): Move here
from rs6000.h.  Adjust for merging the rs6000 builtins after the
standard builtins.
(rs6000_builtin_decls): Ditto.

* config/rs6000/rs6000-builtin.def (toplevel): Add support for
being included in builtins.def to define all rs6000 builtins after
the standard builtins.  Delete RS6000_BUILTIN_EQUATE.
(RS6000_BUILTIN_FIRST): New macros to mark start and end of
various classes of builtins.  Replace existing overload start and
end markers.
(ALTIVEC_BUILTIN_FIRST): Ditto.
(ALTIVEC_BUILTIN_OVERLOADED_FIRST): Ditto.
(ALTIVEC_BUILTIN_OVERLOADED_LAST): Ditto.
(ALTIVEC_BUILTIN_LAST): Ditto.
(SPE_BUILTIN_FIRST): Ditto.
(SPE_BUILTIN_LAST): Ditto.
(PAIRED_BUILTIN_FIRST): Ditto.
(PAIRED_BUILTIN_LAST): Ditto.
(VSX_BUILTIN_FIRST): Ditto.
(VSX_BUILTIN_OVERLOADED_FIRST): Ditto.
(VSX_BUILTIN_OVERLOADED_LAST): Ditto.
(VSX_BUILTIN_LAST): Ditto.
(RS6000_BUILTIN_LAST): Ditto.
(VECTOR_BUILTIN_*): Move so the builtins are in the Altivec
range.

* config/rs6000/rs6000-c.c (struct altivec_builtin_types): Adjust
for merging the rs6000 builtins after th

Re: [rs6000] Fix creation of invalid CONST_VECTORs

2011-08-23 Thread Mike Stump
On Aug 23, 2011, at 3:44 AM, Richard Sandiford wrote:
> My patches to more "accurately" detect the number of zero elements in a
> compound initialiser caused pr34856 to trigger on powerpc*-darwin:
> 
>http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34856

Thanks for fixing this...  I don't think that is darwin code, so I can't 
approve it...


[PATCH] Handle CLRSB in nonzero_bits1

2011-08-23 Thread Jakub Jelinek
Hi!

While looking at PR50168, I've noticed that CLRSB isn't handled
in nonzero_bits1 (while FFS/POPCOUNT/PARITY/CLZ/CTZ are).
It means that combine can't optimize say clrsb insn followed by
zero or sign extension of the result.

Fixed thusly, ok for trunk?

2011-08-23  Jakub Jelinek  

* rtlanal.c (nonzero_bits1): Handle CLRSB.

--- gcc/rtlanal.c.jj2011-08-22 08:17:07.0 +0200
+++ gcc/rtlanal.c   2011-08-23 19:46:13.0 +0200
@@ -1,7 +1,7 @@
 /* Analyze RTL for GNU compiler.
Copyright (C) 1987, 1988, 1992, 1993, 1994, 1995, 1996, 1997, 1998,
-   1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010
-   Free Software Foundation, Inc.
+   1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010,
+   2011 Free Software Foundation, Inc.
 
 This file is part of GCC.
 
@@ -4273,6 +4273,11 @@ nonzero_bits1 (const_rtx x, enum machine
nonzero = -1;
   break;
 
+case CLRSB:
+  /* This is at most the number of bits in the mode minus 1.  */
+  nonzero = ((unsigned HOST_WIDE_INT) 1 << (floor_log2 (mode_width))) - 1;
+  break;
+
 case PARITY:
   nonzero = 1;
   break;

Jakub


Re: [PATCH] Handle CLRSB in nonzero_bits1

2011-08-23 Thread Richard Henderson
On 08/23/2011 10:52 AM, Jakub Jelinek wrote:
> 2011-08-23  Jakub Jelinek  
> 
>   * rtlanal.c (nonzero_bits1): Handle CLRSB.

Ok.


r~


[pph] Remove fixed FIXME in p4eabi.h (issue4939045)

2011-08-23 Thread Gabriel Charette
This memory problem was probably due to the big line_table being created with 
the incorrect code before.

This test now passes and this FIXME was irrelevant, removed it.

Trivial patch, already discussed with Diego, committed to pph.

2011-08-23  Gabriel Charette  

* g++.dg/pph/p4eabi1.h: Remove fixed FIXME.

diff --git a/gcc/testsuite/g++.dg/pph/p4eabi1.h 
b/gcc/testsuite/g++.dg/pph/p4eabi1.h
index 781de86..8c949ea 100644
--- a/gcc/testsuite/g++.dg/pph/p4eabi1.h
+++ b/gcc/testsuite/g++.dg/pph/p4eabi1.h
@@ -1,5 +1,4 @@
 // { dg-options "-w -fpermissive" }
-// FIXME pph - Enabling PPH for this file causes memory problems in cc1plus.
 // c1eabi1.h   c1eabi1.pph
 
 #ifndef C4EABI1_H

--
This patch is available for review at http://codereview.appspot.com/4939045


[libcpp,lto,fortran PATCH] Fix linemap_add use and remove unnecessary kludge

2011-08-23 Thread Dodji Seketeli
Hello,

There are a couple of places in the compiler that need to create line
maps directly by calling linemap_add.

The problem is, sometimes we wrongly do so by passing LC_RENAME as the
reason argument even when creating the first map of a given file.  But
then linemap_add has some code to change that LC_RENAME argument back
into LC_ENTER, as it should have been done initially.

This patch fixes the few calling spots (in pch, lto and fortran) and
turns the then useless fixup kludge in linemap_add into an assert.

I'd like this to be applied to trunk as Jason suggested[1] that
linemap_add drops that kludge while reviewing my macro location tracking
patch set[1].

Bootstrapped with --enable-languages=all,ada --enable-checking and
tested on x86_64-unknown-linux-gnu against trunk.

OK for trunk?

[1]: http://gcc.gnu.org/ml/gcc-patches/2011-08/msg01538.html


libcpp/

* line-map.c (linemap_add): Assert that reason must not be
LC_RENAME when called for the first time on a "main input file".

c-family/

* c-pch.c (c_common_read_pch): Call linemap_add with LC_ENTER as it's
the first time it's being called on this main TU.

gcc/lto/

* lto-lang.c (lto_init): Likewise.  Also, avoid calling
linemap_add twice.

gcc/fortran/

* scanner.c (load_file): Don't abuse LC_RENAME reason while
(indirectly) calling linemap_add.
---
 gcc/c-family/c-pch.c  |2 +-
 gcc/fortran/scanner.c |   24 ++--
 gcc/lto/lto-lang.c|3 +--
 libcpp/line-map.c |9 -
 4 files changed, 28 insertions(+), 10 deletions(-)

diff --git a/gcc/c-family/c-pch.c b/gcc/c-family/c-pch.c
index 3c2fd18..7a289d6 100644
--- a/gcc/c-family/c-pch.c
+++ b/gcc/c-family/c-pch.c
@@ -446,7 +446,7 @@ c_common_read_pch (cpp_reader *pfile, const char *name,
   fclose (f);
 
   line_table->trace_includes = saved_trace_includes;
-  linemap_add (line_table, LC_RENAME, 0, saved_loc.file, saved_loc.line);
+  linemap_add (line_table, LC_ENTER, 0, saved_loc.file, saved_loc.line);
 
   /* Give the front end a chance to take action after a PCH file has
  been loaded.  */
diff --git a/gcc/fortran/scanner.c b/gcc/fortran/scanner.c
index 0c127d4..120d550 100644
--- a/gcc/fortran/scanner.c
+++ b/gcc/fortran/scanner.c
@@ -1887,6 +1887,11 @@ load_file (const char *realfilename, const char 
*displayedname, bool initial)
   int len, line_len;
   bool first_line;
   const char *filename;
+  /* If realfilename and displayedname are different and non-null then
+ surely realfilename is the preprocessed form of
+ displayedname.  */
+  bool preprocessed_p = (realfilename && displayedname
+&& strcmp (realfilename, displayedname));
 
   filename = displayedname ? displayedname : realfilename;
 
@@ -1925,9 +1930,24 @@ load_file (const char *realfilename, const char 
*displayedname, bool initial)
}
 }
 
-  /* Load the file.  */
+  /* Load the file.
 
-  f = get_file (filename, initial ? LC_RENAME : LC_ENTER);
+ A "non-initial" file means a file that is being included.  In
+ that case we are creating an LC_ENTER map.
+
+ An "initial" file means a main file; one that is not included.
+ That file has already got at least one (surely more) line map(s)
+ created by gfc_init.  So the subsequent map created in that case
+ must have LC_RENAME reason.
+
+ This latter case is not true for a preprocessed file.  In that
+ case, although the file is "initial", the line maps created by
+ gfc_init was used during the preprocessing of the file.  Now that
+ the preprocessing is over and we are being fed the result of that
+ preprocessing, we need to create a brand new line map for the
+ preprocessed file, so the reason is going to be LC_ENTER.  */
+
+  f = get_file (filename, (initial && !preprocessed_p) ? LC_RENAME : LC_ENTER);
   if (!initial)
 add_file_change (f->filename, f->inclusion_line);
   current_file = f;
diff --git a/gcc/lto/lto-lang.c b/gcc/lto/lto-lang.c
index 83c41e6..d469fb9 100644
--- a/gcc/lto/lto-lang.c
+++ b/gcc/lto/lto-lang.c
@@ -1081,8 +1081,7 @@ lto_init (void)
   flag_generate_lto = flag_wpa;
 
   /* Initialize libcpp line maps for gcc_assert to work.  */
-  linemap_add (line_table, LC_RENAME, 0, NULL, 0);
-  linemap_add (line_table, LC_RENAME, 0, NULL, 0);
+  linemap_add (line_table, LC_ENTER, 0, NULL, 0);
 
   /* Create the basic integer types.  */
   build_common_tree_nodes (flag_signed_char, /*short_double=*/false);
diff --git a/libcpp/line-map.c b/libcpp/line-map.c
index dd3f11c..2a0749a 100644
--- a/libcpp/line-map.c
+++ b/libcpp/line-map.c
@@ -114,11 +114,10 @@ linemap_add (struct line_maps *set, enum lc_reason reason,
   if (reason == LC_RENAME_VERBATIM)
 reason = LC_RENAME;
 
-  /* If we don't keep our line maps consistent, we can easily
- segfault.  Don't rely on the client to do it for us.  */
-  if (set->depth == 0)
-reason = LC_ENTER;
-  else if (reason == LC_LEA

Re: [PATCH 4/7] Support -fdebug-cpp option

2011-08-23 Thread Dodji Seketeli
Alexandre Oliva  writes:

> On Jul 16, 2011, Dodji Seketeli  wrote:
>
>> This patch adds -fdebug-cpp option. When used with -E this dumps the
>> relevant macro map before every single token. This clutters the output
>> a lot but has proved to be invaluable in tracking some bugs during the
>> development of the virtual location support.
>
> Any way to read that back in while compiling a preprocessed file, so
> that ccache et al can use this flag to get the same location information
> that would have been gotten without separate preprocessing?

Jakub Jelinek  writes:

>
> For ccache and friends I think it would be better to have a preprocessing
> mode that would output all lines as is (i.e. no macro replacement), except
> for processing #include/#include_next directives.

Would that be enough for, say, when people submit bug reports to GCC?  I
think it would but maybe I am missing some corner cases.

Tom Tromey  writes:

>> "Jakub" == Jakub Jelinek  writes:
>
> Jakub> For ccache and friends I think it would be better to have a
> Jakub> preprocessing mode that would output all lines as is (i.e. no
> Jakub> macro replacement), except for processing #include/#include_next
> Jakub> directives.
>
> That exists -- -fdirectives-only.
>
> Tom

Jakub Jelinek  writes:

> On Mon, Aug 22, 2011 at 08:16:45AM -0600, Tom Tromey wrote:
>> > "Jakub" == Jakub Jelinek  writes:
>> 
>> Jakub> For ccache and friends I think it would be better to have a
>> Jakub> preprocessing mode that would output all lines as is (i.e. no
>> Jakub> macro replacement), except for processing #include/#include_next
>> Jakub> directives.
>> 
>> That exists -- -fdirectives-only.
>
> It isn't exactly what would be needed, as e.g. \\\n are removed from
> from #defines and thus they get different location of the tokens.

Would it be acceptable to just change the output of -fdirective to fit?
Or are we bound to not breaking existing consumers?

>
> BTW, I guess we should do something about parsing such an output,
> we emit e.g.
> # 1 ""
> # 1 ""
> #define __STDC__ 1
> #define __STDC_HOSTED__ 1
> #define __GNUC__ 4
> #define __GNUC_MINOR__ 6
> #define __GNUC_PATCHLEVEL__ 0
> ...
>
> For  we really should assume line 0 for all the defines
> in that "file".

Right now BUILTIN_LOCATION is defined to 1 at least the C/C++ FEs.  Are
you saying defines should have UNKNOWN_LOCATION (which is 0) instead?
OTOH, c_builtin_function already sets the location of c builtins to
UNKNOWN_LOCATION.  Which makes me wonder what BUILTIN_LOCATION is
actually good for then.  :-)

-- 
Dodji


Re: [PATCH, testsuite, i386] AVX2 support for GCC

2011-08-23 Thread Kirill Yukhin
Pulled down. Updated patch attached.

--
Thanks, K

On Tue, Aug 23, 2011 at 9:06 PM, H.J. Lu  wrote:
> On Tue, Aug 23, 2011 at 9:55 AM, Kirill Yukhin  
> wrote:
>> Thanks,
>>
>> could anybody please commit that?
>>
>
> Please regenerate AVX2 patch with the current trunk
> since your change won't apply after BMI2 checkin.
>
> Thanks.
>
>
> --
> H.J.
>


avx2-9.gcc.patch.tgz
Description: GNU Zip compressed data


Include $(CFLAGS-$@) in ALL_CXXFLAGS

2011-08-23 Thread Joseph S. Myers
I noticed that in gcc/Makefile.in, ALL_CFLAGS included $(CFLAGS-$@)
but ALL_CXXFLAGS was missing it.  I don't see any reason for this
difference; this patch makes the two agree more closely.

Bootstrapped with no regressions on x86_64-unknown-linux-gnu.  OK to
commit?

2011-08-23  Joseph Myers  

* Makefile.in (ALL_CXXFLAGS): Include $(CFLAGS-$@).

Index: Makefile.in
===
--- Makefile.in (revision 177999)
+++ Makefile.in (working copy)
@@ -1029,7 +1029,7 @@
   $(CFLAGS) $(INTERNAL_CFLAGS) $(COVERAGE_FLAGS) $(WARN_CFLAGS) @DEFS@
 
 # The C++ version.
-ALL_CXXFLAGS = $(T_CFLAGS) $(CXXFLAGS) $(INTERNAL_CFLAGS) \
+ALL_CXXFLAGS = $(T_CFLAGS) $(CFLAGS-$@) $(CXXFLAGS) $(INTERNAL_CFLAGS) \
   $(COVERAGE_FLAGS) $(WARN_CXXFLAGS) @DEFS@
 
 # Likewise.  Put INCLUDES at the beginning: this way, if some autoconf macro

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH, testsuite, i386] AVX2 support for GCC

2011-08-23 Thread H.J. Lu
On Tue, Aug 23, 2011 at 12:04 PM, Kirill Yukhin  wrote:
> Pulled down. Updated patch attached.

I checked it in.

Thanks.

> --
> Thanks, K
>
> On Tue, Aug 23, 2011 at 9:06 PM, H.J. Lu  wrote:
>> On Tue, Aug 23, 2011 at 9:55 AM, Kirill Yukhin  
>> wrote:
>>> Thanks,
>>>
>>> could anybody please commit that?
>>>
>>
>> Please regenerate AVX2 patch with the current trunk
>> since your change won't apply after BMI2 checkin.
>>
>> Thanks.
>>
>>
>> --
>> H.J.
>>
>



-- 
H.J.


Re: Include $(CFLAGS-$@) in ALL_CXXFLAGS

2011-08-23 Thread DJ Delorie

There are some gcc flags that are not legal g++ flags, though...


Re: Include $(CFLAGS-$@) in ALL_CXXFLAGS

2011-08-23 Thread Joseph S. Myers
On Tue, 23 Aug 2011, DJ Delorie wrote:

> There are some gcc flags that are not legal g++ flags, though...

True, but not relevant to this patch, since every flag currently in 
$(CFLAGS-$@) is valid for both (and I'm working on a follow-up patch 
moving more flags into $(CFLAGS-$@), that will be valid and required for 
both C and C++ compilations).

-- 
Joseph S. Myers
jos...@codesourcery.com


[lra] Rewriting caller saves subpass.

2011-08-23 Thread Vladimir Makarov
   After some experiments I found that there is practically no a 
generated code improvement of the current LRA caller saves subpass in 
comparison of the subpass implemented in this patch.  The current 
subpass was based on iterated global forward/backward solution of the 
problem putting save/restore code in program excluding putting the code 
on CFG edges.  I could write a several hypotheses for this but imho most 
probable explanation is that live range splitting (which can be seen as 
save/restore code around calls for some cases) is already done on the 
most important program points (loop borders).


  The proposed subpass is EBB based one.  It permits speed up LRA 
because the code is much simpler and faster and because there is no need 
to call DF analysis after the subpass (LRA itself can easily update 
DF-info).


  The patch also speeds LRA up be removing one lra-lives.c subpass run 
in some cases.


  The patch was successfully tested on bootstrap of x86_64 and ppc64 
and improved GCC compile speed (in release mode) by about 1%.


2011-08-23  Vladimir Makarov 

* lra-int.h (lra_debug_save_data): Remove.
(lra_save_restore): Change the prototype.

* lra.c (lra): Don't call df_analyze after lra_save_restore.
Don't call lra_create_live_ranges before the loop. Call
lra_create_live_ranges before lra_spill if necessary.

* lra-saves.c: Rewrite.



caller-saves.patch.gz
Description: GNU Zip compressed data


Re: Include $(CFLAGS-$@) in ALL_CXXFLAGS

2011-08-23 Thread DJ Delorie

Hmm... ok, I'm just a tad paranoid about using the name "CFLAGS" for
g++, someone's bound to do the stupid thing eventually.


[google] With FDO/LIPO inline some cold callsites

2011-08-23 Thread Mark Heffernan
The following patch changes the inliner callsite filter with FDO/LIPO.
 Previously, cold callsites were unconditionally rejected.  Now the
callsite may still be inlined if the _caller_ is sufficiently hot (max
count of any bb in the function is above hot threshold).  This gives
about 0.5 - 1% geomean performance on x86-64 (depending on microarch)
on internal benchmarks with < 1% average code size increase.

Bootstrapped and reg tested.  Ok for google/gcc-4_6?

Mark

2011-08-23  Mark Heffernan  

* basic-block.h (maybe_hot_frequency_p): Add prototype.
* cgraph.c (dump_cgraph_node): Add field to dump.
(cgraph_clone_node) Handle new field.
* cgraph.h (cgraph_node): New field max_bb_count.
* cgraphbuild.c (rebuild_cgraph_edges): Compute max_bb_count.
* cgraphunit.c (cgraph_copy_node_for_versioning) Handle new field.
* common.opt (finline-hot-caller): New option.
* ipa-inline.c (cgraph_mark_inline_edge) Update max_bb_count.
(edge_hot_enough_p) New function.
(cgraph_decide_inlining_of_small_functions) Call edge_hot_enough_p.
* predict.c (maybe_hot_frequency_p): Remove static keyword and
guard with profile_info check.
* testsuite/gcc.dg/tree-prof/inliner-1.c: Add flag.
* testsuite/gcc.dg/tree-prof/lipo/inliner-1_0.c: Add flag.
Index: cgraphbuild.c
===
--- cgraphbuild.c	(revision 177964)
+++ cgraphbuild.c	(working copy)
@@ -591,9 +591,12 @@ rebuild_cgraph_edges (void)
   ipa_remove_all_references (&node->ref_list);
 
   node->count = ENTRY_BLOCK_PTR->count;
+  node->max_bb_count = 0;
 
   FOR_EACH_BB (bb)
 {
+  if (bb->count > node->max_bb_count)
+	node->max_bb_count = bb->count;
   for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
 	{
 	  gimple stmt = gsi_stmt (gsi);
Index: cgraph.c
===
--- cgraph.c	(revision 177964)
+++ cgraph.c	(working copy)
@@ -1904,6 +1904,9 @@ dump_cgraph_node (FILE *f, struct cgraph
   if (node->count)
 fprintf (f, " executed "HOST_WIDEST_INT_PRINT_DEC"x",
 	 (HOST_WIDEST_INT)node->count);
+  if (node->max_bb_count)
+fprintf (f, " hottest bb executed "HOST_WIDEST_INT_PRINT_DEC"x",
+	 (HOST_WIDEST_INT)node->max_bb_count);
   if (node->local.inline_summary.self_time)
 fprintf (f, " %i time, %i benefit", node->local.inline_summary.self_time,
 	node->local.inline_summary.time_inlining_benefit);
@@ -2234,6 +2237,9 @@ cgraph_clone_node (struct cgraph_node *n
   new_node->global = n->global;
   new_node->rtl = n->rtl;
   new_node->count = count;
+  new_node->max_bb_count = count;
+  if (n->count)
+new_node->max_bb_count = count * n->max_bb_count / n->count;
   new_node->is_versioned_clone = n->is_versioned_clone;
   new_node->frequency = n->frequency;
   new_node->clone = n->clone;
@@ -2252,6 +2258,9 @@ cgraph_clone_node (struct cgraph_node *n
   n->count -= count;
   if (n->count < 0)
 	n->count = 0;
+  n->max_bb_count -= new_node->max_bb_count;
+  if (n->max_bb_count < 0)
+	n->max_bb_count = 0;
 }
 
   FOR_EACH_VEC_ELT (cgraph_edge_p, redirect_callers, i, e)
Index: cgraph.h
===
--- cgraph.h	(revision 177964)
+++ cgraph.h	(working copy)
@@ -235,6 +235,8 @@ struct GTY((chain_next ("%h.next"), chai
 
   /* Expected number of executions: calculated in profile.c.  */
   gcov_type count;
+  /* Maximum count of any basic block in the function.  */
+  gcov_type max_bb_count;
   /* How to scale counts at materialization time; used to merge
  LTO units with different number of profile runs.  */
   int count_materialization_scale;
Index: cgraphunit.c
===
--- cgraphunit.c	(revision 177964)
+++ cgraphunit.c	(working copy)
@@ -2187,6 +2187,7 @@ cgraph_copy_node_for_versioning (struct 
new_version->rtl = old_version->rtl;
new_version->reachable = true;
new_version->count = old_version->count;
+   new_version->max_bb_count = old_version->max_bb_count;
new_version->is_versioned_clone = true;
 
for (e = old_version->callees; e; e=e->next_callee)
Index: testsuite/gcc.dg/tree-prof/inliner-1.c
===
--- testsuite/gcc.dg/tree-prof/inliner-1.c	(revision 177964)
+++ testsuite/gcc.dg/tree-prof/inliner-1.c	(working copy)
@@ -1,4 +1,4 @@
-/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* { dg-options "-O2 -fno-inline-hot-caller -fdump-tree-optimized" } */
 int a;
 int b[100];
 void abort (void);
@@ -34,7 +34,7 @@ main ()
   return 0;
 }
 
-/* cold function should be inlined, while hot function should not.  
+/* cold function should be not inlined, while hot function should be.
Look for "cold_function () [tail call];" call statement not for the
declaration or other apperances of the stri

[PATCH, i386]: Introduce Yp register constraint and merge *_lea add ans ashift patterns with base

2011-08-23 Thread Uros Bizjak
Hello!

Attached patch introduces Yp register constraint, conditionalized on
TARGET_PARTIAL_REG_STALL.  Using this constraint, several *_lea
patterns can be merged with their base patterns, resulting in many
removed lines of code.

No functional changes otherwise.

2011-08-23  Uros Bizjak  

* config/i386/constraints.md (Yp): New register constraint.
* config/i386/i386.md (*addhi_1): Merge with *addhi_1_lea using
Yp register constraint.
(*addqi_1): Merge with *addqi_1_lea using Yp register constraint.
(*ashlhi3_1): Merge with *ashlhi3_1_lea using Yp register constraint.
(*ashlqi3_1): Merge with *ashlqi3_1_lea using Yp register constraint.

Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu
{,-m32}, committed to mainline SVN.

URos.
Index: i386.md
===
--- i386.md (revision 178001)
+++ i386.md (working copy)
@@ -5650,52 +5650,14 @@
(set_attr "mode" "SI")])
 
 (define_insn "*addhi_1"
-  [(set (match_operand:HI 0 "nonimmediate_operand" "=rm,r")
-   (plus:HI (match_operand:HI 1 "nonimmediate_operand" "%0,0")
-(match_operand:HI 2 "general_operand" "rn,rm")))
+  [(set (match_operand:HI 0 "nonimmediate_operand" "=rm,r,r,Yp")
+   (plus:HI (match_operand:HI 1 "nonimmediate_operand" "%0,0,r,Yp")
+(match_operand:HI 2 "general_operand" "rn,rm,0,ln")))
(clobber (reg:CC FLAGS_REG))]
-  "TARGET_PARTIAL_REG_STALL
-   && ix86_binary_operator_ok (PLUS, HImode, operands)"
+  "ix86_binary_operator_ok (PLUS, HImode, operands)"
 {
   switch (get_attr_type (insn))
 {
-case TYPE_INCDEC:
-  if (operands[2] == const1_rtx)
-   return "inc{w}\t%0";
-  else
-{
- gcc_assert (operands[2] == constm1_rtx);
- return "dec{w}\t%0";
-   }
-
-default:
-  if (x86_maybe_negate_const_int (&operands[2], HImode))
-   return "sub{w}\t{%2, %0|%0, %2}";
-
-  return "add{w}\t{%2, %0|%0, %2}";
-}
-}
-  [(set (attr "type")
- (if_then_else (match_operand:HI 2 "incdec_operand" "")
-   (const_string "incdec")
-   (const_string "alu")))
-   (set (attr "length_immediate")
-  (if_then_else
-   (and (eq_attr "type" "alu") (match_operand 2 "const128_operand" ""))
-   (const_string "1")
-   (const_string "*")))
-   (set_attr "mode" "HI")])
-
-(define_insn "*addhi_1_lea"
-  [(set (match_operand:HI 0 "nonimmediate_operand" "=r,rm,r,r")
-   (plus:HI (match_operand:HI 1 "nonimmediate_operand" "%0,0,r,r")
-(match_operand:HI 2 "general_operand" "rmn,rn,0,ln")))
-   (clobber (reg:CC FLAGS_REG))]
-  "!TARGET_PARTIAL_REG_STALL
-   && ix86_binary_operator_ok (PLUS, HImode, operands)"
-{
-  switch (get_attr_type (insn))
-{
 case TYPE_LEA:
   return "#";
 
@@ -5739,63 +5701,16 @@
(const_string "*")))
(set_attr "mode" "HI,HI,HI,SI")])
 
-;; %%% Potential partial reg stall on alternative 2.  What to do?
+;; %%% Potential partial reg stall on alternatives 3 and 4.  What to do?
 (define_insn "*addqi_1"
-  [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,r")
-   (plus:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,0")
-(match_operand:QI 2 "general_operand" "qn,qmn,rn")))
+  [(set (match_operand:QI 0 "nonimmediate_operand" "=qm,q,q,r,r,Yp")
+   (plus:QI (match_operand:QI 1 "nonimmediate_operand" "%0,0,q,0,r,Yp")
+(match_operand:QI 2 "general_operand" "qn,qm,0,rn,0,ln")))
(clobber (reg:CC FLAGS_REG))]
-  "TARGET_PARTIAL_REG_STALL
-   && ix86_binary_operator_ok (PLUS, QImode, operands)"
+  "ix86_binary_operator_ok (PLUS, QImode, operands)"
 {
-  int widen = (which_alternative == 2);
-  switch (get_attr_type (insn))
-{
-case TYPE_INCDEC:
-  if (operands[2] == const1_rtx)
-   return widen ? "inc{l}\t%k0" : "inc{b}\t%0";
-  else
-   {
- gcc_assert (operands[2] == constm1_rtx);
- return widen ? "dec{l}\t%k0" : "dec{b}\t%0";
-   }
+  bool widen = (which_alternative == 3 || which_alternative == 4);
 
-default:
-  if (x86_maybe_negate_const_int (&operands[2], QImode))
-   {
- if (widen)
-   return "sub{l}\t{%2, %k0|%k0, %2}";
- else
-   return "sub{b}\t{%2, %0|%0, %2}";
-   }
-  if (widen)
-return "add{l}\t{%k2, %k0|%k0, %k2}";
-  else
-return "add{b}\t{%2, %0|%0, %2}";
-}
-}
-  [(set (attr "type")
- (if_then_else (match_operand:QI 2 "incdec_operand" "")
-   (const_string "incdec")
-   (const_string "alu")))
-   (set (attr "length_immediate")
-  (if_then_else
-   (and (eq_attr "type" "alu") (match_operand 2 "const128_operand" ""))
-   (const_string "1")
-   (const_string "*")))
-   (set_attr "mode" "QI,QI,SI")])
-
-;; %%% Potential partial reg stall on alternatives 3 and 4.  What to do?
-(define_insn "*addqi_1_lea"
-  [(set (match_operand:QI 0 "nonimmediate_operand" "=q,q

Re: [google] With FDO/LIPO inline some cold callsites

2011-08-23 Thread Xinliang David Li
The patch makes sense as inlining cold callsites can sharpen analysis
in the hot caller leading to more aggressive optimization.

Ok for google branch.

Thanks,

David

On Tue, Aug 23, 2011 at 12:56 PM, Mark Heffernan  wrote:
> The following patch changes the inliner callsite filter with FDO/LIPO.
>  Previously, cold callsites were unconditionally rejected.  Now the
> callsite may still be inlined if the _caller_ is sufficiently hot (max
> count of any bb in the function is above hot threshold).  This gives
> about 0.5 - 1% geomean performance on x86-64 (depending on microarch)
> on internal benchmarks with < 1% average code size increase.
>
> Bootstrapped and reg tested.  Ok for google/gcc-4_6?
>
> Mark
>
> 2011-08-23  Mark Heffernan  
>
>        * basic-block.h (maybe_hot_frequency_p): Add prototype.
>        * cgraph.c (dump_cgraph_node): Add field to dump.
>        (cgraph_clone_node) Handle new field.
>        * cgraph.h (cgraph_node): New field max_bb_count.
>        * cgraphbuild.c (rebuild_cgraph_edges): Compute max_bb_count.
>        * cgraphunit.c (cgraph_copy_node_for_versioning) Handle new field.
>        * common.opt (finline-hot-caller): New option.
>        * ipa-inline.c (cgraph_mark_inline_edge) Update max_bb_count.
>        (edge_hot_enough_p) New function.
>        (cgraph_decide_inlining_of_small_functions) Call edge_hot_enough_p.
>        * predict.c (maybe_hot_frequency_p): Remove static keyword and
>        guard with profile_info check.
>        * testsuite/gcc.dg/tree-prof/inliner-1.c: Add flag.
>        * testsuite/gcc.dg/tree-prof/lipo/inliner-1_0.c: Add flag.
>


Re: [PATCH, i386]: Introduce Yp register constraint and merge *_lea add ans ashift patterns with base

2011-08-23 Thread Richard Henderson
On 08/23/2011 12:59 PM, Uros Bizjak wrote:
>   * config/i386/constraints.md (Yp): New register constraint.
>   * config/i386/i386.md (*addhi_1): Merge with *addhi_1_lea using
>   Yp register constraint.
>   (*addqi_1): Merge with *addqi_1_lea using Yp register constraint.
>   (*ashlhi3_1): Merge with *ashlhi3_1_lea using Yp register constraint.
>   (*ashlqi3_1): Merge with *ashlqi3_1_lea using Yp register constraint.

You can still make use of attribute enabled...


r~


Re: [PATCH, i386]: Introduce Yp register constraint and merge *_lea add ans ashift patterns with base

2011-08-23 Thread Uros Bizjak
On Tue, Aug 23, 2011 at 10:04 PM, Richard Henderson  wrote:

>>       * config/i386/constraints.md (Yp): New register constraint.
>>       * config/i386/i386.md (*addhi_1): Merge with *addhi_1_lea using
>>       Yp register constraint.
>>       (*addqi_1): Merge with *addqi_1_lea using Yp register constraint.
>>       (*ashlhi3_1): Merge with *ashlhi3_1_lea using Yp register constraint.
>>       (*ashlqi3_1): Merge with *ashlqi3_1_lea using Yp register constraint.
>
> You can still make use of attribute enabled...

Yes, I am aware of this attribute, but OTOH, I propose that we use it
for ISA changes and don't mix everything together. I was thinking of
moving Y2, Y3 and Y4 out of register constraints to "enabled"
attribute, but they clashed with TARGET_AVX somehow.

So, at the end of the day, it was much easier to leave pre-AVX
register constraints as they were, while post-AVX and AVX-independent
ISAs should be handled via "enabled" attribute.

Uros.


Re: [PATCH, i386]: Introduce Yp register constraint and merge *_lea add ans ashift patterns with base

2011-08-23 Thread Jakub Jelinek
On Tue, Aug 23, 2011 at 10:23:54PM +0200, Uros Bizjak wrote:
> On Tue, Aug 23, 2011 at 10:04 PM, Richard Henderson  wrote:
> 
> >>       * config/i386/constraints.md (Yp): New register constraint.
> >>       * config/i386/i386.md (*addhi_1): Merge with *addhi_1_lea using
> >>       Yp register constraint.
> >>       (*addqi_1): Merge with *addqi_1_lea using Yp register constraint.
> >>       (*ashlhi3_1): Merge with *ashlhi3_1_lea using Yp register constraint.
> >>       (*ashlqi3_1): Merge with *ashlqi3_1_lea using Yp register constraint.
> >
> > You can still make use of attribute enabled...
> 
> Yes, I am aware of this attribute, but OTOH, I propose that we use it
> for ISA changes and don't mix everything together. I was thinking of
> moving Y2, Y3 and Y4 out of register constraints to "enabled"
> attribute, but they clashed with TARGET_AVX somehow.

The advantage of enabled attribute over the new special constraints is IMHO that
the constraints are exposed to users, they can use them in inline assembly.
You could mix two or three attributes to compute enabled etc. if needed.

Jakub


Re: [PATCH, i386]: Introduce Yp register constraint and merge *_lea add ans ashift patterns with base

2011-08-23 Thread Richard Henderson
On 08/23/2011 01:23 PM, Uros Bizjak wrote:
> Yes, I am aware of this attribute, but OTOH, I propose that we use it
> for ISA changes and don't mix everything together.

I think that's more confusing than not.  We can use other
sub-attributes than ISA if that makes it easier for you,
but I do think that these artificial ISA restrictions do
have the quality of a real restricted ISA like coldfire.


r~


Re: [RFC] Cleanup DW_CFA_GNU_args_size handling

2011-08-23 Thread Richard Henderson
On 08/21/2011 02:51 PM, Eric Botcazou wrote:
>> I'm afraid this patch casues i386 bootstraps to fail:
>>
>>   Comparing stages 2 and 3
>>   warning: gcc/cc1-checksum.o differs
>>   warning: gcc/cc1plus-checksum.o differs
>>   warning: gcc/cc1obj-checksum.o differs
>>   Bootstrap comparison failure!
>>   libiberty/pic/cplus-dem.o differs
>>   libiberty/pic/crc32.o differs
> 
> Probably stating the obvious, but the outcome is the same for i586.
> 

I've been trying for 2 days to replicate this with various
configurations and none have failed.


r~


[patch, hpux, committed] Fix libstdc++/50153, hppa*-hp-hpux11* build problem

2011-08-23 Thread Steve Ellcey

The recent change for libstdc++/1773 included a change that defined
__cplusplus to be 199711L instead of 1.  That caused a problem with the
integer abs() definition on hppa*-hp-hpux11*.  It did not cause a
problem on ia64-hp-hpux11* because of a fixincludes rule (hpux11_abs)
that was applied to ia64-hp-hpux11*.  This patch extends that rule
to the hppa platform by changing 'ia64-hp-hpux11*' to '*-hp-hpux11*'

I tested the fix on hppa2.0w-hp-hpux11.11 and since this only affects
HP-UX platforms I will go ahead and check it in.  Since I am changing
the 'mach' entry for a fix and not adding a new fix no new test is
needed and I verified that making check-fixincludes still worked.

Steve Ellcey
s...@cup.hp.com


2011-08-23  Steve Ellcey  

PR libstdc++/50153
* inclhack.def (hpux11_abs): Extend to all hpux machines.
* fixincl.x: Regenerate.


Index: inclhack.def
===
--- inclhack.def(revision 177990)
+++ inclhack.def(working copy)
@@ -1862,7 +1862,7 @@ fix = {
  */
 fix = {
 hackname  = hpux11_abs;
-mach  = "ia64-hp-hpux11*";
+mach  = "*-hp-hpux11*";
 files = stdlib.h;
 select= "ifndef _MATH_INCLUDED";
 c_fix = format;


Re: [PATCH, i386]: Introduce Yp register constraint and merge *_lea add ans ashift patterns with base

2011-08-23 Thread Bernd Schmidt
On 08/23/11 22:23, Uros Bizjak wrote:
> On Tue, Aug 23, 2011 at 10:04 PM, Richard Henderson  wrote:
> 
>>>   * config/i386/constraints.md (Yp): New register constraint.
>>>   * config/i386/i386.md (*addhi_1): Merge with *addhi_1_lea using
>>>   Yp register constraint.
>>>   (*addqi_1): Merge with *addqi_1_lea using Yp register constraint.
>>>   (*ashlhi3_1): Merge with *ashlhi3_1_lea using Yp register constraint.
>>>   (*ashlqi3_1): Merge with *ashlqi3_1_lea using Yp register constraint.
>>
>> You can still make use of attribute enabled...
> 
> Yes, I am aware of this attribute, but OTOH, I propose that we use it
> for ISA changes and don't mix everything together.

You could have sub-attributes, "isa_enabled" and "tuning_enabled", which
are then anded together.


Bernd



[Patch, fortran] Fix PR fortran/50050 breakage: ICE on valid with null pointer initialization

2011-08-23 Thread Mikael Morin
Hello,

this is an attempt to fix my recent breakage for PR50050.
I forgot that shape can't always be known, and thus, that for some 
expressions, the shape field is a NULL pointer.

This patch adds an early return in gfc_free_shape in the case shape is NULL.
Then some external NULL shape checks are redundant and can be removed. 
I added some asserts in the cases there was no check before, so that the code 
is strictly equivalent.

Neither bootstraped, nor regression tested, but it is in progress. My machine 
does its best (which is not a lot) to have this properly compiled and tested 
(and then committed) as soon as possible.
Otherwise OK for 4.{4..7} ?

Mikael

PS: Sorry for the breakage, and thanks to Andrew Benson for the early report 
(with a reduced testcase !). I was about to break the 4.5 branch as well 
before I saw it.
2011-08-22  Mikael Morin  

PR fortran/50050
* expr.c (gfc_free_shape): Do nothing if shape is NULL.
(free_expr0): Remove redundant NULL shape check.
* resolve.c (check_host_association): Ditto.
* trans-expr.c (gfc_trans_subarray_assign): Assert that shape is
non-NULL.
* trans-io.c (transfer_array_component): Ditto.

2011-08-22  Mikael Morin  

* gfortran.dg/pointer_comp_init_1.f90: New test.
Index: trans-expr.c
===
--- trans-expr.c	(révision 177956)
+++ trans-expr.c	(copie de travail)
@@ -4411,6 +4411,7 @@ gfc_trans_subarray_assign (tree dest, gfc_componen
   gfc_add_block_to_block (&block, &loop.pre);
   gfc_add_block_to_block (&block, &loop.post);
 
+  gcc_assert (lss->shape != NULL);
   gfc_free_shape (&lss->shape, cm->as->rank);
   gfc_cleanup_loop (&loop);
 
Index: expr.c
===
--- expr.c	(révision 177956)
+++ expr.c	(copie de travail)
@@ -409,6 +409,9 @@ gfc_clear_shape (mpz_t *shape, int rank)
 void
 gfc_free_shape (mpz_t **shape, int rank)
 {
+  if (*shape == NULL)
+return;
+
   gfc_clear_shape (*shape, rank);
   free (*shape);
   *shape = NULL;
@@ -490,8 +493,7 @@ free_expr0 (gfc_expr *e)
 }
 
   /* Free a shape array.  */
-  if (e->shape != NULL)
-gfc_free_shape (&e->shape, e->rank);
+  gfc_free_shape (&e->shape, e->rank);
 
   gfc_free_ref_list (e->ref);
 
Index: resolve.c
===
--- resolve.c	(révision 177956)
+++ resolve.c	(copie de travail)
@@ -5198,8 +5198,7 @@ check_host_association (gfc_expr *e)
 	  && sym->attr.contained)
 	{
 	  /* Clear the shape, since it might not be valid.  */
-	  if (e->shape != NULL)
-	gfc_free_shape (&e->shape, e->rank);
+	  gfc_free_shape (&e->shape, e->rank);
 
 	  /* Give the expression the right symtree!  */
 	  gfc_find_sym_tree (e->symtree->name, NULL, 1, &st);
Index: trans-io.c
===
--- trans-io.c	(révision 177956)
+++ trans-io.c	(copie de travail)
@@ -1999,6 +1999,7 @@ transfer_array_component (tree expr, gfc_component
   gfc_add_block_to_block (&block, &loop.pre);
   gfc_add_block_to_block (&block, &loop.post);
 
+  gcc_assert (ss->shape != NULL);
   gfc_free_shape (&ss->shape, cm->as->rank);
   gfc_cleanup_loop (&loop);
 
! { dg-do compile }
!
! PR fortran/50050
! ICE whilst trying to access NULL shape.

! Reduced from the FoX library http://www1.gly.bris.ac.uk/~walker/FoX/
! Contributed by Andrew Benson 

module m_common_attrs
  implicit none

  type dict_item
  end type dict_item

  type dict_item_ptr
 type(dict_item), pointer :: d => null()
  end type dict_item_ptr

contains

  subroutine add_item_to_dict()
type(dict_item_ptr), pointer :: tempList(:)
integer :: n

allocate(tempList(0:n+1)) 
  end subroutine add_item_to_dict

end module m_common_attrs

! { dg-final { cleanup-modules "m_common_attrs" } }


Reduce duplication of compilation commands

2011-08-23 Thread Joseph S. Myers
This patch, relative to a tree with my patch
 applied,
reduces the number of explicit compilation rules in gcc/ and
subdirectory makefiles by using CFLAGS-$@ settings when a target needs
extra compiler options.

Not fixed are rules for building driver files (involving SHLIB_LINK)
or those where the simple relation between .c and .o file names does
not apply (many of which could be fixed by moving the .o files in the
object tree to locations directly corresponding to the source files).
One linking rule that used $(COMPILER) instead of $(LINKER) for no
apparent reason was changed to use $(LINKER) instead of masquerading
as a compilation rule.

This is inspired by some of the cleanups in Tom's automatic dependency
generation patch that had to be reverted in 2008, the idea being that
such cleanups are of value on their own even without automatic
dependency generation and would also facilitate any future attempt at
automatic dependency generation.

Bootstrapped with no regressions on x86_64-unknown-linux-gnu.  OK to
commit (the first patch this is relative to and this one)?

2011-08-23  Joseph Myers  

* Makefile.in (CFLAGS-collect2.o, CFLAGS-c-family/c-opts.o)
(CFLAGS-c-family/c-pch.o, CFLAGS-prefix.o, CFLAGS-version.o)
(CFLAGS-lto-compress.o, CFLAGS-toplev.o, CFLAGS-intl.o)
(CFLAGS-cppbuiltin.o, CFLAGS-cppdefault.o): New.
(collect2.o, c-family/c-cppbuiltin.o, c-family/c-opts.o)
(c-family/c-pch.o, prefix.o, version.o, lto-compress.o, toplev.o)
(intl.o, cppbuiltin.o, cppdefault.o): Remove explicit compilation
rules.
(lto-wrapper$(exeext)): Use $(LINKER) not $(COMPILER).

ada/gcc-interface:
2011-08-23  Joseph Myers  

* Make-lang.in (CFLAGS-ada/tracebak.o, CFLAGS-ada/targext.o)
(CFLAGS-ada/cio.o, CFLAGS-ada/init.o, CFLAGS-ada/initialize.o)
(CFLAGS-ada/raise.o): New.
(ada/tracebak.o, ada/targext.o, ada/cio.o, ada/init.o)
(ada/initialize.o, ada/raise.o): Remove explicit compilation rules.

fortran:
2011-08-23  Joseph Myers  

* Make-lang.in (fortran/cpp.o): Remove explicit compilation rule.

go:
2011-08-23  Joseph Myers  

* Make-lang.in (CFLAGS-go/go-lang.o): New.
(go/go-lang.o): Remove explicit compilation rule.

java:
2011-08-23  Joseph Myers  

* Make-lang.in (CFLAGS-java/jcf-io.o, CFLAGS-java/jcf-path.o):
New.
(java/jcf-io.o, java/jcf-path.o): Remove explicit compilation
rules.

diff -rupN --exclude=.svn gcc-mainline-0/gcc/Makefile.in 
gcc-mainline/gcc/Makefile.in
--- gcc-mainline-0/gcc/Makefile.in  2011-08-23 12:10:52.818130203 -0700
+++ gcc-mainline/gcc/Makefile.in2011-08-23 12:38:13.23810 -0700
@@ -2052,12 +2052,11 @@ collect2$(exeext): $(COLLECT2_OBJS) $(LI
$(COLLECT2_OBJS) $(LIBS) $(COLLECT2_LIBS)
mv -f T$@ $@
 
+CFLAGS-collect2.o += -DTARGET_MACHINE=\"$(target_noncanonical)\" \
+   @TARGET_SYSTEM_ROOT_DEFINE@
 collect2.o : collect2.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) intl.h \
$(OBSTACK_H) $(DEMANGLE_H) collect2.h collect2-aix.h version.h \
$(DIAGNOSTIC_H)
-   $(COMPILER) $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS)  \
-   -DTARGET_MACHINE=\"$(target_noncanonical)\" \
-   -c $(srcdir)/collect2.c $(OUTPUT_OPTION) @TARGET_SYSTEM_ROOT_DEFINE@
 
 collect2-aix.o : collect2-aix.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
 collect2-aix.h
@@ -2066,7 +2065,7 @@ tlink.o: tlink.c $(DEMANGLE_H) $(HASHTAB
 $(OBSTACK_H) collect2.h intl.h $(DIAGNOSTIC_CORE_H)
 
 lto-wrapper$(exeext): lto-wrapper.o $(LIBDEPS)
-   +$(COMPILER) $(ALL_COMPILERFLAGS) $(LDFLAGS) -o T$@ lto-wrapper.o 
$(LIBS)
+   +$(LINKER) $(ALL_COMPILERFLAGS) $(LDFLAGS) -o T$@ lto-wrapper.o $(LIBS)
mv -f T$@ $@
 
 lto-wrapper.o: lto-wrapper.c $(CONFIG_H) $(SYSTEM_H) coretypes.h intl.h \
@@ -2088,8 +2087,6 @@ c-family/c-cppbuiltin.o : c-family/c-cpp
coretypes.h $(TM_H) $(TREE_H) version.h $(C_COMMON_H) $(C_PRAGMA_H) \
$(FLAGS_H) output.h $(TREE_H) $(TARGET_H) $(COMMON_TARGET_H) \
$(TM_P_H) debug.h $(CPP_ID_DATA_H) cppbuiltin.h
-   $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) \
-   $< $(OUTPUT_OPTION)
 
 c-family/c-dump.o : c-family/c-dump.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
$(TM_H) $(TREE_H) $(TREE_DUMP_H)
@@ -2113,20 +2110,18 @@ c-family/c-lex.o : c-family/c-lex.c $(CO
 c-family/c-omp.o : c-family/c-omp.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
$(TREE_H) $(C_COMMON_H) $(GIMPLE_H) langhooks.h
 
+CFLAGS-c-family/c-opts.o += @TARGET_SYSTEM_ROOT_DEFINE@
 c-family/c-opts.o : c-family/c-opts.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
 $(TREE_H) $(C_PRAGMA_H) $(FLAGS_H) toplev.h langhooks.h \
 $(DIAGNOSTIC_H) intl.h debug.h $(C_COMMON_H) $(C_TARGET_H) \
 $(OPTS_H) $(OPTIONS_H) $(MKDEPS_H) incpath.h cppdefault.h
-   $(COMPILER) -c $(ALL_COMPILERFLAGS) $(AL

Re: [cxx-mem-model] Atomic C++ header file changes

2011-08-23 Thread Andrew MacLeod

On 08/18/2011 06:33 PM, Richard Henderson wrote:

On 08/17/2011 08:39 AM, Andrew MacLeod wrote:

!   return __sync_mem_load (const_cast<__int_type *>(&_M_i), __m);

This suggests the builtin is incorrectly defined.
It ought to be const itself.


Err, right.

This patch declares the function properly and the casts are no longer 
needed.


Andrew

* builtin-types.def (BT_CONST_VOLATILE_PTR): New primitive type.
(BT_FN_I{1,2,4,8,16}_VPTR_INT): Change prototype to be const.
* sync-builtins.def (BUILT_IN_SYNC_MEM_LOAD_*): Change to be const.

* fortan/types.def (BUILT_IN_SYNC_MEM_LOAD_*): Change to be const.

Index: builtin-types.def
===
*** builtin-types.def   (revision 177737)
--- builtin-types.def   (working copy)
*** DEF_PRIMITIVE_TYPE (BT_VOLATILE_PTR,
*** 95,100 
--- 95,104 
build_pointer_type
 (build_qualified_type (void_type_node,
TYPE_QUAL_VOLATILE)))
+ DEF_PRIMITIVE_TYPE (BT_CONST_VOLATILE_PTR,
+   build_pointer_type
+(build_qualified_type (void_type_node,
+ TYPE_QUAL_VOLATILE|TYPE_QUAL_CONST)))
  DEF_PRIMITIVE_TYPE (BT_PTRMODE, (*lang_hooks.types.type_for_mode)(ptr_mode, 
0))
  DEF_PRIMITIVE_TYPE (BT_INT_PTR, integer_ptr_type_node)
  DEF_PRIMITIVE_TYPE (BT_FLOAT_PTR, float_ptr_type_node)
*** DEF_FUNCTION_TYPE_2 (BT_FN_BOOL_LONGPTR_
*** 315,325 
 BT_BOOL, BT_PTR_LONG, BT_PTR_LONG)
  DEF_FUNCTION_TYPE_2 (BT_FN_BOOL_ULONGLONGPTR_ULONGLONGPTR,
 BT_BOOL, BT_PTR_ULONGLONG, BT_PTR_ULONGLONG)
! DEF_FUNCTION_TYPE_2 (BT_FN_I1_VPTR_INT, BT_I1, BT_VOLATILE_PTR, BT_INT)
! DEF_FUNCTION_TYPE_2 (BT_FN_I2_VPTR_INT, BT_I2, BT_VOLATILE_PTR, BT_INT)
! DEF_FUNCTION_TYPE_2 (BT_FN_I4_VPTR_INT, BT_I4, BT_VOLATILE_PTR, BT_INT)
! DEF_FUNCTION_TYPE_2 (BT_FN_I8_VPTR_INT, BT_I8, BT_VOLATILE_PTR, BT_INT)
! DEF_FUNCTION_TYPE_2 (BT_FN_I16_VPTR_INT, BT_I16, BT_VOLATILE_PTR, BT_INT)
  DEF_FUNCTION_TYPE_2 (BT_FN_VOID_VPTR_INT, BT_VOID, BT_VOLATILE_PTR, BT_INT)
  DEF_FUNCTION_TYPE_2 (BT_FN_BOOL_VPTR_INT, BT_BOOL, BT_VOLATILE_PTR, BT_INT)
  
--- 319,334 
 BT_BOOL, BT_PTR_LONG, BT_PTR_LONG)
  DEF_FUNCTION_TYPE_2 (BT_FN_BOOL_ULONGLONGPTR_ULONGLONGPTR,
 BT_BOOL, BT_PTR_ULONGLONG, BT_PTR_ULONGLONG)
! DEF_FUNCTION_TYPE_2 (BT_FN_I1_CONST_VPTR_INT, BT_I1, BT_CONST_VOLATILE_PTR,
!BT_INT)
! DEF_FUNCTION_TYPE_2 (BT_FN_I2_CONST_VPTR_INT, BT_I2, BT_CONST_VOLATILE_PTR,
!BT_INT)
! DEF_FUNCTION_TYPE_2 (BT_FN_I4_CONST_VPTR_INT, BT_I4, BT_CONST_VOLATILE_PTR,
!BT_INT)
! DEF_FUNCTION_TYPE_2 (BT_FN_I8_CONST_VPTR_INT, BT_I8, BT_CONST_VOLATILE_PTR,
!BT_INT)
! DEF_FUNCTION_TYPE_2 (BT_FN_I16_CONST_VPTR_INT, BT_I16, BT_CONST_VOLATILE_PTR,
!BT_INT)
  DEF_FUNCTION_TYPE_2 (BT_FN_VOID_VPTR_INT, BT_VOID, BT_VOLATILE_PTR, BT_INT)
  DEF_FUNCTION_TYPE_2 (BT_FN_BOOL_VPTR_INT, BT_BOOL, BT_VOLATILE_PTR, BT_INT)
  
Index: sync-builtins.def
===
*** sync-builtins.def   (revision 177737)
--- sync-builtins.def   (working copy)
*** DEF_SYNC_BUILTIN (BUILT_IN_SYNC_MEM_LOAD
*** 283,301 
  BT_FN_VOID_VAR, ATTR_NOTHROW_LEAF_LIST)
  DEF_SYNC_BUILTIN (BUILT_IN_SYNC_MEM_LOAD_1,
  "__sync_mem_load_1",
! BT_FN_I1_VPTR_INT, ATTR_NOTHROW_LEAF_LIST)
  DEF_SYNC_BUILTIN (BUILT_IN_SYNC_MEM_LOAD_2,
  "__sync_mem_load_2",
! BT_FN_I2_VPTR_INT, ATTR_NOTHROW_LEAF_LIST)
  DEF_SYNC_BUILTIN (BUILT_IN_SYNC_MEM_LOAD_4,
  "__sync_mem_load_4",
! BT_FN_I4_VPTR_INT, ATTR_NOTHROW_LEAF_LIST)
  DEF_SYNC_BUILTIN (BUILT_IN_SYNC_MEM_LOAD_8,
  "__sync_mem_load_8",
! BT_FN_I8_VPTR_INT, ATTR_NOTHROW_LEAF_LIST)
  DEF_SYNC_BUILTIN (BUILT_IN_SYNC_MEM_LOAD_16,
  "__sync_mem_load_16",
! BT_FN_I16_VPTR_INT, ATTR_NOTHROW_LEAF_LIST)
  
  DEF_SYNC_BUILTIN (BUILT_IN_SYNC_MEM_COMPARE_EXCHANGE_N,
  "__sync_mem_compare_exchange",
--- 283,301 
  BT_FN_VOID_VAR, ATTR_NOTHROW_LEAF_LIST)
  DEF_SYNC_BUILTIN (BUILT_IN_SYNC_MEM_LOAD_1,
  "__sync_mem_load_1",
! BT_FN_I1_CONST_VPTR_INT, ATTR_NOTHROW_LEAF_LIST)
  DEF_SYNC_BUILTIN (BUILT_IN_SYNC_MEM_LOAD_2,
  "__sync_mem_load_2",
! BT_FN_I2_CONST_VPTR_INT, ATTR_NOTHROW_LEAF_LIST)
  DEF_SYNC_BUILTIN (BUILT_IN_SYNC_MEM_LOAD_4,
  "__sync_mem_load_4",
! BT_FN_I4_CONST_VPTR_INT, ATTR_NOTHROW_LEAF_LIST)
  DEF_SYNC_BUILTIN (BUILT_IN_SYNC_MEM_LOAD_8,
  "__sync_mem_load_8",
! BT_FN_I8_CONST_VPTR_INT, ATTR_NOTHROW_

  1   2   >