Re: [Patch, libfortran] PR 49296 List read of file without EOR

2011-07-12 Thread Janne Blomqvist
PING

On Mon, Jul 4, 2011 at 00:57, Janne Blomqvist  wrote:
> Hi,
>
> the attached patch fixes the remaining cases of handling input that
> ends in EOF instead of a normal separator for list formatted read of
> the primitive types. Ok for trunk and 4.6?
>
> 2011-07-04  Janne Blomqvist  
>
>        PR libfortran/49296
>        * io/list_read.c (read_logical): Don't error out if a valid value
>        is followed by EOF instead of a normal separator.
>        (read_integer): Likewise.
>
> testsuite:
>
> 2011-07-04  Janne Blomqvist  
>
>        PR libfortran/49296
>        * gfortran.dg/read_list_eof_1.f90: Add tests for integer, real,
>        and logical reads.
>
>
> --
> Janne Blomqvist
>



-- 
Janne Blomqvist


Re: [Ada] Fix --enable-build-with-cxx build

2011-07-12 Thread Rainer Orth
Eric Botcazou  writes:

> gcc/
>   * prefix.h: Wrap up in extern "C" block.
>
> ada/
>   * adadecode.c: Likewise.

No `Likewise.' in different ChangeLogs :-)

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [dwarf2cfi] Cleanup interpretation of cfa.reg

2011-07-12 Thread Andreas Schwab
Richard Henderson  writes:

> @@ -261,6 +262,15 @@ extern void dwarf2out_set_demangle_name_func (const char 
> *(*) (const char *));
>  extern void dwarf2out_vms_debug_main_pointer (void);
>  #endif
>  
> +/* Unfortunately, DWARF_FRAME_REGNUM is not universally defined in such a
> +   way as to force an unsigned return type.  Do that via inline wrapper.  */
> +
> +static inline unsigned
> +dwarf_frame_regnum (unsigned regnum)
> +{
> +  return DWARF_FRAME_REGNUM (regnum);
> +}
> +  

I think this has caused the bootstrap failure on ia64:

In file included from ../../gcc/dwarf2cfi.c:31:0:
../../gcc/dwarf2out.h: In function 'dwarf_frame_regnum':
../../gcc/dwarf2out.h:271:3: error: implicit declaration of function 
'ia64_dbx_register_number' [-Werror=implicit-function-declaration]

Andreas.

-- 
Andreas Schwab, sch...@redhat.com
GPG Key fingerprint = D4E8 DBE3 3813 BB5D FA84  5EC7 45C6 250E 6F00 984E
"And now for something completely different."


Re: ARM: Clear icache when creating a closure

2011-07-12 Thread Richard Earnshaw
On 11/07/11 17:23, Andrew Haley wrote:
> On a multicore ARM, you really do have to clear both caches, not just the
> dcache.  This bug may exist in other ports too.
> 
> Andrew.
> 
> 
> 2011-07-11  Andrew Haley  
> 
> * src/arm/ffi.c (FFI_INIT_TRAMPOLINE): Clear icache.
> 
> diff --git a/src/arm/ffi.c b/src/arm/ffi.c
> index 885a9cb..b2e7667 100644
> --- a/src/arm/ffi.c
> +++ b/src/arm/ffi.c
> @@ -558,12 +558,16 @@ ffi_closure_free (void *ptr)
>  ({ unsigned char *__tramp = (unsigned char*)(TRAMP);   \
> unsigned int  __fun = (unsigned int)(FUN);  \
> unsigned int  __ctx = (unsigned int)(CTX);  \
> +   unsigned char *insns = (unsigned char *)(CTX);   \
> *(unsigned int*) &__tramp[0] = 0xe92d000f; /* stmfd sp!, {r0-r3} */ \
> *(unsigned int*) &__tramp[4] = 0xe59f; /* ldr r0, [pc] */   \
> *(unsigned int*) &__tramp[8] = 0xe59ff000; /* ldr pc, [pc] */   \
> *(unsigned int*) &__tramp[12] = __ctx;  \
> *(unsigned int*) &__tramp[16] = __fun;  \
> -   __clear_cache((&__tramp[0]), (&__tramp[19]));   \
> +   __clear_cache((&__tramp[0]), (&__tramp[19])); /* Clear data mapping.  */ \
> +   __clear_cache(insns, insns + 3 * sizeof (unsigned int)); \
> + /* Clear instruction   \
> +mapping.  */\
>   })
> 
>  #endif
> 
> 


Your patch looks sane, but I'll observe here that the poking of
instruction values is wrong on cores that run in BE-8 mode (where
instructions are always little-endian).

R.



Re: ARM: Clear icache when creating a closure

2011-07-12 Thread Andrew Haley
On 12/07/11 10:12, Richard Earnshaw wrote:
> On 11/07/11 17:23, Andrew Haley wrote:
>> On a multicore ARM, you really do have to clear both caches, not just the
>> dcache.  This bug may exist in other ports too.
>>
>> Andrew.
>>
>>
>> 2011-07-11  Andrew Haley  
>>
>> * src/arm/ffi.c (FFI_INIT_TRAMPOLINE): Clear icache.
>>
>> diff --git a/src/arm/ffi.c b/src/arm/ffi.c
>> index 885a9cb..b2e7667 100644
>> --- a/src/arm/ffi.c
>> +++ b/src/arm/ffi.c
>> @@ -558,12 +558,16 @@ ffi_closure_free (void *ptr)
>>  ({ unsigned char *__tramp = (unsigned char*)(TRAMP);   \
>> unsigned int  __fun = (unsigned int)(FUN);  \
>> unsigned int  __ctx = (unsigned int)(CTX);  \
>> +   unsigned char *insns = (unsigned char *)(CTX);   \
>> *(unsigned int*) &__tramp[0] = 0xe92d000f; /* stmfd sp!, {r0-r3} */ \
>> *(unsigned int*) &__tramp[4] = 0xe59f; /* ldr r0, [pc] */   \
>> *(unsigned int*) &__tramp[8] = 0xe59ff000; /* ldr pc, [pc] */   \
>> *(unsigned int*) &__tramp[12] = __ctx;  \
>> *(unsigned int*) &__tramp[16] = __fun;  \
>> -   __clear_cache((&__tramp[0]), (&__tramp[19]));   \
>> +   __clear_cache((&__tramp[0]), (&__tramp[19])); /* Clear data mapping.  */ 
>> \
>> +   __clear_cache(insns, insns + 3 * sizeof (unsigned int)); \
>> + /* Clear instruction   \
>> +mapping.  */\
>>   })
>>
>>  #endif
>>
>>
> 
> 
> Your patch looks sane, but I'll observe here that the poking of
> instruction values is wrong on cores that run in BE-8 mode (where
> instructions are always little-endian).

Oh dear.  How would one test for BE-8 mode on a Linux system?

Thanks,
Andrew.



[PATCH] Fix ICE in gen_lsm_tmp_name (PR tree-optimization/49712)

2011-07-12 Thread Jakub Jelinek
Hi!

Now that LIM is scheduled after IVOPTS too, it needs to be prepared to
handle TARGET_MEM_REFs IVOPTS creates.
Tested on x86_64-linux, committed to trunk as obvious.

2011-07-12  Jakub Jelinek  

PR tree-optimization/49712
* tree-ssa-loop-im.c (gen_lsm_tmp_name): Handle TARGET_MEM_REF.

* gcc.c-torture/execute/pr49712.c: New test.

--- gcc/tree-ssa-loop-im.c.jj   2011-05-17 13:32:20.0 +0200
+++ gcc/tree-ssa-loop-im.c  2011-07-12 08:48:22.0 +0200
@@ -1982,6 +1982,7 @@ gen_lsm_tmp_name (tree ref)
   switch (TREE_CODE (ref))
 {
 case MEM_REF:
+case TARGET_MEM_REF:
   gen_lsm_tmp_name (TREE_OPERAND (ref, 0));
   lsm_tmp_name_add ("_");
   break;
--- gcc/testsuite/gcc.c-torture/execute/pr49712.c.jj2011-07-12 
08:52:12.0 +0200
+++ gcc/testsuite/gcc.c-torture/execute/pr49712.c   2011-07-12 
08:51:44.0 +0200
@@ -0,0 +1,28 @@
+/* PR tree-optimization/49712 */
+
+int a[2], b, c, d, e;
+
+void
+foo (int x, int y)
+{
+}
+
+int
+bar (void)
+{
+  int i;
+  for (; d <= 0; d = 1)
+for (i = 0; i < 4; i++)
+  for (e = 0; e; e = 1)
+   ;
+  return 0;
+}
+
+int
+main ()
+{
+  for (b = 0; b < 2; b++)
+while (c)
+  foo (a[b] = 0, bar ());
+  return 0;
+}

Jakub


Re: [patch tree-optimization]: [2 of 3]: Boolify compares & more

2011-07-12 Thread Richard Guenther
On Mon, Jul 11, 2011 at 5:37 PM, Kai Tietz  wrote:
> 2011/7/8 Richard Guenther :
>> On Fri, Jul 8, 2011 at 1:32 PM, Kai Tietz  wrote:
>>> 2011/7/8 Richard Guenther :
 On Thu, Jul 7, 2011 at 6:07 PM, Kai Tietz  wrote:
> Hello,
>
> This patch - second of series - adds boolification of comparisions in
> gimplifier.  For this
> casts from/to boolean are marked as not-useless. And in fold_unary_loc
> casts to non-boolean integral types are preserved.
> The hunk in tree-ssa-forwprop.c in combine_cond-expr_cond is not strictly
> necessary - as long as fold-const handles 1-bit precision 
> bitwise-expression
> with truth-logic - but it has shown to short-cut some expensier folding. 
> So
> I kept it within this patch.

 Please split it out.  Also ...

>
> The adjusted testcase gcc.dg/uninit-15.c indicates that due
> optimization we loose
> in this case variables declaration.  But this might be to be expected.
>
> In vectorization we have a regression in gcc.dg/vect/vect-cond-3.c
> test-case.  It's caused
> by always having boolean-type on conditions.  So vectorizer sees
> different types, which
> aren't handled by vectorizer right now.  Maybe this issue could be
> special-cased for
> boolean-types in tree-vect-loop, by making operand for used condition
> equal to vector-type.
> But this is a subject for a different patch and not addressed by this 
> series.
>
> There is a regressions in tree-ssa/vrp47.c, and the fix is addressed
> by the 3rd patch of this
> series.
>
> Bootstrapped and regression tested for all standard-languages (plus
> Ada and Obj-C++) on host x86_64-pc-linux-gnu.
>
> Ok for apply?
>
> Regards,
> Kai
>
>
> ChangeLog
>
> 2011-07-07  Kai Tietz  
>
>        * fold-const.c (fold_unary_loc): Preserve
>        non-boolean-typed casts.
>        * gimplify.c (gimple_boolify): Handle boolification
>        of comparisons.
>        (gimplify_expr): Boolifiy non aggregate-typed
>        comparisons.
>        * tree-cfg.c (verify_gimple_comparison): Check result
>        type of comparison expression.
>        * tree-ssa.c (useless_type_conversion_p): Preserve incompatible
>        casts from/to boolean,
>        * tree-ssa-forwprop.c (combine_cond_expr_cond): Add simplification
>        support for one-bit-precision typed X for cases X != 0 and X == 0.
>        (forward_propagate_comparison): Adjust test of condition
>        result.
>
>
>        * gcc.dg/tree-ssa/builtin-expect-5.c: Adjusted.
>        * gcc.dg/tree-ssa/pr21031.c: Likewise.
>        * gcc.dg/tree-ssa/pr30978.c: Likewise.
>        * gcc.dg/tree-ssa/ssa-fre-6.c: Likewise.
>        * gcc.dg/binop-xor1.c: Mark it as expected fail.
>        * gcc.dg/binop-xor3.c: Likewise.
>        * gcc.dg/uninit-15.c: Adjust reported message.
>
> Index: gcc-head/gcc/fold-const.c
> ===
> --- gcc-head.orig/gcc/fold-const.c
> +++ gcc-head/gcc/fold-const.c
> @@ -7665,11 +7665,11 @@ fold_unary_loc (location_t loc, enum tre
>             non-integral type.
>             Do not fold the result as that would not simplify further, 
> also
>             folding again results in recursions.  */
> -         if (INTEGRAL_TYPE_P (type))
> +         if (TREE_CODE (type) == BOOLEAN_TYPE)
>            return build2_loc (loc, TREE_CODE (op0), type,
>                               TREE_OPERAND (op0, 0),
>                               TREE_OPERAND (op0, 1));
> -         else
> +         else if (!INTEGRAL_TYPE_P (type))
>            return build3_loc (loc, COND_EXPR, type, op0,
>                               fold_convert (type, boolean_true_node),
>                               fold_convert (type, boolean_false_node));
> Index: gcc-head/gcc/gimplify.c
> ===
> --- gcc-head.orig/gcc/gimplify.c
> +++ gcc-head/gcc/gimplify.c
> @@ -2842,18 +2842,23 @@ gimple_boolify (tree expr)
>
>     case TRUTH_NOT_EXPR:
>       TREE_OPERAND (expr, 0) = gimple_boolify (TREE_OPERAND (expr, 0));
> -      /* FALLTHRU */
>
> -    case EQ_EXPR: case NE_EXPR:
> -    case LE_EXPR: case GE_EXPR: case LT_EXPR: case GT_EXPR:
>       /* These expressions always produce boolean results.  */
> -      TREE_TYPE (expr) = boolean_type_node;
> +      if (TREE_CODE (type) != BOOLEAN_TYPE)
> +       TREE_TYPE (expr) = boolean_type_node;
>       return expr;
>
>     default:
> +      if (COMPARISON_CLASS_P (expr))
> +       {
> +         /* There expressions always prduce boolean results.  */
> +         if (TREE_CODE (type) != BOOLEAN_TY

Re: [ARM] Tighten predicates for misaligned loads and stores

2011-07-12 Thread Ramana Radhakrishnan
> Tested on arm-linux-gnueabi.  OK to install?

This is OK.

cheers
Ramana


Re: [dwarf2cfi] Cleanup interpretation of cfa.reg

2011-07-12 Thread Richard Earnshaw
On 12/07/11 10:05, Andreas Schwab wrote:
> Richard Henderson  writes:
> 
>> @@ -261,6 +262,15 @@ extern void dwarf2out_set_demangle_name_func (const 
>> char *(*) (const char *));
>>  extern void dwarf2out_vms_debug_main_pointer (void);
>>  #endif
>>  
>> +/* Unfortunately, DWARF_FRAME_REGNUM is not universally defined in such a
>> +   way as to force an unsigned return type.  Do that via inline wrapper.  */
>> +
>> +static inline unsigned
>> +dwarf_frame_regnum (unsigned regnum)
>> +{
>> +  return DWARF_FRAME_REGNUM (regnum);
>> +}
>> +  
> 
> I think this has caused the bootstrap failure on ia64:
> 
> In file included from ../../gcc/dwarf2cfi.c:31:0:
> ../../gcc/dwarf2out.h: In function 'dwarf_frame_regnum':
> ../../gcc/dwarf2out.h:271:3: error: implicit declaration of function 
> 'ia64_dbx_register_number' [-Werror=implicit-function-declaration]
> 
> Andreas.
> 

And on ARM (PR49713)

R.



Re: [patch tree-optimization]: [2 of 3]: Boolify compares & more

2011-07-12 Thread Kai Tietz
2011/7/12 Richard Guenther :
> On Mon, Jul 11, 2011 at 5:37 PM, Kai Tietz  wrote:
>> 2011/7/8 Richard Guenther :
>>> On Fri, Jul 8, 2011 at 1:32 PM, Kai Tietz  wrote:
 2011/7/8 Richard Guenther :
> On Thu, Jul 7, 2011 at 6:07 PM, Kai Tietz  wrote:
>> Hello,
>>
>> This patch - second of series - adds boolification of comparisions in
>> gimplifier.  For this
>> casts from/to boolean are marked as not-useless. And in fold_unary_loc
>> casts to non-boolean integral types are preserved.
>> The hunk in tree-ssa-forwprop.c in combine_cond-expr_cond is not strictly
>> necessary - as long as fold-const handles 1-bit precision 
>> bitwise-expression
>> with truth-logic - but it has shown to short-cut some expensier folding. 
>> So
>> I kept it within this patch.
>
> Please split it out.  Also ...
>
>>
>> The adjusted testcase gcc.dg/uninit-15.c indicates that due
>> optimization we loose
>> in this case variables declaration.  But this might be to be expected.
>>
>> In vectorization we have a regression in gcc.dg/vect/vect-cond-3.c
>> test-case.  It's caused
>> by always having boolean-type on conditions.  So vectorizer sees
>> different types, which
>> aren't handled by vectorizer right now.  Maybe this issue could be
>> special-cased for
>> boolean-types in tree-vect-loop, by making operand for used condition
>> equal to vector-type.
>> But this is a subject for a different patch and not addressed by this 
>> series.
>>
>> There is a regressions in tree-ssa/vrp47.c, and the fix is addressed
>> by the 3rd patch of this
>> series.
>>
>> Bootstrapped and regression tested for all standard-languages (plus
>> Ada and Obj-C++) on host x86_64-pc-linux-gnu.
>>
>> Ok for apply?
>>
>> Regards,
>> Kai
>>
>>
>> ChangeLog
>>
>> 2011-07-07  Kai Tietz  
>>
>>        * fold-const.c (fold_unary_loc): Preserve
>>        non-boolean-typed casts.
>>        * gimplify.c (gimple_boolify): Handle boolification
>>        of comparisons.
>>        (gimplify_expr): Boolifiy non aggregate-typed
>>        comparisons.
>>        * tree-cfg.c (verify_gimple_comparison): Check result
>>        type of comparison expression.
>>        * tree-ssa.c (useless_type_conversion_p): Preserve incompatible
>>        casts from/to boolean,
>>        * tree-ssa-forwprop.c (combine_cond_expr_cond): Add simplification
>>        support for one-bit-precision typed X for cases X != 0 and X == 0.
>>        (forward_propagate_comparison): Adjust test of condition
>>        result.
>>
>>
>>        * gcc.dg/tree-ssa/builtin-expect-5.c: Adjusted.
>>        * gcc.dg/tree-ssa/pr21031.c: Likewise.
>>        * gcc.dg/tree-ssa/pr30978.c: Likewise.
>>        * gcc.dg/tree-ssa/ssa-fre-6.c: Likewise.
>>        * gcc.dg/binop-xor1.c: Mark it as expected fail.
>>        * gcc.dg/binop-xor3.c: Likewise.
>>        * gcc.dg/uninit-15.c: Adjust reported message.
>>
>> Index: gcc-head/gcc/fold-const.c
>> ===
>> --- gcc-head.orig/gcc/fold-const.c
>> +++ gcc-head/gcc/fold-const.c
>> @@ -7665,11 +7665,11 @@ fold_unary_loc (location_t loc, enum tre
>>             non-integral type.
>>             Do not fold the result as that would not simplify further, 
>> also
>>             folding again results in recursions.  */
>> -         if (INTEGRAL_TYPE_P (type))
>> +         if (TREE_CODE (type) == BOOLEAN_TYPE)
>>            return build2_loc (loc, TREE_CODE (op0), type,
>>                               TREE_OPERAND (op0, 0),
>>                               TREE_OPERAND (op0, 1));
>> -         else
>> +         else if (!INTEGRAL_TYPE_P (type))
>>            return build3_loc (loc, COND_EXPR, type, op0,
>>                               fold_convert (type, boolean_true_node),
>>                               fold_convert (type, boolean_false_node));
>> Index: gcc-head/gcc/gimplify.c
>> ===
>> --- gcc-head.orig/gcc/gimplify.c
>> +++ gcc-head/gcc/gimplify.c
>> @@ -2842,18 +2842,23 @@ gimple_boolify (tree expr)
>>
>>     case TRUTH_NOT_EXPR:
>>       TREE_OPERAND (expr, 0) = gimple_boolify (TREE_OPERAND (expr, 0));
>> -      /* FALLTHRU */
>>
>> -    case EQ_EXPR: case NE_EXPR:
>> -    case LE_EXPR: case GE_EXPR: case LT_EXPR: case GT_EXPR:
>>       /* These expressions always produce boolean results.  */
>> -      TREE_TYPE (expr) = boolean_type_node;
>> +      if (TREE_CODE (type) != BOOLEAN_TYPE)
>> +       TREE_TYPE (expr) = boolean_type_node;
>>       return expr;
>>
>>     default:
>> +      if (COMPARISON_CLASS_P 

Re: [PATCH (2/7)] Widening multiplies by more than one mode

2011-07-12 Thread Andrew Stubbs

On 23/06/11 15:39, Andrew Stubbs wrote:

This patch has two effects:

1. It permits the use of widening multiply instructions that widen by
more than one mode. E.g. HImode -> DImode.

2. It enables the use of widening multiply instructions for (extended)
inputs of narrower mode than the instruction takes. E.g. QImode ->
DImode where only HI->DI or SI->DI is available.

Hopefully, most of the patch is self-explanatory, but here are few notes:

The code introduces a temporary FIXME comment; this will be removed
later in the patch series. In fact, this is not a new restriction;
previously "type1" and "type2" were implicitly identical because they
were required to be one mode smaller than "type".

I regard the ARM portion of this patch as obvious, so I don't think I
need an ARM maintainer to read this.

Is the patch OK?


I found a bug in this patch. It seems I do need to add casts for the 
inputs to widening multiplies (even though I know the registers are 
already fine), because otherwise something is insisting on truncating 
the values to the minimum width, which isn't helpful when it's actually 
an instruction with wider inputs.


The mode changing bits from patch 4 have therefore been moved here. I've 
made the changes Richard Guenther requested there, I think.


Otherwise, the patch is the same as before.

Andrew

2011-07-11  Andrew Stubbs  

	gcc/
	* config/arm/arm.md (maddhidi4): Remove '*' from name.
	* expr.c (expand_expr_real_2): Use find_widening_optab_handler.
	* optabs.c (find_widening_optab_handler_and_mode): New function.
	(expand_widen_pattern_expr): Use find_widening_optab_handler.
	(expand_binop_directly): Likewise.
	(expand_binop): Likewise.
	* optabs.h (find_widening_optab_handler): New macro define.
	(find_widening_optab_handler_and_mode): New prototype.
	* tree-cfg.c (verify_gimple_assign_binary): Adjust WIDEN_MULT_EXPR
	type precision rules.
	(verify_gimple_assign_ternary): Likewise for WIDEN_MULT_PLUS_EXPR.
	* tree-ssa-math-opts.c (build_and_insert_cast): New function.
	(is_widening_mult_rhs_p): Allow widening by more than one mode.
	Explicitly disallow mis-matched input types.
	(convert_mult_to_widen): Use find_widening_optab_handler, and cast
	input types to fit the new handler.
	(convert_plusminus_to_widen): Likewise.

--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -1857,7 +1857,7 @@
(set_attr "predicable" "yes")]
 )
 
-(define_insn "*maddhidi4"
+(define_insn "maddhidi4"
   [(set (match_operand:DI 0 "s_register_operand" "=r")
 	(plus:DI
 	  (mult:DI (sign_extend:DI
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -7638,19 +7638,16 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
 	{
 	  enum machine_mode innermode = TYPE_MODE (TREE_TYPE (treeop0));
 	  this_optab = usmul_widen_optab;
-	  if (mode == GET_MODE_2XWIDER_MODE (innermode))
+	  if (find_widening_optab_handler (this_optab, mode, innermode, 0)
+		!= CODE_FOR_nothing)
 	{
-	  if (widening_optab_handler (this_optab, mode, innermode)
-		!= CODE_FOR_nothing)
-		{
-		  if (TYPE_UNSIGNED (TREE_TYPE (treeop0)))
-		expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
- EXPAND_NORMAL);
-		  else
-		expand_operands (treeop0, treeop1, NULL_RTX, &op1, &op0,
- EXPAND_NORMAL);
-		  goto binop3;
-		}
+	  if (TYPE_UNSIGNED (TREE_TYPE (treeop0)))
+		expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
+ EXPAND_NORMAL);
+	  else
+		expand_operands (treeop0, treeop1, NULL_RTX, &op1, &op0,
+ EXPAND_NORMAL);
+	  goto binop3;
 	}
 	}
   /* Check for a multiplication with matching signedness.  */
@@ -7665,10 +7662,9 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
 	  optab other_optab = zextend_p ? smul_widen_optab : umul_widen_optab;
 	  this_optab = zextend_p ? umul_widen_optab : smul_widen_optab;
 
-	  if (mode == GET_MODE_2XWIDER_MODE (innermode)
-	  && TREE_CODE (treeop0) != INTEGER_CST)
+	  if (TREE_CODE (treeop0) != INTEGER_CST)
 	{
-	  if (widening_optab_handler (this_optab, mode, innermode)
+	  if (find_widening_optab_handler (this_optab, mode, innermode, 0)
 		!= CODE_FOR_nothing)
 		{
 		  expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
@@ -7677,7 +7673,7 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
 	   unsignedp, this_optab);
 		  return REDUCE_BIT_FIELD (temp);
 		}
-	  if (widening_optab_handler (other_optab, mode, innermode)
+	  if (find_widening_optab_handler (other_optab, mode, innermode, 0)
 		!= CODE_FOR_nothing
 		  && innermode == word_mode)
 		{
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -225,6 +225,37 @@ add_equal_note (rtx insns, rtx target, enum rtx_code code, rtx op0, rtx op1)
   return 1;
 }
 
+/* Find a widening optab even if it doesn't widen as much as we want.
+   E.g. if from_mode is HImode, and to_mode is DImode, and there is no
+   direct HI->SI insn, then return SI->DI, if that exists.
+   If PERMIT_NON_WIDENING is no

Re: ARM: Clear icache when creating a closure

2011-07-12 Thread Richard Earnshaw
On 12/07/11 10:15, Andrew Haley wrote:
> On 12/07/11 10:12, Richard Earnshaw wrote:
>> On 11/07/11 17:23, Andrew Haley wrote:
>>> On a multicore ARM, you really do have to clear both caches, not just the
>>> dcache.  This bug may exist in other ports too.
>>>
>>> Andrew.
>>>
>>>
>>> 2011-07-11  Andrew Haley  
>>>
>>> * src/arm/ffi.c (FFI_INIT_TRAMPOLINE): Clear icache.
>>>
>>> diff --git a/src/arm/ffi.c b/src/arm/ffi.c
>>> index 885a9cb..b2e7667 100644
>>> --- a/src/arm/ffi.c
>>> +++ b/src/arm/ffi.c
>>> @@ -558,12 +558,16 @@ ffi_closure_free (void *ptr)
>>>  ({ unsigned char *__tramp = (unsigned char*)(TRAMP);   \
>>> unsigned int  __fun = (unsigned int)(FUN);  \
>>> unsigned int  __ctx = (unsigned int)(CTX);  \
>>> +   unsigned char *insns = (unsigned char *)(CTX);   \
>>> *(unsigned int*) &__tramp[0] = 0xe92d000f; /* stmfd sp!, {r0-r3} */ \
>>> *(unsigned int*) &__tramp[4] = 0xe59f; /* ldr r0, [pc] */   \
>>> *(unsigned int*) &__tramp[8] = 0xe59ff000; /* ldr pc, [pc] */   \
>>> *(unsigned int*) &__tramp[12] = __ctx;  \
>>> *(unsigned int*) &__tramp[16] = __fun;  \
>>> -   __clear_cache((&__tramp[0]), (&__tramp[19]));   \
>>> +   __clear_cache((&__tramp[0]), (&__tramp[19])); /* Clear data mapping.  
>>> */ \
>>> +   __clear_cache(insns, insns + 3 * sizeof (unsigned int)); \
>>> + /* Clear instruction   \
>>> +mapping.  */\
>>>   })
>>>
>>>  #endif
>>>
>>>
>>
>>
>> Your patch looks sane, but I'll observe here that the poking of
>> instruction values is wrong on cores that run in BE-8 mode (where
>> instructions are always little-endian).
> 
> Oh dear.  How would one test for BE-8 mode on a Linux system?
> 
> Thanks,
> Andrew.
> 
> 

Essentially v6 or later and big-endian.  It is possible to run some v6
(but no v7) cores in be-32 mode, but you can't then have unaligned
access support.

To know the configuration for sure, you need to read the SCTLR register
(in CP15 space), but that's not available in user-mode.

R.




Re: [Ada] Fix --enable-build-with-cxx build

2011-07-12 Thread Eric Botcazou
> This is an updated version of Laurent's patch originally here:
>   http://gcc.gnu.org/ml/gcc/2009-06/msg00635.html

Revised version, only 33 files modified now (instead of 51).  We compile the C 
files for the compiler proper (and gnatbind) with the C++ compiler, but we 
keep compiling them with the C compiler in the other cases (library & tools).
I think that this is in keeping with the other compilers (e.g. libgcc is still 
compiled with the C compiler).

Bootstrapped/regtested on x86_64-suse-linux with --enable-build-with-cxx.

Arno, do you have any objections to me applying this?


2011-07-12  Laurent GUERBY  
Eric Botcazou  

gcc/
* prefix.h: Wrap up in extern "C" block.
ada/
* adadecode.c: Wrap up in extern "C" block.
* adadecode.h: Likewise.
* adaint.c: Likewise.  Remove 'const' keyword.
* adaint.h: Likewise.
* argv.c: Likewise.
* atree.h: Likewise.
* cio.c: Likewise.
* cstreams.c: Likewise.
* env.c: Likewise.
* exit.c: Likewise.
* fe.h: Likewise.
* final.c: Likewise.
* init.c: Likewise.
* initialize.c: Likewise.
* link.c: Likewise.
* namet.h: Likewise.
* nlists.h: Likewise.
* raise.c: Likewise.
* raise.h: Likewise.
* repinfo.h: Likewise.
* seh_init.c: Likewise.
* targext.c: Likewise.
* tracebak.c: Likewise.
* uintp.h: Likewise.
* urealp.h: Likewise.
* xeinfo.adb: Wrap up generated C code in extern "C" block.
* xsinfo.adb: Likewise.
* xsnamest.adb: Likewise.
* gcc-interface/gadaint.h: Wrap up in extern "C" block.
* gcc-interface/gigi.h: Wrap up some prototypes in extern "C" block.
* gcc-interface/misc.c: Likewise.
* gcc-interface/Make-lang.in (GCC_LINK): Use LINKER.
(GNAT1_C_OBJS): Remove ada/b_gnat1.o.  List ada/seh_init.o and
ada/targext.o here...
(GNAT_ADA_OBJS): ...and not here.
(GNAT1_ADA_OBJS): Add ada/b_gnat1.o.
(GNATBIND_OBJS): Reorder.


-- 
Eric Botcazou
Index: ada/adadecode.h
===
--- ada/adadecode.h	(revision 176072)
+++ ada/adadecode.h	(working copy)
@@ -6,7 +6,7 @@
  *  *
  *  C Header File   *
  *  *
- *   Copyright (C) 2001-2009, Free Software Foundation, Inc.*
+ *   Copyright (C) 2001-2011, Free Software Foundation, Inc.*
  *  *
  * GNAT is free software;  you can  redistribute it  and/or modify it under *
  * terms of the  GNU General Public License as published  by the Free Soft- *
@@ -29,6 +29,10 @@
  *  *
  /
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
 /* This function will return the Ada name from the encoded form.
The Ada coding is done in exp_dbug.ads and this is the inverse function.
see exp_dbug.ads for full encoding rules, a short description is added
@@ -51,3 +55,7 @@ extern void get_encoding (const char *,
function used in the binutils and GDB. Always consider using __gnat_decode
instead of ada_demangle. Caller must free the pointer returned.  */
 extern char *ada_demangle (const char *);
+
+#ifdef __cplusplus
+}
+#endif
Index: ada/targext.c
===
--- ada/targext.c	(revision 176072)
+++ ada/targext.c	(working copy)
@@ -29,9 +29,13 @@
  *  *
  /
 
-/*  This file contains target-specific parameters describing the file   */
-/*  extension for object and executable files. It is used by the compiler,  */
-/*  binder and tools.   */
+/*  This file contains target-specific parameters describing the file
+extension for object and executable files.  It is used by the compiler,
+binder and tools.  */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
 
 #ifdef IN_RTS
 #include "tconfig.h"
@@ -54,3 +58,7 @@
 const char *__gnat_target_object_extension = TARGET_OBJECT_SUFFIX;
 const char *__gnat_target_executable_extension = TARGET_EXECUTABLE_SUFFIX;
 const char *__gnat_target_debuggable_extension = TARGET_EXECUTABLE_SUFFIX;
+
+#ifdef __cplusplus
+}
+#endif
Index: ada/env.c
===
--- ada/env.c	(revision 176072)
+++ ada/env.c	(working copy)
@@ -6,7 +6,7 @@
  *

Re: [patch tree-optimization]: [2 of 3]: Boolify compares & more

2011-07-12 Thread Richard Guenther
On Tue, Jul 12, 2011 at 11:48 AM, Kai Tietz  wrote:
> 2011/7/12 Richard Guenther :
>> On Mon, Jul 11, 2011 at 5:37 PM, Kai Tietz  wrote:
>>> 2011/7/8 Richard Guenther :
 On Fri, Jul 8, 2011 at 1:32 PM, Kai Tietz  wrote:
> 2011/7/8 Richard Guenther :
>> On Thu, Jul 7, 2011 at 6:07 PM, Kai Tietz  
>> wrote:
>>> Hello,
>>>
>>> This patch - second of series - adds boolification of comparisions in
>>> gimplifier.  For this
>>> casts from/to boolean are marked as not-useless. And in fold_unary_loc
>>> casts to non-boolean integral types are preserved.
>>> The hunk in tree-ssa-forwprop.c in combine_cond-expr_cond is not 
>>> strictly
>>> necessary - as long as fold-const handles 1-bit precision 
>>> bitwise-expression
>>> with truth-logic - but it has shown to short-cut some expensier 
>>> folding. So
>>> I kept it within this patch.
>>
>> Please split it out.  Also ...
>>
>>>
>>> The adjusted testcase gcc.dg/uninit-15.c indicates that due
>>> optimization we loose
>>> in this case variables declaration.  But this might be to be expected.
>>>
>>> In vectorization we have a regression in gcc.dg/vect/vect-cond-3.c
>>> test-case.  It's caused
>>> by always having boolean-type on conditions.  So vectorizer sees
>>> different types, which
>>> aren't handled by vectorizer right now.  Maybe this issue could be
>>> special-cased for
>>> boolean-types in tree-vect-loop, by making operand for used condition
>>> equal to vector-type.
>>> But this is a subject for a different patch and not addressed by this 
>>> series.
>>>
>>> There is a regressions in tree-ssa/vrp47.c, and the fix is addressed
>>> by the 3rd patch of this
>>> series.
>>>
>>> Bootstrapped and regression tested for all standard-languages (plus
>>> Ada and Obj-C++) on host x86_64-pc-linux-gnu.
>>>
>>> Ok for apply?
>>>
>>> Regards,
>>> Kai
>>>
>>>
>>> ChangeLog
>>>
>>> 2011-07-07  Kai Tietz  
>>>
>>>        * fold-const.c (fold_unary_loc): Preserve
>>>        non-boolean-typed casts.
>>>        * gimplify.c (gimple_boolify): Handle boolification
>>>        of comparisons.
>>>        (gimplify_expr): Boolifiy non aggregate-typed
>>>        comparisons.
>>>        * tree-cfg.c (verify_gimple_comparison): Check result
>>>        type of comparison expression.
>>>        * tree-ssa.c (useless_type_conversion_p): Preserve incompatible
>>>        casts from/to boolean,
>>>        * tree-ssa-forwprop.c (combine_cond_expr_cond): Add 
>>> simplification
>>>        support for one-bit-precision typed X for cases X != 0 and X == 
>>> 0.
>>>        (forward_propagate_comparison): Adjust test of condition
>>>        result.
>>>
>>>
>>>        * gcc.dg/tree-ssa/builtin-expect-5.c: Adjusted.
>>>        * gcc.dg/tree-ssa/pr21031.c: Likewise.
>>>        * gcc.dg/tree-ssa/pr30978.c: Likewise.
>>>        * gcc.dg/tree-ssa/ssa-fre-6.c: Likewise.
>>>        * gcc.dg/binop-xor1.c: Mark it as expected fail.
>>>        * gcc.dg/binop-xor3.c: Likewise.
>>>        * gcc.dg/uninit-15.c: Adjust reported message.
>>>
>>> Index: gcc-head/gcc/fold-const.c
>>> ===
>>> --- gcc-head.orig/gcc/fold-const.c
>>> +++ gcc-head/gcc/fold-const.c
>>> @@ -7665,11 +7665,11 @@ fold_unary_loc (location_t loc, enum tre
>>>             non-integral type.
>>>             Do not fold the result as that would not simplify further, 
>>> also
>>>             folding again results in recursions.  */
>>> -         if (INTEGRAL_TYPE_P (type))
>>> +         if (TREE_CODE (type) == BOOLEAN_TYPE)
>>>            return build2_loc (loc, TREE_CODE (op0), type,
>>>                               TREE_OPERAND (op0, 0),
>>>                               TREE_OPERAND (op0, 1));
>>> -         else
>>> +         else if (!INTEGRAL_TYPE_P (type))
>>>            return build3_loc (loc, COND_EXPR, type, op0,
>>>                               fold_convert (type, boolean_true_node),
>>>                               fold_convert (type, boolean_false_node));
>>> Index: gcc-head/gcc/gimplify.c
>>> ===
>>> --- gcc-head.orig/gcc/gimplify.c
>>> +++ gcc-head/gcc/gimplify.c
>>> @@ -2842,18 +2842,23 @@ gimple_boolify (tree expr)
>>>
>>>     case TRUTH_NOT_EXPR:
>>>       TREE_OPERAND (expr, 0) = gimple_boolify (TREE_OPERAND (expr, 0));
>>> -      /* FALLTHRU */
>>>
>>> -    case EQ_EXPR: case NE_EXPR:
>>> -    case LE_EXPR: case GE_EXPR: case LT_EXPR: case GT_EXPR:
>>>       /* These expressions always produce boolean results.  */
>>> -      TREE_TYPE (expr) = boolean_type_no

Re: [Ada] Fix --enable-build-with-cxx build

2011-07-12 Thread Arnaud Charlet
> Revised version, only 33 files modified now (instead of 51).  We compile the C
> files for the compiler proper (and gnatbind) with the C++ compiler, but we
> keep compiling them with the C compiler in the other cases (library & tools).
> I think that this is in keeping with the other compilers (e.g. libgcc is still
> compiled with the C compiler).
> 
> Bootstrapped/regtested on x86_64-suse-linux with
> --enable-build-with-cxx.
> 
> Arno, do you have any objections to me applying this?

Certainly looks better to me.

Arno


[Patch, AVR]: Fix PR49687: Better widening mul 16=8*8

2011-07-12 Thread Georg-Johann Lay
For widening multiply there is room for optimization, e.g.:

* (mult:HI (extend:HI(QI)) HI) is better than
  (extend:HI(QI)) and (mult:HI HI HI)

* For mult with power of 2 sometimes a mult is
  better than a shift left.

* Support MULSU instruction, i.e.
  (mult:HI (sign_extend:HI(QI))
   (zero_extend:HI(QI)))

* (mult:HI (HI small_const)) can be optimized.

Some insns are expanded in mulhi3 expander, others
are synthesized in combine and then split in split1.

This requires the function avr_gate_split1 to avoid that
IRA/reload recombines the insn.  This is needed to have
constants CSEd out, see discussion in
   http://gcc.gnu.org/ml/gcc/2011-07/msg00136.html

I prefer this over clobber regs (because no CSE) and over
combine-split (because it is not clear that combine will
come up with a spare reg and the mode of the spare reg HI
is suboptimal).

FYI, I attached output of a test case compiled with
-Os -dp for an ATmega8 with .0. the original output
without the patch.

Some cases like the qmul8_xy test case are not optimized
(combine flaw), and there are superfluous move instructions
because of early-clobber (IRA/reload flaw).

Tested without regressions.

Ok to commit?

Johann

PR target/49687
* config/avr/avr.md (mulhi3): Use register_or_s8_u8_operand for
operand2 and expand appropriately if there is a CONST_INT in
operand2.
(*mulsu,*mulus): New insns.
(mulsqihi3): New insn.
(muluqihi3): New insn.
(*muluqihi3.uconst): New insn_and_split.
(*muluqihi3.sconst): New insn_and_split.
(*mulsqihi3.sconst): New insn_and_split.
(*mulsqihi3.uconst): New insn_and_split.
(*ashifthi3.signx.const): New insn_and_split.
(*ashifthi3.signx.const7): New insn_and_split.
(*ashifthi3.zerox.const): New insn_and_split.
* config/avr/avr.c (avr_rtx_costs): Report costs of above insns.
(avr_gate_split1): New function.
* config/avr/avr-protos.h (avr_gate_split1): New prototype.
* config/avr/predicates.md (const_2_to_7_operand): New.
(const_2_to_6_operand): New.
(u8_operand): New.
(s8_operand): New.
(register_or_s8_u8_operand): New.

Index: config/avr/predicates.md
===
--- config/avr/predicates.md	(revision 176136)
+++ config/avr/predicates.md	(working copy)
@@ -73,6 +73,16 @@ (define_predicate "const_0_to_7_operand"
   (and (match_code "const_int")
(match_test "IN_RANGE (INTVAL (op), 0, 7)")))
 
+;; Return 1 if OP is constant integer 2..7 for MODE.
+(define_predicate "const_2_to_7_operand"
+  (and (match_code "const_int")
+   (match_test "IN_RANGE (INTVAL (op), 2, 7)")))
+
+;; Return 1 if OP is constant integer 2..6 for MODE.
+(define_predicate "const_2_to_6_operand"
+  (and (match_code "const_int")
+   (match_test "IN_RANGE (INTVAL (op), 2, 6)")))
+
 ;; Returns true if OP is either the constant zero or a register.
 (define_predicate "reg_or_0_operand"
   (ior (match_operand 0 "register_operand")
@@ -156,3 +166,17 @@ (define_predicate "const_8_16_24_operand
   (and (match_code "const_int")
(match_test "8 == INTVAL(op) || 16 == INTVAL(op) || 24 == INTVAL(op)")))
 
+;; Unsigned CONST_INT that fits in 8 bits, i.e. 0..255.
+(define_predicate "u8_operand"
+  (and (match_code "const_int")
+   (match_test "IN_RANGE (INTVAL (op), 0, 255)")))
+
+;; Signed CONST_INT that fits in 8 bits, i.e. -128..127.
+(define_predicate "s8_operand"
+  (and (match_code "const_int")
+   (match_test "IN_RANGE (INTVAL (op), -128, 127)")))
+
+(define_predicate "register_or_s8_u8_operand"
+  (ior (match_operand 0 "register_operand")
+   (match_operand 0 "u8_operand")
+   (match_operand 0 "s8_operand")))
Index: config/avr/avr.md
===
--- config/avr/avr.md	(revision 176136)
+++ config/avr/avr.md	(working copy)
@@ -1017,19 +1017,245 @@ (define_insn "umulqihi3"
   [(set_attr "length" "3")
(set_attr "cc" "clobber")])
 
+(define_insn "*mulsu"
+  [(set (match_operand:HI 0 "register_operand" "=r")
+(mult:HI (sign_extend:HI (match_operand:QI 1 "register_operand" "a"))
+ (zero_extend:HI (match_operand:QI 2 "register_operand" "a"]
+  "AVR_HAVE_MUL"
+  "mulsu %1,%2
+	movw %0,r0
+	clr __zero_reg__"
+  [(set_attr "length" "3")
+   (set_attr "cc" "clobber")])
+
+(define_insn "*mulus"
+  [(set (match_operand:HI 0 "register_operand" "=r")
+(mult:HI (zero_extend:HI (match_operand:QI 1 "register_operand" "a"))
+ (sign_extend:HI (match_operand:QI 2 "register_operand" "a"]
+  "AVR_HAVE_MUL"
+  "mulsu %2,%1
+	movw %0,r0
+	clr __zero_reg__"
+  [(set_attr "length" "3")
+   (set_attr "cc" "clobber")])
+
+;**
+; mul HI: $1 = sign/zero-extend, $2 = small constant
+

Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching

2011-07-12 Thread Richard Guenther
On Mon, Jul 11, 2011 at 6:55 PM, Andrew Stubbs  wrote:
> On 07/07/11 10:58, Richard Guenther wrote:
>>
>> I think you should assume that series of widenings,
>> (int)(short)char_variable
>> are already combined.  Thus I believe you only need to consider a single
>> conversion in valid_types_for_madd_p.
>
> Ok, here's my new patch.
>
> This version only allows one conversion between the multiply and addition,
> so assumes that VRP has eliminated any needless ones.
>
> That one conversion may either be a truncate, if the mode was too large for
> the meaningful data, or an extend, which must be of the right flavour.
>
> This means that this patch now has the same effect as the last patch, for
> all valid cases (following you VRP patch), but rejects the cases where the C
> language (unhelpfully) requires an intermediate temporary to be of the
> 'wrong' signedness.
>
> Hopefully the output will now be the same between both -O0 and -O2, and
> programmers will continue to have to be careful about casting unsigned
> variables whenever they expect purely unsigned math. :(
>
> Is this one ok?

Ok.

Thanks,
Richard.

> Andrew
>


Re: [PATCH (2/7)] Widening multiplies by more than one mode

2011-07-12 Thread Richard Guenther
On Tue, Jul 12, 2011 at 11:50 AM, Andrew Stubbs  wrote:
> On 23/06/11 15:39, Andrew Stubbs wrote:
>>
>> This patch has two effects:
>>
>> 1. It permits the use of widening multiply instructions that widen by
>> more than one mode. E.g. HImode -> DImode.
>>
>> 2. It enables the use of widening multiply instructions for (extended)
>> inputs of narrower mode than the instruction takes. E.g. QImode ->
>> DImode where only HI->DI or SI->DI is available.
>>
>> Hopefully, most of the patch is self-explanatory, but here are few notes:
>>
>> The code introduces a temporary FIXME comment; this will be removed
>> later in the patch series. In fact, this is not a new restriction;
>> previously "type1" and "type2" were implicitly identical because they
>> were required to be one mode smaller than "type".
>>
>> I regard the ARM portion of this patch as obvious, so I don't think I
>> need an ARM maintainer to read this.
>>
>> Is the patch OK?
>
> I found a bug in this patch. It seems I do need to add casts for the inputs
> to widening multiplies (even though I know the registers are already fine),
> because otherwise something is insisting on truncating the values to the
> minimum width, which isn't helpful when it's actually an instruction with
> wider inputs.
>
> The mode changing bits from patch 4 have therefore been moved here. I've
> made the changes Richard Guenther requested there, I think.
>
> Otherwise, the patch is the same as before.

I wonder if we want to restrict the WIDEN_* operations to operate
on types that have matching type/mode precision(**).  Consider

struct {
  int a : 7;
  int b : 7;
} x;

short c = x.a * x.b;

which will be represented as (short)((int)<7-bit-type-with-QImode> *
(int)<7-bit-type-with-QImode>).

I wonder if you can do some experiments with bitfield types and see
if your patch series handles them correctly.

As for the patch, please update tree.def with the new requirements
for the WIDEN_* codes.

As for the bitfield precisions, we probably want to reject types that
do not have TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE
(type)).  Or maybe we can allow them if we generate
correct and good code for them?

+  tmp2 = TYPE_UNSIGNED (type1) == TYPE_UNSIGNED (type2)
+? tmp1
+: create_tmp_var (
+ build_nonstandard_integer_type (
+   GET_MODE_PRECISION (from_mode), TYPE_UNSIGNED (type1)),
+ NULL);

please use an if () stmt to avoid gross formatting.

+  if (TYPE_MODE (type1) != from_mode)

these kind of checks are unsafe if type1 does not have a TYPE_PRECISION
equal to its mode precision.

Thanks,
Richard.


> Andrew
>
>


Re: [Patch, AVR]: Fix PR49687: Better widening mul 16=8*8

2011-07-12 Thread Andrew Stubbs

On 12/07/11 11:35, Georg-Johann Lay wrote:

+(define_insn "*mulsu"
+  [(set (match_operand:HI 0 "register_operand" "=r")
+(mult:HI (sign_extend:HI (match_operand:QI 1 "register_operand" "a"))
+ (zero_extend:HI (match_operand:QI 2 "register_operand" 
"a"]
+  "AVR_HAVE_MUL"
+  "mulsu %1,%2
+   movw %0,r0
+   clr __zero_reg__"
+  [(set_attr "length" "3")
+   (set_attr "cc" "clobber")])
+
+(define_insn "*mulus"
+  [(set (match_operand:HI 0 "register_operand" "=r")
+(mult:HI (zero_extend:HI (match_operand:QI 1 "register_operand" "a"))
+ (sign_extend:HI (match_operand:QI 2 "register_operand" 
"a"]
+  "AVR_HAVE_MUL"
+  "mulsu %2,%1
+   movw %0,r0
+   clr __zero_reg__"
+  [(set_attr "length" "3")
+   (set_attr "cc" "clobber")])


1. You should name that "usmulqihi3" (no star), so the optimizers can 
see it.


2. There's no need to define both of these. For one thing, putting a '%' 
at the start of the constraint list  for operand 1 does precisely this, 
but more importantly, I'm pretty sure one form or the other is the 
canonical form and the other should never occur. If it's not, maybe it 
should be?


Andrew


Re: [PATCH (2/7)] Widening multiplies by more than one mode

2011-07-12 Thread Richard Guenther
On Tue, Jul 12, 2011 at 1:04 PM, Richard Guenther
 wrote:
> On Tue, Jul 12, 2011 at 11:50 AM, Andrew Stubbs  wrote:
>> On 23/06/11 15:39, Andrew Stubbs wrote:
>>>
>>> This patch has two effects:
>>>
>>> 1. It permits the use of widening multiply instructions that widen by
>>> more than one mode. E.g. HImode -> DImode.
>>>
>>> 2. It enables the use of widening multiply instructions for (extended)
>>> inputs of narrower mode than the instruction takes. E.g. QImode ->
>>> DImode where only HI->DI or SI->DI is available.
>>>
>>> Hopefully, most of the patch is self-explanatory, but here are few notes:
>>>
>>> The code introduces a temporary FIXME comment; this will be removed
>>> later in the patch series. In fact, this is not a new restriction;
>>> previously "type1" and "type2" were implicitly identical because they
>>> were required to be one mode smaller than "type".
>>>
>>> I regard the ARM portion of this patch as obvious, so I don't think I
>>> need an ARM maintainer to read this.
>>>
>>> Is the patch OK?
>>
>> I found a bug in this patch. It seems I do need to add casts for the inputs
>> to widening multiplies (even though I know the registers are already fine),
>> because otherwise something is insisting on truncating the values to the
>> minimum width, which isn't helpful when it's actually an instruction with
>> wider inputs.
>>
>> The mode changing bits from patch 4 have therefore been moved here. I've
>> made the changes Richard Guenther requested there, I think.
>>
>> Otherwise, the patch is the same as before.
>
> I wonder if we want to restrict the WIDEN_* operations to operate
> on types that have matching type/mode precision(**).  Consider
>
> struct {
>  int a : 7;
>  int b : 7;
> } x;
>
> short c = x.a * x.b;
>
> which will be represented as (short)((int)<7-bit-type-with-QImode> *
> (int)<7-bit-type-with-QImode>).
>
> I wonder if you can do some experiments with bitfield types and see
> if your patch series handles them correctly.
>
> As for the patch, please update tree.def with the new requirements
> for the WIDEN_* codes.
>
> As for the bitfield precisions, we probably want to reject types that
> do not have TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE
> (type)).  Or maybe we can allow them if we generate
> correct and good code for them?
>
> +      tmp2 = TYPE_UNSIGNED (type1) == TYPE_UNSIGNED (type2)
> +            ? tmp1
> +            : create_tmp_var (
> +                 build_nonstandard_integer_type (
> +                   GET_MODE_PRECISION (from_mode), TYPE_UNSIGNED (type1)),
> +                 NULL);
>
> please use an if () stmt to avoid gross formatting.
>
> +  if (TYPE_MODE (type1) != from_mode)
>
> these kind of checks are unsafe if type1 does not have a TYPE_PRECISION
> equal to its mode precision.

(**) We really ought to forbid any arithmetic on types that have non-mode
precision and only allow conversions to/from such types.


Re: [Patch, AVR]: Fix PR49687: Better widening mul 16=8*8

2011-07-12 Thread Bernd Schmidt
On 07/12/11 13:04, Andrew Stubbs wrote:
> On 12/07/11 11:35, Georg-Johann Lay wrote:
>> +(define_insn "*mulsu"
>> +  [(set (match_operand:HI 0
>> "register_operand" "=r")
>> +(mult:HI (sign_extend:HI (match_operand:QI 1
>> "register_operand" "a"))
>> + (zero_extend:HI (match_operand:QI 2
>> "register_operand" "a"]
>> +  "AVR_HAVE_MUL"
>> +  "mulsu %1,%2
>> +movw %0,r0
>> +clr __zero_reg__"
>> +  [(set_attr "length" "3")
>> +   (set_attr "cc" "clobber")])
>> +
>> +(define_insn "*mulus"
>> +  [(set (match_operand:HI 0
>> "register_operand" "=r")
>> +(mult:HI (zero_extend:HI (match_operand:QI 1
>> "register_operand" "a"))
>> + (sign_extend:HI (match_operand:QI 2
>> "register_operand" "a"]
>> +  "AVR_HAVE_MUL"
>> +  "mulsu %2,%1
>> +movw %0,r0
>> +clr __zero_reg__"
>> +  [(set_attr "length" "3")
>> +   (set_attr "cc" "clobber")])
> 
> 1. You should name that "usmulqihi3" (no star), so the optimizers can
> see it.
> 
> 2. There's no need to define both of these. For one thing, putting a '%'
> at the start of the constraint list  for operand 1 does precisely this,

Unfortunately it doesn't. It won't swap the sign/zero-extend.


Bernd


Re: [Patch, AVR]: Fix PR49687: Better widening mul 16=8*8

2011-07-12 Thread Andrew Stubbs

On 12/07/11 12:11, Bernd Schmidt wrote:

2. There's no need to define both of these. For one thing, putting a '%'
at the start of the constraint list  for operand 1 does precisely this,


Unfortunately it doesn't. It won't swap the sign/zero-extend.


Ah, my mistake. I still think that one form should be canonical, if it 
isn't already.


Andrew


Re: [PATCH (2/7)] Widening multiplies by more than one mode

2011-07-12 Thread Andrew Stubbs

On 12/07/11 12:05, Richard Guenther wrote:

(**) We really ought to forbid any arithmetic on types that have non-mode
precision and only allow conversions to/from such types.


Hmmm, presumably the problem is that we might have a compatible 
precision, but the backends actually work with purely mode-sized types?


That does sound problematic. :(

Does the recent bitfield lowering activity have any affect on this? I.e. 
does it make it a moot point by the time we get to the widen_mult pass?


Andrew


Re: [PATCH (2/7)] Widening multiplies by more than one mode

2011-07-12 Thread Richard Guenther
On Tue, Jul 12, 2011 at 1:26 PM, Andrew Stubbs  wrote:
> On 12/07/11 12:05, Richard Guenther wrote:
>>
>> (**) We really ought to forbid any arithmetic on types that have non-mode
>> precision and only allow conversions to/from such types.
>
> Hmmm, presumably the problem is that we might have a compatible precision,
> but the backends actually work with purely mode-sized types?
>
> That does sound problematic. :(
>
> Does the recent bitfield lowering activity have any affect on this? I.e.
> does it make it a moot point by the time we get to the widen_mult pass?

No, the bitfield lowering will only change the types of memory loads,
not the types of the quantities we eventually see in the IL.  Thus for
my example we'd still see the casts from 7-bit types.

Richard.

> Andrew
>


Re: [Patch, AVR]: Fix PR49687: Better widening mul 16=8*8

2011-07-12 Thread Georg-Johann Lay
Andrew Stubbs wrote:
> On 12/07/11 11:35, Georg-Johann Lay wrote:
>> +(define_insn "*mulsu"
>> +  [(set (match_operand:HI 0
>> "register_operand" "=r")
>> +(mult:HI (sign_extend:HI (match_operand:QI 1
>> "register_operand" "a"))
>> + (zero_extend:HI (match_operand:QI 2
>> "register_operand" "a"]
>> +  "AVR_HAVE_MUL"
>> +  "mulsu %1,%2
>> +movw %0,r0
>> +clr __zero_reg__"
>> +  [(set_attr "length" "3")
>> +   (set_attr "cc" "clobber")])
>> +
>> +(define_insn "*mulus"
>> +  [(set (match_operand:HI 0
>> "register_operand" "=r")
>> +(mult:HI (zero_extend:HI (match_operand:QI 1
>> "register_operand" "a"))
>> + (sign_extend:HI (match_operand:QI 2
>> "register_operand" "a"]
>> +  "AVR_HAVE_MUL"
>> +  "mulsu %2,%1
>> +movw %0,r0
>> +clr __zero_reg__"
>> +  [(set_attr "length" "3")
>> +   (set_attr "cc" "clobber")])
> 
> 1. You should name that "usmulqihi3" (no star), so the optimizers can
> see it.

Thanks for pointing this out!  I confused "us" with unsigned saturate.

> 2. There's no need to define both of these. For one thing, putting a '%'
> at the start of the constraint list  for operand 1 does precisely this,
> but more importantly, I'm pretty sure one form or the other is the
> canonical form and the other should never occur. If it's not, maybe it
> should be?

I don't know if combine does any canonicalization for that.
The % is not correct because the insn is not commutative.
AFAIK for abelian operation there's no advantage if constrains
are the same.

> 
> Andrew

Attached revised patch.

Ok to commit?

Johann
PR target/49687
* config/avr/avr.md (mulhi3): Use register_or_s8_u8_operand for
operand2 and expand appropriately if there is a CONST_INT in
operand2.
(usmulqihi3): New insn.
(mulsqihi3): New insn.
(muluqihi3): New insn.
(*muluqihi3.uconst): New insn_and_split.
(*muluqihi3.sconst): New insn_and_split.
(*mulsqihi3.sconst): New insn_and_split.
(*mulsqihi3.uconst): New insn_and_split.
(*ashifthi3.signx.const): New insn_and_split.
(*ashifthi3.signx.const7): New insn_and_split.
(*ashifthi3.zerox.const): New insn_and_split.
* config/avr/avr.c (avr_rtx_costs): Report costs of above insns.
(avr_gate_split1): New function.
* config/avr/avr-protos.h (avr_gate_split1): New prototype.
* config/avr/predicates.md (const_2_to_7_operand): New.
(const_2_to_6_operand): New.
(u8_operand): New.
(s8_operand): New.
(register_or_s8_u8_operand): New.



Index: config/avr/predicates.md
===
--- config/avr/predicates.md	(revision 176136)
+++ config/avr/predicates.md	(working copy)
@@ -73,6 +73,16 @@ (define_predicate "const_0_to_7_operand"
   (and (match_code "const_int")
(match_test "IN_RANGE (INTVAL (op), 0, 7)")))
 
+;; Return 1 if OP is constant integer 2..7 for MODE.
+(define_predicate "const_2_to_7_operand"
+  (and (match_code "const_int")
+   (match_test "IN_RANGE (INTVAL (op), 2, 7)")))
+
+;; Return 1 if OP is constant integer 2..6 for MODE.
+(define_predicate "const_2_to_6_operand"
+  (and (match_code "const_int")
+   (match_test "IN_RANGE (INTVAL (op), 2, 6)")))
+
 ;; Returns true if OP is either the constant zero or a register.
 (define_predicate "reg_or_0_operand"
   (ior (match_operand 0 "register_operand")
@@ -156,3 +166,17 @@ (define_predicate "const_8_16_24_operand
   (and (match_code "const_int")
(match_test "8 == INTVAL(op) || 16 == INTVAL(op) || 24 == INTVAL(op)")))
 
+;; Unsigned CONST_INT that fits in 8 bits, i.e. 0..255.
+(define_predicate "u8_operand"
+  (and (match_code "const_int")
+   (match_test "IN_RANGE (INTVAL (op), 0, 255)")))
+
+;; Signed CONST_INT that fits in 8 bits, i.e. -128..127.
+(define_predicate "s8_operand"
+  (and (match_code "const_int")
+   (match_test "IN_RANGE (INTVAL (op), -128, 127)")))
+
+(define_predicate "register_or_s8_u8_operand"
+  (ior (match_operand 0 "register_operand")
+   (match_operand 0 "u8_operand")
+   (match_operand 0 "s8_operand")))
Index: config/avr/avr.md
===
--- config/avr/avr.md	(revision 176136)
+++ config/avr/avr.md	(working copy)
@@ -1017,19 +1017,235 @@ (define_insn "umulqihi3"
   [(set_attr "length" "3")
(set_attr "cc" "clobber")])
 
+(define_insn "usmulqihi3"
+  [(set (match_operand:HI 0 "register_operand" "=r")
+(mult:HI (zero_extend:HI (match_operand:QI 1 "register_operand" "a"))
+ (sign_extend:HI (match_operand:QI 2 "register_operand" "a"]
+  "AVR_HAVE_MUL"
+  "mulsu %2,%1
+	movw %0,r0
+	clr __zero_reg__"
+  [(set_attr "length" "3")
+   (set_attr "cc" "clobber")])
+
+;***

Re: [Patch, AVR]: Fix PR49687: Better widening mul 16=8*8

2011-07-12 Thread Georg-Johann Lay
Bernd Schmidt wrote:
> On 07/12/11 13:04, Andrew Stubbs wrote:
>> On 12/07/11 11:35, Georg-Johann Lay wrote:
>>> +(define_insn "*mulsu"
>>> +  [(set (match_operand:HI 0
>>> "register_operand" "=r")
>>> +(mult:HI (sign_extend:HI (match_operand:QI 1
>>> "register_operand" "a"))
>>> + (zero_extend:HI (match_operand:QI 2
>>> "register_operand" "a"]
>>> +  "AVR_HAVE_MUL"
>>> +  "mulsu %1,%2
>>> +movw %0,r0
>>> +clr __zero_reg__"
>>> +  [(set_attr "length" "3")
>>> +   (set_attr "cc" "clobber")])
>>> +
>>> +(define_insn "*mulus"
>>> +  [(set (match_operand:HI 0
>>> "register_operand" "=r")
>>> +(mult:HI (zero_extend:HI (match_operand:QI 1
>>> "register_operand" "a"))
>>> + (sign_extend:HI (match_operand:QI 2
>>> "register_operand" "a"]
>>> +  "AVR_HAVE_MUL"
>>> +  "mulsu %2,%1
>>> +movw %0,r0
>>> +clr __zero_reg__"
>>> +  [(set_attr "length" "3")
>>> +   (set_attr "cc" "clobber")])
>> 1. You should name that "usmulqihi3" (no star), so the optimizers can
>> see it.
>>
>> 2. There's no need to define both of these. For one thing, putting a '%'
>> at the start of the constraint list  for operand 1 does precisely this,
> 
> Unfortunately it doesn't. It won't swap the sign/zero-extend.
> 
> 
> Bernd

Thanks for clarification.

Here is version #3 of the patch with additional insn similar to
usmulqihi3 but with operands swapped ("*sumulqihi3").

Ok to commit?

Johann


PR target/49687
* config/avr/avr.md (mulhi3): Use register_or_s8_u8_operand for
operand2 and expand appropriately if there is a CONST_INT in
operand2.
(usmulqihi3): New insn.
(*sumulqihi3): New insn.
(mulsqihi3): New insn.
(muluqihi3): New insn.
(*muluqihi3.uconst): New insn_and_split.
(*muluqihi3.sconst): New insn_and_split.
(*mulsqihi3.sconst): New insn_and_split.
(*mulsqihi3.uconst): New insn_and_split.
(*ashifthi3.signx.const): New insn_and_split.
(*ashifthi3.signx.const7): New insn_and_split.
(*ashifthi3.zerox.const): New insn_and_split.
* config/avr/avr.c (avr_rtx_costs): Report costs of above insns.
(avr_gate_split1): New function.
* config/avr/avr-protos.h (avr_gate_split1): New prototype.
* config/avr/predicates.md (const_2_to_7_operand): New.
(const_2_to_6_operand): New.
(u8_operand): New.
(s8_operand): New.
(register_or_s8_u8_operand): New.


Index: config/avr/predicates.md
===
--- config/avr/predicates.md	(revision 176136)
+++ config/avr/predicates.md	(working copy)
@@ -73,6 +73,16 @@ (define_predicate "const_0_to_7_operand"
   (and (match_code "const_int")
(match_test "IN_RANGE (INTVAL (op), 0, 7)")))
 
+;; Return 1 if OP is constant integer 2..7 for MODE.
+(define_predicate "const_2_to_7_operand"
+  (and (match_code "const_int")
+   (match_test "IN_RANGE (INTVAL (op), 2, 7)")))
+
+;; Return 1 if OP is constant integer 2..6 for MODE.
+(define_predicate "const_2_to_6_operand"
+  (and (match_code "const_int")
+   (match_test "IN_RANGE (INTVAL (op), 2, 6)")))
+
 ;; Returns true if OP is either the constant zero or a register.
 (define_predicate "reg_or_0_operand"
   (ior (match_operand 0 "register_operand")
@@ -156,3 +166,17 @@ (define_predicate "const_8_16_24_operand
   (and (match_code "const_int")
(match_test "8 == INTVAL(op) || 16 == INTVAL(op) || 24 == INTVAL(op)")))
 
+;; Unsigned CONST_INT that fits in 8 bits, i.e. 0..255.
+(define_predicate "u8_operand"
+  (and (match_code "const_int")
+   (match_test "IN_RANGE (INTVAL (op), 0, 255)")))
+
+;; Signed CONST_INT that fits in 8 bits, i.e. -128..127.
+(define_predicate "s8_operand"
+  (and (match_code "const_int")
+   (match_test "IN_RANGE (INTVAL (op), -128, 127)")))
+
+(define_predicate "register_or_s8_u8_operand"
+  (ior (match_operand 0 "register_operand")
+   (match_operand 0 "u8_operand")
+   (match_operand 0 "s8_operand")))
Index: config/avr/avr.md
===
--- config/avr/avr.md	(revision 176136)
+++ config/avr/avr.md	(working copy)
@@ -1017,19 +1017,249 @@ (define_insn "umulqihi3"
   [(set_attr "length" "3")
(set_attr "cc" "clobber")])
 
+(define_insn "usmulqihi3"
+  [(set (match_operand:HI 0 "register_operand" "=r")
+(mult:HI (zero_extend:HI (match_operand:QI 1 "register_operand" "a"))
+ (sign_extend:HI (match_operand:QI 2 "register_operand" "a"]
+  "AVR_HAVE_MUL"
+  "mulsu %2,%1
+	movw %0,r0
+	clr __zero_reg__"
+  [(set_attr "length" "3")
+   (set_attr "cc" "clobber")])
+
+;; Above insn is not canonicalized by insn combine, so here is a version with
+;; operands swapped.
+
+(define_insn "*sumulqihi3"
+  [(set (match_operand:HI 0 "register_operand"  

Re: Cgraph alias reorg 8/14 (ipa-cp and ipa-prop update)

2011-07-12 Thread Maxim Kuvyrkov
On Jun 10, 2011, at 6:55 PM, Jan Hubicka wrote:

> Hi,
> this patch updated ipa-cp and ipa-prop for aliases.  It is basically easy - 
> we don't
> analyze nodes represneting aliases and when propagating we skip them, like 
> everywhere
> else.
...
> @@ -759,7 +768,8 @@ ipcp_propagate_stage (void)
> 
>   for (cs = node->callees; cs; cs = cs->next_callee)
>   {
> -   struct ipa_node_params *callee_info = IPA_NODE_REF (cs->callee);
> +   struct cgraph_node *callee = cgraph_function_or_thunk_node 
> (cs->callee, NULL);
> +   struct ipa_node_params *callee_info = IPA_NODE_REF (callee);
> struct ipa_edge_args *args = IPA_EDGE_REF (cs);
> 
> if (ipa_is_called_with_var_arguments (callee_info)
> @@ -778,11 +788,11 @@ ipcp_propagate_stage (void)
>   {
> dest_lat->type = new_lat.type;
> dest_lat->constant = new_lat.constant;
> -   ipa_push_func_to_list (&wl, cs->callee);
> +   ipa_push_func_to_list (&wl, callee);
>   }
> 
> if (ipcp_propagate_types (info, callee_info, jump_func, i))
> - ipa_push_func_to_list (&wl, cs->callee);
> + ipa_push_func_to_list (&wl, callee);
>   }
>   }
> }

Jan,

I have a question about the above hunk.  With this hunk you replace all uses of 
'cs->callee' with 'callee' in ipcp_propagate_stage() except for in the check:

  if (ipa_is_called_with_var_arguments (callee_info)
  || !cs->callee->analyzed
  || ipa_is_called_with_var_arguments (callee_info))
continue;

Is there a reason why you left 'cs->callee' intact in this case?

Thank you,

--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics



Re: [patch tree-optimization]: [2 of 3]: Boolify compares & more

2011-07-12 Thread Kai Tietz
2011/7/12 Richard Guenther :
> On Tue, Jul 12, 2011 at 11:48 AM, Kai Tietz  wrote:
>> 2011/7/12 Richard Guenther :
>>> On Mon, Jul 11, 2011 at 5:37 PM, Kai Tietz  wrote:
 2011/7/8 Richard Guenther :
> On Fri, Jul 8, 2011 at 1:32 PM, Kai Tietz  wrote:
>> 2011/7/8 Richard Guenther :
>>> On Thu, Jul 7, 2011 at 6:07 PM, Kai Tietz  
>>> wrote:
 Hello,

 This patch - second of series - adds boolification of comparisions in
 gimplifier.  For this
 casts from/to boolean are marked as not-useless. And in fold_unary_loc
 casts to non-boolean integral types are preserved.
 The hunk in tree-ssa-forwprop.c in combine_cond-expr_cond is not 
 strictly
 necessary - as long as fold-const handles 1-bit precision 
 bitwise-expression
 with truth-logic - but it has shown to short-cut some expensier 
 folding. So
 I kept it within this patch.
>>>
>>> Please split it out.  Also ...
>>>

 The adjusted testcase gcc.dg/uninit-15.c indicates that due
 optimization we loose
 in this case variables declaration.  But this might be to be expected.

 In vectorization we have a regression in gcc.dg/vect/vect-cond-3.c
 test-case.  It's caused
 by always having boolean-type on conditions.  So vectorizer sees
 different types, which
 aren't handled by vectorizer right now.  Maybe this issue could be
 special-cased for
 boolean-types in tree-vect-loop, by making operand for used condition
 equal to vector-type.
 But this is a subject for a different patch and not addressed by this 
 series.

 There is a regressions in tree-ssa/vrp47.c, and the fix is addressed
 by the 3rd patch of this
 series.

 Bootstrapped and regression tested for all standard-languages (plus
 Ada and Obj-C++) on host x86_64-pc-linux-gnu.

 Ok for apply?

 Regards,
 Kai


 ChangeLog

 2011-07-07  Kai Tietz  

        * fold-const.c (fold_unary_loc): Preserve
        non-boolean-typed casts.
        * gimplify.c (gimple_boolify): Handle boolification
        of comparisons.
        (gimplify_expr): Boolifiy non aggregate-typed
        comparisons.
        * tree-cfg.c (verify_gimple_comparison): Check result
        type of comparison expression.
        * tree-ssa.c (useless_type_conversion_p): Preserve incompatible
        casts from/to boolean,
        * tree-ssa-forwprop.c (combine_cond_expr_cond): Add 
 simplification
        support for one-bit-precision typed X for cases X != 0 and X == 
 0.
        (forward_propagate_comparison): Adjust test of condition
        result.


        * gcc.dg/tree-ssa/builtin-expect-5.c: Adjusted.
        * gcc.dg/tree-ssa/pr21031.c: Likewise.
        * gcc.dg/tree-ssa/pr30978.c: Likewise.
        * gcc.dg/tree-ssa/ssa-fre-6.c: Likewise.
        * gcc.dg/binop-xor1.c: Mark it as expected fail.
        * gcc.dg/binop-xor3.c: Likewise.
        * gcc.dg/uninit-15.c: Adjust reported message.

 Index: gcc-head/gcc/fold-const.c
 ===
 --- gcc-head.orig/gcc/fold-const.c
 +++ gcc-head/gcc/fold-const.c
 @@ -7665,11 +7665,11 @@ fold_unary_loc (location_t loc, enum tre
             non-integral type.
             Do not fold the result as that would not simplify further, 
 also
             folding again results in recursions.  */
 -         if (INTEGRAL_TYPE_P (type))
 +         if (TREE_CODE (type) == BOOLEAN_TYPE)
            return build2_loc (loc, TREE_CODE (op0), type,
                               TREE_OPERAND (op0, 0),
                               TREE_OPERAND (op0, 1));
 -         else
 +         else if (!INTEGRAL_TYPE_P (type))
            return build3_loc (loc, COND_EXPR, type, op0,
                               fold_convert (type, boolean_true_node),
                               fold_convert (type, boolean_false_node));
 Index: gcc-head/gcc/gimplify.c
 ===
 --- gcc-head.orig/gcc/gimplify.c
 +++ gcc-head/gcc/gimplify.c
 @@ -2842,18 +2842,23 @@ gimple_boolify (tree expr)

     case TRUTH_NOT_EXPR:
       TREE_OPERAND (expr, 0) = gimple_boolify (TREE_OPERAND (expr, 0));
 -      /* FALLTHRU */

 -    case EQ_EXPR: case NE_EXPR:
 -    case LE_EXPR: case GE_EXPR: case

Re: [PATCH][1/N][C][C++][Fortran][Java] Change POINTER_PLUS_EXPR offset type requirements

2011-07-12 Thread Tobias Burnus

On 07/12/2011 12:33 PM, Richard Guenther wrote:

Bootstrapped and tested on x86_64-unknown-linux-gnu for all languages.

Are the frontend parts ok?


The Fortran bits are OK.

Tobias


2011-07-12  Richard Guenther

fortran/
* trans-expr.c (fill_with_spaces): Use fold_build_pointer_plus.
(gfc_trans_string_copy): Likewise.
* trans-intrinsic.c (gfc_conv_intrinsic_repeat): Likewise.
* trans-types.c (gfc_get_array_descr_info): Likewise.
* trans.c (gfc_build_array_ref): Likewise.


Re: [Patch, libfortran] PR 49296 List read of file without EOR

2011-07-12 Thread Jerry DeLisle

On 07/12/2011 01:05 AM, Janne Blomqvist wrote:

PING

On Mon, Jul 4, 2011 at 00:57, Janne Blomqvist  wrote:

Hi,

the attached patch fixes the remaining cases of handling input that
ends in EOF instead of a normal separator for list formatted read of
the primitive types. Ok for trunk and 4.6?



Yes, OK.  I have been on vacation and missed this sooner.

Thanks for patch.

Jerry


[PATCH Atom][PR middle-end/44382] Tree reassociation improvement

2011-07-12 Thread Илья Энкович
Hello,

Here is a patch related to missed optimization opportunity in tree
reassoc phase.

Currently tree reassoc phase always generates a linear form which
requires the minimum registers but has the highest tree height and
does not allow computation to be performed in parallel. It may be
critical for performance if required operations have high latency but
can be pipelined (i.e. few execution units or low throughput). This
problem becomes important on current Atom processors which are
in-order and have many such instructions: IMUL and scalar SSE FP
instructions.

This patch introduces a new feature to tree reassoc phase to generate
computation tree with reduced height allowing to perform few
long-latency instructions in parallel. It changes only one part of
reassociation - rewrite_expr_tree. A level of parallelism is
controlled via target hook and/or command line option.

New feature is enabled for Atom only by default. Patch boosts mostly
CFP2000 geomean on Atom: +3.04% for 32 bit and +0.32% for 64 bit.

Bootstrapped and checked on x86_64-linux.

Thanks,
Ilya
--
gcc/

2011-07-12  Enkovich Ilya  

* target.def (reassociation_width): New hook.

* doc/tm.texi.in (reassociation_width): New hook documentation.

* doc/tm.texi (reassociation_width): Likewise.

* hooks.h (hook_int_const_gimple_1): New default hook.

* hooks.c (hook_int_const_gimple_1): Likewise.

* config/i386/i386.h (ix86_tune_indices): Add
X86_TUNE_REASSOC_INT_TO_PARALLEL and
X86_TUNE_REASSOC_FP_TO_PARALLEL.

(TARGET_REASSOC_INT_TO_PARALLEL): New.
(TARGET_REASSOC_FP_TO_PARALLEL): Likewise.

* config/i386/i386.c (initial_ix86_tune_features): Add
X86_TUNE_REASSOC_INT_TO_PARALLEL and
X86_TUNE_REASSOC_FP_TO_PARALLEL.

(ix86_reassociation_width) implementation of
new hook for i386 target.

* common.opt (ftree-reassoc-width): New option added.

* tree-ssa-reassoc.c (get_required_cycles): New function.
(get_reassociation_width): Likewise.
(rewrite_expr_tree_parallel): Likewise.

(reassociate_bb): Now checks reassociation width to be used
and call rewrite_expr_tree_parallel instead of rewrite_expr_tree
if needed.

(pass_reassoc): TODO_remove_unused_locals flag added.

gcc/testsuite/

2011-07-12  Enkovich Ilya  

* gcc.dg/tree-ssa/pr38533.c (dg-options): Added option
-ftree-reassoc-width=1.

* gcc.dg/tree-ssa/reassoc-24.c: New test.
* gcc.dg/tree-ssa/reassoc-25.c: Likewise.


PR44382.diff
Description: Binary data


Re: [pph] Remove protection for NAMESPACE_LEVEL being null when adding namespaces (issue4675069)

2011-07-12 Thread Diego Novillo
On Fri, Jul 8, 2011 at 21:29, Gabriel Charette  wrote:

> 2011-07-08  Gabriel Charette  
>
>        * gcc/cp/pph-streamer-in.c (pph_add_bindings_to_namespace):
>        NAMESPACE_LEVEL should never be null for a namespace, removed check.

OK.  I committed it to the branch.  I need to adjust the other two
patches you sent as I'm starting to fix something in the same area.


Diego.


Re: [Patch 0/3] ARM 64 bit atomic operations

2011-07-12 Thread Ramana Radhakrishnan
>
> It's been tested cross to ARM from x86 and also a native x86 build & test.

Cross on qemu ? You do mean a native ARM build and test :) You are
missing changelog entries in each of your patches. Could you please
reply to each of your patches with the appropriate Changelog entries ?

Thanks,
Ramana


Re: [PATCH][1/N][C][C++][Fortran][Java] Change POINTER_PLUS_EXPR offset type requirements

2011-07-12 Thread Jason Merrill

The C++ changes are OK.

Jason


[pph] Mark c4pr36533.cc fixed (issue4708041)

2011-07-12 Thread Diego Novillo
Not quite sure what patch fixed this one and I'm too lazy to figure it out
now.

Committed to branch.


Diego.

* g++.dg/pph/c4pr36533.cc: Mark fixed.

diff --git a/gcc/testsuite/g++.dg/pph/c4pr36533.cc 
b/gcc/testsuite/g++.dg/pph/c4pr36533.cc
index 0093db1..ce2bf1f 100644
--- a/gcc/testsuite/g++.dg/pph/c4pr36533.cc
+++ b/gcc/testsuite/g++.dg/pph/c4pr36533.cc
@@ -1,3 +1,2 @@
 /* { dg-options "-w -fpermissive" } */
-// pph asm xdiff
 #include "c4pr36533.h"

--
This patch is available for review at http://codereview.appspot.com/4708041


[gomp-3.1] Some #pragma omp atomic capture changes

2011-07-12 Thread Jakub Jelinek
Hi!

This patch fixes atomic capture with structure block and
pre/post inc/decrement and adds tests for it (these
forms weren't in the 3.1 draft and were parsed by mistake
before and for x++; v = x; and x--; v = x; it actually ICEd).

I haven't added the x = x binop expr forms yet for neither
atomic update nor atomic capture, because the new standard is fuzzy
about those late additions and I'm waiting for clarifications
on openmp.org/forum/.

2011-07-12  Jakub Jelinek  

* c-parser.c (c_parser_omp_atomic): Fix handling of
#pragma omp atomic capture { x++; v = x; } and
#pragma omp atomic capture { x--; v = x; }.

* parser.c (cp_parser_omp_atomic): Fix handling of
#pragma omp atomic capture { x++; v = x; } and
#pragma omp atomic capture { x--; v = x; }.

* testsuite/libgomp.c/atomic-11.c: Add new tests.
* testsuite/libgomp.c/atomic-12.c: Likewise.
* testsuite/libgomp.c++/atomic-2.C: Likewise.
* testsuite/libgomp.c++/atomic-3.C: Likewise.
* testsuite/libgomp.c++/atomic-4.C: Likewise.
* testsuite/libgomp.c++/atomic-5.C: Likewise.

--- gcc/c-parser.c.jj   2011-07-11 17:48:18.0 +0200
+++ gcc/c-parser.c  2011-07-12 13:10:57.0 +0200
@@ -9157,7 +9157,7 @@ c_parser_omp_structured_block (c_parser 
capture-stmt:
  v = x binop= expr | v = x++ | v = ++x | v = x-- | v = --x
capture-block:
- { v = x; x binop= expr; } | { x binop= expr; v = x; }
+ { v = x; expression-stmt; } | { expression-stmt; v = x; }
 
   where x and v are lvalue expressions with scalar type.
 
@@ -9253,7 +9253,7 @@ restart:
   return;
 
 case POSTINCREMENT_EXPR:
-  if (code == OMP_ATOMIC_CAPTURE_NEW)
+  if (code == OMP_ATOMIC_CAPTURE_NEW && !structured_block)
code = OMP_ATOMIC_CAPTURE_OLD;
   /* FALLTHROUGH */
 case PREINCREMENT_EXPR:
@@ -9263,7 +9263,7 @@ restart:
   break;
 
 case POSTDECREMENT_EXPR:
-  if (code == OMP_ATOMIC_CAPTURE_NEW)
+  if (code == OMP_ATOMIC_CAPTURE_NEW && !structured_block)
code = OMP_ATOMIC_CAPTURE_OLD;
   /* FALLTHROUGH */
 case PREDECREMENT_EXPR:
@@ -9295,6 +9295,7 @@ restart:
  lhs = TREE_OPERAND (lhs, 0);
  opcode = NOP_EXPR;
  if (code == OMP_ATOMIC_CAPTURE_NEW
+ && !structured_block
  && TREE_CODE (orig_lhs) == COMPOUND_EXPR)
code = OMP_ATOMIC_CAPTURE_OLD;
  break;
@@ -9308,6 +9309,7 @@ restart:
  lhs = TREE_OPERAND (lhs, 0);
  opcode = NOP_EXPR;
  if (code == OMP_ATOMIC_CAPTURE_NEW
+ && !structured_block
  && TREE_CODE (orig_lhs) == COMPOUND_EXPR)
code = OMP_ATOMIC_CAPTURE_OLD;
  break;
--- gcc/cp/parser.c.jj  2011-07-11 17:43:43.0 +0200
+++ gcc/cp/parser.c 2011-07-12 13:48:18.0 +0200
@@ -24219,7 +24219,7 @@ cp_parser_omp_structured_block (cp_parse
capture-stmt:
  v = x binop= expr | v = x++ | v = ++x | v = x-- | v = --x
capture-block:
- { v = x; x binop= expr; } | { x binop= expr; v = x; }
+ { v = x; expression-stmt; } | { expression-stmt; v = x; }
 
   where x and v are lvalue expressions with scalar type.  */
 
@@ -24307,7 +24307,7 @@ restart:
   goto saw_error;
 
 case POSTINCREMENT_EXPR:
-  if (code == OMP_ATOMIC_CAPTURE_NEW)
+  if (code == OMP_ATOMIC_CAPTURE_NEW && !structured_block)
code = OMP_ATOMIC_CAPTURE_OLD;
   /* FALLTHROUGH */
 case PREINCREMENT_EXPR:
@@ -24317,7 +24317,7 @@ restart:
   break;
 
 case POSTDECREMENT_EXPR:
-  if (code == OMP_ATOMIC_CAPTURE_NEW)
+  if (code == OMP_ATOMIC_CAPTURE_NEW && !structured_block)
code = OMP_ATOMIC_CAPTURE_OLD;
   /* FALLTHROUGH */
 case PREDECREMENT_EXPR:
@@ -24349,6 +24349,7 @@ restart:
  lhs = TREE_OPERAND (lhs, 0);
  opcode = NOP_EXPR;
  if (code == OMP_ATOMIC_CAPTURE_NEW
+ && !structured_block
  && TREE_CODE (orig_lhs) == COMPOUND_EXPR)
code = OMP_ATOMIC_CAPTURE_OLD;
  break;
--- libgomp/testsuite/libgomp.c/atomic-11.c.jj  2011-04-20 18:31:25.0 
+0200
+++ libgomp/testsuite/libgomp.c/atomic-11.c 2011-07-12 14:05:32.0 
+0200
@@ -60,6 +60,50 @@ main (void)
 v = x;
   if (v != 62)
 abort ();
+  #pragma omp atomic capture
+{ v = x; x++; }
+  if (v != 62)
+abort ();
+  #pragma omp atomic capture
+{ v = x; ++x; }
+  if (v != 63)
+abort ();
+  #pragma omp atomic capture
+{
+  ++x;
+  v = x;
+}
+  if (v != 65)
+abort ();
+#pragma omp atomic capture
+{x++;v=x;}if (v != 66)
+abort ();
+  #pragma omp atomic read
+v = x;
+  if (v != 66)
+abort ();
+  #pragma omp atomic capture
+{ v = x; x--; }
+  if (v != 66)
+abort ();
+  #pragma omp atomic capture
+{ v = x; --x; }
+  if (v != 65)
+abort ();
+  #pragma 

[gomp-3.1] Add a testcase for copyin of unallocated allocatable

2011-07-12 Thread Jakub Jelinek
Hi!

The final standard now explicitly lists how copyin should copy allocatables.
Here is a testcase for something that hasn't been covered by the testsuite
yet.

2011-07-12  Jakub Jelinek  

* testsuite/libgomp.fortran/allocatable8.f90: New test.

--- libgomp/testsuite/libgomp.fortran/allocatable8.f90.jj   2011-07-12 
15:42:56.0 +0200
+++ libgomp/testsuite/libgomp.fortran/allocatable8.f90  2011-07-12 
15:45:00.0 +0200
@@ -0,0 +1,14 @@
+! { dg-do run }
+! { dg-require-effective-target tls_runtime }
+!$ use omp_lib
+
+  integer, save, allocatable :: a(:, :)
+  logical :: l
+!$omp threadprivate (a)
+  if (allocated (a)) call abort
+  l = .false.
+!$omp parallel copyin (a) num_threads (4) reduction(.or.:l)
+  l = l.or.allocated (a)
+!$omp end parallel
+  if (l.or.allocated (a)) call abort
+end

Jakub


Re: [PATCH (4/7)] Unsigned multiplies using wider signed multiplies

2011-07-12 Thread Andrew Stubbs

On 04/07/11 15:26, Andrew Stubbs wrote:

On 28/06/11 15:14, Andrew Stubbs wrote:

On 28/06/11 13:33, Andrew Stubbs wrote:

On 23/06/11 15:41, Andrew Stubbs wrote:

If one or both of the inputs to a widening multiply are of unsigned
type
then the compiler will attempt to use usmul_widen_optab or
umul_widen_optab, respectively.

That works fine, but only if the target supports those operations
directly. Otherwise, it just bombs out and reverts to the normal
inefficient non-widening multiply.

This patch attempts to catch these cases and use an alternative signed
widening multiply instruction, if one of those is available.

I believe this should be legal as long as the top bit of both inputs is
guaranteed to be zero. The code achieves this guarantee by
zero-extending the inputs to a wider mode (which must still be narrower
than the output mode).

OK?


This update fixes the testsuite issue Janis pointed out.


And this one fixes up the wmul-5.c testcase also. The patch has changed
the correct result.


Here's an update for the context changed by the update to patch 3.

The content of the patch has not changed.


This update does the same thing as before, but updated for the changes 
earlier in the patch series. In particular, the build_and_insert_cast 
function and find_widening_optab_handler_and_mode changes have been 
moved up to patch 2.


OK?

Andrew
2011-07-12  Andrew Stubbs  

	gcc/
	* tree-ssa-math-opts.c (convert_mult_to_widen): Convert
	unsupported unsigned multiplies to signed.
	(convert_plusminus_to_widen): Likewise.

	gcc/testsuite/
	* gcc.target/arm/wmul-6.c: New file.

--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-6.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+long long
+foo (long long a, unsigned char *b, signed char *c)
+{
+  return a + (long long)*b * (long long)*c;
+}
+
+/* { dg-final { scan-assembler "smlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2071,6 +2071,7 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
   enum insn_code handler;
   enum machine_mode to_mode, from_mode;
   optab op;
+  bool do_cast = false;
 
   lhs = gimple_assign_lhs (stmt);
   type = TREE_TYPE (lhs);
@@ -2094,9 +2095,32 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
 		  0, &from_mode);
 
   if (handler == CODE_FOR_nothing)
-return false;
+{
+  if (op != smul_widen_optab)
+	{
+	  from_mode = GET_MODE_WIDER_MODE (from_mode);
+	  if (GET_MODE_SIZE (to_mode) <= GET_MODE_SIZE (from_mode))
+	return false;
+
+	  op = smul_widen_optab;
+	  handler = find_widening_optab_handler_and_mode (op, to_mode,
+			  from_mode, 0,
+			  &from_mode);
 
-  if (from_mode != TYPE_MODE (type1))
+	  if (handler == CODE_FOR_nothing)
+	return false;
+
+	  type1 = build_nonstandard_integer_type (
+	GET_MODE_PRECISION (from_mode),
+	0);
+	  type2 = type1;
+	  do_cast = true;
+	}
+  else
+	return false;
+}
+
+  if (from_mode != TYPE_MODE (type1) || do_cast)
 {
   location_t loc = gimple_location (stmt);
   tree tmp1, tmp2;
@@ -2143,6 +2167,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
   enum tree_code wmult_code;
   enum insn_code handler;
   enum machine_mode from_mode;
+  bool do_cast = false;
 
   lhs = gimple_assign_lhs (stmt);
   type = TREE_TYPE (lhs);
@@ -2234,8 +2259,21 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
   else
 return false;
 
+  /* We don't support usmadd yet, so try a wider signed mode.  */
   if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2))
-return false;
+{
+  enum machine_mode mode = TYPE_MODE (type1);
+  mode = GET_MODE_WIDER_MODE (mode);
+  if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (TYPE_MODE (type)))
+	{
+	  type1 = build_nonstandard_integer_type (GET_MODE_PRECISION (mode),
+		  0);
+	  type2 = type1;
+	  do_cast = true;
+	}
+  else
+	return false;
+}
 
   /* If there was a conversion between the multiply and addition
  then we need to make sure it fits a multiply-and-accumulate.
@@ -2276,7 +2314,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
   if (handler == CODE_FOR_nothing)
 return false;
 
-  if (TYPE_MODE (type1) != from_mode)
+  if (TYPE_MODE (type1) != from_mode || do_cast)
 {
   location_t loc = gimple_location (stmt);
   tree tmp;


Re: [PATCH, PR43864] Gimple level duplicate block cleanup.

2011-07-12 Thread Richard Guenther
On Tue, Jul 12, 2011 at 2:12 PM, Tom de Vries  wrote:
> Hi Richard,
>
> here's a new version of the pass. I attempted to address as much as possible
> your comments. The pass was bootstrapped and reg-tested on x86_64.
>
> On 06/14/2011 05:01 PM, Richard Guenther wrote:
>> On Fri, Jun 10, 2011 at 6:54 PM, Tom de Vries  wrote:
>>> Hi Richard,
>>>
>>> thanks for the review.
>>>
>>> On 06/08/2011 11:55 AM, Richard Guenther wrote:
 On Wed, Jun 8, 2011 at 11:42 AM, Tom de Vries  
 wrote:
> Hi Richard,
>
> I have a patch for PR43864. The patch adds a gimple level duplicate block
> cleanup. The patch has been bootstrapped and reg-tested on x86_64, and
> reg-tested on ARM. The size impact on ARM for spec2000 is shown in the 
> following
> table (%, lower is better).
>
>                     none            pic
>                thumb1  thumb2  thumb1 thumb2
> spec2000         99.9    99.9    99.8   99.8
>
> PR43864 is currently marked as a duplicate of PR20070, but I'm not sure 
> that the
> optimizations proposed in PR20070 would fix this PR.
>
> The problem in this PR is that when compiling with -O2, the example below 
> should
> only have one call to free. The original problem is formulated in terms 
> of -Os,
> but currently we generate one call to free with -Os, although still not 
> the
> smallest code possible. I'll show here the -O2 case, since that's similar 
> to the
> original PR.
>
>>>
>>> Example A. (naming it for reference below)
>>>
> #include 
> void foo (char*, FILE*);
> char* hprofStartupp(char *outputFileName, char *ctx)
> {
>    char fileName[1000];
>    FILE *fp;
>    sprintf(fileName, outputFileName);
>    if (access(fileName, 1) == 0) {
>        free(ctx);
>        return 0;
>    }
>
>    fp = fopen(fileName, 0);
>    if (fp == 0) {
>        free(ctx);
>        return 0;
>    }
>
>    foo(outputFileName, fp);
>
>    return ctx;
> }
>
> AFAIU, there are 2 complementary methods of rtl optimizations proposed in 
> PR20070.
> - Merging 2 blocks which are identical expect for input registers, by 
> using a
>  conditional move to choose between the different input registers.
> - Merging 2 blocks which have different local registers, by ignoring those
>  differences
>
> Blocks .L6 and.L7 have no difference in local registers, but they have a
> difference in input registers: r3 and r1. Replacing the move to r5 by a
> conditional move would probably be benificial in terms of size, but it's 
> not
> clear what condition the conditional move should be using. Calculating 
> such a
> condition would add in size and increase the execution path.
>
> gcc -O2 -march=armv7-a -mthumb pr43864.c -S:
> ...
>        push    {r4, r5, lr}
>        mov     r4, r0
>        sub     sp, sp, #1004
>        mov     r5, r1
>        mov     r0, sp
>        mov     r1, r4
>        bl      sprintf
>        mov     r0, sp
>        movs    r1, #1
>        bl      access
>        mov     r3, r0
>        cbz     r0, .L6
>        movs    r1, #0
>        mov     r0, sp
>        bl      fopen
>        mov     r1, r0
>        cbz     r0, .L7
>        mov     r0, r4
>        bl      foo
> .L3:
>        mov     r0, r5
>        add     sp, sp, #1004
>        pop     {r4, r5, pc}
> .L6:
>        mov     r0, r5
>        mov     r5, r3
>        bl      free
>        b       .L3
> .L7:
>        mov     r0, r5
>        mov     r5, r1
>        bl      free
>        b       .L3
> ...
>
> The proposed patch solved the problem by dealing with the 2 blocks at a 
> level
> when they are still identical: at gimple level. It detect that the 2 
> blocks are
> identical, and removes one of them.
>
> The following table shows the impact of the patch on the example in terms 
> of
> size for -march=armv7-a:
>
>          without     with    delta
> Os      :     108      104       -4
> O2      :     120      104      -16
> Os thumb:      68       64       -4
> O2 thumb:      76       64      -12
>
> The gain in size for -O2 is that of removing the entire block, plus the
> replacement of 2 moves by a constant set, which also decreases the 
> execution
> path. The patch ensures optimal code for both -O2 and -Os.
>
>
> By keeping track of equivalent definitions in the 2 blocks, we can ignore 
> those
> differences in comparison. Without this feature, we would only match 
> blocks with
> resultless operations, due the the ssa-nature of gimples.
> For example, with this feature, we reduce the following function to its 
> minimum
> at gim

Re: [patch tree-optimization]: [2 of 3]: Boolify compares & more

2011-07-12 Thread Richard Guenther
On Tue, Jul 12, 2011 at 2:21 PM, Kai Tietz  wrote:
> 2011/7/12 Richard Guenther :
>> On Tue, Jul 12, 2011 at 11:48 AM, Kai Tietz  wrote:
>>> 2011/7/12 Richard Guenther :
 On Mon, Jul 11, 2011 at 5:37 PM, Kai Tietz  wrote:
> 2011/7/8 Richard Guenther :
>> On Fri, Jul 8, 2011 at 1:32 PM, Kai Tietz  
>> wrote:
>>> 2011/7/8 Richard Guenther :
 On Thu, Jul 7, 2011 at 6:07 PM, Kai Tietz  
 wrote:
> Hello,
>
> This patch - second of series - adds boolification of comparisions in
> gimplifier.  For this
> casts from/to boolean are marked as not-useless. And in fold_unary_loc
> casts to non-boolean integral types are preserved.
> The hunk in tree-ssa-forwprop.c in combine_cond-expr_cond is not 
> strictly
> necessary - as long as fold-const handles 1-bit precision 
> bitwise-expression
> with truth-logic - but it has shown to short-cut some expensier 
> folding. So
> I kept it within this patch.

 Please split it out.  Also ...

>
> The adjusted testcase gcc.dg/uninit-15.c indicates that due
> optimization we loose
> in this case variables declaration.  But this might be to be expected.
>
> In vectorization we have a regression in gcc.dg/vect/vect-cond-3.c
> test-case.  It's caused
> by always having boolean-type on conditions.  So vectorizer sees
> different types, which
> aren't handled by vectorizer right now.  Maybe this issue could be
> special-cased for
> boolean-types in tree-vect-loop, by making operand for used condition
> equal to vector-type.
> But this is a subject for a different patch and not addressed by this 
> series.
>
> There is a regressions in tree-ssa/vrp47.c, and the fix is addressed
> by the 3rd patch of this
> series.
>
> Bootstrapped and regression tested for all standard-languages (plus
> Ada and Obj-C++) on host x86_64-pc-linux-gnu.
>
> Ok for apply?
>
> Regards,
> Kai
>
>
> ChangeLog
>
> 2011-07-07  Kai Tietz  
>
>        * fold-const.c (fold_unary_loc): Preserve
>        non-boolean-typed casts.
>        * gimplify.c (gimple_boolify): Handle boolification
>        of comparisons.
>        (gimplify_expr): Boolifiy non aggregate-typed
>        comparisons.
>        * tree-cfg.c (verify_gimple_comparison): Check result
>        type of comparison expression.
>        * tree-ssa.c (useless_type_conversion_p): Preserve incompatible
>        casts from/to boolean,
>        * tree-ssa-forwprop.c (combine_cond_expr_cond): Add 
> simplification
>        support for one-bit-precision typed X for cases X != 0 and X 
> == 0.
>        (forward_propagate_comparison): Adjust test of condition
>        result.
>
>
>        * gcc.dg/tree-ssa/builtin-expect-5.c: Adjusted.
>        * gcc.dg/tree-ssa/pr21031.c: Likewise.
>        * gcc.dg/tree-ssa/pr30978.c: Likewise.
>        * gcc.dg/tree-ssa/ssa-fre-6.c: Likewise.
>        * gcc.dg/binop-xor1.c: Mark it as expected fail.
>        * gcc.dg/binop-xor3.c: Likewise.
>        * gcc.dg/uninit-15.c: Adjust reported message.
>
> Index: gcc-head/gcc/fold-const.c
> ===
> --- gcc-head.orig/gcc/fold-const.c
> +++ gcc-head/gcc/fold-const.c
> @@ -7665,11 +7665,11 @@ fold_unary_loc (location_t loc, enum tre
>             non-integral type.
>             Do not fold the result as that would not simplify 
> further, also
>             folding again results in recursions.  */
> -         if (INTEGRAL_TYPE_P (type))
> +         if (TREE_CODE (type) == BOOLEAN_TYPE)
>            return build2_loc (loc, TREE_CODE (op0), type,
>                               TREE_OPERAND (op0, 0),
>                               TREE_OPERAND (op0, 1));
> -         else
> +         else if (!INTEGRAL_TYPE_P (type))
>            return build3_loc (loc, COND_EXPR, type, op0,
>                               fold_convert (type, boolean_true_node),
>                               fold_convert (type, 
> boolean_false_node));
> Index: gcc-head/gcc/gimplify.c
> ===
> --- gcc-head.orig/gcc/gimplify.c
> +++ gcc-head/gcc/gimplify.c
> @@ -2842,18 +2842,23 @@ gimple_boolify (tree expr)
>
>     case TRUTH_NOT_EXPR:
>       TREE_OPERAND

[PATCH] Fixup copyrename statistics

2011-07-12 Thread Richard Guenther

I noticed we forget to clear the stats and that we do 1:1
replacements.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2011-07-12  Richard Guenther  

* tree-ssa-copyrename.c (rename_ssa_copies): Zero statistics.
Do not perform no-op changes.

Index: gcc/tree-ssa-copyrename.c
===
--- gcc/tree-ssa-copyrename.c   (revision 176196)
+++ gcc/tree-ssa-copyrename.c   (working copy)
@@ -296,6 +296,8 @@ rename_ssa_copies (void)
   FILE *debug;
   bool updated = false;
 
+  memset (&stats, 0, sizeof (stats));
+
   if (dump_file && (dump_flags & TDF_DETAILS))
 debug = dump_file;
   else
@@ -355,16 +357,15 @@ rename_ssa_copies (void)
   if (!part_var)
 continue;
   var = ssa_name (x);
+  if (SSA_NAME_VAR (var) == SSA_NAME_VAR (part_var))
+   continue;
   if (debug)
 {
- if (SSA_NAME_VAR (var) != SSA_NAME_VAR (part_var))
-   {
- fprintf (debug, "Coalesced ");
- print_generic_expr (debug, var, TDF_SLIM);
- fprintf (debug, " to ");
- print_generic_expr (debug, part_var, TDF_SLIM);
- fprintf (debug, "\n");
-   }
+ fprintf (debug, "Coalesced ");
+ print_generic_expr (debug, var, TDF_SLIM);
+ fprintf (debug, " to ");
+ print_generic_expr (debug, part_var, TDF_SLIM);
+ fprintf (debug, "\n");
}
   stats.coalesced++;
   replace_ssa_name_symbol (var, SSA_NAME_VAR (part_var));


Re: [build] Move dfp-bit support to toplevel libgcc

2011-07-12 Thread Rainer Orth
Hello Thomas,

> On Fri, 08 Jul 2011 13:25:13 +0200, Rainer Orth 
>  wrote:
>> 2011-06-22  Rainer Orth  
>> 
>>  gcc:
>>  * config/dfp-bit.c, config/dfp-bit.h: Move to ../libgcc.
>>  * config/t-dfprules: Move to ../libgcc/config.
>
> Seems that you forgot to remove the files from gcc/config/.

right: I svn mv'd them, but forgot to include them in the commit.

Fixed.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH][1/N][C][C++][Fortran]Change POINTER_PLUS_EXPR offset type requirements

2011-07-12 Thread Tom Tromey
> "Richard" == Richard Guenther  writes:

Richard>java/
Richard>* builtins.c (static): Use fold_build_pointer_plus.
Richard>* class.c (make_class_data): Likewise.
Richard>(build_symbol_entry): Likewise.
Richard>* except.c (build_exception_object_ref): Likewise.
Richard>* expr.c (build_java_arrayaccess): Likewise.
Richard>(build_field_ref): Likewise.
Richard>(build_known_method_ref): Likewise.
Richard>(build_invokevirtual): Likewise.

Ok.

Tom


Re: [Patch 3/3] ARM 64 bit atomic operations

2011-07-12 Thread Ramana Radhakrishnan
On 1 July 2011 16:57, Dr. David Alan Gilbert  wrote:
>
> As per pr/48126 Michael Edwards spotted that in the case where
> the compare fails in the cmpxchg, the barrier at the end wasn't taken
> theoretically allowing a following load to float up above the load
> value compared.

Please resubmit with a proper changelog entry. Can you add a comment
in the code to explain that this is to prevent speculative loads
before the barrier ?

cheers
Ramana

>
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 057f9ba..39057d2 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -23531,8 +23626,8 @@ arm_output_sync_loop (emit_f emit,
>        }
>     }
>
> -  arm_process_output_memory_barrier (emit, NULL);
>   arm_output_asm_insn (emit, 1, operands, "%sLSYB%%=:", LOCAL_LABEL_PREFIX);
> +  arm_process_output_memory_barrier (emit, NULL);
>  }
>
>  static rtx
>


Fwd: Re: The TI C6X port

2011-07-12 Thread Bernd Schmidt
I noticed that this approval had not been sent to the mailing list. I'm
restesting and checking in the remaining preliminary patches for C6X now.


Bernd
 Original Message 
Subject: Re: The TI C6X port
Date: Wed, 25 May 2011 17:11:53 -0400
From: Vladimir Makarov 
To: Bernd Schmidt 

On 05/24/2011 08:29 PM, Vladimir Makarov wrote:
> On 11-05-23 5:45 AM, Bernd Schmidt wrote:
>
>> http://gcc.gnu.org/ml/gcc-patches/2011-05/msg00753.html
> It is a really big patch.  I need more time to look at it.
I finished this.  It is ok for me.  Only a comment for new function
estimate_insn_tick (with 3 args) is missed.  I also found a typo in
changelog entry 'estiamte'.

Thanks for the patch.



Re: [dwarf2cfi] Cleanup interpretation of cfa.reg

2011-07-12 Thread Richard Henderson
On 07/12/2011 02:05 AM, Andreas Schwab wrote:
> Richard Henderson  writes:
> 
>> @@ -261,6 +262,15 @@ extern void dwarf2out_set_demangle_name_func (const 
>> char *(*) (const char *));
>>  extern void dwarf2out_vms_debug_main_pointer (void);
>>  #endif
>>  
>> +/* Unfortunately, DWARF_FRAME_REGNUM is not universally defined in such a
>> +   way as to force an unsigned return type.  Do that via inline wrapper.  */
>> +
>> +static inline unsigned
>> +dwarf_frame_regnum (unsigned regnum)
>> +{
>> +  return DWARF_FRAME_REGNUM (regnum);
>> +}
>> +  
> 
> I think this has caused the bootstrap failure on ia64:
> 
> In file included from ../../gcc/dwarf2cfi.c:31:0:
> ../../gcc/dwarf2out.h: In function 'dwarf_frame_regnum':
> ../../gcc/dwarf2out.h:271:3: error: implicit declaration of function 
> 'ia64_dbx_register_number' [-Werror=implicit-function-declaration]

Dang it.  And the whole point of adding that inline was to 
enhance portability across targets.

Ok, I'll fix it in a minute.


r~


[build] Move sync, mips16.S to toplevel libgcc

2011-07-12 Thread Rainer Orth
Another easy part in the toplevel libgcc move was sync.c and related
stuff.  While doing this, it turned out to be easier to move the rest of
gcc/config/mips/t-libgcc-mips16 rather than leave it behind.

The patch is untested except for including it in a mips-sgi-irix6.5
build to make sure it is syntactically correct, but I don't have mips16
system to actually test.

Richard, could you give it a whirl?

Ok for mainline?

Thanks.
Rainer


2011-07-10  Rainer Orth  

gcc:
* config/sync.c: Move to ../libgcc.
* Makefile.in (libgcc.mvars): Remove LIBGCC_SYNC,
LIBGCC_SYNC_CFLAGS.

* config/mips/t-libgcc-mips16: Move to ../libgcc/config/mips.
* config/mips/libgcc-mips16.ver: Likewise.
* config/mips/mips16.S: Likewise.
* config.gcc (mips64*-*-linux*): Remove mips/t-libgcc-mips16 from
tmake_file.
(mips*-*-linux*): Likewise.
(mips*-sde-elf*): Likewise.
(mipsisa32-*-elf*): Likewise.
(mipsisa64sb1-*-elf*): Likewise.
(mips-*-elf*): Likewise.
(mips64-*-elf*): Likewise.
(mips64orion-*-elf*): Likewise.
(mips*-*-rtems*): Likewise.
(mipstx39-*-elf*): Likewise.

libgcc:
* sync.c: New file.
* config/mips/t-mips16: New file.
* config.host (mips64*-*-linux*): Add mips/t-mips16 to tmake_file.
(mips*-*-linux*): Likewise.
(mips*-sde-elf*): Likewise.
(mipsisa32-*-elf*): Join with mipsisa32r2-*-elf*,
mipsisa64-*-elf*, mipsisa64r2-*-elf*.
Add mips/t-mips16 to tmake_file.
(mipsisa64sb1-*-elf*): Add mips/t-mips16 to tmake_file.
(mips-*-elf*): Likewise.
(mips64-*-elf*): Likewise.
(mips64orion-*-elf*): Likewise.
(mips*-*-rtems*): Likewise.
(mipstx39-*-elf*): Likewise.
* Makefile.in: Use SYNC instead of LIBGCC_SYNC.
($(libgcc-sync-size-funcs-o)): Use SYNC_CFLAGS instead of
LIBGCC_SYNC_CFLAGS.
Use $(srcdir) to refer to sync.c.
Use $<.
($(libgcc-sync-funcs-o)): Likewise.
($(libgcc-sync-size-funcs-s-o)): Likewise.
($(libgcc-sync-funcs-s-o)): Likewise.

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1882,8 +1882,6 @@ libgcc.mvars: config.status Makefile $(L
echo SHLIB_NM_FLAGS = '$(SHLIB_NM_FLAGS)' >> tmp-libgcc.mvars
echo LIBGCC2_CFLAGS = '$(LIBGCC2_CFLAGS)' >> tmp-libgcc.mvars
echo TARGET_LIBGCC2_CFLAGS = '$(TARGET_LIBGCC2_CFLAGS)' >> 
tmp-libgcc.mvars
-   echo LIBGCC_SYNC = '$(LIBGCC_SYNC)' >> tmp-libgcc.mvars
-   echo LIBGCC_SYNC_CFLAGS = '$(LIBGCC_SYNC_CFLAGS)' >> tmp-libgcc.mvars
echo CRTSTUFF_CFLAGS = '$(CRTSTUFF_CFLAGS)' >> tmp-libgcc.mvars
echo CRTSTUFF_T_CFLAGS = '$(CRTSTUFF_T_CFLAGS)' >> tmp-libgcc.mvars
echo CRTSTUFF_T_CFLAGS_S = '$(CRTSTUFF_T_CFLAGS_S)' >> tmp-libgcc.mvars
diff --git a/gcc/config.gcc b/gcc/config.gcc
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1829,7 +1829,7 @@ mips*-*-netbsd*)  # NetBSD/mips, either
;;
 mips64*-*-linux* | mipsisa64*-*-linux*)
tm_file="dbxelf.h elfos.h gnu-user.h linux.h glibc-stdint.h ${tm_file} 
mips/linux.h mips/linux64.h"
-   tmake_file="${tmake_file} mips/t-linux64 mips/t-libgcc-mips16"
+   tmake_file="${tmake_file} mips/t-linux64"
tm_defines="${tm_defines} MIPS_ABI_DEFAULT=ABI_N32"
case ${target} in
mips64el-st-linux-gnu)
@@ -1850,7 +1850,6 @@ mips64*-*-linux* | mipsisa64*-*-linux*)
;;
 mips*-*-linux*)# Linux MIPS, either endian.
 tm_file="dbxelf.h elfos.h gnu-user.h linux.h glibc-stdint.h ${tm_file} 
mips/linux.h"
-   tmake_file="${tmake_file} mips/t-libgcc-mips16"
case ${target} in
 mipsisa32r2*)
tm_defines="${tm_defines} MIPS_ISA_DEFAULT=33"
@@ -1873,7 +1872,7 @@ mips*-*-openbsd*)
;;
 mips*-sde-elf*)
tm_file="elfos.h newlib-stdint.h ${tm_file} mips/elf.h mips/sde.h"
-   tmake_file="mips/t-sde mips/t-libgcc-mips16"
+   tmake_file="mips/t-sde"
extra_options="${extra_options} mips/sde.opt"
case "${with_newlib}" in
  yes)
@@ -1910,7 +1909,7 @@ mipsisa32r2-*-elf* | mipsisa32r2el-*-elf
 mipsisa64-*-elf* | mipsisa64el-*-elf* | \
 mipsisa64r2-*-elf* | mipsisa64r2el-*-elf*)
tm_file="elfos.h newlib-stdint.h ${tm_file} mips/elf.h"
-   tmake_file="mips/t-isa3264 mips/t-libgcc-mips16"
+   tmake_file="mips/t-isa3264"
case ${target} in
  mipsisa32r2*)
tm_defines="${tm_defines} MIPS_ISA_DEFAULT=33"
@@ -1947,17 +1946,17 @@ mipsisa64sr71k-*-elf*)
 ;;
 mipsisa64sb1-*-elf* | mipsisa64sb1el-*-elf*)
tm_file="elfos.h newlib-stdint.h ${tm_file} mips/elf.h"
-   tmake_file="mips/t-elf mips/t-libgcc-mips16 mips/t-sb1"
+   tmake_file="mips/t-elf mips/t-sb1"
target_cpu_default="MASK_64BIT|MASK_FLOAT64"
  

[build] Move darwin-crt[23].c to toplevel libgcc

2011-07-12 Thread Rainer Orth
As a prerequisite to moving i386/crtprec.c to toplevel libgcc, it turned
out that I need to move darwin-crt[23].c first to avoid the problem with
inconsistent versions of extra_parts in gcc and libgcc:

http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00831.html

The following pretty mechanical patch does this.

Bootstraps on i386-apple-darwin9.8.0 and powerpc-apple-darwin9.8.0 are
well beyond stage1 now, so libgcc and thus crt?.o build correctly.

Ok for mainline if they pass?

Thanks.
Rainer


2011-07-12  Rainer Orth  

gcc:
* config/darwin-crt2.c: Move to ../libgcc/config/rs6000.
* config/darwin-crt3.c: Move to ../libgcc/config.
* config/t-darwin (EXTRA_MULTILIB_PARTS): Remove.
($(T)crt3$(objext)): Remove.
* config/rs6000/t-darwin (DARWIN_EXTRA_CRT_BUILD_CFLAGS): Remove.
($(T)crt2$(objext)): Remove.
* config.gcc (powerpc-*-darwin*): Remove extra_parts.
(powerpc64-*-darwin*): Likewise.

gcc/po:
* EXCLUDES (config/darwin-crt2.c): Remove.

libgcc:
* config/darwin-crt3.o: New file.
* config/rs6000/darwin-crt2.c: New file.
* config/t-darwin (crt3.o): New rule.
* config/rs6000/t-darwin (DARWIN_EXTRA_CRT_BUILD_CFLAGS): New variable.
(crt2.o): New rule.
* config.host (*-*-darwin*): Add crt3.o to extra_parts.
(powerpc-*-darwin*): Add crt2.o to extra_parts.
(powerpc64-*-darwin*): Likewise.

diff --git a/gcc/config.gcc b/gcc/config.gcc
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -2019,7 +2019,6 @@ picochip-*)
 #  ;;
 powerpc-*-darwin*)
extra_options="${extra_options} rs6000/darwin.opt"
-   extra_parts="crt2.o"
case ${target} in
  *-darwin1[0-9]* | *-darwin[8-9]*)
tmake_file="${tmake_file} rs6000/t-darwin8"
@@ -2036,7 +2035,6 @@ powerpc-*-darwin*)
;;
 powerpc64-*-darwin*)
extra_options="${extra_options} ${cpu_type}/darwin.opt"
-   extra_parts="crt2.o"
tmake_file="${tmake_file} ${cpu_type}/t-darwin64 t-slibgcc-dummy"
tm_file="${tm_file} ${cpu_type}/darwin8.h ${cpu_type}/darwin64.h"
extra_headers=altivec.h
diff --git a/gcc/config/rs6000/t-darwin b/gcc/config/rs6000/t-darwin
--- a/gcc/config/rs6000/t-darwin
+++ b/gcc/config/rs6000/t-darwin
@@ -25,8 +25,6 @@ LIB2FUNCS_STATIC_EXTRA = \
$(srcdir)/config/rs6000/darwin-fpsave.asm  \
$(srcdir)/config/rs6000/darwin-vecsave.asm
 
-DARWIN_EXTRA_CRT_BUILD_CFLAGS = -mlongcall -mmacosx-version-min=10.4
-
 # The .asm files above are designed to run on all processors,
 # even though they use AltiVec instructions.  -Wa is used because
 # -force_cpusubtype_ALL doesn't work with -dynamiclib.
@@ -39,10 +37,3 @@ TARGET_LIBGCC2_CFLAGS = -Wa,-force_cpusu
 
 darwin-fpsave.o:   $(srcdir)/config/rs6000/darwin-asm.h
 darwin-tramp.o:$(srcdir)/config/rs6000/darwin-asm.h
-
-# Explain how to build crt2.o
-$(T)crt2$(objext): $(srcdir)/config/darwin-crt2.c $(GCC_PASSES) \
-   $(TCONFIG_H) stmp-int-hdrs tsystem.h
-   $(GCC_FOR_TARGET) $(GCC_CFLAGS) $(INCLUDES) $(MULTILIB_CFLAGS) \
- $(DARWIN_EXTRA_CRT_BUILD_CFLAGS) \
- -c $(srcdir)/config/darwin-crt2.c -o $(T)crt2$(objext)
diff --git a/gcc/config/t-darwin b/gcc/config/t-darwin
--- a/gcc/config/t-darwin
+++ b/gcc/config/t-darwin
@@ -42,15 +42,6 @@ darwin-driver.o: $(srcdir)/config/darwin
$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \
  $(srcdir)/config/darwin-driver.c
 
-# How to build crt3.o
-EXTRA_MULTILIB_PARTS=crt3.o
-# Pass -fno-tree-dominator-opts to work around bug 26840.
-$(T)crt3$(objext): $(srcdir)/config/darwin-crt3.c $(GCC_PASSES) \
-   $(TCONFIG_H) stmp-int-hdrs tsystem.h
-   $(GCC_FOR_TARGET) $(GCC_CFLAGS) $(INCLUDES) $(MULTILIB_CFLAGS) \
- -fno-tree-dominator-opts $(DARWIN_EXTRA_CRT_BUILD_CFLAGS) \
- -c $(srcdir)/config/darwin-crt3.c -o $(T)crt3$(objext)
-
 # -pipe because there's an assembler bug, 4077127, which causes
 # it to not properly process the first # directive, causing temporary
 # file names to appear in stabs, causing the bootstrap to fail.  Using -pipe
diff --git a/gcc/po/EXCLUDES b/gcc/po/EXCLUDES
--- a/gcc/po/EXCLUDES
+++ b/gcc/po/EXCLUDES
@@ -22,7 +22,6 @@
 # .def are examined to begin with.
 
 #   These files are part of libgcc, or target headers provided by gcc.
-config/darwin-crt2.c
 config/vxlib.c
 crtstuff.c
 gbl-ctors.h
diff --git a/libgcc/config.host b/libgcc/config.host
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -161,6 +161,7 @@ case ${host} in
 *-*-darwin*)
   asm_hidden_op=.private_extern
   tmake_file="t-darwin ${cpu_type}/t-darwin t-slibgcc-darwin"
+  extra_parts=crt3.o
   ;;
 *-*-freebsd[12] | *-*-freebsd[12].* | *-*-freebsd*aout*)
   # This is the place-holder for the generic a.out configuration
@@ -616,9 +617,11 @@ powerpc-*-darwin*)
  ;;
esac
tmake_file="$tmake_file rs6000/t-ibm-ldouble"
+   e

RFA: Avoid unnecessary clearing in union initialisers

2011-07-12 Thread Richard Sandiford
PR 48183 is caused by the fact that we don't really support integers
(or least integer constants) wider than 2*HOST_BITS_PER_WIDE_INT:

   http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01220.html

However, such constants shouldn't be needed in normal use.
They came from an unnecessary zero-initialisation of a union such as:

   union { a f1; b f2; } u = { init_f1 };

where f1 and f2 are the full width of the union.  The zero-initialisation
gets optimised away for "real" insns, but persists in debug insns:

   http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01585.html

This patch takes up Richard's idea here:

   http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01987.html

categorize_ctor_elements currently tries to work out how many scalars a
constructor initialises (IE) and how many of those scalars are zero (ZE).
Callers can then call count_type_elements to find out how many scalars (TE)
ought to be initialised if the constructor is "complete" (i.e. if it
explicitly initialises every meaningful byte, rather than relying on
default zero-initialisation).  The constructor is complete if TE == ZE,
except as noted in [A] below.

However, count_type_elements can't return the required TE for unions,
because it would need to know which of the union's fields was initialised
by the constructor (if any).  This choice of field is reflected in IE and
ZE, so would need to be reflected in TE as well.

count_type_elements therefore punts on unions.  However, the caller
can't easily tell whether it punts because of that, because of overflow,
of because of variable-sized types.

[A] One particular case of interest is when a union constructor initialises
a field that is shorter than the union.  In this case, the rest of the
union must be zeroed in order to ensure that the other fields have
predictable values.  categorize_ctor_elements has a special out-parameter
to reccord this situation.

This leads to quite a complicated interface.  The patch tries to
simplify it by making categorize_ctor_elements keep track of whether
a constructor is complete.  This also has the minor advantage of
avoiding double recursion: first through the constructor,
then through its type tree.

After this change, ZE and IE are only needed when deciding how best to
implement "complete" initialisers (such as whether to do a bulk zero
initialisation anyway, and just write the nonzero elements individually).
For cases where a "leaf" constructor element is itself an aggregate with
a union, we can therefore estimate the number of scalars in the union,
and hopefully make the heuristic a bit more accurate than the current 1:

HOST_WIDE_INT tc = count_type_elements (TREE_TYPE (value), true);
if (tc < 1)
  tc = 1;

cp/typeck2.c also wants to check whether the variable parts of a
constructor are complete.  The patch uses the approach to completeness
there.  This should make it a bit more general than the current code,
which only deals with non-nested constructors.

Tested on x86_64-linux-gnu (all languages, including Ada), and on
arm-linux-gnueabi.  OK to install?

Richard


gcc/
* tree.h (categorize_ctor_elements): Remove comment.  Fix long line.
(count_type_elements): Delete.
(complete_ctor_at_level_p): Declare.
* expr.c (flexible_array_member_p): New function, split out from...
(count_type_elements): ...here.  Make static.  Replace allow_flexarr
parameter with for_ctor_p.  When for_ctor_p is true, return the
number of elements that should appear in the top-level constructor,
otherwise return an estimate of the number of scalars.
(categorize_ctor_elements): Replace p_must_clear with p_complete.
(categorize_ctor_elements_1): Likewise.  Use complete_ctor_at_level_p.
(complete_ctor_at_level_p): New function, borrowing union logic
from old categorize_ctor_elements_1.
(mostly_zeros_p): Return true if the constructor is not complete.
(all_zeros_p): Update call to categorize_ctor_elements.
* gimplify.c (gimplify_init_constructor): Update call to
categorize_ctor_elements.  Don't call count_type_elements.
Unconditionally prevent clearing for variable-sized types,
otherwise rely on categorize_ctor_elements to detect
incomplete initializers.

gcc/cp/
* typeck2.c (split_nonconstant_init_1): Pass the initializer directly,
rather than a pointer to it.  Return true if the whole of the value
was initialized by the generated statements.  Use
complete_ctor_at_level_p instead of count_type_elements.

gcc/testsuite/
2011-07-12  Chung-Lin Tang  

* gcc.target/arm/pr48183.c: New test.

Index: gcc/tree.h
===
--- gcc/tree.h  2011-07-12 15:30:05.0 +0100
+++ gcc/tree.h  2011-07-12 15:32:34.0 +0100
@@ -4804,21 +4804,10 @@ extern bool initializer_zerop (const_tre
 
 extern VEC(tree,gc) *ct

[build] Move i386/crtprec to toplevel libgcc

2011-07-12 Thread Rainer Orth
The next easy step in toplevel libgcc migration is moving
i386/crtprec.c.  I noticed that -mpc{32, 64, 80} wasn't supported on
Solaris/x86 yet and corrected that.  The only testcase using the switch
was adapted to also do so on Darwin/x86 (which already has the support,
but didn't exercise it).

For the reasons already described, I'm not yet removing crtprec??.o from
gcc/config/i386/t-linux64 (EXTRA_MULTILIB_PARTS).

Bootstrapped without regressions on i386-pc-solaris2.11,
x86_64-unknown-linux-gnu.  Bootstrap on i386-apple-darwin9.8.0 is
currently running.

Ok for mainline?

Thanks.
Rainer


2011-07-10  Rainer Orth  

gcc:
* config/i386/crtprec.c: Move to ../libgcc/config/i386.
* config/i386/t-crtpc: Remove.
* config/t-darwin (EXTRA_MULTILIB_PARTS): Remove.
* config.gcc (i[34567]86-*-darwin*): Remove i386/t-crtpc from
tmake_file.
(x86_64-*-darwin*): Likewise.
(i[34567]86-*-linux*): Likewise.
(x86_64-*-linux*): Likewise.

* config/i386/sol2.h (ENDFILE_SPEC): Redefine.
Handle -mpc32, -mpc64, -mpc80.

libgcc:
* config/i386/crtprec.c: New file.
* config/i386/t-crtpc: Use $(srcdir) to refer to crtprec.c.
* config.host (i[34567]86-*-darwin*): Add i386/t-crtpc to tmake_file.
Add crtprec32.o, crtprec64.o, crtprec80.o to extra_parts.
(x86_64-*-darwin*): Likewise.
(i[34567]86-*-solaris2*: Likewise.

gcc/testsuite:
* gcc.c-torture/execute/990127-2.x: Use -mpc64 on i?86-*-darwin*,
i?86-*-solaris2*, x86_64-*-darwin*, x86_64-*-solaris2*.

diff --git a/gcc/config.gcc b/gcc/config.gcc
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1208,12 +1208,12 @@ i[34567]86-*-darwin*)
need_64bit_isa=yes
# Baseline choice for a machine that allows m64 support.
with_cpu=${with_cpu:-core2}
-   tmake_file="${tmake_file} t-slibgcc-dummy i386/t-crtpc"
+   tmake_file="${tmake_file} t-slibgcc-dummy"
libgcc_tm_file="$libgcc_tm_file i386/darwin-lib.h"
;;
 x86_64-*-darwin*)
with_cpu=${with_cpu:-core2}
-   tmake_file="${tmake_file} ${cpu_type}/t-darwin64 t-slibgcc-dummy 
i386/t-crtpc"
+   tmake_file="${tmake_file} ${cpu_type}/t-darwin64 t-slibgcc-dummy"
tm_file="${tm_file} ${cpu_type}/darwin64.h"
libgcc_tm_file="$libgcc_tm_file i386/darwin-lib.h"
;;
@@ -1311,7 +1311,7 @@ i[34567]86-*-linux* | i[34567]86-*-kfree
i[34567]86-*-kopensolaris*-gnu) tm_file="${tm_file} i386/gnu-user.h 
kopensolaris-gnu.h i386/kopensolaris-gnu.h" ;;
i[34567]86-*-gnu*) tm_file="$tm_file i386/gnu-user.h gnu.h i386/gnu.h";;
esac
-   tmake_file="${tmake_file} i386/t-crtstuff i386/t-crtpc"
+   tmake_file="${tmake_file} i386/t-crtstuff"
;;
 x86_64-*-linux* | x86_64-*-kfreebsd*-gnu | x86_64-*-knetbsd*-gnu)
tm_file="${tm_file} i386/unix.h i386/att.h dbxelf.h elfos.h gnu-user.h 
glibc-stdint.h \
@@ -1323,7 +1323,7 @@ x86_64-*-linux* | x86_64-*-kfreebsd*-gnu
x86_64-*-kfreebsd*-gnu) tm_file="${tm_file} kfreebsd-gnu.h 
i386/kfreebsd-gnu64.h" ;;
x86_64-*-knetbsd*-gnu) tm_file="${tm_file} knetbsd-gnu.h" ;;
esac
-   tmake_file="${tmake_file} i386/t-linux64 i386/t-crtstuff i386/t-crtpc"
+   tmake_file="${tmake_file} i386/t-linux64 i386/t-crtstuff"
x86_multilibs="${with_multilib_list}"
if test "$x86_multilibs" = "default"; then
x86_multilibs="m64,m32"
diff --git a/gcc/config/i386/sol2.h b/gcc/config/i386/sol2.h
--- a/gcc/config/i386/sol2.h
+++ b/gcc/config/i386/sol2.h
@@ -70,6 +70,14 @@ along with GCC; see the file COPYING3.  
 #undef ASM_SPEC
 #define ASM_SPEC ASM_SPEC_BASE
 
+#undef  ENDFILE_SPEC
+#define ENDFILE_SPEC \
+  "%{Ofast|ffast-math|funsafe-math-optimizations:crtfastmath.o%s} \
+   %{mpc32:crtprec32.o%s} \
+   %{mpc64:crtprec64.o%s} \
+   %{mpc80:crtprec80.o%s} \
+   crtend.o%s crtn.o%s"
+
 #define SUBTARGET_CPU_EXTRA_SPECS \
   { "cpp_subtarget",CPP_SUBTARGET_SPEC },  \
   { "asm_cpu",  ASM_CPU_SPEC },\
diff --git a/gcc/config/i386/t-crtpc b/gcc/config/i386/t-crtpc
deleted file mode 100644
diff --git a/gcc/testsuite/gcc.c-torture/execute/990127-2.x 
b/gcc/testsuite/gcc.c-torture/execute/990127-2.x
--- a/gcc/testsuite/gcc.c-torture/execute/990127-2.x
+++ b/gcc/testsuite/gcc.c-torture/execute/990127-2.x
@@ -3,12 +3,16 @@
 # Use -mpc64 to force 80387 floating-point precision to 64 bits.  This option
 # has no effect on SSE, but it is needed in case of -m32 on x86_64 targets.
 
-if { [istarget i?86-*-linux*]
+if { [istarget i?86-*-darwin*]
+ || [istarget i?86-*-linux*]
  || [istarget i?86-*-kfreebsd*-gnu]
  || [istarget i?86-*-knetbsd*-gnu]
+ || [istarget i?86-*-solaris2*]
+ || [istarget x86_64-*-darwin*]
  || [istarget x86_64-*-linux*]
  || [istarget x86_64-*-kfreebsd*-gnu]
- || [istarget x86_64-*-knetbsd*-gnu] } {
+ || [istarget x86_64-*-

[build] Remove crt0, mcrt0 support

2011-07-12 Thread Rainer Orth
gcc/Makefile.in currently has some support for crt0.o and mcrt0.o, but
it is only used by the i?86-*-netware* target, which admits in a comment
to abuse it.

So this patch removes the related support and moves the NetWare files
over to libgcc.  I've decided to rename libgcc/config/i386/t-nwld to
t-slibgcc-nwld since that's what it is and put the related stuff from
gcc/config/i386/t-nwld into a new file in libgcc.  This isn't presented
in the patch below in a readable manner, though ;-(

This patch has only been included e.g. in i386-pc-solaris2.11 bootstraps
to assure syntactic correctness.

On the other hand, maybe it's time to obsolete or even immediately
remove the netware port: there is no listed maintainer, no testsuite
results at least back to 2007 (if any were ever posted), and the only
netware-related change that hasn't been part of general cleanup is
almost two years ago.

Thoughts?

Rainer


2011-07-10  Rainer Orth  

gcc:
* Makefile.in (CRT0STUFF_T_CFLAGS): Remove.
($(T)crt0.o, $(T)mcrt0.o, s-crt0): Remove.
* config/i386/netware-crt0.c: Move to ../libgcc/config/i386.
* config/i386/t-nwld (CRTSTUFF_T_CFLAGS, CRT0STUFF_T_CFLAGS): Remove.
(CRT0_S, MCRT0_S): Remove.
($(T)libgcc.def, $(T)libc.def, $(T)libcpre.def, $(T)posixpre.def):
Remove.
(s-crt0): Remove.
* config.gcc (i[3456x]86-*-netware*): Remove extra_parts.

libgcc:
* config/i386/netware-crt0.c: New file.
* config/i386/t-nwld: Rename to ...
* config/i386/t-slibgcc-nwld: ... this.
* config/i386/t-nwld: New file.
* config.host (i[3456x]86-*-netware*): Add i386/t-slibgcc-nwld to
tmake_file.
Add crt0.o, libgcc.def, libc.def, libcpre.def, posixpre.def to
extra_parts.

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -796,9 +796,6 @@ RUNTESTFLAGS =
 # Extra flags to use when compiling crt{begin,end}.o.
 CRTSTUFF_T_CFLAGS =
 
-# Extra flags to use when compiling [m]crt0.o.
-CRT0STUFF_T_CFLAGS =
-
 # "t" or nothing, for building multilibbed versions of, say, crtbegin.o.
 T =
 
@@ -1947,18 +1944,6 @@ s-mlib: $(srcdir)/genmultilib Makefile
$(GCC_FOR_TARGET) $(CRTSTUFF_CFLAGS) $(CRTSTUFF_T_CFLAGS) \
  -c $(srcdir)/crtstuff.c -DCRT_BEGIN -DCRTSTUFFT_O \
  -o $(T)crtbeginT$(objext)
-
-# Compile the start modules crt0.o and mcrt0.o that are linked with
-# every program
-$(T)crt0.o: s-crt0 ; @true
-$(T)mcrt0.o: s-crt0; @true
-
-s-crt0:$(CRT0_S) $(MCRT0_S) $(GCC_PASSES) $(CONFIG_H)
-   $(GCC_FOR_TARGET) $(GCC_CFLAGS) $(CRT0STUFF_T_CFLAGS) \
- -o $(T)crt0.o -c $(CRT0_S)
-   $(GCC_FOR_TARGET) $(GCC_CFLAGS) $(CRT0STUFF_T_CFLAGS) \
- -o $(T)mcrt0.o -c $(MCRT0_S)
-   $(STAMP) s-crt0
 #
 # Compiling object files from source files.
 
diff --git a/gcc/config.gcc b/gcc/config.gcc
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1370,7 +1370,6 @@ i[3456x]86-*-netware*)
extra_objs="$extra_objs nwld.o"
tm_file="${tm_file} i386/nwld.h"
tmake_file="${tmake_file} i386/t-nwld t-slibgcc-dummy"
-   extra_parts="crt0.o libgcc.def libc.def libcpre.def 
posixpre.def"
;;
esac
case x${enable_threads} in
diff --git a/gcc/config/i386/t-nwld b/gcc/config/i386/t-nwld
--- a/gcc/config/i386/t-nwld
+++ b/gcc/config/i386/t-nwld
@@ -1,4 +1,4 @@
-# Copyright (C) 2004, 2005, 2006, 2007, 2008, 2009
+# Copyright (C) 2004, 2005, 2006, 2007, 2008, 2009, 2011
 # Free Software Foundation, Inc.
 #
 # This file is part of GCC.
@@ -17,31 +17,6 @@
 # along with GCC; see the file COPYING3.  If not see
 # .
 
-CRTSTUFF_T_CFLAGS = -mpreferred-stack-boundary=2
-CRT0STUFF_T_CFLAGS = -mpreferred-stack-boundary=2 $(INCLUDES)
-# this is a slight misuse (it's not an assembler file)
-CRT0_S = $(srcdir)/config/i386/netware-crt0.c
-MCRT0_S = $(srcdir)/config/i386/netware-crt0.c
-
-$(T)libgcc.def: $(srcdir)/config/i386/t-nwld
-   echo "module libgcc_s" >$@
-
-$(T)libc.def: $(srcdir)/config/i386/t-nwld
-   echo "module libc" >$@
-
-$(T)libcpre.def: $(srcdir)/config/i386/t-nwld
-   echo "start _LibCPrelude" >$@
-   echo "exit _LibCPostlude" >>$@
-   echo "check _LibCCheckUnload" >>$@
-
-$(T)posixpre.def: $(srcdir)/config/i386/t-nwld
-   echo "start POSIX_Start" >$@
-   echo "exit POSIX_Stop" >>$@
-   echo "check POSIX_CheckUnload" >>$@
-
 nwld.o: $(srcdir)/config/i386/nwld.c $(RTL_H) $(TREE_H) $(CONFIG_H) $(TM_P_H)
$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \
$(srcdir)/config/i386/nwld.c
-
-
-s-crt0: $(srcdir)/unwind-dw2-fde.h
diff --git a/libgcc/config.host b/libgcc/config.host
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -431,7 +431,8 @@ i[34567]86-*-lynxos*)
 i[3456x]86-*-netware*)
case /${with_ld} in
*/nwld)
-   tmake_file=

Re: [PATCH Atom][PR middle-end/44382] Tree reassociation improvement

2011-07-12 Thread William J. Schmidt
Ilya, thanks for posting this!  This patch is useful also on powerpc64.
Applying it solved a performance degradation with bwaves due to loss of
reassociation somewhere between 4.5 and 4.6 (still tracking it down).
When we apply -ftree-reassoc-width=2 to bwaves, the more optimal code
generation returns.

Bill

On Tue, 2011-07-12 at 16:30 +0400, Илья Энкович wrote:
> Hello,
> 
> Here is a patch related to missed optimization opportunity in tree
> reassoc phase.
> 
> Currently tree reassoc phase always generates a linear form which
> requires the minimum registers but has the highest tree height and
> does not allow computation to be performed in parallel. It may be
> critical for performance if required operations have high latency but
> can be pipelined (i.e. few execution units or low throughput). This
> problem becomes important on current Atom processors which are
> in-order and have many such instructions: IMUL and scalar SSE FP
> instructions.
> 
> This patch introduces a new feature to tree reassoc phase to generate
> computation tree with reduced height allowing to perform few
> long-latency instructions in parallel. It changes only one part of
> reassociation - rewrite_expr_tree. A level of parallelism is
> controlled via target hook and/or command line option.
> 
> New feature is enabled for Atom only by default. Patch boosts mostly
> CFP2000 geomean on Atom: +3.04% for 32 bit and +0.32% for 64 bit.
> 
> Bootstrapped and checked on x86_64-linux.
> 
> Thanks,
> Ilya
> --
> gcc/
> 
> 2011-07-12  Enkovich Ilya  
> 
>   * target.def (reassociation_width): New hook.
> 
>   * doc/tm.texi.in (reassociation_width): New hook documentation.
> 
>   * doc/tm.texi (reassociation_width): Likewise.
> 
>   * hooks.h (hook_int_const_gimple_1): New default hook.
> 
>   * hooks.c (hook_int_const_gimple_1): Likewise.
> 
>   * config/i386/i386.h (ix86_tune_indices): Add
>   X86_TUNE_REASSOC_INT_TO_PARALLEL and
>   X86_TUNE_REASSOC_FP_TO_PARALLEL.
> 
>   (TARGET_REASSOC_INT_TO_PARALLEL): New.
>   (TARGET_REASSOC_FP_TO_PARALLEL): Likewise.
> 
>   * config/i386/i386.c (initial_ix86_tune_features): Add
>   X86_TUNE_REASSOC_INT_TO_PARALLEL and
>   X86_TUNE_REASSOC_FP_TO_PARALLEL.
> 
>   (ix86_reassociation_width) implementation of
>   new hook for i386 target.
> 
>   * common.opt (ftree-reassoc-width): New option added.
> 
>   * tree-ssa-reassoc.c (get_required_cycles): New function.
>   (get_reassociation_width): Likewise.
>   (rewrite_expr_tree_parallel): Likewise.
> 
>   (reassociate_bb): Now checks reassociation width to be used
>   and call rewrite_expr_tree_parallel instead of rewrite_expr_tree
>   if needed.
> 
>   (pass_reassoc): TODO_remove_unused_locals flag added.
> 
> gcc/testsuite/
> 
> 2011-07-12  Enkovich Ilya  
> 
>   * gcc.dg/tree-ssa/pr38533.c (dg-options): Added option
>   -ftree-reassoc-width=1.
> 
>   * gcc.dg/tree-ssa/reassoc-24.c: New test.
>   * gcc.dg/tree-ssa/reassoc-25.c: Likewise.




Re: [patch tree-optimization]: [3 of 3]: Boolify compares & more

2011-07-12 Thread Kai Tietz
Hello,

As discussed on IRC, I reuse here the do_dce flag to choose folding
direction within BB.

Bootstrapped and regression tested for all standard-languages (plus
Ada and Obj-C++) on host x86_64-pc-linux-gnu.

Ok for apply?

Regards,
Kai

ChangeLog gcc/

2011-07-12  Kai Tietz  

* tree-ssa-propagate.c (substitute_and_fold):
Only use last to first scanning direction if do_cde
is true.
* tree-vrp.c (extract_range_from_binary_expr): Add
handling for BIT_IOR_EXPR, BIT_AND_EXPR, and BIT_NOT_EXPR.
(register_edge_assert_for_1): Add handling for 1-bit
BIT_IOR_EXPR and BIT_NOT_EXPR.
(register_edge_assert_for): Add handling for 1-bit
BIT_IOR_EXPR.
(ssa_name_get_inner_ssa_name_p): New helper function.
(ssa_name_get_cast_to_p): New helper function.
(simplify_truth_ops_using_ranges): Handle prefixed
cast instruction for result, and add support for one
bit precision BIT_IOR_EXPR, BIT_AND_EXPR, BIT_XOR_EXPR,
and BIT_NOT_EXPR.
(simplify_stmt_using_ranges): Add handling for one bit
precision BIT_IOR_EXPR, BIT_AND_EXPR, BIT_XOR_EXPR,
and BIT_NOT_EXPR.

ChangeLog gcc/testsuite

2011-07-08  Kai Tietz  

* gcc.dg/tree-ssa/vrp47.c: Remove dom-output
and adjust testcase for vrp output analysis.

Index: gcc/gcc/testsuite/gcc.dg/tree-ssa/vrp47.c
===
--- gcc.orig/gcc/testsuite/gcc.dg/tree-ssa/vrp47.c  2011-07-12
15:21:23.793440400 +0200
+++ gcc/gcc/testsuite/gcc.dg/tree-ssa/vrp47.c   2011-07-12
15:27:11.892259100 +0200
@@ -4,7 +4,7 @@
jumps when evaluating an && condition.  VRP is not able to optimize
this.  */
 /* { dg-do compile { target { ! "mips*-*-* s390*-*-*  avr-*-*
mn10300-*-*" } } } */
-/* { dg-options "-O2 -fdump-tree-vrp -fdump-tree-dom" } */
+/* { dg-options "-O2 -fdump-tree-vrp" } */
 /* { dg-options "-O2 -fdump-tree-vrp -fdump-tree-dom -march=i586" {
target { i?86-*-* && ilp32 } } } */

 int h(int x, int y)
@@ -36,13 +36,10 @@ int f(int x)
0 or 1.  */
 /* { dg-final { scan-tree-dump-times "\[xy\]\[^ \]* !=" 0 "vrp1" } } */

-/* This one needs more copy propagation that only happens in dom1.  */
-/* { dg-final { scan-tree-dump-times "x\[^ \]* & y" 1 "dom1" } } */
-/* { dg-final { scan-tree-dump-times "x\[^ \]* & y" 1 "vrp1" { xfail
*-*-* } } } */
+/* { dg-final { scan-tree-dump-times "x\[^ \]* & y" 1 "vrp1" } } */

 /* These two are fully simplified by VRP.  */
 /* { dg-final { scan-tree-dump-times "x\[^ \]* \[|\] y" 1 "vrp1" } } */
 /* { dg-final { scan-tree-dump-times "x\[^ \]* \\^ 1" 1 "vrp1" } } */

 /* { dg-final { cleanup-tree-dump "vrp\[0-9\]" } } */
-/* { dg-final { cleanup-tree-dump "dom\[0-9\]" } } */
Index: gcc/gcc/tree-ssa-propagate.c
===
--- gcc.orig/gcc/tree-ssa-propagate.c   2011-07-12 15:21:23.804440400 +0200
+++ gcc/gcc/tree-ssa-propagate.c2011-07-12 15:28:22.83100 +0200
@@ -979,6 +979,9 @@ replace_phi_args_in (gimple phi, ssa_pro

DO_DCE is true if trivially dead stmts can be removed.

+   If DO_DCE is true, the statements within a BB are walked from
+   last to first element.  Otherwise we scan from first to last element.
+
Return TRUE when something changed.  */

 bool
@@ -1059,9 +1062,10 @@ substitute_and_fold (ssa_prop_get_value_
for (i = gsi_start_phis (bb); !gsi_end_p (i); gsi_next (&i))
  replace_phi_args_in (gsi_stmt (i), get_value_fn);

-  /* Propagate known values into stmts.  Do a backward walk to expose
-more trivially deletable stmts.  */
-  for (i = gsi_last_bb (bb); !gsi_end_p (i);)
+  /* Propagate known values into stmts.  Do a backward walk if
+ do_dce is true. In some case it exposes
+more trivially deletable stmts to walk backward.  */
+  for (i = (do_dce ? gsi_last_bb (bb) : gsi_start_bb (bb));
!gsi_end_p (i);)
{
   bool did_replace;
  gimple stmt = gsi_stmt (i);
@@ -1070,7 +1074,10 @@ substitute_and_fold (ssa_prop_get_value_
  gimple_stmt_iterator oldi;

  oldi = i;
- gsi_prev (&i);
+ if (do_dce)
+   gsi_prev (&i);
+ else
+   gsi_next (&i);

  /* Ignore ASSERT_EXPRs.  They are used by VRP to generate
 range information for names and they are discarded
Index: gcc/gcc/tree-vrp.c
===
--- gcc.orig/gcc/tree-vrp.c 2011-07-12 15:21:23.838440400 +0200
+++ gcc/gcc/tree-vrp.c  2011-07-12 15:27:11.976269800 +0200
@@ -2232,6 +2232,7 @@ extract_range_from_binary_expr (value_ra
  some cases.  */
   if (code != BIT_AND_EXPR
   && code != TRUTH_AND_EXPR
+  && code != BIT_IOR_EXPR
   && code != TRUTH_OR_EXPR
   && code != TRUNC_DIV_EXPR
   && code != FLOOR_DIV_EXPR
@@ -2291,6 +2292,8 @@ extract_range_from_binary_expr (value_ra
  

Re: [PATCH Atom][PR middle-end/44382] Tree reassociation improvement

2011-07-12 Thread H.J. Lu
On Tue, Jul 12, 2011 at 9:50 AM, William J. Schmidt
 wrote:
> Ilya, thanks for posting this!  This patch is useful also on powerpc64.
> Applying it solved a performance degradation with bwaves due to loss of
> reassociation somewhere between 4.5 and 4.6 (still tracking it down).
> When we apply -ftree-reassoc-width=2 to bwaves, the more optimal code
> generation returns.


It is good to know.  Ilya, please mention PR middle-end/44382
in ChangeLog.

Thanks.

H.J.
--


Re: CFT: Move unwinder to toplevel libgcc

2011-07-12 Thread Steve Ellcey

Rainer,

I did another GCC build with your libgcc patch and with fixes for the
two problems I already mentioned to you and got another failure:

libtool: compile:  
/wsp/sje/gcc_git/build-ia64-hp-hpux11.23-gcc/obj_gcc/./gcc/xgcc -shared-libgcc 
-B/wsp/sje/gcc_git/build-ia64-hp-hpux11.23-gcc/obj_gcc/./gcc -nostdinc++ 
-L/wsp/sje/gcc_git/build-ia64-hp-hpux11.23-gcc/obj_gcc/ia64-hp-hpux11.23/libstdc++-v3/src
 
-L/wsp/sje/gcc_git/build-ia64-hp-hpux11.23-gcc/obj_gcc/ia64-hp-hpux11.23/libstdc++-v3/src/.libs
 -B/wsp/sje/gcc_git/gcc-ia64-hp-hpux11.23-gcc/ia64-hp-hpux11.23/bin/ 
-B/wsp/sje/gcc_git/gcc-ia64-hp-hpux11.23-gcc/ia64-hp-hpux11.23/lib/ -isystem 
/wsp/sje/gcc_git/gcc-ia64-hp-hpux11.23-gcc/ia64-hp-hpux11.23/include -isystem 
/wsp/sje/gcc_git/gcc-ia64-hp-hpux11.23-gcc/ia64-hp-hpux11.23/sys-include 
-I/wsp/sje/gcc_git/src/gcc/libstdc++-v3/../gcc 
-I/wsp/sje/gcc_git/build-ia64-hp-hpux11.23-gcc/obj_gcc/ia64-hp-hpux11.23/libstdc++-v3/include/ia64-hp-hpux11.23
 
-I/wsp/sje/gcc_git/build-ia64-hp-hpux11.23-gcc/obj_gcc/ia64-hp-hpux11.23/libstdc++-v3/include
 -I/wsp/sje/gcc_git/src/gcc/libstdc++-v3/libsupc++ -fno-implicit-templates 
-Wall -Wextra -Wwrite-strings -Wcast-qual -fdiagnostics-show-location=once 
-ffunction-sections -fdata-sections -g -O2 -c 
/wsp/sje/gcc_git/src/gcc/libstdc++-v3/libsupc++/eh_call.cc  -fPIC -DPIC -o 
eh_call.o
/wsp/sje/gcc_git/src/gcc/libstdc++-v3/libsupc++/eh_call.cc:34:23: fatal error: 
unwind-pe.h: No such file or directory
compilation terminated.
make[4]: *** [eh_call.lo] Error 1
make[4]: Leaving directory 
`/wsp/sje/gcc_git/build-ia64-hp-hpux11.23-gcc/obj_gcc/ia64-hp-hpux11.23/libstdc++-v3/libsupc++'
make[3]: *** [all-recursive] Error 1

I think in my earlier build I was building C only (or maybe the compiler only) 
and that is why I didn't
see this problem.

I think the libstdc++ Makefile needs to add

-I/wsp/sje/gcc_git/src/gcc/libstdc++-v3/../libgcc

in addition (or instead of)

-I/wsp/sje/gcc_git/src/gcc/libstdc++-v3/../gcc

Steve Ellcey
s...@cup.hp.com




Re: [PATCH Atom][PR middle-end/44382] Tree reassociation improvement

2011-07-12 Thread William J. Schmidt
On Tue, 2011-07-12 at 11:50 -0500, William J. Schmidt wrote:
> Ilya, thanks for posting this!  This patch is useful also on powerpc64.
> Applying it solved a performance degradation with bwaves due to loss of
> reassociation somewhere between 4.5 and 4.6 (still tracking it down).
> When we apply -ftree-reassoc-width=2 to bwaves, the more optimal code
> generation returns.
> 
> Bill
> 

However, it does not fix http://gcc.gnu.org/PR45671, which surprises me
as it was marked as a duplicate of this one.  Any thoughts on why this
isn't sufficient to reassociate the linear chain of adds?

Test case:

int myfunction (int a, int b, int c, int d, int e, int f, int g, int h)
{
  int ret;

  ret = a + b + c + d + e + f + g + h;
  return ret;

}




Re: [Ada] Fix --enable-build-with-cxx build

2011-07-12 Thread Eric Botcazou
> Certainly looks better to me.

Thanks, applied.  In any case, it's only a quantitative issue: since gigi will 
very likely use C++ features at some point, it needs to be compiled with the 
C++ compiler, which means that all the FE headers must be extern "C".  So half 
of the 33 files must be modified and I think that it's simpler to modify them 
all and use a single compilation scheme than maintaining two such schemes.

-- 
Eric Botcazou


Re: CFT: Move unwinder to toplevel libgcc

2011-07-12 Thread Rainer Orth
Steve,

> /wsp/sje/gcc_git/src/gcc/libstdc++-v3/libsupc++/eh_call.cc:34:23: fatal 
> error: unwind-pe.h: No such file or directory
> compilation terminated.
> make[4]: *** [eh_call.lo] Error 1
> make[4]: Leaving directory 
> `/wsp/sje/gcc_git/build-ia64-hp-hpux11.23-gcc/obj_gcc/ia64-hp-hpux11.23/libstdc++-v3/libsupc++'
> make[3]: *** [all-recursive] Error 1
>
> I think in my earlier build I was building C only (or maybe the compiler 
> only) and that is why I didn't
> see this problem.
>
> I think the libstdc++ Makefile needs to add
>
> -I/wsp/sje/gcc_git/src/gcc/libstdc++-v3/../libgcc
>
> in addition (or instead of)
>
> -I/wsp/sje/gcc_git/src/gcc/libstdc++-v3/../gcc

this is strange: my patch already includes this snippet:

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -685,9 +685,9 @@ AC_DEFUN([GLIBCXX_EXPORT_INCLUDES], [
   fi
 
   # Stuff in the actual top level.  Currently only used by libsupc++ to
-  # get unwind* headers from the gcc dir.
-  #TOPLEVEL_INCLUDES='-I$(toplevel_srcdir)/gcc -I$(toplevel_srcdir)/include'
-  TOPLEVEL_INCLUDES='-I$(toplevel_srcdir)/gcc'
+  # get unwind* headers from the libgcc dir.
+  #TOPLEVEL_INCLUDES='-I$(toplevel_srcdir)/libgcc -I$(toplevel_srcdir)/include'
+  TOPLEVEL_INCLUDES='-I$(toplevel_srcdir)/libgcc'
 
   # Now, export this to all the little Makefiles
   AC_SUBST(GLIBCXX_INCLUDES)

After rebuilding libstdc++-v3/configure, you should be fine.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: CFT: Move unwinder to toplevel libgcc

2011-07-12 Thread Steve Ellcey
On Tue, 2011-07-12 at 19:25 +0200, Rainer Orth wrote:

> After rebuilding libstdc++-v3/configure, you should be fine.
> 
>   Rainer

Ah, I may have forgotten to run autoconf.  My original testing was with
a subversion workarea, I am now trying to do it in a git area that I
created and I may have forgotten to run autoconf in the new git area
after applying your patch.

Steve Ellcey
s...@cup.hp.com



Re: [C++-0x] User defined literals.

2011-07-12 Thread Jason Merrill
Hmm, sorry I didn't notice this patch until now.  Please CC me on C++ 
patches, and feel free to ping me if I don't respond promptly.  Thanks 
for working on this!



2. Templates don't really work.  This is proving difficult for my little brain.

I need to parse this:
123_abc;
and replace it with a call to

operator"" _abc<'1', '2', '3'>();

I can call
operator"" _abc<>();

In other words I can't figure out how to get the template args in.

There's stuff in there getting the template function out of OVERLOAD.
Then I try tsubst.  No dice.


Can you build up a TEMPLATE_ID_EXPR and then let the usual overload 
resolution code do the rest?



3. I can't get friend to work.  The operator definition doesn't even get 
written.
I'm deferring this until after I get templates to work.


If you're talking about your udlit-friend.C test, I wouldn't expect that 
to work; friends are only found by argument-dependent lookup, and there 
is no argument of type Foo in the call


 operator""_Bar('x')


+#define UDLIT_OP_MANGLED_PREFIX "__udlit"


Discussion on the ABI list seems to have settled on

  li 

as the mangling of a literal operator.

Rather than check lang == CXX0X or GNUCXX0X in libcpp, let's add a C++0x 
flag.


There doesn't seem to be any need for USERDEF_LITERAL to be in the 
language-independent code; bits can go in c-common.h and c-common.def 
rather than tree.h and tree.def, etc.


Thanks,
Jason


Build failure (Re: [PATCH] Make VRP optimize useless conversions)

2011-07-12 Thread Ulrich Weigand
Richard Guenther wrote:

> 2011-07-11  Richard Guenther  
> 
>   * tree-vrp.c (simplify_conversion_using_ranges): Manually
>   translate the source value-range through the conversion chain.

This causes a build failure in cachemgr.c on spu-elf.  A slightly
modified simplified test case also fails on i386-linux:

void *
test (unsigned long long x, unsigned long long y)
{
  return (void *) (unsigned int) (x / y);
}

compiled with -O2 results in:

test.i: In function 'test':
test.i:3:1: error: invalid types in nop conversion
void *
long long unsigned int
D.1962_5 = (void *) D.1963_3;

test.i:3:1: internal compiler error: verify_gimple failed

Any thoughts?

Thanks,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com


Re: CFT: [build] Move fp-bit support to toplevel libgcc

2011-07-12 Thread Richard Henderson
On 07/12/2011 10:07 AM, Rainer Orth wrote:
> Only a couple of special defines (like FLOAT_WORD_ORDER_MISMATCH,
> QUIET_NAN_NEGATED) are moved to special t-* files in libgcc/config with
> [FDT]PBIT_CFLAGS similar to e.g. LIBGCC_SYNC_CFLAGS.  If it were
> possible to have gcc define some __LIBGCC_* macro corresponding to them,
> that would allow for further simplification.  Only if this mechanism
> couldn't handle the requirements have I resorted to introducing
> libgcc_tm_file snippets to handle them.

Re QUIET_NAN_NEGATED, it seems like we should be able to make use
of the __builtin_nan("") function.

Perhaps

  if (isnan (src))
{
  FLO_type ret = __builtin_nan("");
  if (sign)
ret = -ret;
  return ret;
}

... assuming __builtin_nan gets re-defined in fp-bit for the type
as appropriate.


r~


Re: [pph] Mark c4pr36533.cc fixed (issue4708041)

2011-07-12 Thread Gabriel Charette
My patch at http://codereview.appspot.com/4657092/ was fixing this.

Did you apply that already? I didn't see it as part of your commits to date?

Gab

On Tue, Jul 12, 2011 at 5:55 AM, Diego Novillo  wrote:
>
> Not quite sure what patch fixed this one and I'm too lazy to figure it out
> now.
>
> Committed to branch.
>
>
> Diego.
>
>        * g++.dg/pph/c4pr36533.cc: Mark fixed.
>
> diff --git a/gcc/testsuite/g++.dg/pph/c4pr36533.cc 
> b/gcc/testsuite/g++.dg/pph/c4pr36533.cc
> index 0093db1..ce2bf1f 100644
> --- a/gcc/testsuite/g++.dg/pph/c4pr36533.cc
> +++ b/gcc/testsuite/g++.dg/pph/c4pr36533.cc
> @@ -1,3 +1,2 @@
>  /* { dg-options "-w -fpermissive" } */
> -// pph asm xdiff
>  #include "c4pr36533.h"
>
> --
> This patch is available for review at http://codereview.appspot.com/4708041


Re: [pph] Add alternate addresses to register in the cache (issue4685054)

2011-07-12 Thread Gabriel Charette
Re-adding gcc-patches (forgot to send plain text last time...sigh!)

On Tue, Jul 12, 2011 at 10:56 AM, Gabriel Charette  wrote:
> I like this implementation!
> Only one thing, if we ACTUALLY want "to_register" NULL instead of the read
> value we can't as in your current implementation NULL means don't do the
> alternate registration.
> Could that be a problem or do we expect the structures we pass to this to
> always be non-null? (i.e. for scope_chain->bindings this is definitely ok,
> but what about other potential uses?), if we keep it this way we should at
> least make it clear in the function's comment that NULL means don't do
> alternate registration I think.
> Gab
>
> On Mon, Jul 11, 2011 at 2:42 PM, Diego Novillo  wrote:
>>
>> This patch adapts an idea from Gab that allow us to register alternate
>> addresses in the cache.  The problem here is making sure that symbols
>> read from a PPH file reference the right bindings.
>>
>> If a symbol is in the global namespace when compiling a header file,
>> its bindings will point to NAMESPACE_LEVEL(global_namespace)->bindings,
>> but that global_namespace is the global_namespace instantiated for the
>> header file.  When reading that PPH image from a translation unit, we
>> need to refer to the bindings of the *current* global_namespace.
>>
>> In general we solve this by inserting the pointer in the streamer
>> cache.  For instance, to avoid instantiating a second global_namespace
>> decl, the initialization code of both the writer and the reader store
>> global_namespace into the streaming cache.  This way, all the
>> references to global_namespace point to the current global_namespace
>> as known by the writer and the reader.
>>
>> However, we cannot use the same trick on the bindings for
>> global_namespace.  If we simply inserted it into the cache then
>> writing out NAMESPACE_LEVEL(global_namespace)->bindings would simply
>> write a reference to the current one and on the reader side, it would
>> simply restore a pointer to the current translation unit's bindings.
>> Without ever actually writing or reading anything (since it was
>> satisified from the cache).
>>
>> Therefore, we want a mechanism that allows the reader to: (a) read all
>> the symbols in the global bindings, and (b) references to the
>> global binding made by the symbols should point to the global bindings
>> of the current translation unit (instead of the one in the PPH image).
>>
>> That's where ALLOC_AND_REGISTER_ALTERNATE comes in.  When called, it
>> allocates the data structure but registers another pointer in the
>> cache.  We use this trick when calling pph_in_binding_level from the
>> toplevel:
>>
>> +  new_bindings = pph_in_binding_level (stream, scope_chain->bindings);
>>
>> This way, when pph_in_binding_level tries to allocate the binding
>> structure read from STREAM, it registers scope_chain->bindings in the
>> cache.  This way, references to the original file's global binding are
>> automatically redirected to the current translation unit's global
>> bindings.
>>
>> Gab, I modified your original implementation to move all the logic to
>> the place where we need to make this decision.  This way, it is easier
>> to tell which functions need this alternate registration, instead of
>> relying on some status flag squirreled away in the STREAM data
>> structure.
>>
>>
>> Tested on x86_64.  Applied to branch.
>>
>> 2011-07-11   Diego Novillo  
>>             Gabriel Charette  
>>
>>        * pph-streamer-in.c (ALLOC_AND_REGISTER_ALTERNATE): Define.
>>        (pph_in_binding_level): Add argument TO_REGISTER.  Call
>>        ALLOC_AND_REGISTER_ALTERNATE if set.
>>        Update all users.
>>        (pph_register_decls_in_symtab): Call varpool_finalize_decl
>>        on all file-local symbols.
>>        (pph_in_scope_chain): Call pph_in_binding_level with
>>        scope_chain->bindings as the alternate pointer to
>>        register in the streaming cache.
>>
>> diff --git a/gcc/cp/ChangeLog.pph b/gcc/cp/ChangeLog.pph
>> index 1011902..f18c2f4 100644
>> --- a/gcc/cp/ChangeLog.pph
>> +++ b/gcc/cp/ChangeLog.pph
>> @@ -1,3 +1,16 @@
>> +2011-07-11   Diego Novillo  
>> +            Gabriel Charette  
>> +
>> +       * pph-streamer-in.c (ALLOC_AND_REGISTER_ALTERNATE): Define.
>> +       (pph_in_binding_level): Add argument TO_REGISTER.  Call
>> +       ALLOC_AND_REGISTER_ALTERNATE if set.
>> +       Update all users.
>> +       (pph_register_decls_in_symtab): Call varpool_finalize_decl
>> +       on all file-local symbols.
>> +       (pph_in_scope_chain): Call pph_in_binding_level with
>> +       scope_chain->bindings as the alternate pointer to
>> +       register in the streaming cache.
>> +
>>  2011-07-07   Diego Novillo  
>>
>>        * pph-streamer-in.c (pph_register_decls_in_symtab): Rename
>> diff --git a/gcc/cp/pph-streamer-in.c b/gcc/cp/pph-streamer-in.c
>> index 571ebf5..903cd94 100644
>> --- a/gcc/cp/pph-streamer-in.c
>> +++ b/gcc/cp/pph-streamer-in.c
>> @@ -42,6 +42,18 @@ alon

Re: [build] Move darwin-crt[23].c to toplevel libgcc

2011-07-12 Thread Mike Stump
On Jul 12, 2011, at 9:31 AM, Rainer Orth wrote:
> As a prerequisite to moving i386/crtprec.c to toplevel libgcc, it turned
> out that I need to move darwin-crt[23].c first to avoid the problem with
> inconsistent versions of extra_parts in gcc and libgcc:
> 
>   http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00831.html
> 
> The following pretty mechanical patch does this.
> 
> Bootstraps on i386-apple-darwin9.8.0 and powerpc-apple-darwin9.8.0 are
> well beyond stage1 now, so libgcc and thus crt?.o build correctly.
> 
> Ok for mainline if they pass?

Ok.


Re: [build] Move i386/crtprec to toplevel libgcc

2011-07-12 Thread Mike Stump
On Jul 12, 2011, at 9:37 AM, Rainer Orth wrote:
> The next easy step in toplevel libgcc migration is moving
> i386/crtprec.c.  I noticed that -mpc{32, 64, 80} wasn't supported on
> Solaris/x86 yet and corrected that.  The only testcase using the switch
> was adapted to also do so on Darwin/x86 (which already has the support,
> but didn't exercise it).

> Ok for mainline?

Darwin bits Ok.



Re: [pph] Mark c4pr36533.cc fixed (issue4708041)

2011-07-12 Thread Diego Novillo
On Tue, Jul 12, 2011 at 14:02, Gabriel Charette  wrote:
> My patch at http://codereview.appspot.com/4657092/ was fixing this.
>
> Did you apply that already? I didn't see it as part of your commits to date?

Ah, so that was it.  No, it wasn't 4657092.  I think we both fixed
this independently.  I now have a variant of 4657092 in my tree, but
it's still causing other grief.  I should have it ready today.


Diego.


Re: [pph] Add alternate addresses to register in the cache (issue4685054)

2011-07-12 Thread Diego Novillo
On Tue, Jul 12, 2011 at 13:56, Gabriel Charette  wrote:
> I like this implementation!
> Only one thing, if we ACTUALLY want "to_register" NULL instead of the read
> value we can't as in your current implementation NULL means don't do the
> alternate registration.

I don't think that's a problem.  Note too that the original
implementation also treated NULL to mean "don't do alternate
registration".


Diego.


Re: [pph] Add alternate addresses to register in the cache (issue4685054)

2011-07-12 Thread Gabriel Charette
Right, I remember my original implementation had the same behaviour,
but I'm pretty sure I had a comment mentioning that in the function
usage comment. I'm just saying it should be mentioned what passing
NULL means (especially since we do it all over the place).

On Tue, Jul 12, 2011 at 11:14 AM, Diego Novillo  wrote:
> On Tue, Jul 12, 2011 at 13:56, Gabriel Charette  wrote:
>> I like this implementation!
>> Only one thing, if we ACTUALLY want "to_register" NULL instead of the read
>> value we can't as in your current implementation NULL means don't do the
>> alternate registration.
>
> I don't think that's a problem.  Note too that the original
> implementation also treated NULL to mean "don't do alternate
> registration".
>
>
> Diego.
>


[i386, darwin] Fix pr/49714

2011-07-12 Thread Richard Henderson
It *appears* as if the references we were generating before
switching the thunk to rtl weren't valid.  Certainly those
same references don't pass legitimate_address_p, which is
the direct cause of the assertion failure.

Fixed by using the same address transformation that 
ix86_expand_call would have used, rather than doing
something by hand.

Thanks to Dominiq for testing.  Committed.


r~
PR target/49714
* config/i386/i386.c (x86_output_mi_thunk): Use
machopic_indirect_call_target instead of machopic_indirection_name 
directly.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 4ca95ab..9f63bf7 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -29409,12 +29409,8 @@ x86_output_mi_thunk (FILE *file,
 #if TARGET_MACHO
   else if (TARGET_MACHO)
{
- rtx sym_ref = XEXP (DECL_RTL (function), 0);
- if (TARGET_MACHO_BRANCH_ISLANDS)
-   sym_ref = (gen_rtx_SYMBOL_REF
-  (Pmode,
-   machopic_indirection_name (sym_ref, /*stub_p=*/true)));
- fnaddr = gen_rtx_MEM (Pmode, sym_ref);
+ fnaddr = machopic_indirect_call_target (DECL_RTL (function));
+ fnaddr = XEXP (fnaddr, 0);
}
 #endif /* TARGET_MACHO */
   else


Re: [pph] Add alternate addresses to register in the cache (issue4685054)

2011-07-12 Thread Diego Novillo
On Tue, Jul 12, 2011 at 14:23, Gabriel Charette  wrote:
> Right, I remember my original implementation had the same behaviour,
> but I'm pretty sure I had a comment mentioning that in the function
> usage comment. I'm just saying it should be mentioned what passing
> NULL means (especially since we do it all over the place).

Oh, absolutely.


Diego.


[Committed/Obvious] Fix PR 49474: ICE on ppc-linux with -O3 in cprop.c

2011-07-12 Thread Andrew Pinski
Hi,
  The problem here is the code reads:
   /* Check for more than one successor.  */
 if (! EDGE_COUNT (bb->succs) > 1)
But that expression is always false as ! has a higher precedence than
> does.  So the obvious thing is to rewrite this statement as:
  if (EDGE_COUNT (bb->succs) <= 1)
And that fixes the problem.  The only case where this case matters is
with __builtin_unreachable as fis_get_condition will return null for
all other cases where there are only one successor edge and we check
that result.

Committed as obvious after a bootstrap/test on x86_64-linux-gnu.

Thanks,
Andrew Pinski

ChangeLog:


Re: [Committed/Obvious] Fix PR 49474: ICE on ppc-linux with -O3 in cprop.c

2011-07-12 Thread Andrew Pinski
On Tue, Jul 12, 2011 at 11:42 AM, Andrew Pinski  wrote:
> Hi,
>  The problem here is the code reads:
>       /* Check for more than one successor.  */
>     if (! EDGE_COUNT (bb->succs) > 1)
> But that expression is always false as ! has a higher precedence than
>> does.  So the obvious thing is to rewrite this statement as:
>  if (EDGE_COUNT (bb->succs) <= 1)
> And that fixes the problem.  The only case where this case matters is
> with __builtin_unreachable as fis_get_condition will return null for
> all other cases where there are only one successor edge and we check
> that result.
>
> Committed as obvious after a bootstrap/test on x86_64-linux-gnu.
>
> Thanks,
> Andrew Pinski
>
> ChangeLog:

* cprop.c (find_implicit_sets): Correct the condition.
Index: testsuite/gcc.c-torture/compile/pr49474.c
===
--- testsuite/gcc.c-torture/compile/pr49474.c   (revision 0)
+++ testsuite/gcc.c-torture/compile/pr49474.c   (revision 0)
@@ -0,0 +1,16 @@
+typedef struct gfc_formal_arglist
+{
+  int next;
+}
+gfc_actual_arglist;
+update_arglist_pass (gfc_actual_arglist* lst, int po, unsigned argpos,
+   const char *name)
+{
+  ((void)(__builtin_expect(!(argpos > 0), 0) ? __builtin_unreachable(), 0 : 
0));
+  if (argpos == 1)
+  return 0;
+  if (lst)
+lst->next = update_arglist_pass (lst->next, po, argpos - 1, name);
+  else
+lst = update_arglist_pass (((void *)0), po, argpos - 1, name);
+}
Index: cprop.c
===
--- cprop.c (revision 176187)
+++ cprop.c (working copy)
@@ -1332,7 +1332,7 @@ find_implicit_sets (void)
   FOR_EACH_BB (bb)
 {
   /* Check for more than one successor.  */
-  if (! EDGE_COUNT (bb->succs) > 1)
+  if (EDGE_COUNT (bb->succs) <= 1)
continue;
 
   cond = fis_get_condition (BB_END (bb));


[dwarf2cfi] Fix pr/49713

2011-07-12 Thread Richard Henderson
On 07/12/2011 02:29 AM, Richard Earnshaw wrote:
> On 12/07/11 10:05, Andreas Schwab wrote:
>> I think this has caused the bootstrap failure on ia64:
>>
>> In file included from ../../gcc/dwarf2cfi.c:31:0:
>> ../../gcc/dwarf2out.h: In function 'dwarf_frame_regnum':
>> ../../gcc/dwarf2out.h:271:3: error: implicit declaration of function 
>> 'ia64_dbx_register_number' [-Werror=implicit-function-declaration]
>>
>> Andreas.
>>
> 
> And on ARM (PR49713)

Ok, I've removed the inline from the header.

In the process I found a place in dwarf2out that was using dbx 
register numbers instead of dwarf2 register numbers.  Tsk Tsk.
Of course, on most targets they're the same thing.  But still...

Tested on x86_64-linux, arm-eabi (arm-sim).
Cross-compiled ia64-linux, mips-sgi-irix6.5.


r~
PR target/49713
* dwarf2out.h (dwarf_frame_regnum): Remove.
* dwarf2out.c (based_loc_descr): Revert last change.  Initialize regno
earlier from DWARF_FRAME_REGNUM.  Never use dbx_reg_number.
* dwarf2cfi.c (dw_stack_pointer_regnum, dw_frame_pointer_regnum): New.
(execute_dwarf2_frame): Initialize them.
(DW_STACK_POINTER_REGNUM, DW_FRAME_POINTER_REGNUM): Remove; replace
users of the macros with the variables.
(expand_builtin_dwarf_sp_column): Revert last change.
(expand_builtin_init_dwarf_reg_sizes): Likewise.  Compute the
result of DWARF_FRAME_REGNUM into a local variable.



diff --git a/gcc/dwarf2cfi.c b/gcc/dwarf2cfi.c
index 1c76b3f..4e648ae 100644
--- a/gcc/dwarf2cfi.c
+++ b/gcc/dwarf2cfi.c
@@ -57,10 +57,6 @@ along with GCC; see the file COPYING3.  If not see
 
 /* Maximum size (in bytes) of an artificially generated label.  */
 #define MAX_ARTIFICIAL_LABEL_BYTES 30
-
-/* Short-hand for commonly used register numbers.  */
-#define DW_STACK_POINTER_REGNUM  dwarf_frame_regnum (STACK_POINTER_REGNUM)
-#define DW_FRAME_POINTER_REGNUM  dwarf_frame_regnum (HARD_FRAME_POINTER_REGNUM)
 
 /* A vector of call frame insns for the CIE.  */
 cfi_vec cie_cfi_vec;
@@ -78,6 +74,10 @@ static bool emit_cfa_remember;
 
 /* True if any CFI directives were emitted at the current insn.  */
 static bool any_cfis_emitted;
+
+/* Short-hand for commonly used register numbers.  */
+static unsigned dw_stack_pointer_regnum;
+static unsigned dw_frame_pointer_regnum;
 
 
 static void dwarf2out_cfi_begin_epilogue (rtx insn);
@@ -89,7 +89,7 @@ static void dwarf2out_frame_debug_restore_state (void);
 rtx
 expand_builtin_dwarf_sp_column (void)
 {
-  unsigned int dwarf_regnum = DW_STACK_POINTER_REGNUM;
+  unsigned int dwarf_regnum = DWARF_FRAME_REGNUM (STACK_POINTER_REGNUM);
   return GEN_INT (DWARF2_FRAME_REG_OUT (dwarf_regnum, 1));
 }
 
@@ -117,7 +117,8 @@ expand_builtin_init_dwarf_reg_sizes (tree address)
 
   for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
 {
-  unsigned int rnum = DWARF2_FRAME_REG_OUT (dwarf_frame_regnum (i), 1);
+  unsigned int dnum = DWARF_FRAME_REGNUM (i);
+  unsigned int rnum = DWARF2_FRAME_REG_OUT (dnum, 1);
 
   if (rnum < DWARF_FRAME_REGISTERS)
{
@@ -127,7 +128,7 @@ expand_builtin_init_dwarf_reg_sizes (tree address)
 
  if (HARD_REGNO_CALL_PART_CLOBBERED (i, save_mode))
save_mode = choose_hard_reg_mode (i, 1, true);
- if (dwarf_frame_regnum (i) == DWARF_FRAME_RETURN_COLUMN)
+ if (dnum == DWARF_FRAME_RETURN_COLUMN)
{
  if (save_mode == VOIDmode)
continue;
@@ -812,10 +813,10 @@ dwarf2out_args_size (HOST_WIDE_INT size)
 static void
 dwarf2out_stack_adjust (HOST_WIDE_INT offset)
 {
-  if (cfa.reg == DW_STACK_POINTER_REGNUM)
+  if (cfa.reg == dw_stack_pointer_regnum)
 cfa.offset += offset;
 
-  if (cfa_store.reg == DW_STACK_POINTER_REGNUM)
+  if (cfa_store.reg == dw_stack_pointer_regnum)
 cfa_store.offset += offset;
 
   if (ACCUMULATE_OUTGOING_ARGS)
@@ -861,7 +862,7 @@ dwarf2out_notice_stack_adjust (rtx insn, bool after_p)
 
   /* If only calls can throw, and we have a frame pointer,
  save up adjustments until we see the CALL_INSN.  */
-  if (!flag_asynchronous_unwind_tables && cfa.reg != DW_STACK_POINTER_REGNUM)
+  if (!flag_asynchronous_unwind_tables && cfa.reg != dw_stack_pointer_regnum)
 {
   if (CALL_P (insn) && !after_p)
{
@@ -955,13 +956,13 @@ static GTY(()) VEC(reg_saved_in_data, gc) 
*regs_saved_in_regs;
 static GTY(()) reg_saved_in_data *cie_return_save;
 
 /* Short-hand inline for the very common D_F_R (REGNO (x)) operation.  */
-/* ??? This ought to go into dwarf2out.h alongside dwarf_frame_regnum,
-   except that dwarf2out.h is used in places where rtl is prohibited.  */
+/* ??? This ought to go into dwarf2out.h, except that dwarf2out.h is
+   used in places where rtl is prohibited.  */
 
 static inline unsigned
 dwf_regno (const_rtx reg)
 {
-  return dwarf_frame_regnum (REGNO (reg));
+  return DWARF_FRAME_REGNUM (REGNO (reg));
 }
 
 /* Compare X and Y for equivalence.  The inputs may be REGs or PC_RTX.  */
@@ -1

[pph] Fix 3 asm differences (issue4695048)

2011-07-12 Thread Diego Novillo

This patch is a slight adaptation of Gab's fix to the order in which
we stream chains (http://codereview.appspot.com/4657092).  I mostly
just changed how we keep the temporary list to reverse (it now uses a
VEC instead of a custom-build linked list).

The other change is in the reader.  We were not registering symbols in
scope_chain->static_aggregates as they come from the PPH file (which
would cause an ICE in x1hardlookup.cc).

This fixes 3 tests, but we still have some asm differences that are
similar in nature: when reinstantiating PPH images, the compiler emits
some symbols in different order, causing different numbering and
naming in the assembler output (we need to generate identical output
from a pph or from text).

Tested on x86_64.  Committed to branch.


Diego.

2011-07-12   Diego Novillo  

* pph-streamer-in.c (pph_register_decl_in_symtab): New.
(pph_register_binding_in_symtab): Rename from
pph_register_decls_in_symtab.  Update all users.
Do not call nreverse on bl->names and bl->namespaces.
Call pph_register_decl_in_symtab.
(pph_read_file_contents): Register decls in
FILE_STATIC_AGGREGATES.

2011-07-12  Gabriel Charette  
Diego Novillo  

* pph-streamer-out.c (pph_out_chained_tree): New.
(pph_out_chain_filtered): Add REVERSE_P parameter.
If REVERSE_P is set, write the list in reverse order.
Update all users.
(pph_out_binding_level): Write out lists bl->names,
bl->namespaces, bl->usings and bl->using_directives in
reverse.


testsuite/ChangeLog.pph
2011-07-12  Gabriel Charette  

* g++.dg/pph/c1pr44948-1a.cc: Mark fixed.
* g++.dg/pph/c2pr36533.cc: Mark fixed.
* g++.dg/pph/x2functions.cc: Mark fixed.

diff --git a/gcc/cp/pph-streamer-in.c b/gcc/cp/pph-streamer-in.c
index 63f8965..e7d1d00 100644
--- a/gcc/cp/pph-streamer-in.c
+++ b/gcc/cp/pph-streamer-in.c
@@ -1175,29 +1175,33 @@ pph_in_lang_type (pph_stream *stream)
 }
 
 
+/* Register DECL with the middle end.  */
+
+static void
+pph_register_decl_in_symtab (tree decl)
+{
+  if (TREE_CODE (decl) == VAR_DECL
+  && TREE_STATIC (decl)
+  && !DECL_EXTERNAL (decl))
+varpool_finalize_decl (decl);
+}
+
+
 /* Register all the symbols in binding level BL in the callgraph symbol
table.  */
 
 static void
-pph_register_decls_in_symtab (struct cp_binding_level *bl)
+pph_register_binding_in_symtab (struct cp_binding_level *bl)
 {
   tree t;
 
-  /* The chains are built backwards (ref: add_decl_to_level),
- reverse them before putting them back in.  */
-  bl->names = nreverse (bl->names);
-  bl->namespaces = nreverse (bl->namespaces);
-
+  /* Add file-local symbols to the varpool.  */
   for (t = bl->names; t; t = DECL_CHAIN (t))
-{
-  /* Add file-local symbols to the varpool.  */
-  if (TREE_CODE (t) == VAR_DECL && TREE_STATIC (t) && !DECL_EXTERNAL (t))
-   varpool_finalize_decl (t);
-}
+pph_register_decl_in_symtab (t);
 
   /* Recurse into the namespaces contained in BL.  */
   for (t = bl->namespaces; t; t = DECL_CHAIN (t))
-pph_register_decls_in_symtab (NAMESPACE_LEVEL (t));
+pph_register_binding_in_symtab (NAMESPACE_LEVEL (t));
 }
 
 
@@ -1220,7 +1224,7 @@ pph_in_scope_chain (pph_stream *stream)
   new_bindings = pph_in_binding_level (stream, scope_chain->bindings);
 
   /* Register all the symbols in STREAM with the call graph.  */
-  pph_register_decls_in_symtab (new_bindings);
+  pph_register_binding_in_symtab (new_bindings);
 
   /* Merge the bindings from STREAM into saved_scope->bindings.  */
   chainon (cur_bindings->names, new_bindings->names);
@@ -1413,6 +1417,16 @@ pph_read_file_contents (pph_stream *stream)
   file_static_aggregates = pph_in_tree (stream);
   static_aggregates = chainon (file_static_aggregates, static_aggregates);
 
+  /* Register all symbols in FILE_STATIC_AGGREGATES with the middle end.
+ Each element of this list is an INIT_EXPR expression.  */
+  for (t = file_static_aggregates; t; t = TREE_CHAIN (t))
+{
+  tree lhs = TREE_OPERAND (TREE_PURPOSE (t), 0);
+  tree rhs = TREE_OPERAND (TREE_PURPOSE (t), 1);
+  pph_register_decl_in_symtab (lhs);
+  pph_register_decl_in_symtab (rhs);
+}
+
   /* Expand all the functions with bodies that we read from STREAM.  */
   FOR_EACH_VEC_ELT (tree, stream->fns_to_expand, i, fndecl)
 {
diff --git a/gcc/cp/pph-streamer-out.c b/gcc/cp/pph-streamer-out.c
index f7bf739..9c9a1f8 100644
--- a/gcc/cp/pph-streamer-out.c
+++ b/gcc/cp/pph-streamer-out.c
@@ -584,21 +584,44 @@ pph_out_label_binding (pph_stream *stream, 
cp_label_binding *lb, bool ref_p)
 }
 
 
+/* Outputs chained tree T by nulling out its chain first and restoring it
+   after the streaming is done. STREAM and REF_P are as in
+   pph_out_chain_filtered.  */
+
+static inline void
+pph_out_chained_tree (pph_stream *stream, tree t, bool ref_p)
+{
+  tree saved_chain;
+
+  saved_chain = TREE_CHAIN (t);
+  TREE_CH

Re: [pph] Stream DECL_CHAIN only for VAR/FUNCTION_DECLs that are part of a RECORD_OR_UNION_TYPE (issue4672055)

2011-07-12 Thread Diego Novillo
On Fri, Jul 8, 2011 at 21:20, Gabriel Charette  wrote:

> 2011-07-08  Gabriel Charette  
>
>        * pph-streamer-in.c (pph_in_function_decl): Stream in
>        DECL_CHAIN of FUNCTION_DECL only if it's part of a RECORD_OR_UNION_TYPE
>        (pph_read_tree): Stream in DECL_CHAIN of VAR_DECL only if it's part
>        of a RECORD_OR_UNION_TYPE.
>        * pph-streamer-out.c (pph_out_function_decl): Stream out
>        DECL_CHAIN of FUNCTION_DECL only if it's part of a RECORD_OR_UNION_TYPE
>        (pph_write_tree): Stream out DECL_CHAIN of VAR_DECL only if it's part
>        of a RECORD_OR_UNION_TYPE.

Gab, do you still need this patch?  In principle, it doesn't make a
lot of sense to restrict when we save the DECL_CHAIN in this way.
It's not obvious what this would fix or help with.


Diego.


[PATCH, i386]: Tidy processor feature bitmasks

2011-07-12 Thread Uros Bizjak
Hello!

No functional change.

2011-07-12  Uros Bizjak  

* config/i386/i386.c: Tidy processor feature bitmasks.
(m_P4_NOCONA): New.

Tested on x86_64-pc-linux-gnu {,-m32}. Committed to mainline SVN.

Uros.
Index: i386.c
===
--- i386.c  (revision 176213)
+++ i386.c  (working copy)
@@ -1880,30 +1880,31 @@
 #define m_486 (1<

Replace cxx_scope with cp_binding_level (issue4702044)

2011-07-12 Thread Diego Novillo

We were using cxx_scope and cp_binding_level interchangeably in
confusing ways.  This patch implements Jason's suggestion of making
cp_binding_level the typedef for struct cp_binding_level.

Tests currently running on x86_64.

OK for mainline if they pass?


Diego.

* name-lookup.h (cp_binding_level): Rename from cxx_scope.
Update all users.
(struct cp_binding_level): Fix indentation.

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index cc08640..96d9fa8 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -950,7 +950,7 @@ struct GTY(()) saved_scope {
   VEC(tree,gc) *lang_base;
   tree lang_name;
   tree template_parms;
-  struct cp_binding_level *x_previous_class_level;
+  cp_binding_level *x_previous_class_level;
   tree x_saved_tree;
 
   /* Only used for uses of this in trailing return type.  */
@@ -967,8 +967,8 @@ struct GTY(()) saved_scope {
 
   struct stmt_tree_s x_stmt_tree;
 
-  struct cp_binding_level *class_bindings;
-  struct cp_binding_level *bindings;
+  cp_binding_level *class_bindings;
+  cp_binding_level *bindings;
 
   struct saved_scope *prev;
 };
@@ -1054,7 +1054,7 @@ struct GTY(()) language_function {
   BOOL_BITFIELD can_throw : 1;
 
   htab_t GTY((param_is(struct named_label_entry))) x_named_labels;
-  struct cp_binding_level *bindings;
+  cp_binding_level *bindings;
   VEC(tree,gc) *x_local_names;
   htab_t GTY((param_is (struct cxx_int_tree_map))) extern_decl_map;
 };
@@ -1944,7 +1944,7 @@ struct GTY(()) lang_decl_fn {
 
 struct GTY(()) lang_decl_ns {
   struct lang_decl_base base;
-  struct cp_binding_level *level;
+  cp_binding_level *level;
 };
 
 /* DECL_LANG_SPECIFIC for parameters.  */
@@ -4860,7 +4860,7 @@ extern tree make_anon_name(void);
 extern tree pushdecl_top_level_maybe_friend(tree, bool);
 extern tree pushdecl_top_level_and_finish  (tree, tree);
 extern tree check_for_out_of_scope_variable(tree);
-extern void print_other_binding_stack  (struct cp_binding_level *);
+extern void print_other_binding_stack  (cp_binding_level *);
 extern tree maybe_push_decl(tree);
 extern tree current_decl_namespace (void);
 
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 266d049..2742af5 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -184,7 +184,7 @@ struct GTY((chain_next ("%h.next"))) named_label_use_entry {
   /* The binding level to which this entry is *currently* attached.
  This is initially the binding level in which the goto appeared,
  but is modified as scopes are closed.  */
-  struct cp_binding_level *binding_level;
+  cp_binding_level *binding_level;
   /* The head of the names list that was current when the goto appeared,
  or the inner scope popped.  These are the decls that will *not* be
  skipped when jumping to the label.  */
@@ -208,7 +208,7 @@ struct GTY(()) named_label_entry {
   /* The binding level to which the label is *currently* attached.
  This is initially set to the binding level in which the label
  is defined, but is modified as scopes are closed.  */
-  struct cp_binding_level *binding_level;
+  cp_binding_level *binding_level;
   /* The head of the names list that was current when the label was
  defined, or the inner scope popped.  These are the decls that will
  be skipped when jumping to the label.  */
@@ -270,7 +270,7 @@ current_tmpl_spec_kind (int n_class_scopes)
   int n_template_parm_scopes = 0;
   int seen_specialization_p = 0;
   int innermost_specialization_p = 0;
-  struct cp_binding_level *b;
+  cp_binding_level *b;
 
   /* Scan through the template parameter scopes.  */
   for (b = current_binding_level;
@@ -447,7 +447,7 @@ objc_get_current_scope (void)
 void
 objc_mark_locals_volatile (void *enclosing_blk)
 {
-  struct cp_binding_level *scope;
+  cp_binding_level *scope;
 
   for (scope = current_binding_level;
scope && scope != enclosing_blk;
@@ -470,8 +470,8 @@ static int
 poplevel_named_label_1 (void **slot, void *data)
 {
   struct named_label_entry *ent = (struct named_label_entry *) *slot;
-  struct cp_binding_level *bl = (struct cp_binding_level *) data;
-  struct cp_binding_level *obl = bl->level_chain;
+  cp_binding_level *bl = (cp_binding_level *) data;
+  cp_binding_level *obl = bl->level_chain;
 
   if (ent->binding_level == bl)
 {
@@ -853,7 +853,7 @@ walk_namespaces (walk_namespaces_fn f, void* data)
 int
 wrapup_globals_for_namespace (tree name_space, void* data)
 {
-  struct cp_binding_level *level = NAMESPACE_LEVEL (name_space);
+  cp_binding_level *level = NAMESPACE_LEVEL (name_space);
   VEC(tree,gc) *statics = level->static_decls;
   tree *vec = VEC_address (tree, statics);
   int len = VEC_length (tree, statics);
@@ -2644,10 +2644,10 @@ identify_goto (tree decl, const location_t *locus)
true if all is well.  */
 
 static bool
-check_previous_goto_1 (tree decl, struct cp_binding_level* level, tree names,
+check_previous_goto_1 (tree decl, cp_binding_level*

Re: Replace cxx_scope with cp_binding_level (issue4702044)

2011-07-12 Thread Jason Merrill

OK.

Jason


Fix warnings in build with G++

2011-07-12 Thread Eric Botcazou
G++ kindly suggested making the following changes during a build so here we go.

Bootstrapped/regtested on x86_64-suse-linux, applied as obvious.


2011-07-12  Eric Botcazou  

* cse.c (insert_with_costs): Put semi-colon after empty loop body
on the next line.
* emit-rtl.c (push_to_sequence): Likewise.
* haifa-sched.c (max_issue): Likewise.
* matrix-reorg.c (add_allocation_site): Likewise.
* postreload-gcse.c (eliminate_partially_redundant_load): Likewise.
* reload.c (alternative_allows_const_pool_ref): Likewise.
* sched-rgn.c (rgn_add_block): Likewise.
(rgn_fix_recovery_cfg): Likewise.
* tree.c (attribute_list_contained): Likewise.
c-family/
* c-ada-spec.c (dump_nested_types): Put semi-colon after empty loop
body on the next line.


-- 
Eric Botcazou
Index: c-family/c-ada-spec.c
===
--- c-family/c-ada-spec.c	(revision 176072)
+++ c-family/c-ada-spec.c	(working copy)
@@ -2333,7 +2333,8 @@ dump_nested_types (pretty_printer *buffe
 		if (TREE_CODE (decl) == FUNCTION_TYPE)
 		  for (decl = TREE_TYPE (decl);
 		   decl && TREE_CODE (decl) == POINTER_TYPE;
-		   decl = TREE_TYPE (decl));
+		   decl = TREE_TYPE (decl))
+		;
 
 		decl = get_underlying_decl (decl);
 
Index: postreload-gcse.c
===
--- postreload-gcse.c	(revision 176072)
+++ postreload-gcse.c	(working copy)
@@ -1131,7 +1131,8 @@ eliminate_partially_redundant_load (basi
  discover additional redundancies, so mark it for later deletion.  */
   for (a_occr = get_bb_avail_insn (bb, expr->avail_occr);
a_occr && (a_occr->insn != insn);
-   a_occr = get_bb_avail_insn (bb, a_occr->next));
+   a_occr = get_bb_avail_insn (bb, a_occr->next))
+;
 
   if (!a_occr)
 {
Index: tree.c
===
--- tree.c	(revision 176072)
+++ tree.c	(working copy)
@@ -6350,7 +6350,8 @@ attribute_list_contained (const_tree l1,
t1 != 0 && t2 != 0
 && TREE_PURPOSE (t1) == TREE_PURPOSE (t2)
 && TREE_VALUE (t1) == TREE_VALUE (t2);
-   t1 = TREE_CHAIN (t1), t2 = TREE_CHAIN (t2));
+   t1 = TREE_CHAIN (t1), t2 = TREE_CHAIN (t2))
+;
 
   /* Maybe the lists are equal.  */
   if (t1 == 0 && t2 == 0)
Index: reload.c
===
--- reload.c	(revision 176072)
+++ reload.c	(working copy)
@@ -4591,7 +4591,8 @@ alternative_allows_const_pool_ref (rtx m
   /* Skip alternatives before the one requested.  */
   while (altnum > 0)
 {
-  while (*constraint++ != ',');
+  while (*constraint++ != ',')
+	;
   altnum--;
 }
   /* Scan the requested alternative for TARGET_MEM_CONSTRAINT or 'o'.
Index: haifa-sched.c
===
--- haifa-sched.c	(revision 176072)
+++ haifa-sched.c	(working copy)
@@ -2568,7 +2568,8 @@ max_issue (struct ready_list *ready, int
 		{
 		  n = privileged_n;
 		  /* Try to find issued privileged insn.  */
-		  while (n && !ready_try[--n]);
+		  while (n && !ready_try[--n])
+		;
 		}
 
 	  if (/* If all insns are equally good...  */
Index: cse.c
===
--- cse.c	(revision 176072)
+++ cse.c	(working copy)
@@ -1637,8 +1637,10 @@ insert_with_costs (rtx x, struct table_e
 	  /* Put it after the last element cheaper than X.  */
 	  struct table_elt *p, *next;
 
-	  for (p = classp; (next = p->next_same_value) && CHEAPER (next, elt);
-	   p = next);
+	  for (p = classp;
+	   (next = p->next_same_value) && CHEAPER (next, elt);
+	   p = next)
+	;
 
 	  /* Put it after P and before NEXT.  */
 	  elt->next_same_value = next;
Index: matrix-reorg.c
===
--- matrix-reorg.c	(revision 176072)
+++ matrix-reorg.c	(working copy)
@@ -719,7 +719,8 @@ add_allocation_site (struct matrix_info
  must be set accordingly.  */
   for (min_malloc_level = 0;
 	   min_malloc_level < mi->max_malloced_level
-	   && mi->malloc_for_level[min_malloc_level]; min_malloc_level++);
+	   && mi->malloc_for_level[min_malloc_level]; min_malloc_level++)
+	;
   if (level < min_malloc_level)
 	{
 	  mi->allocation_function_decl = current_function_decl;
Index: emit-rtl.c
===
--- emit-rtl.c	(revision 176072)
+++ emit-rtl.c	(working copy)
@@ -5043,7 +5043,8 @@ push_to_sequence (rtx first)
 
   start_sequence ();
 
-  for (last = first; last && NEXT_INSN (last); last = NEXT_INSN (last));
+  for (last = first; last && NEXT_INSN (last); last = NEXT_INSN (last))
+;
 
   set_first_insn (first);
   set_last_insn (last);
Index: sched-rgn.c
===
--- sched-rgn.c	(revisi

Re: [pph] Fix 3 asm differences (issue4695048)

2011-07-12 Thread Gabriel Charette
I like the modified implementation with VEC.

We probably want pph_register_decl_in_symtab to be inline as it does
so little now.

Now that you simply chainon bindings, you probably want to nreverse
them before you chain them on (this way we will stream them in from
first->last as this pacth does (to alloc stuff in order), but them in
the chain we want them to be last->first as they should be if they had
been pushed as they are in the original parser).

Gab

On Tue, Jul 12, 2011 at 12:19 PM, Diego Novillo  wrote:
>
> This patch is a slight adaptation of Gab's fix to the order in which
> we stream chains (http://codereview.appspot.com/4657092).  I mostly
> just changed how we keep the temporary list to reverse (it now uses a
> VEC instead of a custom-build linked list).
>
> The other change is in the reader.  We were not registering symbols in
> scope_chain->static_aggregates as they come from the PPH file (which
> would cause an ICE in x1hardlookup.cc).
>
> This fixes 3 tests, but we still have some asm differences that are
> similar in nature: when reinstantiating PPH images, the compiler emits
> some symbols in different order, causing different numbering and
> naming in the assembler output (we need to generate identical output
> from a pph or from text).
>
> Tested on x86_64.  Committed to branch.
>
>
> Diego.
>
> 2011-07-12   Diego Novillo  
>
>        * pph-streamer-in.c (pph_register_decl_in_symtab): New.
>        (pph_register_binding_in_symtab): Rename from
>        pph_register_decls_in_symtab.  Update all users.
>        Do not call nreverse on bl->names and bl->namespaces.
>        Call pph_register_decl_in_symtab.
>        (pph_read_file_contents): Register decls in
>        FILE_STATIC_AGGREGATES.
>
> 2011-07-12  Gabriel Charette  
>            Diego Novillo  
>
>        * pph-streamer-out.c (pph_out_chained_tree): New.
>        (pph_out_chain_filtered): Add REVERSE_P parameter.
>        If REVERSE_P is set, write the list in reverse order.
>        Update all users.
>        (pph_out_binding_level): Write out lists bl->names,
>        bl->namespaces, bl->usings and bl->using_directives in
>        reverse.
>
>
> testsuite/ChangeLog.pph
> 2011-07-12  Gabriel Charette  
>
>        * g++.dg/pph/c1pr44948-1a.cc: Mark fixed.
>        * g++.dg/pph/c2pr36533.cc: Mark fixed.
>        * g++.dg/pph/x2functions.cc: Mark fixed.
>
> diff --git a/gcc/cp/pph-streamer-in.c b/gcc/cp/pph-streamer-in.c
> index 63f8965..e7d1d00 100644
> --- a/gcc/cp/pph-streamer-in.c
> +++ b/gcc/cp/pph-streamer-in.c
> @@ -1175,29 +1175,33 @@ pph_in_lang_type (pph_stream *stream)
>  }
>
>
> +/* Register DECL with the middle end.  */
> +
> +static void
> +pph_register_decl_in_symtab (tree decl)
> +{
> +  if (TREE_CODE (decl) == VAR_DECL
> +      && TREE_STATIC (decl)
> +      && !DECL_EXTERNAL (decl))
> +    varpool_finalize_decl (decl);
> +}
> +
> +
>  /* Register all the symbols in binding level BL in the callgraph symbol
>    table.  */
>
>  static void
> -pph_register_decls_in_symtab (struct cp_binding_level *bl)
> +pph_register_binding_in_symtab (struct cp_binding_level *bl)
>  {
>   tree t;
>
> -  /* The chains are built backwards (ref: add_decl_to_level),
> -     reverse them before putting them back in.  */
> -  bl->names = nreverse (bl->names);
> -  bl->namespaces = nreverse (bl->namespaces);
> -
> +  /* Add file-local symbols to the varpool.  */
>   for (t = bl->names; t; t = DECL_CHAIN (t))
> -    {
> -      /* Add file-local symbols to the varpool.  */
> -      if (TREE_CODE (t) == VAR_DECL && TREE_STATIC (t) && !DECL_EXTERNAL (t))
> -       varpool_finalize_decl (t);
> -    }
> +    pph_register_decl_in_symtab (t);
>
>   /* Recurse into the namespaces contained in BL.  */
>   for (t = bl->namespaces; t; t = DECL_CHAIN (t))
> -    pph_register_decls_in_symtab (NAMESPACE_LEVEL (t));
> +    pph_register_binding_in_symtab (NAMESPACE_LEVEL (t));
>  }
>
>
> @@ -1220,7 +1224,7 @@ pph_in_scope_chain (pph_stream *stream)
>   new_bindings = pph_in_binding_level (stream, scope_chain->bindings);
>
>   /* Register all the symbols in STREAM with the call graph.  */
> -  pph_register_decls_in_symtab (new_bindings);
> +  pph_register_binding_in_symtab (new_bindings);
>
>   /* Merge the bindings from STREAM into saved_scope->bindings.  */
>   chainon (cur_bindings->names, new_bindings->names);
> @@ -1413,6 +1417,16 @@ pph_read_file_contents (pph_stream *stream)
>   file_static_aggregates = pph_in_tree (stream);
>   static_aggregates = chainon (file_static_aggregates, static_aggregates);
>
> +  /* Register all symbols in FILE_STATIC_AGGREGATES with the middle end.
> +     Each element of this list is an INIT_EXPR expression.  */
> +  for (t = file_static_aggregates; t; t = TREE_CHAIN (t))
> +    {
> +      tree lhs = TREE_OPERAND (TREE_PURPOSE (t), 0);
> +      tree rhs = TREE_OPERAND (TREE_PURPOSE (t), 1);
> +      pph_register_decl_in_symtab (lhs);
> +      pph_register_decl_in_symtab (rhs);
> +    }
> +
>   /* Expand all the functi

Re: Fix warnings in build with G++

2011-07-12 Thread Richard Henderson
On 07/12/2011 01:07 PM, Eric Botcazou wrote:
> -  while (*constraint++ != ',');
> +  while (*constraint++ != ',')
> + ;

FWIW, elsewhere in gcc we use "continue;" for empty loop bodies.


r~


Re: [pph] Stream DECL_CHAIN only for VAR/FUNCTION_DECLs that are part of a RECORD_OR_UNION_TYPE (issue4672055)

2011-07-12 Thread Gabriel Charette
The reason I put that patch out is that sometimes, when we stream an
actual chain, lto_input_chain is going to rebuild the new chain it's
meant to be, but then pph_read_tree (which is called after by the
name_hook to finish reading special parts of the tree) overwrites the
DECL_CHAIN that was introduced by lto_input_chain.

The only time we need to actually stream in/out the DECL_CHAIN is when
streaming unions/structs because from what I looked at in lto it looks
like we are not doing lto_output_chain, but lto_output_tree on the
first member of the fields' chain (not sure how that even works in
lto... but in pph we used to only get the first member of structs
streamed and streaming DECL_CHAIN was the fix for it...)

I introduced this fix because it broke my patch trying to stream out
the chains backwards (as it would overwrite the chain I was trying to
create backwards on input, I think this didn't show up before because
the chain being built on input was the same as the one existing on
output (thus overwriting with the same value...) )

Even if this doesn't break tests anymore, we probably still want this,
no point adding stuff to the pph image that is not needed...

Any idea why lto doesn't call lto_output_chain, but simply
lto_output_tree to output the chains for struct/union?

Gab

On Tue, Jul 12, 2011 at 12:21 PM, Diego Novillo  wrote:
> On Fri, Jul 8, 2011 at 21:20, Gabriel Charette  wrote:
>
>> 2011-07-08  Gabriel Charette  
>>
>>        * pph-streamer-in.c (pph_in_function_decl): Stream in
>>        DECL_CHAIN of FUNCTION_DECL only if it's part of a 
>> RECORD_OR_UNION_TYPE
>>        (pph_read_tree): Stream in DECL_CHAIN of VAR_DECL only if it's part
>>        of a RECORD_OR_UNION_TYPE.
>>        * pph-streamer-out.c (pph_out_function_decl): Stream out
>>        DECL_CHAIN of FUNCTION_DECL only if it's part of a 
>> RECORD_OR_UNION_TYPE
>>        (pph_write_tree): Stream out DECL_CHAIN of VAR_DECL only if it's part
>>        of a RECORD_OR_UNION_TYPE.
>
> Gab, do you still need this patch?  In principle, it doesn't make a
> lot of sense to restrict when we save the DECL_CHAIN in this way.
> It's not obvious what this would fix or help with.
>
>
> Diego.
>


[PATCH] Hookize TARGET_CLASS_MAX_NREGS

2011-07-12 Thread Anatoly Sokolov
Hello.

  This patch turns TARGET_CLASS_MAX_NREGS macro into a hook.

  The patch has been bootstrapped on and regression tested on
x86_64-unknown-linux-gnu and  v850-unknown-elf for c.

  Changes for other platforms is obvious and similar changes in v850 target.

  This patch is pre-approved and should be committed within a week if no
objections.

* target.def (class_max_nregs): New hook.
* doc/tm.texi.in (TARGET_CLASS_MAX_NREGS): Document.
* doc/tm.texi: Regenerate.
* targhooks.c (default_class_max_nregs): New function.
* targhooks.h (default_class_max_nregs): Declare.
* ira.h (target_ira): Change type x_ira_reg_class_max_nregs and
x_ira_reg_class_min_nregs arrays to unsigned char.
* ira.c (setup_reg_class_nregs): Use TARGET_CLASS_MAX_NREGS target
hook instead of CLASS_MAX_NREGS macro.
* reginfo.c (restore_register_info): Ditto.
* ira-conflicts.c (process_regs_for_copy): Use
ira_reg_class_max_nregs array instead of CLASS_MAX_NREGS macro.
Change type rclass and aclass vars to reg_class_t.
* ira-costs.c (record_reg_classes): Use ira_reg_class_max_nregs
array instead of CLASS_MAX_NREGS macro. Change type rclass var to
reg_class_t.
* reload.c (combine_reloads, find_reloads, find_reloads_address_1):
Use ira_reg_class_max_nregs array instead of CLASS_MAX_NREGS macro.

* config/i386/i386.h (CLASS_MAX_NREGS): Remove.
* config/i386/i386.c (ix86_class_max_nregs): New function.
(ix86_register_move_cost): Use TARGET_CLASS_MAX_NREGS target hook
instead of CLASS_MAX_NREGS macro.
(TARGET_CLASS_MAX_NREGS): Define.
* config/avr/avr.h (CLASS_MAX_NREGS): Remove.
* config/avr/avr-protos.h (class_max_nregs): Remove declaration.
* config/avr/avr.c (class_max_nregs): Remove function.
* config/alpha/alpha.h (CLASS_MAX_NREGS): Remove.
* config/spu/spu.h (CLASS_MAX_NREGS): Remove.
* config/mep/mep.h (CLASS_MAX_NREGS): Remove.
* config/m32r/m32r.h (CLASS_MAX_NREGS): Remove.
* config/microblaze/microblaze.h (CLASS_MAX_NREGS): Remove.
* config/xtensa/xtensa.h (CLASS_MAX_NREGS): Remove.
* config/stormy16/stormy16.h (CLASS_MAX_NREGS): Remove.
* config/lm32/lm32.h (CLASS_MAX_NREGS): Remove.
* config/moxie/moxie.h (CLASS_MAX_NREGS): Remove.
* config/iq2000/iq2000.h (CLASS_MAX_NREGS): Remove.
* config/mn10300/mn10300.h (CLASS_MAX_NREGS): Remove.
* config/score/score.h (CLASS_MAX_NREGS): Remove.
* config/vax/vax.h (CLASS_MAX_NREGS): Remove.
* config/h8300/h8300.h (CLASS_MAX_NREGS): Remove.
* config/v850/v850.h (CLASS_MAX_NREGS): Remove.


Index: gcc/doc/tm.texi
===
--- gcc/doc/tm.texi (revision 176209)
+++ gcc/doc/tm.texi (working copy)
@@ -2846,6 +2846,23 @@
 allocation.
 @end deftypefn
 
+@deftypefn {Target Hook} {unsigned char} TARGET_CLASS_MAX_NREGS (reg_class_t 
@var{rclass}, enum machine_mode @var{mode})
+A target hook returns the maximum number of consecutive registers
+of class @var{rclass} needed to hold a value of mode @var{mode}.
+
+This is closely related to the macro @code{HARD_REGNO_NREGS}.  In fact,
+the value returned by @code{TERGET_CLASS_MAX_NREGS (@var{rclass},
+@var{mode})} target hook should be the maximum value of
+@code{HARD_REGNO_NREGS (@var{regno}, @var{mode})} for all @var{regno}
+values in the class @var{rclass}.
+
+This target hook helps control the handling of multiple-word values
+in the reload pass.
+
+The default version of this target hook returns the size of @var{mode}
+in words.
+@end deftypefn
+
 @defmac CLASS_MAX_NREGS (@var{class}, @var{mode})
 A C expression for the maximum number of consecutive registers
 of class @var{class} needed to hold a value of mode @var{mode}.
Index: gcc/doc/tm.texi.in
===
--- gcc/doc/tm.texi.in  (revision 176209)
+++ gcc/doc/tm.texi.in  (working copy)
@@ -2832,6 +2832,23 @@
 allocation.
 @end deftypefn
 
+@hook TARGET_CLASS_MAX_NREGS
+A target hook returns the maximum number of consecutive registers
+of class @var{rclass} needed to hold a value of mode @var{mode}.
+
+This is closely related to the macro @code{HARD_REGNO_NREGS}.  In fact,
+the value returned by @code{TERGET_CLASS_MAX_NREGS (@var{rclass},
+@var{mode})} target hook should be the maximum value of
+@code{HARD_REGNO_NREGS (@var{regno}, @var{mode})} for all @var{regno}
+values in the class @var{rclass}.
+
+This target hook helps control the handling of multiple-word values
+in the reload pass.
+
+The default version of this target hook returns the size of @var{mode}
+in words.
+@end deftypefn
+
 @defmac CLASS_MAX_NREGS (@var{class}, @var{mode})
 A C expression for the maximum number of consecutive registers
 of class @var{class} needed to hold a value of mode

Re: Fix warnings in build with G++

2011-07-12 Thread Eric Botcazou
> FWIW, elsewhere in gcc we use "continue;" for empty loop bodies.

I think I've never run into this idiom in about a decade of work on GCC. :-)
Sometimes there is a comment after the ; on the line, but this is somewhat 
redundant IMO.  Maybe we should simply ban loops with emtpy bodies.

-- 
Eric Botcazou


Re: [C++-0x] User defined literals.

2011-07-12 Thread Jason Merrill

A few more notes:


+  if (DECL_NAMESPACE_SCOPE_P (decl))
+   {
+ if (!check_literal_operator_args(decl,
+&long_long_unsigned_p, &long_double_p))
+   {
+ error ("%qD has illegal argument list", decl);
+ return NULL_TREE;
+   }
+
+ if (CP_DECL_CONTEXT (decl) == global_namespace)
+   {
+ const char *suffix = UDLIT_OP_SUFFIX (DECL_NAME (decl));
+ if (long_long_unsigned_p)
+   {
+ if (cpp_interpret_int_suffix (suffix, strlen (suffix)))
+   warning (0, "integer suffix shadowed by implementation");
+   }
+ else if (long_double_p)
+   {
+ if (cpp_interpret_float_suffix (suffix, strlen (suffix)))
+   warning (0, "floating point suffix"
+   " shadowed by implementation");
+   }
+   }
+   }


Doesn't the shadowing apply everywhere, not just at file scope?


+  if (cpp_userdef_string_p (tok->type))
+{
+  string_tree = USERDEF_LITERAL_VALUE (tok->u.value);
+  tok->type = cpp_userdef_string_remove_type (tok->type);
+  curr_tok_is_userdef_p = true;
+}


It seems like a mistake to change tok->type without changing the value. 
 Why not just set the 'type' local variable appropriately?



+ const char *curr_suffix = IDENTIFIER_POINTER (suffix_id);
+ if (have_suffix_p == 0)
+   {
+ suffix = xstrdup (curr_suffix);
+ have_suffix_p = 1;
+   }
+ else if (have_suffix_p == 1 && strcmp (suffix, curr_suffix) != 0)

...

+ USERDEF_LITERAL_SUFFIX_ID (literal) = get_identifier (suffix);


Just remember the identifier and compare it with ==.  Identifiers are 
unique.



+  /* Lookup the name we got back from the id-expression.  */
+  decl = cp_parser_lookup_name (parser, name,


Maybe use lookup_function_nonclass?

Jason



Re: [Patch 1/3] ARM 64 bit atomic operations

2011-07-12 Thread Ramana Radhakrishnan
Hi Dave,

Could you split this further into a patch that deals with the
case for disabling MCR memory barriers for Thumb1 so that it
maybe backported to the release branches ? I have commented inline
as well.

Could you also provide a proper changelog entry for this that will
also help with review of the patch ?

I've not yet managed to fully review all the bits in this patch but
here's some initial comments that should be looked at.

On 1 July 2011 16:54, Dr. David Alan Gilbert  wrote:
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 057f9ba..39057d2 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c

>
>  /* Emit a strex{b,h,d, } instruction appropriate for the specified
> @@ -23374,14 +23388,29 @@ arm_output_strex (emit_f emit,
>                  rtx value,
>                  rtx memory)
>  {
> -  const char *suffix = arm_ldrex_suffix (mode);
> -  rtx operands[3];
> +  rtx operands[4];
>
>   operands[0] = result;
>   operands[1] = value;
> -  operands[2] = memory;
> -  arm_output_asm_insn (emit, 0, operands, "strex%s%s\t%%0, %%1, %%C2", 
> suffix,
> -                      cc);
> +  if (mode != DImode)
> +    {
> +      const char *suffix = arm_ldrex_suffix (mode);
> +      operands[2] = memory;
> +      arm_output_asm_insn (emit, 0, operands, "strex%s%s\t%%0, %%1, %%C2",
> +                         suffix, cc);
> +    }
> +  else
> +    {
> +      /* The restrictions on target registers in ARM mode are that the two
> +        registers are consecutive and the first one is even; Thumb is
> +        actually more flexible, but DI should give us this anyway.
> +        Note that the 1st register always gets the lowest word in memory.  */
> +      gcc_assert ((REGNO (value) & 1) == 0);
> +      operands[2] = gen_rtx_REG (SImode, REGNO (value) + 1);
> +      operands[3] = memory;
> +      arm_output_asm_insn (emit, 0, operands, "strexd%s\t%%0, %%1, %%2, 
> %%C3",
> +                          cc);
> +    }
>  }
>
>  /* Helper to emit a two operand instruction.  */
> @@ -23423,7 +23452,7 @@ arm_output_op3 (emit_f emit, const char *mnemonic, 
> rtx d, rtx a, rtx b)
>
>    required_value:
>
> -   RTX register or const_int representing the required old_value for
> +   RTX register representing the required old_value for
>    the modify to continue, if NULL no comparsion is performed.  */
>  static void
>  arm_output_sync_loop (emit_f emit,
> @@ -23437,7 +23466,13 @@ arm_output_sync_loop (emit_f emit,
>                      enum attr_sync_op sync_op,
>                      int early_barrier_required)
>  {
> -  rtx operands[1];
> +  rtx operands[2];
> +  /* We'll use the lo for the normal rtx in the none-DI case
> +     as well as the least-sig word in the DI case.  */
> +  rtx old_value_lo, required_value_lo, new_value_lo, t1_lo;
> +  rtx old_value_hi, required_value_hi, new_value_hi, t1_hi;
> +
> +  bool is_di = mode == DImode;
>
>   gcc_assert (t1 != t2);
>
> @@ -23448,82 +23483,142 @@ arm_output_sync_loop (emit_f emit,
>
>   arm_output_ldrex (emit, mode, old_value, memory);
>
> +  if (is_di)
> +    {
> +      old_value_lo = gen_lowpart (SImode, old_value);
> +      old_value_hi = gen_highpart (SImode, old_value);
> +      if (required_value)
> +       {
> +         required_value_lo = gen_lowpart (SImode, required_value);
> +         required_value_hi = gen_highpart (SImode, required_value);
> +       }
> +      else
> +       {
> +         /* Silence false potentially unused warning */
> +         required_value_lo = NULL;
> +         required_value_hi = NULL;
> +       }
> +      new_value_lo = gen_lowpart (SImode, new_value);
> +      new_value_hi = gen_highpart (SImode, new_value);
> +      t1_lo = gen_lowpart (SImode, t1);
> +      t1_hi = gen_highpart (SImode, t1);
> +    }
> +  else
> +    {
> +      old_value_lo = old_value;
> +      new_value_lo = new_value;
> +      required_value_lo = required_value;
> +      t1_lo = t1;
> +
> +      /* Silence false potentially unused warning */
> +      t1_hi = NULL;
> +      new_value_hi = NULL;
> +      required_value_hi = NULL;
> +      old_value_hi = NULL;
> +    }
> +
>   if (required_value)
>     {
> -      rtx operands[2];
> +      operands[0] = old_value_lo;
> +      operands[1] = required_value_lo;
>
> -      operands[0] = old_value;
> -      operands[1] = required_value;
>       arm_output_asm_insn (emit, 0, operands, "cmp\t%%0, %%1");
> +      if (is_di)
> +        {
> +          arm_output_asm_insn (emit, 0, operands, "it\teq");

This should be guarded with a if (TARGET_THUMB2) - there's no point in
accounting for the length of this instruction in the compiler and then
have the assembler fold it away in ARM state.

>
> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
> index c32ef1a..3fdd22f 100644
> --- a/gcc/config/arm/arm.h
> +++ b/gcc/config/arm/arm.h
> @@ -282,7 +282,8 @@ extern void 
> (*arm_lang_output_object_attributes_hook)(void);
> -#define TARGET_HAVE_DMB_MCR    (arm_arch6k && ! TARGET_HAVE_DMB)
> +#def

Re: RFA: Avoid unnecessary clearing in union initialisers

2011-07-12 Thread Jason Merrill

On 07/12/2011 12:34 PM, Richard Sandiford wrote:

-  HOST_WIDE_INT num_type_elements, num_initialized_elements;
+  bool complete_p = true;
+  HOST_WIDE_INT num_elts = 0;


Let's use num_split_elts so that it's clearer that we're counting the 
number of elements that have been initialized outside the CONSTRUCTOR. 
OK with that change.


Jason


AVX generic mode tuning discussion.

2011-07-12 Thread harsha.jagasia
We would like to propose changing AVX generic mode tuning to generate 128-bit
AVX instead of 256-bit AVX. As per H.J's suggestion, we have reviewed the
various tuning choices made for generic mode with respect to AMD's upcoming
Bulldozer processor. At this moment, this is the most significant change we
have to propose. While we are willing to re-engineer generic mode, this
feature needs immediate discussion since the performance impact on Bulldozer
is significant.

Here is the relative CPU2006 performance data we have gathered using gcc on AMD
Bulldozer (BD) and Intel Sandybridge (SB) machines with "-Ofast -mtune=generic
-mavx".

%gain/loss avx256 vs avx128
(negative % indicates loss
positive % indicates gain)

AMD BD  Intel SB
410.bwaves  -2.34   -1.52  
416.gamess  -1.11   -0.30
433.milc0.47-1.75
434.zeusmp  -3.61   0.68
435.gromacs -0.54   -0.38
436.cactusADM   -23.56  21.49
437.leslie3d-0.44   1.56
444.namd0.000.00
447.dealII  -0.36   -0.23
450.soplex  -0.43   -0.29
453.povray  0.503.63
454.calculix-8.29   1.38
459.GemsFDTD2.37-1.54
465.tonto   0.000.00
470.lbm 0.000.21
481.wrf -4.80   0.00
482.sphinx3 -10.20  -3.65
SpecINT -3.29   1.01

400.perlbench   0.931.47
401.bzip2   0.600.00
403.gcc 0.000.00
429.mcf 0.00-0.36
445.gobmk   -1.03   0.37
456.hmmer   -0.64   0.38
458.sjeng   1.740.00
462.libquantum  0.310.00
464.h264ref 0.000.00
471.omnetpp -1.27   0.00
473.astar   0.000.46
483.xalancbmk   0.510.00
SpecFP  0.090.19

As per the data, the 1% performance gain for Intel Sandybridge on SpecFP is
eclipsed by a 3% degradation for AMD Bulldozer.

For the data above, generic mode splits both 256-bit misaligned loads and
stores, as is currently the case in trunk. 

Even if we disable 256-bit misaliged load splitting, AVX 256-bit performance
improves only by ~1.4% on SpecFP for AMD Bulldozer. On the other hand, AVX
256-bit performance drops by 0.12% on Intel Sandybridge. In this case with
AVX 256 load splitting disabled, a cumulative 0.9% performance gain for Intel
Sandybridge is reflected versus a 1.9% loss for AMD Bulldozer comparing AVX 256
to AVX 128 and hence AVX 256 is still not a fair choice for generic mode.

Please provide thoughts. It would be great if HJ can verify Intel Sandybridge
data.

Thanks,
Harsha




Re: AVX generic mode tuning discussion.

2011-07-12 Thread Richard Henderson
On 07/12/2011 02:22 PM, harsha.jaga...@amd.com wrote:
> We would like to propose changing AVX generic mode tuning to generate 128-bit
> AVX instead of 256-bit AVX.

You indicate a 3% reduction on bulldozer with avx256.
How does avx128 compare to -mno-avx -msse4.2?
Will the next AMD generation have a useable avx256?

I'm not keen on the idea of generic mode being tune
for a single processor revision that maybe shouldn't
actually be using avx at all.


r~


Re: [pph] Fix 3 asm differences (issue4695048)

2011-07-12 Thread Diego Novillo

On 11-07-12 16:34 , Gabriel Charette wrote:


We probably want pph_register_decl_in_symtab to be inline as it does
so little now.


It doesn't really matter all that much.  Given that it's a static 
function, the compiler will inline it (or not) as an optimization.  The 
'inline' keyword is more and more just a suggestion than an actual 
guarantee.



Now that you simply chainon bindings, you probably want to nreverse
them before you chain them on (this way we will stream them in from
first->last as this pacth does (to alloc stuff in order), but them in
the chain we want them to be last->first as they should be if they had
been pushed as they are in the original parser).


Perhaps, but first I want to make sure we really want to reverse them 
all.  Not every list is processed from back to front.



Diego.


Re: [PATCH, ARM, iWMMXt][1/5]: ARM code generic change

2011-07-12 Thread Ramana Radhakrishnan
On 06/07/11 11:11, Xinyu Qi wrote:
> Hi,
>
> It is the first part of iWMMXt maintenance.
>
> *config/arm/arm.c (arm_option_override):
>   Enable iWMMXt with VFP. iWMMXt and NEON are incompatible. iWMMXt 
> unsupported under Thumb-2 mode.
>   (arm_expand_binop_builtin): Accept immediate op (with mode VOID)
> *config/arm/arm.md:
>   Resettle include location of iwmmxt.md so that *arm_movdi and 
> *arm_movsi_insn could be used when iWMMXt is enabled.

With the current work in trunk to handle enabled attributes and
per-alternative predicable attributes (Thanks Bernd) we should be able
to get rid of
*cond_iwmmxt_movsi_insn"  in iwmmxt.md file. It's not a matter for
this patch but for a follow-up patch.

Actually we should probably do the same for the various insns that
are dotted around all over the place with final conditions that prevent
matching - atleast makes the backend description slightly smaller :).

>   Add pipeline description file include.

It is enough to say

 (): Include.

in the changelog entry.

The include for the pipeline description file should be with the patch
that you add this in i.e. patch #5. Please add this to MD_INCLUDES in
t-arm as well.

Also as a general note, please provide a correct Changelog entry.

This is not the format that we expect Changelog entries to be in.
Please look at the coding standards on the website for this or at
other patches submitted with respect to Changelog entries. Please fix
this for each patch in the patch stack.


cheers
Ramana


Re: [pph] Stream DECL_CHAIN only for VAR/FUNCTION_DECLs that are part of a RECORD_OR_UNION_TYPE (issue4672055)

2011-07-12 Thread Diego Novillo

On 11-07-12 16:43 , Gabriel Charette wrote:


Even if this doesn't break tests anymore, we probably still want this,
no point adding stuff to the pph image that is not needed...


Actually, the reverse is true.  We want to write out the IL exactly as 
the original parser emitted it.  There are things we decide not to write 
because they are better re-generated when the pph image is being read 
(e.g., function numbers, DECL_RTL), but



Any idea why lto doesn't call lto_output_chain, but simply
lto_output_tree to output the chains for struct/union?


LTO did not need those chains because once in the middle-end they are 
not used.  We are working at the parser level, so we need them.  Perhaps 
we won't need to write these chains, but first I'd like to understand why.


Since we are streaming the chains backwards without new breakage, let's 
leave it out for now.



Diego.


PATCH RFA: Disable -Wstrict-overflow for unevaluated expressions

2011-07-12 Thread Ian Lance Taylor
This patch to the C frontend disables warnings about -Wstrict-overflow
when handling expressions which will not be evaluated.  This fixes PR
49705.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

OK for mainline?

Ian


gcc/c-family:
2011-07-12  Ian Lance Taylor  

PR middle-end/49705
* c-common.c (c_disable_warnings): New static function.
(c_enable_warnings): New static function.
(c_fully_fold_internal): Change local unused_p to bool.  Call
c_disable_warnings and c_enable_warnings rather than change
c_inhibit_evaluation_warnings.

gcc/testsuite:

2011-07-12  Ian Lance Taylor  

PR middle-end/49705
* gcc.dg/pr49705.c: New test.


Index: gcc/c-family/c-common.c
===
--- gcc/c-family/c-common.c	(revision 175665)
+++ gcc/c-family/c-common.c	(working copy)
@@ -963,6 +963,32 @@ fix_string_type (tree value)
   return value;
 }
 
+/* If DISABLE is true, stop issuing warnings.  This is used when
+   parsing code that we know will not be executed.  This function may
+   be called multiple times, and works as a stack.  */
+
+static void
+c_disable_warnings (bool disable)
+{
+  if (disable)
+{
+  ++c_inhibit_evaluation_warnings;
+  fold_defer_overflow_warnings ();
+}
+}
+
+/* If ENABLE is true, reenable issuing warnings.  */
+
+static void
+c_enable_warnings (bool enable)
+{
+  if (enable)
+{
+  --c_inhibit_evaluation_warnings;
+  fold_undefer_and_ignore_overflow_warnings ();
+}
+}
+
 /* Fully fold EXPR, an expression that was not folded (beyond integer
constant expressions and null pointer constants) when being built
up.  If IN_INIT, this is in a static initializer and certain
@@ -1029,7 +1055,7 @@ c_fully_fold_internal (tree expr, bool i
   bool op0_const = true, op1_const = true, op2_const = true;
   bool op0_const_self = true, op1_const_self = true, op2_const_self = true;
   bool nowarning = TREE_NO_WARNING (expr);
-  int unused_p;
+  bool unused_p;
 
   /* This function is not relevant to C++ because C++ folds while
  parsing, and may need changes to be correct for C++ when C++
@@ -1278,10 +1304,10 @@ c_fully_fold_internal (tree expr, bool i
   unused_p = (op0 == (code == TRUTH_ANDIF_EXPR
 			  ? truthvalue_false_node
 			  : truthvalue_true_node));
-  c_inhibit_evaluation_warnings += unused_p;
+  c_disable_warnings (unused_p);
   op1 = c_fully_fold_internal (op1, in_init, &op1_const, &op1_const_self);
   STRIP_TYPE_NOPS (op1);
-  c_inhibit_evaluation_warnings -= unused_p;
+  c_enable_warnings (unused_p);
 
   if (op0 != orig_op0 || op1 != orig_op1 || in_init)
 	ret = in_init
@@ -1313,15 +1339,15 @@ c_fully_fold_internal (tree expr, bool i
   op0 = c_fully_fold_internal (op0, in_init, &op0_const, &op0_const_self);
 
   STRIP_TYPE_NOPS (op0);
-  c_inhibit_evaluation_warnings += (op0 == truthvalue_false_node);
+  c_disable_warnings (op0 == truthvalue_false_node);
   op1 = c_fully_fold_internal (op1, in_init, &op1_const, &op1_const_self);
   STRIP_TYPE_NOPS (op1);
-  c_inhibit_evaluation_warnings -= (op0 == truthvalue_false_node);
+  c_enable_warnings (op0 == truthvalue_false_node);
 
-  c_inhibit_evaluation_warnings += (op0 == truthvalue_true_node);
+  c_disable_warnings (op0 == truthvalue_true_node);
   op2 = c_fully_fold_internal (op2, in_init, &op2_const, &op2_const_self);
   STRIP_TYPE_NOPS (op2);
-  c_inhibit_evaluation_warnings -= (op0 == truthvalue_true_node);
+  c_enable_warnings (op0 == truthvalue_true_node);
 
   if (op0 != orig_op0 || op1 != orig_op1 || op2 != orig_op2)
 	ret = fold_build3_loc (loc, code, TREE_TYPE (expr), op0, op1, op2);
Index: gcc/testsuite/gcc.dg/pr49705.c
===
--- gcc/testsuite/gcc.dg/pr49705.c	(revision 0)
+++ gcc/testsuite/gcc.dg/pr49705.c	(revision 0)
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -Wstrict-overflow" } */
+
+struct glyph
+{
+  long foo, bar, baz;
+};
+
+extern int fatal (char const *, int, int);
+
+int
+check_image_width (int width, int height)
+{
+  if ((0 * (0 * 2 + width) - 1) < 0) ? - (~ (0 * (0 * 2 + width) + 0) == -1) - 0 * (0 * 2 + width) + 1) << (sizeof ((0 * 2 + width) + 0) * 8 - 2)) - 1) * 2 + 1) : (0 * (0 * 2 + width) + 0))) < 0 ? (2 < 0 ? width < 0 * (0 * 2 + width) - 1) < 0) ? - (~ (0 * (0 * 2 + width) + 0) == -1) - 0 * (0 * 2 + width) + 1) << (sizeof ((0 * 2 + width) + 0) * 8 - 2)) - 1) * 2 + 1) : (0 * (0 * 2 + width) + 0))) - 2 : 0 * (0 * 2 + width) - 1) < 0) ? 0 * (0 * 2 + width) + 1) << (sizeof ((0 * 2 + width) + 0) * 8 - 2)) - 1) * 2 + 1) : (0 * (0 * 2 + width) - 1))) - 2 < width) : width < 0 ? 2 <= width + 2 : 2 < 0 ? width <= width + 2 : width + 2 < 2)
+  || ((0 * (0 * height + (width + 2)) - 1) < 0) ? - (~ (0 * (0 * height + (width + 2)) + 0) == -1) - 0 * (0 * hei

Re: [PATCH] [Annotalysis] Fix to get_canonical_lock_expr

2011-07-12 Thread Diego Novillo

On 11-07-11 18:53 , Delesley Hutchins wrote:

This patch fixes get_canonical_lock_expr so that it works on lock
expressions that involve a MEM_REF.  Gimple code can use either
MEM_REF or INDIRECT_REF in many expressions, and the choice of which
to use is somewhat arbitrary.  The canonical form of a lock expression
must rewrite all MEM_REFs to INDIRECT_REFs to accurately compare
expressions.  The surrounding "if" block prevented this rewrite from
happening in certain cases.

Bootstrapped and passed GCC regression testsuite on x86_64-unknown-linux-gnu.

Okay for branches/annotalysis and google/main?


Remember


  -DeLesley


2011-07-06   DeLesley Hutchins
   * cp_get_virtual_function_decl.c (handle_call_gs): Changes
   function to return null if the method cannot be found.


>   2011-07-11   DeLesley Hutchins
> * tree-threadsafe-analyze.c (tree-threadsafe-analyze.c)
>  Changed to force rewrite on MEM_REF

You keep forgetting a blank line after the date line :)
Needs a ':' after ')', and the sentence should end in '.'.

The change description does not seem to be related to the change in the 
code.


OK, otherwise.


Diego.


[RFC PATCH] -grecord-gcc-switches (PR other/32998)

2011-07-12 Thread Jakub Jelinek
Hi!

As discussed in the PR, this patch implements IMHO a better alternative
to the current -frecord-gcc-switches option which isn't very usable unless
inspected on relocatable files only.  As each switch is zero terminated and
the section is mergeable, for whole binary or shared library we end up with
a set of all options which ever showed up on the command line when compiling
some of the CUs, but it is impossible to narrow it down back to which CU has
been compiled with what options.

This patch insteads appends the options as one long string separated by
spaces to DW_AT_producer attribute (which is normally just a reference
to a string in .debug_str).  Thus, ideally if all or most of the CUs
are compiled with the same options there is just one or just small number of
different DW_AT_producer strings, but it is still possible to find out
what has been compiled with what options.

The aim is to include just (or primarily) code generation affecting options
explicitly passed on the command line.  So that the merging actually works,
options or arguments which include filenames or paths shouldn't be added,
on Roland's request -D*/-U* options aren't added either (that should be
covered by .debug_macinfo), similarly -I/-i* options (the interesting
stuff what headers have been actually used is recorded in .debug_line)
and warning options shouldn't affect code generation either.

Ideally we'd just include explicitly passed options from command line that
haven't been overridden by other command line options, and would sort them,
so that there are higher chances of DW_AT_producer strings being merged
(e.g. -O2 -ffast-math vs. -ffast-math -O2 are now different strings, and
similarly -O2 vs. -O3 -O2 vs. -O0 -O1 -Ofast -O2), but I'm not sure if it is
easily possible using current option handling framework.

Bootstrapped/regtested on x86_64-linux, e.g. cc1plus has just a few
different DW_AT_producers (sort|uniq -c|sort -n -r output):
322  (indirect string, offset: 0x4fb3): GNU C 4.7.0 20110712 (experimental) 
-mtune=generic -march=x86-64 -g -O2 -pedantic -fno-common   
 48  (indirect string, offset: 0xf41f1): GNU C 4.7.0 20110712 
(experimental) -mtune=generic -march=x86-64 -g -O2 -pedantic  
 16  (indirect string, offset: 0xebf8): GNU C 4.7.0 20110712 (experimental) 
-mtune=generic -march=x86-64 -g -O2 -fno-common 
  9  (indirect string, offset: 0xfa115): GNU C 4.7.0 20110712 
(experimental) -mtune=generic -march=x86-64 -g -O2
  2  (indirect string, offset: 0xfa974): GNU C 4.7.0 20110712 
(experimental) -mtune=generic -march=x86-64 -g -g -g -O2 -O2 -O2 -fPIC 
-fbuilding-libgcc -fno-stack-protector -fvisibility=hidden 

2011-07-13  Jakub Jelinek  

PR other/32998
* common.opt (grecord-gcc-switches, gno-record-gcc-switches): New
options.
* dwarf2out.c: Include opts.h.
(dchar_p): New typedef.  Define heap VEC for it.
(gen_compile_unit_die): Handle dwarf_record_gcc_switches.
* Makefile.in (dwarf2out.o): Depend on $(OPTS_H).

* lib/dg-pch.exp (dg-flags-pch): Compile tests for assembly comparison
with -gno-record-gcc-switches.

--- gcc/common.opt.jj   2011-06-28 19:09:09.936421970 +0200
+++ gcc/common.opt  2011-07-12 21:17:13.809402161 +0200
@@ -2184,6 +2184,14 @@ ggdb
 Common JoinedOrMissing
 Generate debug information in default extended format
 
+gno-record-gcc-switches
+Common RejectNegative Var(dwarf_record_gcc_switches,0) Init(0)
+Don't record gcc command line switches in DWARF DW_AT_producer.
+
+grecord-gcc-switches
+Common RejectNegative Var(dwarf_record_gcc_switches,1)
+Record gcc command line switches in DWARF DW_AT_producer.
+
 gstabs
 Common JoinedOrMissing Negative(gstabs+)
 Generate debug information in STABS format
--- gcc/dwarf2out.c.jj  2011-07-12 21:15:46.987389471 +0200
+++ gcc/dwarf2out.c 2011-07-13 01:18:21.864451800 +0200
@@ -94,6 +94,7 @@ along with GCC; see the file COPYING3.  
 #include "tree-pass.h"
 #include "tree-flow.h"
 #include "cfglayout.h"
+#include "opts.h"
 
 static void dwarf2out_source_line (unsigned int, const char *, int, bool);
 static rtx last_var_location_insn;
@@ -18110,6 +18111,10 @@ gen_ptr_to_mbr_type_die (tree type, dw_d
 
 /* Generate the DIE for the compilation unit.  */
 
+typedef const char *dchar_p; /* For DEF_VEC_P.  */
+DEF_VEC_P(dchar_p);
+DEF_VEC_ALLOC_P(dchar_p,heap);
+
 static dw_die_ref
 gen_compile_unit_die (const char *filename)
 {
@@ -18130,18 +18135,104 @@ gen_compile_unit_die (const char *filena
 
   sprintf (producer, "%s %s", language_string, version_string);
 
+  if (dwarf_record_gcc_switches)
+{
+  size_t j;
+  VEC(dchar_p, heap) *switches = NULL;
+  char *producer_and_switches, *tail;
+  const char *p;
+  size_t len = 0, plen = strlen (producer);
+  for (j = 1; j < save_decoded_options_count; j++)
+   switch (save_de

Re: [pph] Fix 3 asm differences (issue4695048)

2011-07-12 Thread Gabriel Charette
On Tue, Jul 12, 2011 at 3:25 PM, Diego Novillo  wrote:
> On 11-07-12 16:34 , Gabriel Charette wrote:
>
>> We probably want pph_register_decl_in_symtab to be inline as it does
>> so little now.
>
> It doesn't really matter all that much.  Given that it's a static function,
> the compiler will inline it (or not) as an optimization.  The 'inline'
> keyword is more and more just a suggestion than an actual guarantee.
>

OK

>> Now that you simply chainon bindings, you probably want to nreverse
>> them before you chain them on (this way we will stream them in from
>> first->last as this pacth does (to alloc stuff in order), but them in
>> the chain we want them to be last->first as they should be if they had
>> been pushed as they are in the original parser).
>
> Perhaps, but first I want to make sure we really want to reverse them all.
>  Not every list is processed from back to front.
>

Ok, well for now in the code though they are inserted in the
current_bindings in the reverse order (names and namespaces for sure,
usings I'm not sure) then they were in the parser when originally
written out (I don't know if this causes problems...?)

Gab


Re: [pph] Stream DECL_CHAIN only for VAR/FUNCTION_DECLs that are part of a RECORD_OR_UNION_TYPE (issue4672055)

2011-07-12 Thread Gabriel Charette
On Tue, Jul 12, 2011 at 3:32 PM, Diego Novillo  wrote:
> On 11-07-12 16:43 , Gabriel Charette wrote:
>
>> Even if this doesn't break tests anymore, we probably still want this,
>> no point adding stuff to the pph image that is not needed...
>
> Actually, the reverse is true.  We want to write out the IL exactly as the
> original parser emitted it.  There are things we decide not to write because
> they are better re-generated when the pph image is being read (e.g.,
> function numbers, DECL_RTL), but
>

Well so lto_input_chain, called from pph_in_chain for every single
chain, already reconstructs the DECL_CHAIN on input (DECL_CHAIN is
actually set to NULL before streaming it out each element in the chain
anyways).

The only case where that wasn't true was when we output structs (and
unions I think: we need to add a test for unions), because structs
don't seem to use output chain (I don't have the code in front of me,
but I know they would only output the tree (i.e. the tree was, in that
case, responsible for streaming its DECL_CHAIN)).

I'm surprised it no longer breaks... I'll have a look when I come back
on Friday if it's still a debated issue then.

Gab


PATCH: Remove -mfused-madd and add -mfma

2011-07-12 Thread H.J. Lu
Hi,

-mfused-madd is deprecated and -mfma is undocumented.  This patch
removes -mfused-madd and documents -mfma.  OK for trunk?

Thanks.

H.J.
---
2011-07-12  H.J. Lu  

* doc/invoke.texi (x86): Remove -mfused-madd and add -mfma.

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index f146cc5..3429b31 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -600,7 +600,7 @@ Objective-C and Objective-C++ Dialects}.
 -mincoming-stack-boundary=@var{num} @gol
 -mcld -mcx16 -msahf -mmovbe -mcrc32 -mrecip -mvzeroupper @gol
 -mmmx  -msse  -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4 -mavx @gol
--maes -mpclmul -mfsgsbase -mrdrnd -mf16c -mfused-madd @gol
+-maes -mpclmul -mfsgsbase -mrdrnd -mf16c -mfma @gol
 -msse4a -m3dnow -mpopcnt -mabm -mbmi -mtbm -mfma4 -mxop -mlwp @gol
 -mthreads  -mno-align-stringops  -minline-all-stringops @gol
 -minline-stringops-dynamically -mstringop-strategy=@var{alg} @gol
@@ -12587,6 +12587,8 @@ preferred alignment to 
@option{-mpreferred-stack-boundary=2}.
 @itemx -mno-rdrnd
 @itemx -mf16c
 @itemx -mno-f16c
+@itemx -mfma
+@itemx -mno-fma
 @itemx -msse4a
 @itemx -mno-sse4a
 @itemx -mfma4
@@ -12612,9 +12614,9 @@ preferred alignment to 
@option{-mpreferred-stack-boundary=2}.
 @opindex mno-sse
 @opindex m3dnow
 @opindex mno-3dnow
-These switches enable or disable the use of instructions in the MMX,
-SSE, SSE2, SSE3, SSSE3, SSE4.1, AVX, AES, PCLMUL, FSGSBASE, RDRND,
-F16C, SSE4A, FMA4, XOP, LWP, ABM, BMI, or 3DNow!@: extended instruction sets.
+These switches enable or disable the use of instructions in the MMX, SSE,
+SSE2, SSE3, SSSE3, SSE4.1, AVX, AES, PCLMUL, FSGSBASE, RDRND, F16C, FMA,
+SSE4A, FMA4, XOP, LWP, ABM, BMI, or 3DNow!@: extended instruction sets.
 These extensions are also available as built-in functions: see
 @ref{X86 Built-in Functions}, for details of the functions enabled and
 disabled by these switches.
@@ -12633,13 +12635,6 @@ supported architecture, using the appropriate flags.  
In particular,
 the file containing the CPU detection code should be compiled without
 these options.
 
-@item -mfused-madd
-@itemx -mno-fused-madd
-@opindex mfused-madd
-@opindex mno-fused-madd
-Do (don't) generate code that uses the fused multiply/add or multiply/subtract
-instructions.  The default is to use these instructions.
-
 @item -mcld
 @opindex mcld
 This option instructs GCC to emit a @code{cld} instruction in the prologue


libgo patch committed: Run tests in source file order

2011-07-12 Thread Ian Lance Taylor
The libgo tests expect to be run in the order in which they appear in
the source file.  This patch uses -fno-toplevel-reorder to make sure
that happens.  Bootstrapped and ran libgo tests on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r 8020981d5461 libgo/testsuite/gotest
--- a/libgo/testsuite/gotest	Mon Jul 11 13:25:57 2011 -0700
+++ b/libgo/testsuite/gotest	Tue Jul 12 17:56:52 2011 -0700
@@ -288,11 +288,11 @@
 	prefixarg="-fgo-prefix=$prefix"
 fi
 
-$GC -g $prefixarg -c -I . -o _gotest_.o $gofiles $pkgbasefiles
+$GC -g $prefixarg -c -I . -fno-toplevel-reorder -o _gotest_.o $gofiles $pkgbasefiles
 if $havex; then
 	mkdir -p `dirname $package`
 	cp _gotest_.o `dirname $package`/lib`basename $package`.a
-	$GC -g -c -I . -o $xofile $xgofiles
+	$GC -g -c -I . -fno-toplevel-reorder -o $xofile $xgofiles
 fi
 
 # They all compile; now generate the code to call them.


  1   2   >