Re: [PATCH v3] Re: avoid useless if-before-free tests

2011-04-15 Thread Janne Blomqvist
On Thu, Mar 24, 2011 at 18:51, Jim Meyering  wrote:
> Janne Blomqvist wrote:
>> On Tue, Mar 8, 2011 at 19:53, Jim Meyering  wrote:
>>> Relative to v2, I've added libgo/ to the list of exempt directories and 
>>> added
>>> this recently discussed gfc_free patch, at the request of Tobias Burnus.
>>> Also, I corrected an error in fortran's ChangeLog and removed all
>>> whitespace changes from all ChangeLog files.
>>
>> The libgfortran changes are Ok for 4.7.
>>
>> For the gfortran frontend (gcc/fortran/*) I'd prefer if you'd
>>
>> - Replace all calls to "gfc_free (x)" with "free (x)".
>> - Remove the gfc_free() function and prototype.
>> - Remove the free() macro which currently prevents calling free() directly.
>
> Following up, I've refreshed the series but hit a minor snag
> while converting new uses of gfc_free, removing new tests-before-free
> and merging/reordering changes.
>
> Applying this fix first makes my problem go away:

[snip]

So, what's the plan here? Do you plan to get a GCC account, do you
already have one, or what? Now that 4.7 is open for development, it's
perhaps the right time to poke the maintainers to get this patch in.

If you don't have a GCC account, as one of the Fortran maintainers I
can commit the Fortran and libgfortran parts, but someone else will
have to do the rest (were they ever approved, BTW?) as I have only
write after approval privileges for the rest of GCC.


-- 
Janne Blomqvist


[PATCH] Improve combining of conditionals

2011-04-15 Thread Maxim Kuvyrkov
This patch fixes the problem with substituting expressions into IF_THEN_ELSE 
during combining.  Without this patch combining of conditionals inside 
IF_THEN_ELSE is seriously inhibited.

The problem this patch fixes is that combine_simplify_rtx() prefers to return 
an expression (say, ) even when a comparison is prefered (say, 
).  Expressions are not recognized as valid conditions of 
if_then_else for most targets, so combiner misses a potential optimization.  
This patch makes combine_simplify_rtx() aware of the context it was invoked in, 
and, when appropriate, does not discourage it from returning a conditional.

The motivating example for this fix was crcu8() routine from CoreMark.  
Compiling this routine for MIPS32R2 at -O3 produces there are several instances 
of sequence

andi$2,$2,0x1
xori$2,$2,0x1
movn$3,$5,$2  ; $2 dies here

which can be optimized into

andi$2,$2,0x1
movz$3,$5,$2  ; $2 dies here
.

The patch was successfully tested on {i686, arm, mips}-linux, both GCC 
testsuites and SPEC2000 runs.  For all targets there was no observable code 
difference in SPEC2000 benchmarks, so the example does not trigger very often.  
Still, it speeds up CoreMark by about 1%.

OK for trunk?

--
Maxim Kuvyrkov
Mentor Graphics / CodeSourcery




gcc-combine-if_then_else.ChangeLog
Description: Binary data


gcc-combine-if_then_else.patch
Description: Binary data


Re: [PATCH v3] Re: avoid useless if-before-free tests

2011-04-15 Thread Jim Meyering
Janne Blomqvist wrote:

> On Thu, Mar 24, 2011 at 18:51, Jim Meyering  wrote:
>> Janne Blomqvist wrote:
>>> On Tue, Mar 8, 2011 at 19:53, Jim Meyering  wrote:
 Relative to v2, I've added libgo/ to the list of exempt directories and 
 added
 this recently discussed gfc_free patch, at the request of Tobias Burnus.
 Also, I corrected an error in fortran's ChangeLog and removed all
 whitespace changes from all ChangeLog files.
>>>
>>> The libgfortran changes are Ok for 4.7.
>>>
>>> For the gfortran frontend (gcc/fortran/*) I'd prefer if you'd
>>>
>>> - Replace all calls to "gfc_free (x)" with "free (x)".
>>> - Remove the gfc_free() function and prototype.
>>> - Remove the free() macro which currently prevents calling free() directly.
>>
>> Following up, I've refreshed the series but hit a minor snag
>> while converting new uses of gfc_free, removing new tests-before-free
>> and merging/reordering changes.
>>
>> Applying this fix first makes my problem go away:
>
> [snip]
>
> So, what's the plan here? Do you plan to get a GCC account, do you
> already have one, or what? Now that 4.7 is open for development, it's
> perhaps the right time to poke the maintainers to get this patch in.
>
> If you don't have a GCC account, as one of the Fortran maintainers I
> can commit the Fortran and libgfortran parts, but someone else will
> have to do the rest (were they ever approved, BTW?) as I have only
> write after approval privileges for the rest of GCC.

Can someone add me to the gcc group?  That would help.
I already have ssh access to sourceware.org.

I've rebased the series a few times, but it's been a week or so.
More convertible uses are being added regularly.
Plus I have to reorder/split things a little to avoid a new conflict.


Re: [PATCH][RFC] Make called function type explicit, make function pointer type conversions useless

2011-04-15 Thread Eric Botcazou
> 2011-04-14  Eric Botcazou  
>
>   * cfgexpand.c (expand_call_stmt): Rematerialize the original function
>   type if this is not a builtin function.

Hum, using fold_convert seems to be more appropriate here.

Bootstrapped/regtested on i586-suse-linux, applied as obvious.


2011-04-15  Eric Botcazou  

* cfgexpand.c (expand_call_stmt): Simply convert the function type.


-- 
Eric Botcazou
Index: cfgexpand.c
===
--- cfgexpand.c	(revision 172469)
+++ cfgexpand.c	(working copy)
@@ -1851,8 +1851,8 @@ expand_call_stmt (gimple stmt)
  call is made may be different from the type of the function.  */
   if (!builtin_p)
 CALL_EXPR_FN (exp)
-  = fold_build1 (NOP_EXPR, build_pointer_type (gimple_call_fntype (stmt)),
-		 CALL_EXPR_FN (exp));
+  = fold_convert (build_pointer_type (gimple_call_fntype (stmt)),
+		  CALL_EXPR_FN (exp));
 
   TREE_TYPE (exp) = gimple_call_return_type (stmt);
   CALL_EXPR_STATIC_CHAIN (exp) = gimple_call_chain (stmt);


[PATCH] doubled words

2011-04-15 Thread Jim Meyering

Signed-off-by: Jim Meyering 
---
While most of these are in comments, the corrections
in gcc/tree-cfg.c and gcc/config/sh/constraints.md are in strings.
The former at least is marked for translation, and hence appears
in every .po file.

 gcc/config/alpha/vms-unwind.h  |4 ++--
 gcc/config/arm/unwind-arm.h|4 ++--
 gcc/config/microblaze/microblaze.c |2 +-
 gcc/config/sh/constraints.md   |4 ++--
 gcc/cp/pt.c|2 +-
 gcc/java/jcf-parse.c   |4 ++--
 gcc/tree-cfg.c |4 ++--
 7 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/gcc/config/alpha/vms-unwind.h b/gcc/config/alpha/vms-unwind.h
index ea2c3a3..71cb7b8 100644
--- a/gcc/config/alpha/vms-unwind.h
+++ b/gcc/config/alpha/vms-unwind.h
@@ -1,5 +1,5 @@
 /* Fallback frame unwinding for Alpha/VMS.
-   Copyright (C) 1996, 1997, 1998, 2000, 2001, 2002, 2003, 2009, 2010
+   Copyright (C) 1996, 1997, 1998, 2000, 2001, 2002, 2003, 2009, 2010, 2011
Free Software Foundation, Inc.

This file is part of GCC.
@@ -229,7 +229,7 @@ alpha_vms_fallback_frame_state (struct _Unwind_Context 
*context,

   /* If PV designates an exception dispatcher, we have to adjust the return
  address column to get at the signal occurrence point, and account for
- for what the CHF context contains.  */
+ what the CHF context contains.  */

   if (DENOTES_EXC_DISPATCHER (pv))
 {
diff --git a/gcc/config/arm/unwind-arm.h b/gcc/config/arm/unwind-arm.h
index a9ba126..1a51d8d 100644
--- a/gcc/config/arm/unwind-arm.h
+++ b/gcc/config/arm/unwind-arm.h
@@ -1,5 +1,5 @@
 /* Header file for the ARM EABI unwinder
-   Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2009
+   Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2011
Free Software Foundation, Inc.
Contributed by Paul Brook

@@ -65,7 +65,7 @@ extern "C" {
 }
   _Unwind_State;

-  /* Provided only for for compatibility with existing code.  */
+  /* Provided only for compatibility with existing code.  */
   typedef int _Unwind_Action;
 #define _UA_SEARCH_PHASE   1
 #define _UA_CLEANUP_PHASE  2
diff --git a/gcc/config/microblaze/microblaze.c 
b/gcc/config/microblaze/microblaze.c
index 6ea5fa2..5796bc7 100644
--- a/gcc/config/microblaze/microblaze.c
+++ b/gcc/config/microblaze/microblaze.c
@@ -919,7 +919,7 @@ microblaze_rtx_costs (rtx x, int code, int outer_code 
ATTRIBUTE_UNUSED, int *tot
  {
/* Add 1 to make shift slightly more expensive than add.  */
*total = COSTS_N_INSNS (INTVAL (XEXP (x, 1))) + 1;
-   /* Reduce shift costs for for special circumstances.  */
+   /* Reduce shift costs for special circumstances.  */
if (optimize_size && INTVAL (XEXP (x, 1)) > 5)
  *total -= 2;
if (!optimize_size && INTVAL (XEXP (x, 1)) > 17)
diff --git a/gcc/config/sh/constraints.md b/gcc/config/sh/constraints.md
index 6b0e5d2..40d0d0b 100644
--- a/gcc/config/sh/constraints.md
+++ b/gcc/config/sh/constraints.md
@@ -1,5 +1,5 @@
 ;; Constraint definitions for Renesas / SuperH SH.
-;; Copyright (C) 2007, 2008 Free Software Foundation, Inc.
+;; Copyright (C) 2007, 2008, 2011 Free Software Foundation, Inc.
 ;;
 ;; This file is part of GCC.
 ;;
@@ -99,7 +99,7 @@
(match_test "ival >= -128 && ival <= 127")))

 (define_constraint "I10"
-  "A signed 10-bit constant, as used in in SHmedia andi, ori."
+  "A signed 10-bit constant, as used in SHmedia andi, ori."
   (and (match_code "const_int")
(match_test "ival >= -512 && ival <= 511")))

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 3356e75..fc69a0c 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -13980,7 +13980,7 @@ type_unification_real (tree tparms,
   gcc_assert (ntparms > 0);

   /* Reset the number of non-defaulted template arguments contained
- in in TARGS.  */
+ in TARGS.  */
   NON_DEFAULT_TEMPLATE_ARGS_COUNT (targs) = NULL_TREE;

   switch (strict)
diff --git a/gcc/java/jcf-parse.c b/gcc/java/jcf-parse.c
index feeddad..a56c1b7 100644
--- a/gcc/java/jcf-parse.c
+++ b/gcc/java/jcf-parse.c
@@ -1,6 +1,6 @@
 /* Parser for Java(TM) .class files.
Copyright (C) 1996, 1998, 1999, 2000, 2001, 2002, 2003, 2004,
-   2005, 2006, 2007, 2008, 2009, 2010 Free Software Foundation, Inc.
+   2005, 2006, 2007, 2008, 2009, 2010, 2011 Free Software Foundation, Inc.

 This file is part of GCC.

@@ -368,7 +368,7 @@ set_source_filename (JCF *jcf, int index)
from the input class file into the output file.  We don't decode the
data at all, merely rewriting constant indexes whenever we come
across them: this is necessary because the constant pool in the
-   output file isn't the same as the constant pool in in the input.
+   output file isn't the same as the constant pool in the input.

The main advantage of this technique is that the resulting
annotation data is pointer-free, so it doesn't have to be relocated
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
ind

Re: [PATCH v3] Re: avoid useless if-before-free tests

2011-04-15 Thread Janne Blomqvist
On Fri, Apr 15, 2011 at 10:54, Jim Meyering  wrote:
> Janne Blomqvist wrote:
>
>> On Thu, Mar 24, 2011 at 18:51, Jim Meyering  wrote:
>>> Janne Blomqvist wrote:
 On Tue, Mar 8, 2011 at 19:53, Jim Meyering  wrote:
> Relative to v2, I've added libgo/ to the list of exempt directories and 
> added
> this recently discussed gfc_free patch, at the request of Tobias Burnus.
> Also, I corrected an error in fortran's ChangeLog and removed all
> whitespace changes from all ChangeLog files.

 The libgfortran changes are Ok for 4.7.

 For the gfortran frontend (gcc/fortran/*) I'd prefer if you'd

 - Replace all calls to "gfc_free (x)" with "free (x)".
 - Remove the gfc_free() function and prototype.
 - Remove the free() macro which currently prevents calling free() directly.
>>>
>>> Following up, I've refreshed the series but hit a minor snag
>>> while converting new uses of gfc_free, removing new tests-before-free
>>> and merging/reordering changes.
>>>
>>> Applying this fix first makes my problem go away:
>>
>> [snip]
>>
>> So, what's the plan here? Do you plan to get a GCC account, do you
>> already have one, or what? Now that 4.7 is open for development, it's
>> perhaps the right time to poke the maintainers to get this patch in.
>>
>> If you don't have a GCC account, as one of the Fortran maintainers I
>> can commit the Fortran and libgfortran parts, but someone else will
>> have to do the rest (were they ever approved, BTW?) as I have only
>> write after approval privileges for the rest of GCC.
>
> Can someone add me to the gcc group?  That would help.
> I already have ssh access to sourceware.org.

According to http://gcc.gnu.org/svnwrite.html , you should send an
email to overseers(at)gcc.gnu.org . It says that you need a sponsor
and "The steering committee or a well-established GCC maintainer
(including reviewers) can approve for write access any person with GNU
copyright assignment papers in place and known to submit good
patches.". I'm not sure if I'm considered to be well-established
enough, so could someone help Jim out here, please?

Or, if nobody pipes up within a few days, just mention my name as your
sponsor and we'll see if the overseers consider me well-established or
not.. ;)


-- 
Janne Blomqvist


Re: [PATCH v3] Re: avoid useless if-before-free tests

2011-04-15 Thread Jim Meyering
Janne Blomqvist wrote:
> On Fri, Apr 15, 2011 at 10:54, Jim Meyering  wrote:
>> Janne Blomqvist wrote:
>>
>>> On Thu, Mar 24, 2011 at 18:51, Jim Meyering  wrote:
 Janne Blomqvist wrote:
> On Tue, Mar 8, 2011 at 19:53, Jim Meyering  wrote:
>> Relative to v2, I've added libgo/ to the list of exempt
>> directories and added
>> this recently discussed gfc_free patch, at the request of Tobias Burnus.
>> Also, I corrected an error in fortran's ChangeLog and removed all
>> whitespace changes from all ChangeLog files.
>
> The libgfortran changes are Ok for 4.7.
>
> For the gfortran frontend (gcc/fortran/*) I'd prefer if you'd
>
> - Replace all calls to "gfc_free (x)" with "free (x)".
> - Remove the gfc_free() function and prototype.
> - Remove the free() macro which currently prevents calling free() 
> directly.

 Following up, I've refreshed the series but hit a minor snag
 while converting new uses of gfc_free, removing new tests-before-free
 and merging/reordering changes.

 Applying this fix first makes my problem go away:
>>>
>>> [snip]
>>>
>>> So, what's the plan here? Do you plan to get a GCC account, do you
>>> already have one, or what? Now that 4.7 is open for development, it's
>>> perhaps the right time to poke the maintainers to get this patch in.
>>>
>>> If you don't have a GCC account, as one of the Fortran maintainers I
>>> can commit the Fortran and libgfortran parts, but someone else will
>>> have to do the rest (were they ever approved, BTW?) as I have only
>>> write after approval privileges for the rest of GCC.
>>
>> Can someone add me to the gcc group?  That would help.
>> I already have ssh access to sourceware.org.
>
> According to http://gcc.gnu.org/svnwrite.html , you should send an
> email to overseers(at)gcc.gnu.org . It says that you need a sponsor
> and "The steering committee or a well-established GCC maintainer
> (including reviewers) can approve for write access any person with GNU
> copyright assignment papers in place and known to submit good
> patches.". I'm not sure if I'm considered to be well-established
> enough, so could someone help Jim out here, please?
>
> Or, if nobody pipes up within a few days, just mention my name as your
> sponsor and we'll see if the overseers consider me well-established or
> not.. ;)

Thanks.
FYI, I have an "ANY" assignment on file, so no need to wait for paperwork.


Fix minor issues in gimplify.c

2011-04-15 Thread Eric Botcazou
Long lines, superfluous 's' and missing head comment.  No functional changes.

There is one function left without head comment: gimplify_adjust_omp_clauses.
Jakub, when you have a few minutes, would you mind adding one?  TIA.

Tested on i586-suse-linux, applied on the mainline as obvious.


2011-04-15  Eric Botcazou  

* gimplify.c: Fix issues in comments throughout.
(voidify_wrapper_expr): Fix long line.
(build_stack_save_restore): Likewise.
(gimplify_loop_expr): Likewise.
(gimplify_compound_lval): Likewise.
(gimplify_init_ctor_eval): Likewise.
(gimplify_modify_expr_rhs): Likewise.
(omp_notice_threadprivate_variable): Likewise.


-- 
Eric Botcazou
Index: gimplify.c
===
--- gimplify.c	(revision 172469)
+++ gimplify.c	(working copy)
@@ -91,7 +91,7 @@ static struct gimplify_ctx *gimplify_ctx
 static struct gimplify_omp_ctx *gimplify_omp_ctxp;
 
 
-/* Formal (expression) temporary table handling: Multiple occurrences of
+/* Formal (expression) temporary table handling: multiple occurrences of
the same scalar expression are evaluated into the same temporary.  */
 
 typedef struct gimple_temp_hash_elt
@@ -100,11 +100,12 @@ typedef struct gimple_temp_hash_elt
   tree temp;  /* Value */
 } elt_t;
 
-/* Forward declarations.  */
+/* Forward declaration.  */
 static enum gimplify_status gimplify_compound_expr (tree *, gimple_seq *, bool);
 
 /* Mark X addressable.  Unlike the langhook we expect X to be in gimple
form and we don't do any syntax checking.  */
+
 void
 mark_addressable (tree x)
 {
@@ -232,6 +233,8 @@ pop_gimplify_context (gimple body)
 htab_delete (c->temp_htab);
 }
 
+/* Push a GIMPLE_BIND tuple onto the stack of bindings.  */
+
 static void
 gimple_push_bind_expr (gimple gimple_bind)
 {
@@ -240,19 +243,23 @@ gimple_push_bind_expr (gimple gimple_bin
   VEC_safe_push (gimple, heap, gimplify_ctxp->bind_expr_stack, gimple_bind);
 }
 
+/* Pop the first element off the stack of bindings.  */
+
 static void
 gimple_pop_bind_expr (void)
 {
   VEC_pop (gimple, gimplify_ctxp->bind_expr_stack);
 }
 
+/* Return the first element of the stack of bindings.  */
+
 gimple
 gimple_current_bind_expr (void)
 {
   return VEC_last (gimple, gimplify_ctxp->bind_expr_stack);
 }
 
-/* Return the stack GIMPLE_BINDs created during gimplification.  */
+/* Return the stack of bindings created during gimplification.  */
 
 VEC(gimple, heap) *
 gimple_bind_expr_stack (void)
@@ -260,7 +267,7 @@ gimple_bind_expr_stack (void)
   return gimplify_ctxp->bind_expr_stack;
 }
 
-/* Returns true iff there is a COND_EXPR between us and the innermost
+/* Return true iff there is a COND_EXPR between us and the innermost
CLEANUP_POINT_EXPR.  This info is used by gimple_push_cleanup.  */
 
 static bool
@@ -392,7 +399,7 @@ remove_suffix (char *name, int len)
 }
 }
 
-/* Create a new temporary name with PREFIX.  Returns an identifier.  */
+/* Create a new temporary name with PREFIX.  Return an identifier.  */
 
 static GTY(()) unsigned int tmp_var_id_num;
 
@@ -413,9 +420,8 @@ create_tmp_var_name (const char *prefix)
   return get_identifier (tmp_name);
 }
 
-
 /* Create a new temporary variable declaration of type TYPE.
-   Does NOT push it into the current binding.  */
+   Do NOT push it into the current binding.  */
 
 tree
 create_tmp_var_raw (tree type, const char *prefix)
@@ -446,7 +452,7 @@ create_tmp_var_raw (tree type, const cha
   return tmp_var;
 }
 
-/* Create a new temporary variable declaration of type TYPE.  DOES push the
+/* Create a new temporary variable declaration of type TYPE.  DO push the
variable into the current binding.  Further, assume that this is called
only from gimplification or optimization, at which point the creation of
certain types are bugs.  */
@@ -537,7 +543,6 @@ lookup_tmp_var (tree val, bool is_formal
   return ret;
 }
 
-
 /* Return true if T is a CALL_EXPR or an expression that can be
assigned to a temporary.  Note that this predicate should only be
used during gimplification.  See the rationale for this in
@@ -605,7 +610,7 @@ internal_get_tmp_var (tree val, gimple_s
   return t;
 }
 
-/* Returns a formal temporary variable initialized with VAL.  PRE_P is as
+/* Return a formal temporary variable initialized with VAL.  PRE_P is as
in gimplify_expr.  Only use this function if:
 
1) The value of the unfactored expression represented by VAL will not
@@ -623,7 +628,7 @@ get_formal_tmp_var (tree val, gimple_seq
   return internal_get_tmp_var (val, pre_p, NULL, true);
 }
 
-/* Returns a temporary variable initialized with VAL.  PRE_P and POST_P
+/* Return a temporary variable initialized with VAL.  PRE_P and POST_P
are as in gimplify_expr.  */
 
 tree
@@ -632,8 +637,8 @@ get_initialized_tmp_var (tree val, gimpl
   return internal_get_tmp_var (val, pre_p, post_p, false);
 }
 
-/* Declares all the variables in VARS in SCOPE.  If DEBUG_IN

Re: [lto/pph] Do not pack more bits than requested (issue4415044)

2011-04-15 Thread Richard Guenther
On Thu, 14 Apr 2011, Diego Novillo wrote:

> On Thu, Apr 14, 2011 at 15:28, Jakub Jelinek  wrote:
> 
> > If bitpack_word_t has BITS_PER_BITPACK_WORD bits, then for
> > nbits = BITS_PER_BITPACK_WORD this will be undefined.
> > Use say
> > mask = ((bitpack_word_t) 2 << (nbits - 1)) - 1;
> > or something similar (assertion ensures that nbits isn't 0).
> 
> Quite right, thanks.  In the meantime, I've changed my mind with this.
>  I think it's safer if we just assert that the value we are about to
> pack fit in the number of bits the caller specified.
> 
> The only problematic user is pack_ts_type_value_fields when it tries
> to pack a -1 for the type's alias set.  I think we should just stream
> that as an integer and not go through the bitpacking overhead.
> 
> For now, I'm applying this to the pph branch.  Tested on x86_64.  No
> LTO failures.

See below
 
> 
> Diego.
> 
> * lto-streamer-out.c (pack_ts_type_value_fields): Pack all bits
> of -1 value.
> * lto-streamer.h (bitpack_create): Assert that the value to
> pack does not overflow NBITS.
> * lto-streamer-in.c (unpack_ts_type_value_fields): Unpack
> BITS_PER_BITPACK_WORD bits for TYPE_ALIAS_SET.
> 
> diff --git a/gcc/lto-streamer-in.c b/gcc/lto-streamer-in.c
> index 97b86ce..f04e031 100644
> --- a/gcc/lto-streamer-in.c
> +++ b/gcc/lto-streamer-in.c
> @@ -1734,7 +1734,7 @@ unpack_ts_type_value_fields (struct bitpack_d
> *bp, tree expr)
>TYPE_USER_ALIGN (expr) = (unsigned) bp_unpack_value (bp, 1);
>TYPE_READONLY (expr) = (unsigned) bp_unpack_value (bp, 1);
>TYPE_ALIGN (expr) = (unsigned) bp_unpack_value (bp, HOST_BITS_PER_INT);
> -  TYPE_ALIAS_SET (expr) = bp_unpack_value (bp, HOST_BITS_PER_INT);
> +  TYPE_ALIAS_SET (expr) = bp_unpack_value (bp, BITS_PER_BITPACK_WORD);
>  }
> 
> 
> diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
> index 3ccad8b..89ad9c5 100644
> --- a/gcc/lto-streamer-out.c
> +++ b/gcc/lto-streamer-out.c
> @@ -518,7 +518,8 @@ pack_ts_type_value_fields (struct bitpack_d *bp, tree 
> expr)
>bp_pack_value (bp, TYPE_USER_ALIGN (expr), 1);
>bp_pack_value (bp, TYPE_READONLY (expr), 1);
>bp_pack_value (bp, TYPE_ALIGN (expr), HOST_BITS_PER_INT);
> -  bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0 ? 0 : -1, HOST_BITS_PER_INT);
> +  bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0 ? 0 : -1,
> +BITS_PER_BITPACK_WORD);

As we only want to stream alias-set zeros just change it to a single bit,
like

 bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0, 1);

and on the reader side restore either a zero or -1.

>  }
> 
> 
> diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h
> index 0d49430..73afd46 100644
> --- a/gcc/lto-streamer.h
> +++ b/gcc/lto-streamer.h
> @@ -1190,18 +1190,15 @@ bitpack_create (struct lto_output_stream *s)
>  static inline void
>  bp_pack_value (struct bitpack_d *bp, bitpack_word_t val, unsigned nbits)
>  {
> -  bitpack_word_t mask, word;
> +  bitpack_word_t word = bp->word;
>int pos = bp->pos;
> 
> -  word = bp->word;
> -
> +  /* We shouldn't try to pack more bits than can fit in a bitpack word.  */
>gcc_assert (nbits > 0 && nbits <= BITS_PER_BITPACK_WORD);

Asserts will break merging the operations and make them slow again.
Please no asserts in these core routines.  Look at the .optimized
dump of a series of bp_pack_value, they should be basically optimized to
a series of ORs.

As for the -1 case, it's simply broken use of the interface.

Richard.

> -  /* Make sure that VAL only has the lower NBITS set.  Generate a
> - mask with the lower NBITS set and use it to filter the upper
> - bits from VAL.  */
> -  mask = ((bitpack_word_t) 1 << nbits) - 1;
> -  val = val & mask;
> +  /* The value to pack should not overflow NBITS.  */
> +  gcc_assert (nbits == BITS_PER_BITPACK_WORD
> + || val <= ((bitpack_word_t) 1 << nbits));
> 
>/* If val does not fit into the current bitpack word switch to the
>   next one.  */


Re: [PATCH] Improve combining of conditionals

2011-04-15 Thread Eric Botcazou
> The patch was successfully tested on {i686, arm, mips}-linux, both GCC
> testsuites and SPEC2000 runs.  For all targets there was no observable code
> difference in SPEC2000 benchmarks, so the example does not trigger very
> often.  Still, it speeds up CoreMark by about 1%.
>
> OK for trunk?

Yes, modulo the following nits:

@@ -4938,11 +4938,13 @@ find_split_point (rtx *loc, rtx insn, bool set_src)
 
IN_DEST is nonzero if we are processing the SET_DEST of a SET.
 
+   IN_COND is nonzero if we are on top level of the condition.

"...we are at the top level of a condition."


@@ -5221,10 +5225,12 @@ subst (rtx x, rtx from, rtx to, int in_dest, int 
unique_copy)
expression.
 
OP0_MODE is the original mode of XEXP (x, 0).  IN_DEST is nonzero
-   if we are inside a SET_DEST.  */
+   if we are inside a SET_DEST.  IN_COND is nonzero if we are on the top level
+   of a condition.  */

Likewise.


@@ -5717,7 +5723,16 @@ combine_simplify_rtx (rtx x, enum machine_mode op0_mode, 
int in_dest)
 ZERO_EXTRACT is indeed appropriate, it will be placed back by
 the call to make_compound_operation in the SET case.  */
 
- if (STORE_FLAG_VALUE == 1
+ if (in_cond)
+   /* Don't apply below optimizations if the caller would
+  prefer a comparison rather than a value.
+  E.g., for the condition in an IF_THEN_ELSE most targets need
+  an explicit comparison.  */
+   {
+ ;
+   }

Remove the superfluous parentheses and move the comment to a new paragraph of 
the big comment just above.

No need to retest, just make sure this still compiles, thanks in advance.

-- 
Eric Botcazou


Re: [PATCH][RFC] Make called function type explicit, make function pointer type conversions useless

2011-04-15 Thread Richard Guenther
On Thu, 14 Apr 2011, Eric Botcazou wrote:

> > I have the following, the gimple_call_chain check is to prevent
> > Ada bootstrap from failing.
> 
> On x86?

Yes, though maybe because I had get_callee_fndecl patched to not
strip conversions.

> > I suppose your patch is ok, maybe with simply not wrapping the NOP_EXPR
> > around builtins.
> 
> I can try that indeed...

Thanks for fixing this!

Richard.


Re: [Patch,testsuite,avr]: add -finline-limit=0 to pr41885.c options

2011-04-15 Thread Richard Guenther
On Thu, Apr 14, 2011 at 8:14 PM, Denis Chertykov  wrote:
> 2011/4/14 Georg-Johann Lay :
>> This patchlet adds -finline-limit=0 to dg-options in
>>
>> testsuite/gcc.target/avr/torture/pr41885.c
>>
>> because otherwise optimizers will fold all tests and actually no test
>> function is called when optimization is on. The test case still passes
>> all tests.
>>
>>
>> testsuite/
>> 2011-04-14  Georg-Johann Lay  
>>
>>        * gcc.target/avr/torture/pr41885.c (dg-options): Add
>>        -finline-limit=0

Please use -fno-inline instead.

Richard.

>> Index: testsuite/gcc.target/avr/torture/pr41885.c
>> ===
>> --- testsuite/gcc.target/avr/torture/pr41885.c  (Revision 172431)
>> +++ testsuite/gcc.target/avr/torture/pr41885.c  (Arbeitskopie)
>> @@ -1,4 +1,4 @@
>> -/* { dg-options "-w -std=c99" } */
>> +/* { dg-options "-w -std=c99 -finline-limit=0" } */
>>  /* { dg-do run } */
>>
>>  #include 
>>
>
> Approved.
>
> Denis.
>


Re: More of ipa-inline housekeeping

2011-04-15 Thread Jan Hubicka
> >
> > I fixed this on the and added sanity check that the fields are initialized.
> > This has shown problem with early inliner iteration fixed thusly and fact 
> > that
> > early inliner is attempting to compute overall growth at a time the inline
> > parameters are not computed for functions not visited by early optimizations
> > yet. We previously agreed that early inliner should not try to do that (as 
> > this
> > leads to early inliner inlining functions called once that should be 
> > deferred
> > for later consieration).  I just hope it won't cause benchmarks to
> > regress too much ;)
> 
> Yeah, we agreed to that.  And I forgot about it as it wasn't part of the
> early inliner reorg (which was supposed to be a 1:1 transform).

Today C++ results shows some regressions, but nothing earthshaking.  So I think 
it is good
idea to drop this feature of early inliner since it is not really systematic.
There is also great improvement on LTO SPEC2000, but I tend to hope it is 
unrelated change.
Perhaps your aliasing?

Honza


Re: More of ipa-inline housekeeping

2011-04-15 Thread Richard Guenther
2011/4/15 Jan Hubicka :
>> >
>> > I fixed this on the and added sanity check that the fields are initialized.
>> > This has shown problem with early inliner iteration fixed thusly and fact 
>> > that
>> > early inliner is attempting to compute overall growth at a time the inline
>> > parameters are not computed for functions not visited by early 
>> > optimizations
>> > yet. We previously agreed that early inliner should not try to do that (as 
>> > this
>> > leads to early inliner inlining functions called once that should be 
>> > deferred
>> > for later consieration).  I just hope it won't cause benchmarks to
>> > regress too much ;)
>>
>> Yeah, we agreed to that.  And I forgot about it as it wasn't part of the
>> early inliner reorg (which was supposed to be a 1:1 transform).
>
> Today C++ results shows some regressions, but nothing earthshaking.  So I 
> think it is good
> idea to drop this feature of early inliner since it is not really systematic.
> There is also great improvement on LTO SPEC2000, but I tend to hope it is 
> unrelated change.
> Perhaps your aliasing?

I doubt SPEC2k uses VLAs or alloca, does it?  Might be the DSE
improvements, but I'm not sure.

Richard.

> Honza
>


[committed] Fix pr46084.c testcase (PR target/48614)

2011-04-15 Thread Jakub Jelinek
Hi!

This testcase doesn't use avx-check.h and is dg-do run testcase,
as it is compiled with -mavx whenever it emits any AVX instructions,
it won't be able to run on any older CPUs.

Fixed thusly, committed to trunk/4.6.

2011-04-15  Jakub Jelinek  

PR target/48614
* gcc.target/i386/pr46084.c: Require avx_runtime instead of
just avx.

--- gcc/testsuite/gcc.target/i386/pr46084.c.jj  2011-04-15 08:11:00.493464611 
+0200
+++ gcc/testsuite/gcc.target/i386/pr46084.c 2011-04-15 08:08:50.282146043 
+0200
@@ -1,7 +1,7 @@
 /* This test needs to use setrlimit to set the stack size, so it can
only run on Unix.  */
 /* { dg-do run { target *-*-linux* *-*-solaris* *-*-darwin* } } */
-/* { dg-require-effective-target avx } */
+/* { dg-require-effective-target avx_runtime } */
 /* { dg-require-effective-target split_stack } */
 /* { dg-options "-fsplit-stack -O2 -mavx" } */
 

Jakub


Re: [PATCH] Improve combining of conditionals

2011-04-15 Thread Steven Bosscher
On Fri, Apr 15, 2011 at 11:04 AM, Eric Botcazou  wrote:
> @@ -4938,11 +4938,13 @@ find_split_point (rtx *loc, rtx insn, bool set_src)
>
>    IN_DEST is nonzero if we are processing the SET_DEST of a SET.
>
> +   IN_COND is nonzero if we are on top level of the condition.
>
> "...we are at the top level of a condition."

And make IN_COND a bool instead of an int?

Ciao!
Steven


Re: [PATCH] Improve combining of conditionals

2011-04-15 Thread Maxim Kuvyrkov
On Apr 15, 2011, at 2:38 PM, Steven Bosscher wrote:

> On Fri, Apr 15, 2011 at 11:04 AM, Eric Botcazou  wrote:
>> @@ -4938,11 +4938,13 @@ find_split_point (rtx *loc, rtx insn, bool set_src)
>> 
>>IN_DEST is nonzero if we are processing the SET_DEST of a SET.
>> 
>> +   IN_COND is nonzero if we are on top level of the condition.
>> 
>> "...we are at the top level of a condition."
> 
> And make IN_COND a bool instead of an int?

I think it's better to follow existing format of the function and make IN_COND 
to be of the same type as IN_DEST, i.e.,  an 'int'.  I agree that 'bool' should 
be preferred when adding a new function or significantly rewriting an existing 
one.

--
Maxim Kuvyrkov
Mentor Graphics / CodeSourcery



Re: [PATCH] Improve combining of conditionals

2011-04-15 Thread Eric Botcazou
> And make IN_COND a bool instead of an int?

I had the same initial reaction but then came to the same conclusion as Maxim.

-- 
Eric Botcazou


Re: [patch][ARM] Fix 16-bit -> 64-bit multiply and accumulate

2011-04-15 Thread Andrew Stubbs

Ping.

On 25/03/11 16:19, Andrew Stubbs wrote:

On 28/01/11 15:20, Richard Earnshaw wrote:


On Fri, 2011-01-28 at 15:13 +, Andrew Stubbs wrote:

On 28/01/11 14:12, Richard Earnshaw wrote:

So what happens to a variation of your testcase:

long long foolong (long long x, short *a, short *b)
{
return x + (long long)*a * (long long)*b;
}

With your patch? This should generate identical code to your original
test-case.



The patch has no effect on that testcase - it's broken in some other
way, I think, and the same with and without my patch:

ldrsh r3, [r3, #0]
ldrsh r2, [r2, #0]
push {r4, r5}
asrs r4, r3, #31
asrs r5, r2, #31
mul r4, r2, r4
mla r4, r3, r5, r4
umull r2, r3, r2, r3
adds r3, r4, r3
adds r0, r0, r2
adc r1, r1, r3
pop {r4, r5}
bx lr

Hmmm, that probably doesn't add anything useful to the discussion. :(

I'll add that one to the todo list ...

Andrew



Ouch! I though that used to work :-(



I looked at this one again, but on a second inspection I'm not sure
there's much wrong with it?

When I wrote the above I thought that there was a 64-bit multiply
instruction, but now I look more closely I see there isn't, hence the
above. It does two 16-bit loads, sign-extends the inputs to 64-bit, does
a 64-bit -> 64-bit multiply, and then adds 'x'.

Can the umull/add/add/adc be optimized using umlal? It's too late on a
Friday to workout what's going on with the carries 

Also, I don't fully understand why the compiler can't just disregard the
casts and use maddhidi4? Isn't that mathematically equivalent in this case?

When you say it used to work, what did it use to look like?

Thanks

Andrew





[ping] 3 unreviewed patches

2011-04-15 Thread Eric Botcazou
Fix annoying gcov filename handling:
  http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01380.html

(rs6000) Fix thinko in output_profile_hook:
  http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01624.html

Introduce -Wstack-usage:
  http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01992.html

Thanks in advance.

-- 
Eric Botcazou


Re: More of ipa-inline housekeeping

2011-04-15 Thread Jan Hubicka
> 2011/4/15 Jan Hubicka :
> >> >
> >> > I fixed this on the and added sanity check that the fields are 
> >> > initialized.
> >> > This has shown problem with early inliner iteration fixed thusly and 
> >> > fact that
> >> > early inliner is attempting to compute overall growth at a time the 
> >> > inline
> >> > parameters are not computed for functions not visited by early 
> >> > optimizations
> >> > yet. We previously agreed that early inliner should not try to do that 
> >> > (as this
> >> > leads to early inliner inlining functions called once that should be 
> >> > deferred
> >> > for later consieration).  I just hope it won't cause benchmarks to
> >> > regress too much ;)
> >>
> >> Yeah, we agreed to that.  And I forgot about it as it wasn't part of the
> >> early inliner reorg (which was supposed to be a 1:1 transform).
> >
> > Today C++ results shows some regressions, but nothing earthshaking.  So I 
> > think it is good
> > idea to drop this feature of early inliner since it is not really 
> > systematic.
> > There is also great improvement on LTO SPEC2000, but I tend to hope it is 
> > unrelated change.
> > Perhaps your aliasing?
> 
> I doubt SPEC2k uses VLAs or alloca, does it?  Might be the DSE
> improvements, but I'm not sure.

It seems to happen only with LTO, so it might be inlining & fixed call cost 
estimates. It does not
seem so likely to me however - I know that gzip is touchy about inlining, but 
vortex seems easy.

Honza


Re: [Patch,testsuite,avr]: add -finline-limit=0 to pr41885.c options

2011-04-15 Thread Georg-Johann Lay
Richard Guenther schrieb:
> On Thu, Apr 14, 2011 at 8:14 PM, Denis Chertykov  wrote:
>> 2011/4/14 Georg-Johann Lay :
>>> This patchlet adds -finline-limit=0 to dg-options in
>>>
>>> testsuite/gcc.target/avr/torture/pr41885.c
>>>
>>> because otherwise optimizers will fold all tests and actually no test
>>> function is called when optimization is on. The test case still passes
>>> all tests.
>>>
>>>
>>> testsuite/
>>> 2011-04-14  Georg-Johann Lay  
>>>
>>>* gcc.target/avr/torture/pr41885.c (dg-options): Add
>>>-finline-limit=0
> 
> Please use -fno-inline instead.
> 
> Richard.

Ok, changed it:

http://gcc.gnu.org/viewcvs?view=revision&revision=172487

Johann


Re: [PATCH] Improve combining of conditionals

2011-04-15 Thread Eric Botcazou
> The problem this patch fixes is that combine_simplify_rtx() prefers to
> return an expression (say, ) even when a comparison is
> prefered (say, ).  Expressions are not recognized as
> valid conditions of if_then_else for most targets, so combiner misses a
> potential optimization.  This patch makes combine_simplify_rtx() aware of
> the context it was invoked in, and, when appropriate, does not discourage
> it from returning a conditional.

Btw, this is very likely also valid for targets with STORE_FLAG_VALUE == -1 so 
the same IN_COND short-circuit would need to be added a few lines below in 
combine_simplify_rtx.  But this would need to be tested.  Do you happen to 
have access to such a target, e.g. m68k?

-- 
Eric Botcazou


Re: PATCH: PR middle-end/48608: Alignment adjust of local variables is lost

2011-04-15 Thread Michael Matz
Hi,

On Thu, 14 Apr 2011, H.J. Lu wrote:

> >> > +  if (align > DECL_ALIGN (decl))
> >> > +    DECL_ALIGN (decl) = align;
> >>
> >> Shouldn't this unconditionally set DECL_ALIGN in case
> >> LOCAL_DECL_ALINGMENT returns something smaller?
> >
> > Decreasing alignment of DECLs points to a problem elsewhere, so perhaps an
> > assert that this doesn't happen is better.  Decreasing is a problem
> > because it's not conservative: there might have been code generated
> > already assuming the once larger alignment that then possibly breaks if it
> > turns out the alignment is actually smaller.
> 
> ia32 may decrease local variable alignment:
> 
>   /* Don't do dynamic stack realignment for long long objects with
>  -mpreferred-stack-boundary=2.  */
>   if (!TARGET_64BIT
>   && align == 64
>   && ix86_preferred_stack_boundary < 64
>   && (mode == DImode || (type && TYPE_MODE (type) == DImode))
>   && (!type || !TYPE_USER_ALIGN (type))
>   && (!decl || !DECL_USER_ALIGN (decl)))
> align = 32;

But it hopefully does so before gimplification?  I.e. before real code is 
generated that could possibly make use of the large alignment?


Ciao,
Michael.

Re: Implement stack arrays even for unknown sizes

2011-04-15 Thread Michael Matz
Hi,

On Thu, 14 Apr 2011, Dominique Dhumieres wrote:

> I have forgotten to mentionned that I have a variant of fatigue
> in which I have done the inlining manually along with few other
> optimizations and the timing for it is
> 
> [macbook] lin/test% gfc -Ofast fatigue_v8.f90
> [macbook] lin/test% time a.out > /dev/null
> 2.793u 0.002s 0:02.79 100.0%  0+0k 0+1io 0pf+0w
> [macbook] lin/test% gfc -Ofast -fwhole-program fatigue_v8.f90
> [macbook] lin/test% time a.out > /dev/null
> 2.680u 0.003s 0:02.68 100.0%  0+0k 0+2io 0pf+0w
> [macbook] lin/test% gfc -Ofast -fwhole-program -flto fatigue_v8.f90
> [macbook] lin/test% time a.out > /dev/null
> 2.671u 0.002s 0:02.67 100.0%  0+0k 0+2io 0pf+0w
> [macbook] lin/test% gfc -Ofast -fwhole-program -fstack-arrays fatigue_v8.f90
> [macbook] lin/test% time a.out > /dev/null
> 2.680u 0.003s 0:02.68 100.0%  0+0k 0+2io 0pf+0w
> [macbook] lin/test% gfc -Ofast -fwhole-program -flto -fstack-arrays 
> fatigue_v8.f90
> [macbook] lin/test% time a.out > /dev/null
> 2.677u 0.003s 0:02.68 99.6%   0+0k 0+0io 0pf+0w
> 
> So the timing of the original code with 
> -Ofast -finline-limit=600 -fwhole-program -flto -fstack-arrays
> is quite close to this lower bound.
> 
> I have also looked at the failure for gfortran.dg/elemental_dependency_1.f90
> and it seems due to a spurious integer(kind=4) A.37[4]; (and friends) in

Yes, this is due to the DECL_EXPR statement which is rendered by the 
dumper just the same as a normal decl.  The testcase looks for exactly one 
such decl, but with -fstack-arrays there are exactly two for each such 
array.

> integer(kind=4) A.37[4];

So, this is the normal decl.

> atmp.36.dim[0].ubound = 3;
> integer(kind=4) A.37[4];

And this is the DECL_EXPR statement, which then actually is transformed 
into the stack_save/alloca/stack_restore sequence.


Ciao,
Michael.


Re: [Patch, libfortran] Replace sprintf() with snprintf()

2011-04-15 Thread Jerry DeLisle

On 04/14/2011 11:53 PM, Janne Blomqvist wrote:

Hi,

as is well known, sprintf() is prone to buffer overflow, hence
snprintf(). libgfortran uses snprintf() in some places, but not
everywhere. Rather than analyzing every sprintf() call for a potential
overflow, the attached patch takes the dogmatic but simple approach of
replacing all the remaining sprintf() usage with snprintf().

For targets without snprintf(), io/list_read.c contained a fallback
macro that uses sprintf(); this is moved to libgfortran.h so that it's
available everywhere.

readelf -s libgfortran.so|grep sprintf

confirms that there is no remaining usage of sprintf().

Regtested on x86_64-unknown-linux-gnu, Ok for trunk?



OK, thanks.

Jerry


Re: [PATCH] Improve combining of conditionals

2011-04-15 Thread Maxim Kuvyrkov
On Apr 15, 2011, at 3:34 PM, Eric Botcazou wrote:

>> The problem this patch fixes is that combine_simplify_rtx() prefers to
>> return an expression (say, ) even when a comparison is
>> prefered (say, ).  Expressions are not recognized as
>> valid conditions of if_then_else for most targets, so combiner misses a
>> potential optimization.  This patch makes combine_simplify_rtx() aware of
>> the context it was invoked in, and, when appropriate, does not discourage
>> it from returning a conditional.
> 
> Btw, this is very likely also valid for targets with STORE_FLAG_VALUE == -1 
> so 
> the same IN_COND short-circuit would need to be added a few lines below in 
> combine_simplify_rtx.  But this would need to be tested.  Do you happen to 
> have access to such a target, e.g. m68k?

Hm, I didn't notice that one, thanks!  I have access to m68k (ColdFire, tbp) 
and will test this change there before committing.

--
Maxim Kuvyrkov
Mentor Graphics / CodeSourcery



Re: [7/9] Testsuite: remove vect_{extract_even_odd,strided}_wide

2011-04-15 Thread Richard Guenther
On Tue, Apr 12, 2011 at 4:14 PM, Richard Sandiford
 wrote:
> We have separate vect_extract_even_odd and vect_extract_even_odd_wide
> target selectors, and separate vect_strided and vect_strided_wide
> selectors.  The comment suggests that "wide" is for 32+ bits,
> but we often use the non-wide forms for 32-bit tests.  We also have
> tests that combine 16-bit and 32-bit strided accesses without checking
> for both widths.
>
> I'm about to split vect_strided into vect_stridedN (for each stride
> factor N).  One option was to preserve the wide distinction and have
> vect_stridedN_wide as well.  However, given the current usage,
> and given that the two selectors are the same, I think it makes sense
> to combine them until we know what distinction we need to make.
>
> Tested on x86_64-linux-gnu and arm-linux-gnueabi.  OK to install?

Ok.

Thanks,
Richard.

> Richard
>
>
> gcc/testsuite/
>        * lib/target-supports.exp
>        (check_effective_target_vect_extract_even_odd_wide): Delete.
>        (check_effective_target_vect_strided_wide): Likewise.
>        * gcc.dg/vect/O3-pr39675-2.c: Use the non-wide versions instead.
>        * gcc.dg/vect/fast-math-pr35982.c: Likewise.
>        * gcc.dg/vect/fast-math-vect-complex-3.c: Likewise.
>        * gcc.dg/vect/pr37539.c: Likewise.
>        * gcc.dg/vect/slp-11.c: Likewise.
>        * gcc.dg/vect/slp-12a.c: Likewise.
>        * gcc.dg/vect/slp-12b.c: Likewise.
>        * gcc.dg/vect/slp-19.c: Likewise.
>        * gcc.dg/vect/slp-23.c: Likewise.
>        * gcc.dg/vect/vect-1.c: Likewise.
>        * gcc.dg/vect/vect-98.c: Likewise.
>        * gcc.dg/vect/vect-107.c: Likewise.
>        * gcc.dg/vect/vect-strided-float.c: Likewise.
>
> Index: gcc/testsuite/lib/target-supports.exp
> ===
> --- gcc/testsuite/lib/target-supports.exp       2011-04-12 11:53:54.0 
> +0100
> +++ gcc/testsuite/lib/target-supports.exp       2011-04-12 11:55:11.0 
> +0100
> @@ -3121,29 +3121,6 @@ proc check_effective_target_vect_extract
>     return $et_vect_extract_even_odd_saved
>  }
>
> -# Return 1 if the target supports vector even/odd elements extraction of
> -# vectors with SImode elements or larger, 0 otherwise.
> -
> -proc check_effective_target_vect_extract_even_odd_wide { } {
> -    global et_vect_extract_even_odd_wide_saved
> -
> -    if [info exists et_vect_extract_even_odd_wide_saved] {
> -        verbose "check_effective_target_vect_extract_even_odd_wide: using 
> cached result" 2
> -    } else {
> -        set et_vect_extract_even_odd_wide_saved 0
> -        if { [istarget powerpc*-*-*]
> -             || [istarget i?86-*-*]
> -             || [istarget x86_64-*-*]
> -             || [istarget ia64-*-*]
> -             || [istarget spu-*-*] } {
> -           set et_vect_extract_even_odd_wide_saved 1
> -        }
> -    }
> -
> -    verbose "check_effective_target_vect_extract_even_wide_odd: returning 
> $et_vect_extract_even_odd_wide_saved" 2
> -    return $et_vect_extract_even_odd_wide_saved
> -}
> -
>  # Return 1 if the target supports vector interleaving, 0 otherwise.
>
>  proc check_effective_target_vect_interleave { } {
> @@ -3184,25 +3161,6 @@ proc check_effective_target_vect_strided
>     return $et_vect_strided_saved
>  }
>
> -# Return 1 if the target supports vector interleaving and extract even/odd
> -# for wide element types, 0 otherwise.
> -proc check_effective_target_vect_strided_wide { } {
> -    global et_vect_strided_wide_saved
> -
> -    if [info exists et_vect_strided_wide_saved] {
> -        verbose "check_effective_target_vect_strided_wide: using cached 
> result" 2
> -    } else {
> -        set et_vect_strided_wide_saved 0
> -        if { [check_effective_target_vect_interleave]
> -             && [check_effective_target_vect_extract_even_odd_wide] } {
> -           set et_vect_strided_wide_saved 1
> -        }
> -    }
> -
> -    verbose "check_effective_target_vect_strided_wide: returning 
> $et_vect_strided_wide_saved" 2
> -    return $et_vect_strided_wide_saved
> -}
> -
>  # Return 1 if the target supports section-anchors
>
>  proc check_effective_target_section_anchors { } {
> Index: gcc/testsuite/gcc.dg/vect/O3-pr39675-2.c
> ===
> --- gcc/testsuite/gcc.dg/vect/O3-pr39675-2.c    2011-04-12 11:53:54.0 
> +0100
> +++ gcc/testsuite/gcc.dg/vect/O3-pr39675-2.c    2011-04-12 11:55:11.0 
> +0100
> @@ -26,7 +26,7 @@ foo ()
>     }
>  }
>
> -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  { target 
> vect_strided_wide } } } */
> -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" 
> { target vect_strided_wide } } } */
> +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  { target 
> vect_strided } } } */
> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" 
> { target vect_strided } } } */
>  /* { dg-final { cleanup-t

Re: [8/9] Testsuite: split tests for strided accesses

2011-04-15 Thread Richard Guenther
On Tue, Apr 12, 2011 at 4:19 PM, Richard Sandiford
 wrote:
> The next patch introduces separate vect_stridedN target selectors
> for each tested stride factor N.  At the moment, some tests contain
> several independent loops that have different stride factors.
> It's easier to make the next change if we put these loops into
> separate tests.
>
> Tested on x86_64-linux-gnu and arm-linux-gnueabi.  OK to install?

Ok.

Thanks,
Richard.

> Richard
>
>
> gcc/testsuite/
>        * gcc.dg/vect/slp-11.c: Split into...
>        * gcc.dg/vect/slp-11a.c, gcc.dg/vect/slp-11b.c,
>        gcc.dg/vect/slp-11c.c: ...these tests.
>        * gcc.dg/vect/slp-12a.c: Split 4-stride loop into...
>        * gcc.dg/vect/slp-12c.c: ...this new test.
>        * gcc.dg/vect/slp-19.c: Split into...
>        * gcc.dg/vect/slp-19a.c, gcc.dg/vect/slp-19b.c,
>        gcc.dg/vect/slp-19c.c: ...these new tests.
>
> Index: gcc/testsuite/gcc.dg/vect/slp-11.c
> ===
> --- gcc/testsuite/gcc.dg/vect/slp-11.c  2011-04-12 15:18:24.0 +0100
> +++ /dev/null   2011-03-23 08:42:11.268792848 +
> @@ -1,113 +0,0 @@
> -/* { dg-require-effective-target vect_int } */
> -
> -#include 
> -#include "tree-vect.h"
> -
> -#define N 8
> -
> -int
> -main1 ()
> -{
> -  int i;
> -  unsigned int out[N*8], a0, a1, a2, a3, a4, a5, a6, a7, b1, b0, b2, b3, b4, 
> b5, b6, b7;
> -  unsigned int in[N*8] = 
> {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63};
> -  float out2[N*8];
> -
> -  /* Different operations - not SLPable.  */
> -  for (i = 0; i < N; i++)
> -    {
> -      a0 = in[i*8] + 5;
> -      a1 = in[i*8 + 1] * 6;
> -      a2 = in[i*8 + 2] + 7;
> -      a3 = in[i*8 + 3] + 8;
> -      a4 = in[i*8 + 4] + 9;
> -      a5 = in[i*8 + 5] + 10;
> -      a6 = in[i*8 + 6] + 11;
> -      a7 = in[i*8 + 7] + 12;
> -
> -      b0 = a0 * 3;
> -      b1 = a1 * 2;
> -      b2 = a2 * 12;
> -      b3 = a3 * 5;
> -      b4 = a4 * 8;
> -      b5 = a5 * 4;
> -      b6 = a6 * 3;
> -      b7 = a7 * 2;
> -
> -      out[i*8] = b0 - 2;
> -      out[i*8 + 1] = b1 - 3;
> -      out[i*8 + 2] = b2 - 2;
> -      out[i*8 + 3] = b3 - 1;
> -      out[i*8 + 4] = b4 - 8;
> -      out[i*8 + 5] = b5 - 7;
> -      out[i*8 + 6] = b6 - 3;
> -      out[i*8 + 7] = b7 - 7;
> -    }
> -
> -  /* check results:  */
> -  for (i = 0; i < N; i++)
> -    {
> -      if (out[i*8] !=  (in[i*8] + 5) * 3 - 2
> -         || out[i*8 + 1] != (in[i*8 + 1] * 6) * 2 - 3
> -         || out[i*8 + 2] != (in[i*8 + 2] + 7) * 12 - 2
> -         || out[i*8 + 3] != (in[i*8 + 3] + 8) * 5 - 1
> -         || out[i*8 + 4] != (in[i*8 + 4] + 9) * 8 - 8
> -         || out[i*8 + 5] != (in[i*8 + 5] + 10) * 4 - 7
> -         || out[i*8 + 6] != (in[i*8 + 6] + 11) * 3 - 3
> -         || out[i*8 + 7] != (in[i*8 + 7] + 12) * 2 - 7)
> -       abort ();
> -    }
> -
> -  /* Requires permutation - not SLPable.  */
> -  for (i = 0; i < N*2; i++)
> -    {
> -      out[i*4] = (in[i*4] + 2) * 3;
> -      out[i*4 + 1] = (in[i*4 + 2] + 2) * 7;
> -      out[i*4 + 2] = (in[i*4 + 1] + 7) * 3;
> -      out[i*4 + 3] = (in[i*4 + 3] + 3) * 4;
> -    }
> -
> -  /* check results:  */
> -  for (i = 0; i < N*2; i++)
> -    {
> -      if (out[i*4] !=  (in[i*4] + 2) * 3
> -         || out[i*4 + 1] != (in[i*4 + 2] + 2) * 7
> -         || out[i*4 + 2] != (in[i*4 + 1] + 7) * 3
> -         || out[i*4 + 3] != (in[i*4 + 3] + 3) * 4)
> -        abort ();
> -    }
> -
> -  /* Different operations - not SLPable.  */
> -  for (i = 0; i < N*4; i++)
> -    {
> -      out2[i*2] = ((float) in[i*2] * 2 + 6) ;
> -      out2[i*2 + 1] = (float) (in[i*2 + 1] * 3 + 7);
> -    }
> -
> -  /* check results:  */
> -  for (i = 0; i < N*4; i++)
> -    {
> -      if (out2[i*2] !=  ((float) in[i*2] * 2 + 6)
> -         || out2[i*2 + 1] != (float) (in[i*2 + 1] * 3 + 7))
> -        abort ();
> -    }
> -
> -
> -  return 0;
> -}
> -
> -int main (void)
> -{
> -  check_vect ();
> -
> -  main1 ();
> -
> -  return 0;
> -}
> -
> -/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect"  { target 
> { { vect_uintfloat_cvt && vect_strided } &&  vect_int_mult } } } } */
> -/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect"  { target 
> { { { ! vect_uintfloat_cvt } && vect_strided } &&  vect_int_mult } } } } */
> -/* { dg-final { scan-tree-dump-times "vectorized 0 loops" 1 "vect"  {target  
> { ! { vect_int_mult && vect_strided } } } } }  */
> -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0  "vect" 
>  } } */
> -/* { dg-final { cleanup-tree-dump "vect" } } */
> -
> Index: gcc/testsuite/gcc.dg/vect/slp-11a.c
> ===
> --- /dev/null   2011-03-23 08:42:11.268792848 +
> +++ gcc/testsuite/gcc.dg/vect/slp-11a.c 2011-04-12 15:18:25.0 +0100
> @@ -0,0 +1,75 @@
> +/* { dg-require-effective-targ

Re: [10/9] Add tests for stride-3 accesses

2011-04-15 Thread Richard Guenther
On Tue, Apr 12, 2011 at 4:34 PM, Richard Sandiford
 wrote:
> This patch adds a test for stride-3 accesses.  I didn't add any
> particularly complicated cases because I think the testsuite already
> covers the interaction between the strided loads & stores and other
> operations pretty well.  Let me know if there's something I should
> add though.
>
> Tested on x86_64-linux-gnu and arm-linux-gnueabi.  OK to install?

Ok.

Thanks,
Richard.

> Richard
>
>
> gcc/testsuite/
>        * gcc.dg/vect/vect-strided-u16-i3.c: New test.
>
> Index: gcc/testsuite/gcc.dg/vect/vect-strided-u16-i3.c
> ===
> --- /dev/null   2011-03-23 08:42:11.268792848 +
> +++ gcc/testsuite/gcc.dg/vect/vect-strided-u16-i3.c     2011-04-12 
> 11:55:17.0 +0100
> @@ -0,0 +1,112 @@
> +#include 
> +#include "tree-vect.h"
> +
> +#define N 128
> +
> +typedef struct {
> +   unsigned short a;
> +   unsigned short b;
> +   unsigned short c;
> +} s;
> +
> +#define A(I) (I)
> +#define B(I) ((I) * 2)
> +#define C(I) ((unsigned short) ~((I) ^ 0x18))
> +
> +void __attribute__ ((noinline))
> +check1 (s *res)
> +{
> +  int i;
> +
> +  for (i = 0; i < N; i++)
> +    if (res[i].a != C (i)
> +       || res[i].b != A (i)
> +       || res[i].c != B (i))
> +      abort ();
> +}
> +
> +void __attribute__ ((noinline))
> +check2 (unsigned short *res)
> +{
> +  int i;
> +
> +  for (i = 0; i < N; i++)
> +    if (res[i] != (unsigned short) (A (i) + B (i) + C (i)))
> +      abort ();
> +}
> +
> +void __attribute__ ((noinline))
> +check3 (s *res)
> +{
> +  int i;
> +
> +  for (i = 0; i < N; i++)
> +    if (res[i].a != i
> +       || res[i].b != i
> +       || res[i].c != i)
> +      abort ();
> +}
> +
> +void __attribute__ ((noinline))
> +check4 (unsigned short *res)
> +{
> +  int i;
> +
> +  for (i = 0; i < N; i++)
> +    if (res[i] != (unsigned short) (A (i) + B (i)))
> +      abort ();
> +}
> +
> +void __attribute__ ((noinline))
> +main1 (s *arr)
> +{
> +  int i;
> +  s *ptr = arr;
> +  s res1[N];
> +  unsigned short res2[N];
> +
> +  for (i = 0; i < N; i++)
> +    {
> +      res1[i].a = arr[i].c;
> +      res1[i].b = arr[i].a;
> +      res1[i].c = arr[i].b;
> +    }
> +  check1 (res1);
> +
> +  for (i = 0; i < N; i++)
> +    res2[i] = arr[i].a + arr[i].b + arr[i].c;
> +  check2 (res2);
> +
> +  for (i = 0; i < N; i++)
> +    {
> +      res1[i].a = i;
> +      res1[i].b = i;
> +      res1[i].c = i;
> +    }
> +  check3 (res1);
> +
> +  for (i = 0; i < N; i++)
> +    res2[i] = arr[i].a + arr[i].b;
> +  check4 (res2);
> +}
> +
> +int main (void)
> +{
> +  int i;
> +  s arr[N];
> +
> +  check_vect ();
> +
> +  for (i = 0; i < N; i++)
> +    {
> +      arr[i].a = A (i);
> +      arr[i].b = B (i);
> +      arr[i].c = C (i);
> +    }
> +  main1 (arr);
> +
> +  return 0;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect"  { target 
> vect_strided3 } } } */
> +/* { dg-final { cleanup-tree-dump "vect" } } */
>


[Committed] S/390: Fix popcount expanders

2011-04-15 Thread Andreas Krebbel
Hi,

the attached patch fixes a problem with the population count expanders
on s390.  For operand 2 a scratch register is allocated in the
preparation code of the expander.  Nevertheless there has been a
match_operand for operand 2.  It should be just a match_dup instead.

Fixed with the attached patch.

Bootstrapped and regression tested on s390x.

Committed to 4.6 and mainline.

Bye,

-Andreas-


2011-04-15  Andreas Krebbel  

* config/s390/s390.md (popcountdi2, popcountsi2, popcounthi2):
Replace match_operand with match_dup for the third operand in
these expanders.


Index: gcc/config/s390/s390.md
===
*** gcc/config/s390/s390.md.orig
--- gcc/config/s390/s390.md
***
*** 9330,9336 
 (parallel [(set (match_dup 0) (plus:DI (match_dup 0) (match_dup 2)))
  (clobber (reg:CC CC_REGNUM))])
 ; sllg op2, op0, 16
!(set (match_operand:DI 2 "register_operand" "")
(ashift:DI (match_dup 0) (const_int 16)))
 ; agr op0, op2
 (parallel [(set (match_dup 0) (plus:DI (match_dup 0) (match_dup 2)))
--- 9330,9336 
 (parallel [(set (match_dup 0) (plus:DI (match_dup 0) (match_dup 2)))
  (clobber (reg:CC CC_REGNUM))])
 ; sllg op2, op0, 16
!(set (match_dup 2)
(ashift:DI (match_dup 0) (const_int 16)))
 ; agr op0, op2
 (parallel [(set (match_dup 0) (plus:DI (match_dup 0) (match_dup 2)))
***
*** 9352,9358 
  UNSPEC_POPCNT))
  (clobber (reg:CC CC_REGNUM))])
 ; sllk op2, op0, 16
!(set (match_operand:SI 2 "register_operand" "")
(ashift:SI (match_dup 0) (const_int 16)))
 ; ar op0, op2
 (parallel [(set (match_dup 0) (plus:SI (match_dup 0) (match_dup 2)))
--- 9352,9358 
  UNSPEC_POPCNT))
  (clobber (reg:CC CC_REGNUM))])
 ; sllk op2, op0, 16
!(set (match_dup 2)
(ashift:SI (match_dup 0) (const_int 16)))
 ; ar op0, op2
 (parallel [(set (match_dup 0) (plus:SI (match_dup 0) (match_dup 2)))
***
*** 9374,9380 
  UNSPEC_POPCNT))
  (clobber (reg:CC CC_REGNUM))])
 ; sllk op2, op0, 8
!(set (match_operand:SI 2 "register_operand" "")
(ashift:SI (match_dup 0) (const_int 8)))
 ; ar op0, op2
 (parallel [(set (match_dup 0) (plus:SI (match_dup 0) (match_dup 2)))
--- 9374,9380 
  UNSPEC_POPCNT))
  (clobber (reg:CC CC_REGNUM))])
 ; sllk op2, op0, 8
!(set (match_dup 2)
(ashift:SI (match_dup 0) (const_int 8)))
 ; ar op0, op2
 (parallel [(set (match_dup 0) (plus:SI (match_dup 0) (match_dup 2)))


Re: [lto/pph] Do not pack more bits than requested (issue4415044)

2011-04-15 Thread Diego Novillo
On Fri, Apr 15, 2011 at 04:56, Richard Guenther  wrote:

>> @@ -518,7 +518,8 @@ pack_ts_type_value_fields (struct bitpack_d *bp, tree 
>> expr)
>>    bp_pack_value (bp, TYPE_USER_ALIGN (expr), 1);
>>    bp_pack_value (bp, TYPE_READONLY (expr), 1);
>>    bp_pack_value (bp, TYPE_ALIGN (expr), HOST_BITS_PER_INT);
>> -  bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0 ? 0 : -1, 
>> HOST_BITS_PER_INT);
>> +  bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0 ? 0 : -1,
>> +                BITS_PER_BITPACK_WORD);
>
> As we only want to stream alias-set zeros just change it to a single bit,
> like
>
>     bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0, 1);
>
> and on the reader side restore either a zero or -1.

Ah, yes.  Much better.

> As for the -1 case, it's simply broken use of the interface.

Which would've been caught by the assertion.  How about this, we keep
the asserts with #ifdef ENABLE_CHECKING. This would've saved me some
ugly debugging.


Diego.


Re: [lto/pph] Do not pack more bits than requested (issue4415044)

2011-04-15 Thread Richard Guenther
On Fri, 15 Apr 2011, Diego Novillo wrote:

> On Fri, Apr 15, 2011 at 04:56, Richard Guenther  wrote:
> 
> >> @@ -518,7 +518,8 @@ pack_ts_type_value_fields (struct bitpack_d *bp, tree 
> >> expr)
> >>    bp_pack_value (bp, TYPE_USER_ALIGN (expr), 1);
> >>    bp_pack_value (bp, TYPE_READONLY (expr), 1);
> >>    bp_pack_value (bp, TYPE_ALIGN (expr), HOST_BITS_PER_INT);
> >> -  bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0 ? 0 : -1, 
> >> HOST_BITS_PER_INT);
> >> +  bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0 ? 0 : -1,
> >> +                BITS_PER_BITPACK_WORD);
> >
> > As we only want to stream alias-set zeros just change it to a single bit,
> > like
> >
> >     bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0, 1);
> >
> > and on the reader side restore either a zero or -1.
> 
> Ah, yes.  Much better.
> 
> > As for the -1 case, it's simply broken use of the interface.
> 
> Which would've been caught by the assertion.  How about this, we keep
> the asserts with #ifdef ENABLE_CHECKING. This would've saved me some
> ugly debugging.

I think we should rather add the masking.  The assert would only
trigger for particular values, not for bogus use of the interface
(you get sign-extension for signed arguments).

Richard.

PATCH: PR target/48612: [4.7 Regression] vminps instruction is generated with -ftree-vectorize

2011-04-15 Thread H.J. Lu
Hi,

This patch is approved in PR.  I checked it in.


H.J.
---
Index: ChangeLog
===
--- ChangeLog   (revision 172490)
+++ ChangeLog   (working copy)
@@ -1,3 +1,9 @@
+2011-04-15  H.J. Lu  
+
+   PR target/48612
+   * config/i386/sse.md (*ieee_smin3): Switch mnemonics.
+   (*ieee_smax3): Likewise.
+
 2011-04-15  Andreas Krebbel  
 
* config/s390/s390.md (popcountdi2, popcountsi2, popcounthi2):
Index: config/i386/sse.md
===
--- config/i386/sse.md  (revision 172490)
+++ config/i386/sse.md  (working copy)
@@ -962,8 +962,8 @@
 UNSPEC_IEEE_MIN))]
   ""
   "@
-   vmin\t{%2, %1, %0|%0, %1, %2}
-   min\t{%2, %0|%0, %2}"
+   min\t{%2, %0|%0, %2}
+   vmin\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "isa" "noavx,avx")
(set_attr "type" "sseadd")
(set_attr "prefix" "orig,vex")
@@ -977,8 +977,8 @@
 UNSPEC_IEEE_MAX))]
   ""
   "@
-   vmax\t{%2, %1, %0|%0, %1, %2}
-   max\t{%2, %0|%0, %2}"
+   max\t{%2, %0|%0, %2}
+   vmax\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "isa" "noavx,avx")
(set_attr "type" "sseadd")
(set_attr "prefix" "orig,vex")


[PATCH 0/4] Devirtualization fix and improvements

2011-04-15 Thread Martin Jambor
Hi,

the following set of patches is a bunch of early fixups and improvements
to the current devirtualization mechanism which are semi-related but are
meant to be applied on top of each other.  One nice thing about them is
that they improve SPEC 2006 473.astar by over 3% with LTO.

More comments above the individual patches.

Thanks,

Martin


[PATCH 1/4] Remove usesess and wrong code from ipa_analyze_virtual_call_uses

2011-04-15 Thread Martin Jambor
Hi,

ipa_analyze_virtual_call_uses contains code that was meant to deal
with situation where OBJ_TYPE_REF_OBJECT is a (number of)
COMPONENT_REFs on top of a dereferenced default definition SSA_NAME of
a parameter.

The code is useless because that never happens in the IL, if an
ancestor object of a parameter is being used for a virtual call, the
object in the expression is always an SSA_NAME which is assigned the
proper value in a previous statement.

Moreover, if it ever triggered, it might lead to wrong code because in
cases like this it is necessary also to store the offset within the
parameter into the indirect call graph edge information (like we do in
indirect inlining).

The above is done in the next patch in the series.  I've split this
part from it because I would like to commit it also to the 4.6 branch.
I have bootstrapped and tested this on x86-64-linux without any
problems.  OK for trunk and the 4.6 branch?

Thanks,

Martin


2011-04-08  Martin Jambor  

* ipa-prop.c (ipa_analyze_virtual_call_uses): Remove handling
of ADR_EXPRs.


Index: src/gcc/ipa-prop.c
===
--- src.orig/gcc/ipa-prop.c
+++ src/gcc/ipa-prop.c
@@ -1383,18 +1383,6 @@ ipa_analyze_virtual_call_uses (struct cg
   if (!flag_devirtualize)
 return;
 
-  if (TREE_CODE (obj) == ADDR_EXPR)
-{
-  do
-   {
- obj = TREE_OPERAND (obj, 0);
-   }
-  while (TREE_CODE (obj) == COMPONENT_REF);
-  if (TREE_CODE (obj) != MEM_REF)
-   return;
-  obj = TREE_OPERAND (obj, 0);
-}
-
   if (TREE_CODE (obj) != SSA_NAME
   || !SSA_NAME_IS_DEFAULT_DEF (obj))
 return;



[PATCH 3/4] Simple relaxation of dynamic type change detection routine

2011-04-15 Thread Martin Jambor
Hi,

in order to speed up astar, I had to persuade the function that
decides whether a statement potentially modifies the dynamic type of
an object by storing a new value to the VMT pointer to consider the
following statement harmless (all types are integers of some sort):

MEM[(i32 *)b2arp_3(D) + 8B] = 0;

I'd like to experiment with this routine a bit more once I have some
other IPA-CP infrastructure in place but at the moment I opted for a
simple solution:  All scalar non-pointer stores are deemed safe.

VMT pointer is a compiler generated field which is a pointer so legal
user code is not able to store stuff there through some fancy type
casts and compiler generated code should have no reason whatsoever to
that either.  Therefore I believe this change is safe and useful.

I have bootstrapped and tested the patch on x886_64-linux.  OK for
trunk?

Thanks,

Martin



2011-04-14  Martin Jambor  

* ipa-prop.c (stmt_may_be_vtbl_ptr_store): Return false for scalar
non-pointer assignments.

Index: src/gcc/ipa-prop.c
===
--- src.orig/gcc/ipa-prop.c
+++ src/gcc/ipa-prop.c
@@ -405,13 +405,18 @@ stmt_may_be_vtbl_ptr_store (gimple stmt)
 {
   tree lhs = gimple_assign_lhs (stmt);
 
-  if (TREE_CODE (lhs) == COMPONENT_REF
- && !DECL_VIRTUAL_P (TREE_OPERAND (lhs, 1))
- && !AGGREGATE_TYPE_P (TREE_TYPE (lhs)))
+  if (!AGGREGATE_TYPE_P (TREE_TYPE (lhs)))
+   {
+ if (!POINTER_TYPE_P (TREE_TYPE (lhs)))
return false;
-  /* In the future we might want to use get_base_ref_and_offset to find
-if there is a field corresponding to the offset and if so, proceed
-almost like if it was a component ref.  */
+
+ if (TREE_CODE (lhs) == COMPONENT_REF
+ && !DECL_VIRTUAL_P (TREE_OPERAND (lhs, 1)))
+   return false;
+ /* In the future we might want to use get_base_ref_and_offset to find
+if there is a field corresponding to the offset and if so, proceed
+almost like if it was a component ref.  */
+   }
 }
   return true;
 }



[PATCH 4/4] Devirtualization based on global objects

2011-04-15 Thread Martin Jambor
Hi,

this is the patch that actually speeds up astar (together with the
previous one).  It implements devirtualization based on global
objects.  In the past we have refrained from doing this because in
general it is difficult to say whether the object is currently being
constructed and so it might have a dynamic type of one of its
ancestors.  However, if the object's class does not have any ancestors
that obviously cannot happen.

Devirtualizing in such conditions is enough to change a virtual call
to regwayobj::isaddtobound in 473.astar to a direct one which can and
should be inlined.  That seemed a good justification to implement this
and so the patch below does so and brings about 3.1% speedup for that
benchmark with LTO.

I acknowledge that instead of discarding all classes with ancestors it
would be better to check that the called virtual method has the same
implementation in all ancestors instead.  That is perhaps something
for later.

It took me surprisingly long to realize that this technique can be
used for folding virtual calls based on local automatically allocated
objexts too and so can be used to un-XFAIL g++.dg/opt/devirt1.c that
regressed in 4.6.

Bootstrapped and tested on x86_64-linux.  OK for trunk?

Thanks,

Martin


2011-04-15  Martin Jambor  

* ipa-cp.c (ipcp_process_devirtualization_opportunities): Devirtualize
also according to actual contants.
* gimple-fold.c (gimple_extract_devirt_binfo_from_cst): New function.
(gimple_fold_obj_type_ref_call): New function.
(gimple_fold_call): Call gimple_fold_obj_type_ref_call on
OBJ_TYPE_REFs.
* gimple.h (gimple_extract_devirt_binfo_from_cst): Declare.

* testsuite/g++.dg/opt/devirt1.C: Bump to -O2, remove XFAIL.
* testsuite/g++.dg/opt/devirt2.C: New test.
* testsuite/g++.dg/ipa/devirt-g-1.C: Likewise.


Index: src/gcc/ipa-cp.c
===
--- src.orig/gcc/ipa-cp.c
+++ src/gcc/ipa-cp.c
@@ -1245,51 +1245,71 @@ ipcp_process_devirtualization_opportunit
 
   for (ie = node->indirect_calls; ie; ie = next_ie)
 {
-  int param_index, types_count, j;
+  int param_index;
   HOST_WIDE_INT token, anc_offset;
   tree target, delta, otr_type;
+  struct ipcp_lattice *lat;
 
   next_ie = ie->next_callee;
   if (!ie->indirect_info->polymorphic)
continue;
   param_index = ie->indirect_info->param_index;
-  if (param_index == -1
- || ipa_param_cannot_devirtualize_p (info, param_index)
- || ipa_param_types_vec_empty (info, param_index))
+  if (param_index == -1)
continue;
 
+  lat = ipcp_get_lattice (info, param_index);
   token = ie->indirect_info->otr_token;
   anc_offset = ie->indirect_info->anc_offset;
   otr_type = ie->indirect_info->otr_type;
   target = NULL_TREE;
-  types_count = VEC_length (tree, info->params[param_index].types);
-  for (j = 0; j < types_count; j++)
+  if (lat->type == IPA_CONST_VALUE)
{
- tree binfo = VEC_index (tree, info->params[param_index].types, j);
- tree d, t;
-
+ tree binfo = gimple_extract_devirt_binfo_from_cst (lat->constant);
+ if (!binfo)
+   continue;
  binfo = get_binfo_at_offset (binfo, anc_offset, otr_type);
  if (!binfo)
-   {
- target = NULL_TREE;
- break;
-   }
+   continue;
+ target = gimple_get_virt_mehtod_for_binfo (token, binfo, &delta,
+false);
+   }
+  else
+   {
+ int  types_count, j;
 
- t = gimple_get_virt_mehtod_for_binfo (token, binfo, &d, true);
- if (!t)
-   {
- target = NULL_TREE;
- break;
-   }
- else if (!target)
-   {
- target = t;
- delta = d;
-   }
- else if (target != t || !tree_int_cst_equal (delta, d))
+ if (ipa_param_cannot_devirtualize_p (info, param_index)
+ || ipa_param_types_vec_empty (info, param_index))
+   continue;
+
+ types_count = VEC_length (tree, info->params[param_index].types);
+ for (j = 0; j < types_count; j++)
{
- target = NULL_TREE;
- break;
+ tree binfo = VEC_index (tree, info->params[param_index].types, j);
+ tree d, t;
+
+ binfo = get_binfo_at_offset (binfo, anc_offset, otr_type);
+ if (!binfo)
+   {
+ target = NULL_TREE;
+ break;
+   }
+
+ t = gimple_get_virt_mehtod_for_binfo (token, binfo, &d, true);
+ if (!t)
+   {
+ target = NULL_TREE;
+ break;
+   }
+ else if (!target)
+   {
+ target = t;
+ delta = d;
+   }
+

[PATCH 2/4] Handle calls to ancestor objects in IPA-CP devirtualization

2011-04-15 Thread Martin Jambor
Hi,

early inlining can create virtual calls based on the part of an object
that represents an ancestor.  This patch makes ipa-prop analysis able
to recognize such calls and store the required offset along with such
calls (the field is already there for similar purposes of indirect
inlining).  The constant propagation is then made aware of the offset
field and takes it into account when looking up the proper BINFO.

Bootstrapped and tested on x86_64-linux. OK for trunk?

Thanks,

Martin



2011-04-13  Martin Jambor  

* ipa-cp.c (ipcp_process_devirtualization_opportunities): Take into
account anc_offset and otr_type from the indirect edge info.
* ipa-prop.c (get_ancestor_addr_info): New function.
(compute_complex_ancestor_jump_func): Assignment analysis moved to
get_ancestor_addr_info, call it.
(ipa_note_param_call): Do not initialize information about polymorphic
calls, return the indirect call graph edge.  Remove the last
parameter, adjust all callers.
(ipa_analyze_virtual_call_uses): Process also calls to ancestors of
parameters.  Initialize polymorphic information in the indirect edge.

* testsuite/g++.dg/ipa/devirt-7.C: New test.


Index: src/gcc/ipa-cp.c
===
--- src.orig/gcc/ipa-cp.c
+++ src/gcc/ipa-cp.c
@@ -1246,8 +1246,8 @@ ipcp_process_devirtualization_opportunit
   for (ie = node->indirect_calls; ie; ie = next_ie)
 {
   int param_index, types_count, j;
-  HOST_WIDE_INT token;
-  tree target, delta;
+  HOST_WIDE_INT token, anc_offset;
+  tree target, delta, otr_type;
 
   next_ie = ie->next_callee;
   if (!ie->indirect_info->polymorphic)
@@ -1259,14 +1259,23 @@ ipcp_process_devirtualization_opportunit
continue;
 
   token = ie->indirect_info->otr_token;
+  anc_offset = ie->indirect_info->anc_offset;
+  otr_type = ie->indirect_info->otr_type;
   target = NULL_TREE;
   types_count = VEC_length (tree, info->params[param_index].types);
   for (j = 0; j < types_count; j++)
{
  tree binfo = VEC_index (tree, info->params[param_index].types, j);
- tree d;
- tree t = gimple_get_virt_mehtod_for_binfo (token, binfo, &d, true);
+ tree d, t;
 
+ binfo = get_binfo_at_offset (binfo, anc_offset, otr_type);
+ if (!binfo)
+   {
+ target = NULL_TREE;
+ break;
+   }
+
+ t = gimple_get_virt_mehtod_for_binfo (token, binfo, &d, true);
  if (!t)
{
  target = NULL_TREE;
Index: src/gcc/ipa-prop.c
===
--- src.orig/gcc/ipa-prop.c
+++ src/gcc/ipa-prop.c
@@ -576,6 +576,49 @@ compute_complex_assign_jump_func (struct
 }
 }
 
+/* Extract the base, offset and MEM_REF expression from a statement ASSIGN if
+   it looks like:
+
+   iftmp.1_3 = &obj_2(D)->D.1762;
+
+   The base of the MEM_REF must be a default definition SSA NAME of a
+   parameter.  Return NULL_TREE if it looks otherwise.  If case of success, the
+   whole MEM_REF expression is returned and the offset calculated from any
+   handled components and the MEM_REF itself is stored into *OFFSET.  The whole
+   RHS stripped off the ADDR_EXPR is stored into *OBJ_P.  */
+
+static tree
+get_ancestor_addr_info (gimple assign, tree *obj_p, HOST_WIDE_INT *offset)
+{
+  HOST_WIDE_INT size, max_size;
+  tree expr, parm, obj;
+
+  if (!gimple_assign_single_p (assign))
+return NULL_TREE;
+  expr = gimple_assign_rhs1 (assign);
+
+  if (TREE_CODE (expr) != ADDR_EXPR)
+return NULL_TREE;
+  expr = TREE_OPERAND (expr, 0);
+  obj = expr;
+  expr = get_ref_base_and_extent (expr, offset, &size, &max_size);
+
+  if (TREE_CODE (expr) != MEM_REF
+  /* If this is a varying address, punt.  */
+  || max_size == -1
+  || max_size != size)
+return NULL_TREE;
+  parm = TREE_OPERAND (expr, 0);
+  if (TREE_CODE (parm) != SSA_NAME
+  || !SSA_NAME_IS_DEFAULT_DEF (parm)
+  || *offset < 0)
+return NULL_TREE;
+
+  *offset += mem_ref_offset (expr).low * BITS_PER_UNIT;
+  *obj_p = obj;
+  return expr;
+}
+
 
 /* Given that an actual argument is an SSA_NAME that is a result of a phi
statement PHI, try to find out whether NAME is in fact a
@@ -603,7 +646,7 @@ compute_complex_ancestor_jump_func (stru
struct ipa_jump_func *jfunc,
gimple call, gimple phi)
 {
-  HOST_WIDE_INT offset, size, max_size;
+  HOST_WIDE_INT offset;
   gimple assign, cond;
   basic_block phi_bb, assign_bb, cond_bb;
   tree tmp, parm, expr, obj;
@@ -626,29 +669,12 @@ compute_complex_ancestor_jump_func (stru
 
   assign = SSA_NAME_DEF_STMT (tmp);
   assign_bb = gimple_bb (assign);
-  if (!single_pred_p (assign_bb)
-  || !gimple_assign_single_p (assign))
+  if (!single_pred_p (assign_bb))
 return;
-  expr = gimple

Re: [lto/pph] Do not pack more bits than requested (issue4415044)

2011-04-15 Thread Diego Novillo
On Fri, Apr 15, 2011 at 08:49, Richard Guenther  wrote:
> On Fri, 15 Apr 2011, Diego Novillo wrote:
>
>> On Fri, Apr 15, 2011 at 04:56, Richard Guenther  wrote:
>>
>> >> @@ -518,7 +518,8 @@ pack_ts_type_value_fields (struct bitpack_d *bp, tree 
>> >> expr)
>> >>    bp_pack_value (bp, TYPE_USER_ALIGN (expr), 1);
>> >>    bp_pack_value (bp, TYPE_READONLY (expr), 1);
>> >>    bp_pack_value (bp, TYPE_ALIGN (expr), HOST_BITS_PER_INT);
>> >> -  bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0 ? 0 : -1, 
>> >> HOST_BITS_PER_INT);
>> >> +  bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0 ? 0 : -1,
>> >> +                BITS_PER_BITPACK_WORD);
>> >
>> > As we only want to stream alias-set zeros just change it to a single bit,
>> > like
>> >
>> >     bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0, 1);
>> >
>> > and on the reader side restore either a zero or -1.
>>
>> Ah, yes.  Much better.
>>
>> > As for the -1 case, it's simply broken use of the interface.
>>
>> Which would've been caught by the assertion.  How about this, we keep
>> the asserts with #ifdef ENABLE_CHECKING. This would've saved me some
>> ugly debugging.
>
> I think we should rather add the masking.  The assert would only
> trigger for particular values, not for bogus use of the interface
> (you get sign-extension for signed arguments).

OK, that works too.  I'll prepare a patch.


Diego.


Re: [6/9] NEON vec_load_lanes and vec_store_lanes patterns

2011-04-15 Thread Richard Earnshaw

On Tue, 2011-04-12 at 15:01 +0100, Richard Sandiford wrote:
> This patch adds vec_load_lanes and vec_store_lanes patterns for NEON.
> They feed directly into the corresponding intrinsic patterns.
> 
> Tested on x86_64-linux-gnu and arm-linux-gnueabi.  OK to install?
> 
> Richard
> 
> 
> gcc/
>   * config/arm/neon.md (vec_load_lanes): New expanders,
>   (vec_store_lanes): Likewise.
> 

OK.

R.





[PATCH] factor asm op chaining out from stmt.c:expand_asm_stmt

2011-04-15 Thread Nathan Froyd
There are several cut-and-pasted loops in expand_asm_stmt that could be
parameterized by the functions used to access the particular operands.
The patch below does that.

Tested on x86_64-unknown-linux-gnu.  OK to commit?

-Nathan

 * stmt.c (chain_asm_ops): New function.
(expand_asm_stmt): Call it.

diff --git a/gcc/stmt.c b/gcc/stmt.c
index 1a9f9e5..1fc09e9 100644
--- a/gcc/stmt.c
+++ b/gcc/stmt.c
@@ -1114,13 +1114,34 @@ expand_asm_operands (tree string, tree outputs, tree 
inputs,
   free_temp_slots ();
 }
 
+/* Return the operands of STMT, a GIMPLE_ASM, as described by OP_FN and
+   N_OPS connected via TREE_CHAIN.  */
+
+static tree
+chain_asm_ops (const_gimple stmt, unsigned (*n_ops) (const_gimple),
+  tree (*op_fn) (const_gimple, unsigned))
+{
+  unsigned i, n;
+  tree ret = NULL_TREE, t;
+
+  n = (*n_ops) (stmt);
+  if (n > 0)
+{
+  t = ret = (*op_fn) (stmt, 0);
+  for (i = 1; i < n; i++)
+   t = TREE_CHAIN (t) = (*op_fn) (stmt, i);
+}
+
+  return ret;
+}
+
 void
 expand_asm_stmt (gimple stmt)
 {
   int noutputs;
-  tree outputs, tail, t;
+  tree outputs, tail;
   tree *o;
-  size_t i, n;
+  size_t i;
   const char *s;
   tree str, out, in, cl, labels;
   location_t locus = gimple_location (stmt);
@@ -1128,41 +1149,10 @@ expand_asm_stmt (gimple stmt)
   /* Meh... convert the gimple asm operands into real tree lists.
  Eventually we should make all routines work on the vectors instead
  of relying on TREE_CHAIN.  */
-  out = NULL_TREE;
-  n = gimple_asm_noutputs (stmt);
-  if (n > 0)
-{
-  t = out = gimple_asm_output_op (stmt, 0);
-  for (i = 1; i < n; i++)
-   t = TREE_CHAIN (t) = gimple_asm_output_op (stmt, i);
-}
-
-  in = NULL_TREE;
-  n = gimple_asm_ninputs (stmt);
-  if (n > 0)
-{
-  t = in = gimple_asm_input_op (stmt, 0);
-  for (i = 1; i < n; i++)
-   t = TREE_CHAIN (t) = gimple_asm_input_op (stmt, i);
-}
-
-  cl = NULL_TREE;
-  n = gimple_asm_nclobbers (stmt);
-  if (n > 0)
-{
-  t = cl = gimple_asm_clobber_op (stmt, 0);
-  for (i = 1; i < n; i++)
-   t = TREE_CHAIN (t) = gimple_asm_clobber_op (stmt, i);
-}
-
-  labels = NULL_TREE;
-  n = gimple_asm_nlabels (stmt);
-  if (n > 0)
-{
-  t = labels = gimple_asm_label_op (stmt, 0);
-  for (i = 1; i < n; i++)
-   t = TREE_CHAIN (t) = gimple_asm_label_op (stmt, i);
-}
+  out = chain_asm_ops (stmt, gimple_asm_noutputs, gimple_asm_output_op);
+  in = chain_asm_ops (stmt, gimple_asm_ninputs, gimple_asm_input_op);
+  cl = chain_asm_ops (stmt, gimple_asm_nclobbers, gimple_asm_clobber_op);
+  labels = chain_asm_ops (stmt, gimple_asm_nlabels, gimple_asm_label_op);
 
   s = gimple_asm_string (stmt);
   str = build_string (strlen (s), s);


Re: [PATCH v3] Re: avoid useless if-before-free tests

2011-04-15 Thread Tom Tromey
> "Janne" == Janne Blomqvist  writes:

Jim> Can someone add me to the gcc group?  That would help.
Jim> I already have ssh access to sourceware.org.

Janne> I'm not sure if I'm considered to be well-established
Janne> enough, so could someone help Jim out here, please?

I added Jim to the gcc group.

Tom


[PATCH] refactor gimple asm memory clobber checking

2011-04-15 Thread Nathan Froyd
There are a couple places that check GIMPLE_ASMs for clobbering memory;
this patch centralizes the logic in gimple.c.

Tested on x86_64-unknown-linux-gnu.  OK to commit?

-Nathan

* gimple.h (gimple_asm_clobbers_memory_p): Declare.
* gimple.c (gimple_asm_clobbers_memory_p): Define.
* ipa-pure-const.c (check_stmt): Call it.
* tree-ssa-operands.c (get_asm_expr_operands): Likewise.

diff --git a/gcc/gimple.c b/gcc/gimple.c
index 090fc94..5dc62ea 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -5142,4 +5142,21 @@ gimple_call_builtin_p (gimple stmt, enum 
built_in_function code)
  && DECL_FUNCTION_CODE (fndecl) == code);
 }
 
+/* Return true if STMT clobbers memory.  STMT is required to be a
+   GIMPLE_ASM.  */
+
+bool
+gimple_asm_clobbers_memory_p (const_gimple stmt)
+{
+  unsigned i;
+
+  for (i = 0; i < gimple_asm_nclobbers (stmt); i++)
+{
+  tree op = gimple_asm_clobber_op (stmt, i);
+  if (strcmp (TREE_STRING_POINTER (TREE_VALUE (op)), "memory") == 0)
+   return true;
+}
+
+  return false;
+}
 #include "gt-gimple.h"
diff --git a/gcc/gimple.h b/gcc/gimple.h
index 572cabc..840e149 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -973,6 +973,7 @@ extern bool walk_stmt_load_store_ops (gimple, void *,
  bool (*)(gimple, tree, void *));
 extern bool gimple_ior_addresses_taken (bitmap, gimple);
 extern bool gimple_call_builtin_p (gimple, enum built_in_function);
+extern bool gimple_asm_clobbers_memory_p (const_gimple);
 
 /* In gimplify.c  */
 extern tree create_tmp_var_raw (tree, const char *);
diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
index b7deb57..eb5b0f6 100644
--- a/gcc/ipa-pure-const.c
+++ b/gcc/ipa-pure-const.c
@@ -639,7 +639,6 @@ static void
 check_stmt (gimple_stmt_iterator *gsip, funct_state local, bool ipa)
 {
   gimple stmt = gsi_stmt (*gsip);
-  unsigned int i = 0;
 
   if (is_gimple_debug (stmt))
 return;
@@ -693,16 +692,12 @@ check_stmt (gimple_stmt_iterator *gsip, funct_state 
local, bool ipa)
}
   break;
 case GIMPLE_ASM:
-  for (i = 0; i < gimple_asm_nclobbers (stmt); i++)
+  if (gimple_asm_clobbers_memory_p (stmt))
{
- tree op = gimple_asm_clobber_op (stmt, i);
- if (strcmp (TREE_STRING_POINTER (TREE_VALUE (op)), "memory") == 0)
-   {
-  if (dump_file)
-fprintf (dump_file, "memory asm clobber is not 
const/pure");
- /* Abandon all hope, ye who enter here. */
- local->pure_const_state = IPA_NEITHER;
-   }
+ if (dump_file)
+   fprintf (dump_file, "memory asm clobber is not const/pure");
+ /* Abandon all hope, ye who enter here. */
+ local->pure_const_state = IPA_NEITHER;
}
   if (gimple_asm_volatile_p (stmt))
{
diff --git a/gcc/tree-ssa-operands.c b/gcc/tree-ssa-operands.c
index 57f443f..7f76cbf 100644
--- a/gcc/tree-ssa-operands.c
+++ b/gcc/tree-ssa-operands.c
@@ -832,15 +832,8 @@ get_asm_expr_operands (gimple stmt)
 }
 
   /* Clobber all memory and addressable symbols for asm ("" : : : "memory");  
*/
-  for (i = 0; i < gimple_asm_nclobbers (stmt); i++)
-{
-  tree link = gimple_asm_clobber_op (stmt, i);
-  if (strcmp (TREE_STRING_POINTER (TREE_VALUE (link)), "memory") == 0)
-   {
- add_virtual_operand (stmt, opf_def);
- break;
-   }
-}
+  if (gimple_asm_clobbers_memory_p (stmt))
+add_virtual_operand (stmt, opf_def);
 }
 
 


Re: [patch, ARM] Fix PR48325, support POST_INC/PRE_DEC for NEON struct modes

2011-04-15 Thread Richard Earnshaw

On Thu, 2011-03-31 at 22:57 +0800, Chung-Lin Tang wrote:
> This PR doesn't exactly trigger currently on trunk; a REG_DEAD note that
> occurs in trunk, but not in the 4.5-based compilers which this bug was
> reported for, currently blocks auto-inc-dec from doing its job, and just
> happens to avoid this ICE.
> 
> The problem is vldmia/db, which do exist, are currently not enabled for
> NEON struct modes (OI, XI, etc.)  This only needs a small patch to
> neon_struct_mem_operand() to work. Okay for trunk?
> 
> Thanks,
> Chung-Lin
> 
> 2011-03-31  Chung-Lin Tang  
> 
>   * config/arm/arm.c (neon_struct_mem_operand):
>   Support POST_INC/PRE_DEC memory operands.

OK.

R.




Re: [PATCH, ARM] PR47855 Compute attr "length" for some thumb2 insns, 2/3

2011-04-15 Thread Richard Earnshaw

On Thu, 2011-04-14 at 21:19 +0800, Carrot Wei wrote:
> On Fri, Apr 8, 2011 at 6:51 PM, Ramana Radhakrishnan
>  wrote:
> > On 08/04/11 10:57, Carrot Wei wrote:
> >>
> >> Hi
> >>
> >> This is the second part of the fixing for
> >>
> >> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47855
> >>
> >> This patch contains the length computation for insn patterns
> >> "*arm_movqi_insn"
> >> and "*arm_addsi3". Since the alternatives and encodings are much more
> >> complex,
> >> the attribute length is computed in separate C functions.
> 
> > Sorry, no. This is potentially a maintenance pain. It hardcodes alternatives
> > from a pattern elsewhere in the C file. I don't like doing this unless we
> > have to with the sync primitives or with push_multi. In this case I'm not
> > convinced we need such functions in the .c file.
> >
> > Why can't we use the "enabled" attribute here with appropriate constraints
> > for everything other than the memory cases (even there we might be able to
> > invent some new constraints) ?
> >
> > Also a note about programming style. There are the helper macros like REG_P,
> > CONST_INT_P and MEM_P which remove the necessity for checks like
> >
> > GET_CODE (x) == y where y E { REG, CONST_INT, MEM}
> 
> Hi Ramana
> 
> As you suggested I created several new constraints, and use the
> "enabled" attribute to split the current alternatives in this new
> patch. It has been tested on arm qemu without regression.
> 
> thanks
> Carrot


Sorry, I don't think this approach can work.  Certainly not with the way
the compiler currently works, and especially for mov and add insns. 

These instructions are only 2 bytes long if either:
1) They clobber the condition code register or
2) They occur inside an IT block.

We can't tell either of these from the pattern, so you're
underestimating the length of the instruction in some circumstances by
claiming that they are only 2 bytes long.  That /will/ lead to broken
code someday.

We can't add potential clobbers to mov and add patterns because that
will break reload which relies on these patterns being simple-set insns
with no added baggage.  It *might* be possible to add clobbers to other
operations, but that will then most-likely upset instruction scheduling
(I think the scheduler treats two insns that clobber the same hard reg
as being strongly ordered).  Putting in the clobber too early will
certainly affect cond-exec generation.

In short, I'm not aware of a simple way to address this problem so that
we get accurate length information, but minimal impact on other passes
in the compiler.

R.




Re: Implement stack arrays even for unknown sizes

2011-04-15 Thread Dominique Dhumieres
Michael,

> Yes, this is due to the DECL_EXPR statement which is rendered by the 
> dumper just the same as a normal decl.  The testcase looks for exactly one 
> such decl, but with -fstack-arrays there are exactly two for each such 
> array.

The testsuite is run without -fstack-arrays, so I dont' understand why
the "DECL_EXPR statement" appears.

Dominique


[PATCH] Fix PR48290

2011-04-15 Thread Richard Guenther

This fixes the remaining part of PR48290, copyprop not properly
propagating constant copies across PHI nodes.  On the way this
patch needs to fix some present issues with the code inhibiting
various kinds of propagation (but not removing those two that
look completely bogus).

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2011-04-15  Richard Guenther  

PR tree-optimization/48290
* tree-ssa-copy.c (copy_prop_visit_phi_node): Propagate constants.
Properly decide inhibiting propagation based on the valueized
operand.  Do loop-closed SSA form preserving here ...
(init_copy_prop): ... not here.

Index: gcc/tree-ssa-copy.c
===
*** gcc/tree-ssa-copy.c (revision 172485)
--- gcc/tree-ssa-copy.c (working copy)
*** copy_prop_visit_phi_node (gimple phi)
*** 567,572 
--- 567,573 
for (i = 0; i < gimple_phi_num_args (phi); i++)
  {
prop_value_t *arg_val;
+   tree arg_value;
tree arg = gimple_phi_arg_def (phi, i);
edge e = gimple_phi_arg_edge (phi, i);
  
*** copy_prop_visit_phi_node (gimple phi)
*** 575,598 
if (!(e->flags & EDGE_EXECUTABLE))
continue;
  
!   /* Constants in the argument list never generate a useful copy.
!Similarly, names that flow through abnormal edges cannot be
!used to derive copies.  */
!   if (TREE_CODE (arg) != SSA_NAME || SSA_NAME_OCCURS_IN_ABNORMAL_PHI 
(arg))
!   {
! phi_val.value = lhs;
! break;
!   }
! 
!   /* Avoid copy propagation from an inner into an outer loop.
!Otherwise, this may move loop variant variables outside of
!their loops and prevent coalescing opportunities.  If the
!value was loop invariant, it will be hoisted by LICM and
!exposed for copy propagation.  Not a problem for virtual
!operands though.
!???  The value will be always loop invariant.  */
!   if (is_gimple_reg (lhs)
! && loop_depth_of_name (arg) > loop_depth_of_name (lhs))
{
  phi_val.value = lhs;
  break;
--- 576,584 
if (!(e->flags & EDGE_EXECUTABLE))
continue;
  
!   /* Names that flow through abnormal edges cannot be used to
!derive copies.  */
!   if (TREE_CODE (arg) == SSA_NAME && SSA_NAME_OCCURS_IN_ABNORMAL_PHI 
(arg))
{
  phi_val.value = lhs;
  break;
*** copy_prop_visit_phi_node (gimple phi)
*** 605,630 
  fprintf (dump_file, "\n");
}
  
!   arg_val = get_copy_of_val (arg);
  
!   /* If we didn't visit the definition of arg yet treat it as
!  UNDEFINED.  This also handles PHI arguments that are the
!same as lhs.  We'll come here again.  */
!   if (!arg_val->value)
!   continue;
  
/* If the LHS didn't have a value yet, make it a copy of the
 first argument we find.   */
if (phi_val.value == NULL_TREE)
{
! phi_val.value = arg_val->value;
  continue;
}
  
/* If PHI_VAL and ARG don't have a common copy-of chain, then
 this PHI node cannot be a copy operation.  */
!   if (phi_val.value != arg_val->value
! && !operand_equal_p (phi_val.value, arg_val->value, 0))
{
  phi_val.value = lhs;
  break;
--- 591,641 
  fprintf (dump_file, "\n");
}
  
!   if (TREE_CODE (arg) == SSA_NAME)
!   {
! arg_val = get_copy_of_val (arg);
! 
! /* If we didn't visit the definition of arg yet treat it as
!UNDEFINED.  This also handles PHI arguments that are the
!same as lhs.  We'll come here again.  */
! if (!arg_val->value)
!   continue;
  
! arg_value = arg_val->value;
!   }
!   else
!   arg_value = valueize_val (arg);
! 
!   /* Avoid copy propagation from an inner into an outer loop.
!Otherwise, this may move loop variant variables outside of
!their loops and prevent coalescing opportunities.  If the
!value was loop invariant, it will be hoisted by LICM and
!exposed for copy propagation.
!???  The value will be always loop invariant.
!In loop-closed SSA form do not copy-propagate through
!PHI nodes in blocks with a loop exit edge predecessor.  */
!   if (current_loops
! && TREE_CODE (arg_value) == SSA_NAME
! && (loop_depth_of_name (arg_value) > loop_depth_of_name (lhs)
! || (loops_state_satisfies_p (LOOP_CLOSED_SSA)
! && loop_exit_edge_p (e->src->loop_father, e
!   {
! phi_val.value = lhs;
! break;
!   }
  
/* If the LHS didn't have a value yet, make it a copy of the
 first argument we find.   */
if (phi_val.value == NULL_TREE)
{
! phi_val.value = arg_value;
  co

[PATCH] Fix PR48286

2011-04-15 Thread Richard Guenther

This makes us avoid 323.

Tested on x86_64-unknown-linux-gnu/-m32, committed.

Richard.

2011-04-15  Richard Guenther  

PR testsuite/48286
* gfortran.dg/cray_pointers_8.f90: Use -ffloat-store.

gcc/testsuite/gfortran.dg/cray_pointers_8.f90
Index: gcc/testsuite/gfortran.dg/cray_pointers_8.f90
===
--- gcc/testsuite/gfortran.dg/cray_pointers_8.f90   (revision 172493)
+++ gcc/testsuite/gfortran.dg/cray_pointers_8.f90   (working copy)
@@ -1,5 +1,5 @@
 ! { dg-do run }
-! { dg-options "-fcray-pointer" }
+! { dg-options "-fcray-pointer -ffloat-store" }
 !
 ! Test the fix for PR36528 in which the Cray pointer was not passed
 ! correctly to 'euler' so that an undefined reference to fcn was


Re: [PATCH] factor asm op chaining out from stmt.c:expand_asm_stmt

2011-04-15 Thread Richard Guenther
On Fri, Apr 15, 2011 at 3:19 PM, Nathan Froyd  wrote:
> There are several cut-and-pasted loops in expand_asm_stmt that could be
> parameterized by the functions used to access the particular operands.
> The patch below does that.
>
> Tested on x86_64-unknown-linux-gnu.  OK to commit?

Hmm, it doesn't make it easier to follow though.

Just my 2cents,

Richard.

> -Nathan
>
>         * stmt.c (chain_asm_ops): New function.
>        (expand_asm_stmt): Call it.
>
> diff --git a/gcc/stmt.c b/gcc/stmt.c
> index 1a9f9e5..1fc09e9 100644
> --- a/gcc/stmt.c
> +++ b/gcc/stmt.c
> @@ -1114,13 +1114,34 @@ expand_asm_operands (tree string, tree outputs, tree 
> inputs,
>   free_temp_slots ();
>  }
>
> +/* Return the operands of STMT, a GIMPLE_ASM, as described by OP_FN and
> +   N_OPS connected via TREE_CHAIN.  */
> +
> +static tree
> +chain_asm_ops (const_gimple stmt, unsigned (*n_ops) (const_gimple),
> +              tree (*op_fn) (const_gimple, unsigned))
> +{
> +  unsigned i, n;
> +  tree ret = NULL_TREE, t;
> +
> +  n = (*n_ops) (stmt);
> +  if (n > 0)
> +    {
> +      t = ret = (*op_fn) (stmt, 0);
> +      for (i = 1; i < n; i++)
> +       t = TREE_CHAIN (t) = (*op_fn) (stmt, i);
> +    }
> +
> +  return ret;
> +}
> +
>  void
>  expand_asm_stmt (gimple stmt)
>  {
>   int noutputs;
> -  tree outputs, tail, t;
> +  tree outputs, tail;
>   tree *o;
> -  size_t i, n;
> +  size_t i;
>   const char *s;
>   tree str, out, in, cl, labels;
>   location_t locus = gimple_location (stmt);
> @@ -1128,41 +1149,10 @@ expand_asm_stmt (gimple stmt)
>   /* Meh... convert the gimple asm operands into real tree lists.
>      Eventually we should make all routines work on the vectors instead
>      of relying on TREE_CHAIN.  */
> -  out = NULL_TREE;
> -  n = gimple_asm_noutputs (stmt);
> -  if (n > 0)
> -    {
> -      t = out = gimple_asm_output_op (stmt, 0);
> -      for (i = 1; i < n; i++)
> -       t = TREE_CHAIN (t) = gimple_asm_output_op (stmt, i);
> -    }
> -
> -  in = NULL_TREE;
> -  n = gimple_asm_ninputs (stmt);
> -  if (n > 0)
> -    {
> -      t = in = gimple_asm_input_op (stmt, 0);
> -      for (i = 1; i < n; i++)
> -       t = TREE_CHAIN (t) = gimple_asm_input_op (stmt, i);
> -    }
> -
> -  cl = NULL_TREE;
> -  n = gimple_asm_nclobbers (stmt);
> -  if (n > 0)
> -    {
> -      t = cl = gimple_asm_clobber_op (stmt, 0);
> -      for (i = 1; i < n; i++)
> -       t = TREE_CHAIN (t) = gimple_asm_clobber_op (stmt, i);
> -    }
> -
> -  labels = NULL_TREE;
> -  n = gimple_asm_nlabels (stmt);
> -  if (n > 0)
> -    {
> -      t = labels = gimple_asm_label_op (stmt, 0);
> -      for (i = 1; i < n; i++)
> -       t = TREE_CHAIN (t) = gimple_asm_label_op (stmt, i);
> -    }
> +  out = chain_asm_ops (stmt, gimple_asm_noutputs, gimple_asm_output_op);
> +  in = chain_asm_ops (stmt, gimple_asm_ninputs, gimple_asm_input_op);
> +  cl = chain_asm_ops (stmt, gimple_asm_nclobbers, gimple_asm_clobber_op);
> +  labels = chain_asm_ops (stmt, gimple_asm_nlabels, gimple_asm_label_op);
>
>   s = gimple_asm_string (stmt);
>   str = build_string (strlen (s), s);
>


Re: [PATCH] refactor gimple asm memory clobber checking

2011-04-15 Thread Richard Guenther
On Fri, Apr 15, 2011 at 3:20 PM, Nathan Froyd  wrote:
> There are a couple places that check GIMPLE_ASMs for clobbering memory;
> this patch centralizes the logic in gimple.c.
>
> Tested on x86_64-unknown-linux-gnu.  OK to commit?

Ok.

Thanks,
Richard.

> -Nathan
>
>        * gimple.h (gimple_asm_clobbers_memory_p): Declare.
>        * gimple.c (gimple_asm_clobbers_memory_p): Define.
>        * ipa-pure-const.c (check_stmt): Call it.
>        * tree-ssa-operands.c (get_asm_expr_operands): Likewise.
>
> diff --git a/gcc/gimple.c b/gcc/gimple.c
> index 090fc94..5dc62ea 100644
> --- a/gcc/gimple.c
> +++ b/gcc/gimple.c
> @@ -5142,4 +5142,21 @@ gimple_call_builtin_p (gimple stmt, enum 
> built_in_function code)
>          && DECL_FUNCTION_CODE (fndecl) == code);
>  }
>
> +/* Return true if STMT clobbers memory.  STMT is required to be a
> +   GIMPLE_ASM.  */
> +
> +bool
> +gimple_asm_clobbers_memory_p (const_gimple stmt)
> +{
> +  unsigned i;
> +
> +  for (i = 0; i < gimple_asm_nclobbers (stmt); i++)
> +    {
> +      tree op = gimple_asm_clobber_op (stmt, i);
> +      if (strcmp (TREE_STRING_POINTER (TREE_VALUE (op)), "memory") == 0)
> +       return true;
> +    }
> +
> +  return false;
> +}
>  #include "gt-gimple.h"
> diff --git a/gcc/gimple.h b/gcc/gimple.h
> index 572cabc..840e149 100644
> --- a/gcc/gimple.h
> +++ b/gcc/gimple.h
> @@ -973,6 +973,7 @@ extern bool walk_stmt_load_store_ops (gimple, void *,
>                                      bool (*)(gimple, tree, void *));
>  extern bool gimple_ior_addresses_taken (bitmap, gimple);
>  extern bool gimple_call_builtin_p (gimple, enum built_in_function);
> +extern bool gimple_asm_clobbers_memory_p (const_gimple);
>
>  /* In gimplify.c  */
>  extern tree create_tmp_var_raw (tree, const char *);
> diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
> index b7deb57..eb5b0f6 100644
> --- a/gcc/ipa-pure-const.c
> +++ b/gcc/ipa-pure-const.c
> @@ -639,7 +639,6 @@ static void
>  check_stmt (gimple_stmt_iterator *gsip, funct_state local, bool ipa)
>  {
>   gimple stmt = gsi_stmt (*gsip);
> -  unsigned int i = 0;
>
>   if (is_gimple_debug (stmt))
>     return;
> @@ -693,16 +692,12 @@ check_stmt (gimple_stmt_iterator *gsip, funct_state 
> local, bool ipa)
>        }
>       break;
>     case GIMPLE_ASM:
> -      for (i = 0; i < gimple_asm_nclobbers (stmt); i++)
> +      if (gimple_asm_clobbers_memory_p (stmt))
>        {
> -         tree op = gimple_asm_clobber_op (stmt, i);
> -         if (strcmp (TREE_STRING_POINTER (TREE_VALUE (op)), "memory") == 0)
> -           {
> -              if (dump_file)
> -                fprintf (dump_file, "    memory asm clobber is not 
> const/pure");
> -             /* Abandon all hope, ye who enter here. */
> -             local->pure_const_state = IPA_NEITHER;
> -           }
> +         if (dump_file)
> +           fprintf (dump_file, "    memory asm clobber is not const/pure");
> +         /* Abandon all hope, ye who enter here. */
> +         local->pure_const_state = IPA_NEITHER;
>        }
>       if (gimple_asm_volatile_p (stmt))
>        {
> diff --git a/gcc/tree-ssa-operands.c b/gcc/tree-ssa-operands.c
> index 57f443f..7f76cbf 100644
> --- a/gcc/tree-ssa-operands.c
> +++ b/gcc/tree-ssa-operands.c
> @@ -832,15 +832,8 @@ get_asm_expr_operands (gimple stmt)
>     }
>
>   /* Clobber all memory and addressable symbols for asm ("" : : : "memory");  
> */
> -  for (i = 0; i < gimple_asm_nclobbers (stmt); i++)
> -    {
> -      tree link = gimple_asm_clobber_op (stmt, i);
> -      if (strcmp (TREE_STRING_POINTER (TREE_VALUE (link)), "memory") == 0)
> -       {
> -         add_virtual_operand (stmt, opf_def);
> -         break;
> -       }
> -    }
> +  if (gimple_asm_clobbers_memory_p (stmt))
> +    add_virtual_operand (stmt, opf_def);
>  }
>
>
>


patch pings

2011-04-15 Thread Jeff Law
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01060.html
http://gcc.gnu.org/ml/gcc-patches/2011-03/msg02247.html
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJNqFOfAAoJEBRtltQi2kC7ijYH/ibLT2HNaFXjdE9SKT5Ih1JV
dTEWPnY7QBP5Xe6FVwZ09ibPCOxJsK7yGdQBYqy5gRQor8ebifI4kenwcBcdET/m
NogdM8DPWbuhgGda7cETNkru7FifSe9mRsKQGhNzVQl48oEKWmcGkm/l3a7gndfD
cX08lFzqIH1wcPWUqQPf6gcUMRE4W/0j92E4PEoIbigMoSIRFcduVouBlld8NLBV
aicTihAC+MFMVKSkpXGjMLCbn/HkNOyV9s9T+Or1/XuCIZy9zlB1JSfR/EJqVmPm
mNmg0bUrtzXN2uImL6ohIV32i732KKcNhNm/FupZTHfbI50nDc1h8SnCgEHc2t0=
=i7zA
-END PGP SIGNATURE-


Re: Implement stack arrays even for unknown sizes

2011-04-15 Thread Michael Matz
Hi,

On Fri, 15 Apr 2011, Dominique Dhumieres wrote:

> Michael,
> 
> > Yes, this is due to the DECL_EXPR statement which is rendered by the 
> > dumper just the same as a normal decl.  The testcase looks for exactly one 
> > such decl, but with -fstack-arrays there are exactly two for each such 
> > array.
> 
> The testsuite is run without -fstack-arrays, so I dont' understand why
> the "DECL_EXPR statement" appears.

Bummer, you're right.  I unconditionally emit a DECL_EXPR for arrays even 
when they don't have a variable length.  It's harmless, but makes the 
testcase fail (I wasn't seeing the fail because I've changed the testcase 
already to make it not fail with -fstack-arrays).

I'll make the DECL_EXPR conditional on the size being variable.  As Tobias 
already okayed the patch I'm planning to check in the slightly modified 
variant as below, after a new round of testing.


Ciao,
Michael.

* trans-array.c (toplevel): Include gimple.h.
(gfc_trans_allocate_array_storage): Check flag_stack_arrays,
properly expand variable length arrays.
(gfc_trans_auto_array_allocation): If flag_stack_arrays create
variable length decls and associate them with their scope.
* gfortran.h (gfc_option_t): Add flag_stack_arrays member.
* options.c (gfc_init_options): Handle -fstack_arrays option.
* lang.opt (fstack-arrays): Add option.
* invoke.texi (Code Gen Options): Document it.
* Make-lang.in (trans-array.o): Depend on GIMPLE_H.

Index: fortran/trans-array.c
===
*** fortran/trans-array.c   (revision 172431)
--- fortran/trans-array.c   (working copy)
*** along with GCC; see the file COPYING3.
*** 81,86 
--- 81,87 
  #include "system.h"
  #include "coretypes.h"
  #include "tree.h"
+ #include "gimple.h"
  #include "diagnostic-core.h"  /* For internal_error/fatal_error.  */
  #include "flags.h"
  #include "gfortran.h"
*** gfc_trans_allocate_array_storage (stmtbl
*** 630,647 
  {
/* Allocate the temporary.  */
onstack = !dynamic && initial == NULL_TREE
!&& gfc_can_put_var_on_stack (size);
  
if (onstack)
{
  /* Make a temporary variable to hold the data.  */
  tmp = fold_build2_loc (input_location, MINUS_EXPR, TREE_TYPE (nelem),
 nelem, gfc_index_one_node);
  tmp = build_range_type (gfc_array_index_type, gfc_index_zero_node,
  tmp);
  tmp = build_array_type (gfc_get_element_type (TREE_TYPE (desc)),
  tmp);
  tmp = gfc_create_var (tmp, "A");
  tmp = gfc_build_addr_expr (NULL_TREE, tmp);
  gfc_conv_descriptor_data_set (pre, desc, tmp);
}
--- 631,657 
  {
/* Allocate the temporary.  */
onstack = !dynamic && initial == NULL_TREE
!&& (gfc_option.flag_stack_arrays
!|| gfc_can_put_var_on_stack (size));
  
if (onstack)
{
  /* Make a temporary variable to hold the data.  */
  tmp = fold_build2_loc (input_location, MINUS_EXPR, TREE_TYPE (nelem),
 nelem, gfc_index_one_node);
+ tmp = gfc_evaluate_now (tmp, pre);
  tmp = build_range_type (gfc_array_index_type, gfc_index_zero_node,
  tmp);
  tmp = build_array_type (gfc_get_element_type (TREE_TYPE (desc)),
  tmp);
  tmp = gfc_create_var (tmp, "A");
+ /* If we're here only because of -fstack-arrays we have to
+emit a DECL_EXPR to make the gimplifier emit alloca calls.  */
+ if (!gfc_can_put_var_on_stack (size))
+   gfc_add_expr_to_block (pre,
+  fold_build1_loc (input_location,
+   DECL_EXPR, TREE_TYPE (tmp),
+   tmp));
  tmp = gfc_build_addr_expr (NULL_TREE, tmp);
  gfc_conv_descriptor_data_set (pre, desc, tmp);
}
*** gfc_trans_auto_array_allocation (tree de
*** 4759,4767 
  {
stmtblock_t init;
tree type;
!   tree tmp;
tree size;
tree offset;
bool onstack;
  
gcc_assert (!(sym->attr.pointer || sym->attr.allocatable));
--- 4769,4779 
  {
stmtblock_t init;
tree type;
!   tree tmp = NULL_TREE;
tree size;
tree offset;
+   tree space;
+   tree inittree;
bool onstack;
  
gcc_assert (!(sym->attr.pointer || sym->attr.allocatable));
*** gfc_trans_auto_array_allocation (tree de
*** 4818,4832 
return;
  }
  
!   /* The size is the number of elements in the array, so multiply by the
!  size of an element to get the total size.  */
!   tmp = TYPE_SIZE_UNIT (gfc_get_element_type (type));
!

Re: [PATCH 1/4] Remove usesess and wrong code from ipa_analyze_virtual_call_uses

2011-04-15 Thread Richard Guenther
On Fri, 15 Apr 2011, Martin Jambor wrote:

> Hi,
> 
> ipa_analyze_virtual_call_uses contains code that was meant to deal
> with situation where OBJ_TYPE_REF_OBJECT is a (number of)
> COMPONENT_REFs on top of a dereferenced default definition SSA_NAME of
> a parameter.
> 
> The code is useless because that never happens in the IL, if an
> ancestor object of a parameter is being used for a virtual call, the
> object in the expression is always an SSA_NAME which is assigned the
> proper value in a previous statement.
> 
> Moreover, if it ever triggered, it might lead to wrong code because in
> cases like this it is necessary also to store the offset within the
> parameter into the indirect call graph edge information (like we do in
> indirect inlining).
> 
> The above is done in the next patch in the series.  I've split this
> part from it because I would like to commit it also to the 4.6 branch.
> I have bootstrapped and tested this on x86-64-linux without any
> problems.  OK for trunk and the 4.6 branch?

Ok for both.

Thanks,
Richard.

> Thanks,
> 
> Martin
> 
> 
> 2011-04-08  Martin Jambor  
> 
>   * ipa-prop.c (ipa_analyze_virtual_call_uses): Remove handling
>   of ADR_EXPRs.
> 
> 
> Index: src/gcc/ipa-prop.c
> ===
> --- src.orig/gcc/ipa-prop.c
> +++ src/gcc/ipa-prop.c
> @@ -1383,18 +1383,6 @@ ipa_analyze_virtual_call_uses (struct cg
>if (!flag_devirtualize)
>  return;
>  
> -  if (TREE_CODE (obj) == ADDR_EXPR)
> -{
> -  do
> - {
> -   obj = TREE_OPERAND (obj, 0);
> - }
> -  while (TREE_CODE (obj) == COMPONENT_REF);
> -  if (TREE_CODE (obj) != MEM_REF)
> - return;
> -  obj = TREE_OPERAND (obj, 0);
> -}
> -
>if (TREE_CODE (obj) != SSA_NAME
>|| !SSA_NAME_IS_DEFAULT_DEF (obj))
>  return;
> 
> 

-- 
Richard Guenther 
Novell / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 - GF: Markus Rex


Re: More of ipa-inline housekeeping

2011-04-15 Thread Dominique Dhumieres
AFAICT revision 172430 fixed the original problem in pr45810:

gfc -Ofast -fwhole-program fatigue.f90   : 6.301u 0.003s 0:06.30
gfc -Ofast -fwhole-program -flto fatigue.f90 : 6.263u 0.003s 0:06.26

However if I play with --param max-inline-insns-auto=*, I get

gfc -Ofast -fwhole-program --param max-inline-insns-auto=124 -fstack-arrays 
fatigue.f90 : 4.870u 0.002s 0:04.87
gfc -Ofast -fwhole-program --param max-inline-insns-auto=125 -fstack-arrays 
fatigue.f90 : 2.872u 0.002s 0:02.87

and

gfc -Ofast -fwhole-program -flto --param max-inline-insns-auto=515 
-fstack-arrays fatigue.f90 : 4.965u 0.003s 0:04.97
gfc -Ofast -fwhole-program -flto --param max-inline-insns-auto=516 
-fstack-arrays fatigue.f90 : 2.732u 0.002s 0:02.73

while I get the same threshold=125 with/without -flto at revision 172429.
Note that I get the same thresholds without -fstack-arrays, the run times
are only larger.

Dominique


Re: Implement stack arrays even for unknown sizes

2011-04-15 Thread Jerry DeLisle

On 04/15/2011 07:28 AM, Michael Matz wrote:
--- snip ---


I'll make the DECL_EXPR conditional on the size being variable.  As Tobias
already okayed the patch I'm planning to check in the slightly modified
variant as below, after a new round of testing.



Thats A-OK

Thanks,

Jerry


Re: More of ipa-inline housekeeping

2011-04-15 Thread Dominique Dhumieres
I have forgotten to say that the 125 and 516 thresholds are for
x86_64-apple-darwin10. On powerpc-apple-darwin9 they are
repsectively 172 and 638.

Dominique


[PATCH, SMS] Avoid unfreed memory when SMS fails

2011-04-15 Thread Revital Eres
Hello,

This patch fixes the scenario where SMS fails to
schedule a loop and continue to the next one without
freeing data structures allocated while scheduling
the first loop.

Bootstrap and regtested on ppc64-redhat-linux.

OK for mainline?

Thanks,
Revital

Changelog:

* modulo-sched.c (sms_schedule): Avoid unfreed memory when SMS fails.

Index: modulo-sched.c
===
--- modulo-sched.c  (revision 170464)
+++ modulo-sched.c  (working copy)
@@ -1177,7 +1177,6 @@ sms_schedule (void)
  fprintf (dump_file, HOST_WIDEST_INT_PRINT_DEC, trip_count);
  fprintf (dump_file, ")\n");
}
- continue;
}
   else
{


[PATCH, SMS] Free sccs field

2011-04-15 Thread Revital Eres
Hello,

The attached patch adds missing free operation for storage
allocated while calculating SCCs.

Bootstrap and regtested on ppc64-redhat-linux.

OK for mainline?

Thanks,
Revital

Changelog:

* ddg.c (free_ddg_all_sccs): Free sccs field in struct ddg_all_sccs.

Index: ddg.c
===
--- ddg.c   (revision 171573)
+++ ddg.c   (working copy)
@@ -1011,6 +1082,8 @@ free_ddg_all_sccs (ddg_all_sccs_ptr all_
   for (i = 0; i < all_sccs->num_sccs; i++)
 free_scc (all_sccs->sccs[i]);

+  if (all_sccs->sccs)
+free (all_sccs->sccs);
   free (all_sccs);
 }


Re: [PATCH 2/4] Handle calls to ancestor objects in IPA-CP devirtualization

2011-04-15 Thread Richard Guenther
On Fri, 15 Apr 2011, Martin Jambor wrote:

> Hi,
> 
> early inlining can create virtual calls based on the part of an object
> that represents an ancestor.  This patch makes ipa-prop analysis able
> to recognize such calls and store the required offset along with such
> calls (the field is already there for similar purposes of indirect
> inlining).  The constant propagation is then made aware of the offset
> field and takes it into account when looking up the proper BINFO.
> 
> Bootstrapped and tested on x86_64-linux. OK for trunk?
> 
> Thanks,
> 
> Martin
> 
> 
> 
> 2011-04-13  Martin Jambor  
> 
>   * ipa-cp.c (ipcp_process_devirtualization_opportunities): Take into
>   account anc_offset and otr_type from the indirect edge info.
>   * ipa-prop.c (get_ancestor_addr_info): New function.
>   (compute_complex_ancestor_jump_func): Assignment analysis moved to
>   get_ancestor_addr_info, call it.
>   (ipa_note_param_call): Do not initialize information about polymorphic
>   calls, return the indirect call graph edge.  Remove the last
>   parameter, adjust all callers.
>   (ipa_analyze_virtual_call_uses): Process also calls to ancestors of
>   parameters.  Initialize polymorphic information in the indirect edge.
> 
>   * testsuite/g++.dg/ipa/devirt-7.C: New test.
> 
> 
> Index: src/gcc/ipa-cp.c
> ===
> --- src.orig/gcc/ipa-cp.c
> +++ src/gcc/ipa-cp.c
> @@ -1246,8 +1246,8 @@ ipcp_process_devirtualization_opportunit
>for (ie = node->indirect_calls; ie; ie = next_ie)
>  {
>int param_index, types_count, j;
> -  HOST_WIDE_INT token;
> -  tree target, delta;
> +  HOST_WIDE_INT token, anc_offset;
> +  tree target, delta, otr_type;
>  
>next_ie = ie->next_callee;
>if (!ie->indirect_info->polymorphic)
> @@ -1259,14 +1259,23 @@ ipcp_process_devirtualization_opportunit
>   continue;
>  
>token = ie->indirect_info->otr_token;
> +  anc_offset = ie->indirect_info->anc_offset;
> +  otr_type = ie->indirect_info->otr_type;
>target = NULL_TREE;
>types_count = VEC_length (tree, info->params[param_index].types);
>for (j = 0; j < types_count; j++)
>   {
> tree binfo = VEC_index (tree, info->params[param_index].types, j);
> -   tree d;
> -   tree t = gimple_get_virt_mehtod_for_binfo (token, binfo, &d, true);
> +   tree d, t;
>  
> +   binfo = get_binfo_at_offset (binfo, anc_offset, otr_type);
> +   if (!binfo)
> + {
> +   target = NULL_TREE;
> +   break;
> + }
> +
> +   t = gimple_get_virt_mehtod_for_binfo (token, binfo, &d, true);
> if (!t)
>   {
> target = NULL_TREE;
> Index: src/gcc/ipa-prop.c
> ===
> --- src.orig/gcc/ipa-prop.c
> +++ src/gcc/ipa-prop.c
> @@ -576,6 +576,49 @@ compute_complex_assign_jump_func (struct
>  }
>  }
>  
> +/* Extract the base, offset and MEM_REF expression from a statement ASSIGN if
> +   it looks like:
> +
> +   iftmp.1_3 = &obj_2(D)->D.1762;
> +
> +   The base of the MEM_REF must be a default definition SSA NAME of a
> +   parameter.  Return NULL_TREE if it looks otherwise.  If case of success, 
> the
> +   whole MEM_REF expression is returned and the offset calculated from any
> +   handled components and the MEM_REF itself is stored into *OFFSET.  The 
> whole
> +   RHS stripped off the ADDR_EXPR is stored into *OBJ_P.  */
> +
> +static tree
> +get_ancestor_addr_info (gimple assign, tree *obj_p, HOST_WIDE_INT *offset)
> +{
> +  HOST_WIDE_INT size, max_size;
> +  tree expr, parm, obj;
> +
> +  if (!gimple_assign_single_p (assign))
> +return NULL_TREE;
> +  expr = gimple_assign_rhs1 (assign);
> +
> +  if (TREE_CODE (expr) != ADDR_EXPR)
> +return NULL_TREE;
> +  expr = TREE_OPERAND (expr, 0);
> +  obj = expr;
> +  expr = get_ref_base_and_extent (expr, offset, &size, &max_size);
> +
> +  if (TREE_CODE (expr) != MEM_REF
> +  /* If this is a varying address, punt.  */
> +  || max_size == -1
> +  || max_size != size)
> +return NULL_TREE;
> +  parm = TREE_OPERAND (expr, 0);
> +  if (TREE_CODE (parm) != SSA_NAME
> +  || !SSA_NAME_IS_DEFAULT_DEF (parm)

Might be an uninitialized variable, so also check
TREE_CODE (SSA_NAME_VAR (parm)) == PARM_DECL?

> +  || *offset < 0)

Check this above where you check max_size/size.

> +return NULL_TREE;
> +
> +  *offset += mem_ref_offset (expr).low * BITS_PER_UNIT;

At some point it might be worth switching to
get_addr_base_and_unit_offsets and not use bit but unit offsets
throughout the code.

> +  *obj_p = obj;
> +  return expr;
> +}
> +
>  
>  /* Given that an actual argument is an SSA_NAME that is a result of a phi
> statement PHI, try to find out whether NAME is in fact a
> @@ -603,7 +646,7 @@ compute_complex_ancestor_jump_func (stru
>   st

[PATCH, SMS] New flag to apply SMS when SC equals 1

2011-04-15 Thread Revital Eres
Hello,

The attached patch introduces a new flag to allow applying SMS when
stage count (SC) also equals 1.
Currently, SMS is applied only when SC greater than 1 as stage count
of 1 means that there is no interleaving between iterations and the
scheduling passes do the job in this case.
The new flag is introduced for debugging purposes to apply SMS
on more loops.

Bootstrap and regtest on ppc64-redhat-linux.

OK for mainline?

Thanks,
Revital

Changelog:

* common.opt (fmodulo-sched-allow-sc-one): New flag.
* modulo-sched.c (sms_schedule): Allow SMS when stage count
equals one and -fmodulo-sched-allow-sc-one flag is set.
=== modified file 'gcc/common.opt'
--- gcc/common.opt  2011-03-06 00:38:13 +
+++ gcc/common.opt  2011-04-10 11:46:08 +
@@ -1395,6 +1395,10 @@ fmodulo-sched-allow-regmoves
 Common Report Var(flag_modulo_sched_allow_regmoves)
 Perform SMS based modulo scheduling with register moves allowed
 
+fmodulo-sched-allow-sc-one
+Common Report Var(flag_modulo_sched_allow_sc_one)
+Perform SMS based modulo scheduling also when stage count equals one
+
 fmove-loop-invariants
 Common Report Var(flag_move_loop_invariants) Init(1) Optimization
 Move loop invariant computations out of loops

=== modified file 'gcc/modulo-sched.c'
--- gcc/modulo-sched.c  2011-03-27 07:11:08 +
+++ gcc/modulo-sched.c  2011-04-10 11:45:17 +
@@ -1223,8 +1223,10 @@ sms_schedule (void)
}
   
   /* Stage count of 1 means that there is no interleaving between
- iterations, let the scheduling passes do the job.  */
-  if (stage_count <= 1
+ iterations, let the scheduling passes do the job unless 
+-fmodulo-sched-allow-sc-one flag is set.  */
+  if ((!flag_modulo_sched_allow_sc_one && (stage_count == 1))
+ || (stage_count < 1)
  || (count_init && (loop_count <= stage_count))
  || (flag_branch_probabilities && (trip_count <= stage_count)))
{



Re: [PATCH 3/4] Simple relaxation of dynamic type change detection routine

2011-04-15 Thread Richard Guenther
On Fri, 15 Apr 2011, Martin Jambor wrote:

> Hi,
> 
> in order to speed up astar, I had to persuade the function that
> decides whether a statement potentially modifies the dynamic type of
> an object by storing a new value to the VMT pointer to consider the
> following statement harmless (all types are integers of some sort):
> 
> MEM[(i32 *)b2arp_3(D) + 8B] = 0;
> 
> I'd like to experiment with this routine a bit more once I have some
> other IPA-CP infrastructure in place but at the moment I opted for a
> simple solution:  All scalar non-pointer stores are deemed safe.
> 
> VMT pointer is a compiler generated field which is a pointer so legal
> user code is not able to store stuff there through some fancy type
> casts and compiler generated code should have no reason whatsoever to
> that either.  Therefore I believe this change is safe and useful.
> 
> I have bootstrapped and tested the patch on x886_64-linux.  OK for
> trunk?

I think this should be only done for -fstrict-aliasing.

Ok with that change.
Richard.

> Thanks,
> 
> Martin
> 
> 
> 
> 2011-04-14  Martin Jambor  
> 
>   * ipa-prop.c (stmt_may_be_vtbl_ptr_store): Return false for scalar
>   non-pointer assignments.
> 
> Index: src/gcc/ipa-prop.c
> ===
> --- src.orig/gcc/ipa-prop.c
> +++ src/gcc/ipa-prop.c
> @@ -405,13 +405,18 @@ stmt_may_be_vtbl_ptr_store (gimple stmt)
>  {
>tree lhs = gimple_assign_lhs (stmt);
>  
> -  if (TREE_CODE (lhs) == COMPONENT_REF
> -   && !DECL_VIRTUAL_P (TREE_OPERAND (lhs, 1))
> -   && !AGGREGATE_TYPE_P (TREE_TYPE (lhs)))
> +  if (!AGGREGATE_TYPE_P (TREE_TYPE (lhs)))
> + {
> +   if (!POINTER_TYPE_P (TREE_TYPE (lhs)))
>   return false;
> -  /* In the future we might want to use get_base_ref_and_offset to find
> -  if there is a field corresponding to the offset and if so, proceed
> -  almost like if it was a component ref.  */
> +
> +   if (TREE_CODE (lhs) == COMPONENT_REF
> +   && !DECL_VIRTUAL_P (TREE_OPERAND (lhs, 1)))
> + return false;
> +   /* In the future we might want to use get_base_ref_and_offset to find
> +  if there is a field corresponding to the offset and if so, proceed
> +  almost like if it was a component ref.  */
> + }
>  }
>return true;
>  }
> 
> 

-- 
Richard Guenther 
Novell / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 - GF: Markus Rex


Re: [PATCH 4/4] Devirtualization based on global objects

2011-04-15 Thread Richard Guenther
On Fri, 15 Apr 2011, Martin Jambor wrote:

> Hi,
> 
> this is the patch that actually speeds up astar (together with the
> previous one).  It implements devirtualization based on global
> objects.  In the past we have refrained from doing this because in
> general it is difficult to say whether the object is currently being
> constructed and so it might have a dynamic type of one of its
> ancestors.  However, if the object's class does not have any ancestors
> that obviously cannot happen.
> 
> Devirtualizing in such conditions is enough to change a virtual call
> to regwayobj::isaddtobound in 473.astar to a direct one which can and
> should be inlined.  That seemed a good justification to implement this
> and so the patch below does so and brings about 3.1% speedup for that
> benchmark with LTO.
> 
> I acknowledge that instead of discarding all classes with ancestors it
> would be better to check that the called virtual method has the same
> implementation in all ancestors instead.  That is perhaps something
> for later.
> 
> It took me surprisingly long to realize that this technique can be
> used for folding virtual calls based on local automatically allocated
> objexts too and so can be used to un-XFAIL g++.dg/opt/devirt1.c that
> regressed in 4.6.
> 
> Bootstrapped and tested on x86_64-linux.  OK for trunk?
> 
> Thanks,
> 
> Martin
> 
> 
> 2011-04-15  Martin Jambor  
> 
>   * ipa-cp.c (ipcp_process_devirtualization_opportunities): Devirtualize
>   also according to actual contants.
>   * gimple-fold.c (gimple_extract_devirt_binfo_from_cst): New function.
>   (gimple_fold_obj_type_ref_call): New function.
>   (gimple_fold_call): Call gimple_fold_obj_type_ref_call on
>   OBJ_TYPE_REFs.
>   * gimple.h (gimple_extract_devirt_binfo_from_cst): Declare.
> 
>   * testsuite/g++.dg/opt/devirt1.C: Bump to -O2, remove XFAIL.
>   * testsuite/g++.dg/opt/devirt2.C: New test.
>   * testsuite/g++.dg/ipa/devirt-g-1.C: Likewise.
> 
> 
> Index: src/gcc/ipa-cp.c
> ===
> --- src.orig/gcc/ipa-cp.c
> +++ src/gcc/ipa-cp.c
> @@ -1245,51 +1245,71 @@ ipcp_process_devirtualization_opportunit
>  
>for (ie = node->indirect_calls; ie; ie = next_ie)
>  {
> -  int param_index, types_count, j;
> +  int param_index;
>HOST_WIDE_INT token, anc_offset;
>tree target, delta, otr_type;
> +  struct ipcp_lattice *lat;
>  
>next_ie = ie->next_callee;
>if (!ie->indirect_info->polymorphic)
>   continue;
>param_index = ie->indirect_info->param_index;
> -  if (param_index == -1
> -   || ipa_param_cannot_devirtualize_p (info, param_index)
> -   || ipa_param_types_vec_empty (info, param_index))
> +  if (param_index == -1)
>   continue;
>  
> +  lat = ipcp_get_lattice (info, param_index);
>token = ie->indirect_info->otr_token;
>anc_offset = ie->indirect_info->anc_offset;
>otr_type = ie->indirect_info->otr_type;
>target = NULL_TREE;
> -  types_count = VEC_length (tree, info->params[param_index].types);
> -  for (j = 0; j < types_count; j++)
> +  if (lat->type == IPA_CONST_VALUE)
>   {
> -   tree binfo = VEC_index (tree, info->params[param_index].types, j);
> -   tree d, t;
> -
> +   tree binfo = gimple_extract_devirt_binfo_from_cst (lat->constant);
> +   if (!binfo)
> + continue;
> binfo = get_binfo_at_offset (binfo, anc_offset, otr_type);
> if (!binfo)
> - {
> -   target = NULL_TREE;
> -   break;
> - }
> + continue;
> +   target = gimple_get_virt_mehtod_for_binfo (token, binfo, &delta,
> +  false);
> + }
> +  else
> + {
> +   int  types_count, j;
>  
> -   t = gimple_get_virt_mehtod_for_binfo (token, binfo, &d, true);
> -   if (!t)
> - {
> -   target = NULL_TREE;
> -   break;
> - }
> -   else if (!target)
> - {
> -   target = t;
> -   delta = d;
> - }
> -   else if (target != t || !tree_int_cst_equal (delta, d))
> +   if (ipa_param_cannot_devirtualize_p (info, param_index)
> +   || ipa_param_types_vec_empty (info, param_index))
> + continue;
> +
> +   types_count = VEC_length (tree, info->params[param_index].types);
> +   for (j = 0; j < types_count; j++)
>   {
> -   target = NULL_TREE;
> -   break;
> +   tree binfo = VEC_index (tree, info->params[param_index].types, j);
> +   tree d, t;
> +
> +   binfo = get_binfo_at_offset (binfo, anc_offset, otr_type);
> +   if (!binfo)
> + {
> +   target = NULL_TREE;
> +   break;
> + }
> +
> +   t = gimple_get_virt_mehtod_for_binfo (token, binfo, &d, true);
> +   if (!t)
> + {
> +   targ

Re: [PATCH, SMS] Avoid unfreed memory when SMS fails

2011-04-15 Thread Richard Guenther
On Fri, Apr 15, 2011 at 5:26 PM, Revital Eres  wrote:
> Hello,
>
> This patch fixes the scenario where SMS fails to
> schedule a loop and continue to the next one without
> freeing data structures allocated while scheduling
> the first loop.
>
> Bootstrap and regtested on ppc64-redhat-linux.
>
> OK for mainline?

Ok.

Thanks,
Richard.

> Thanks,
> Revital
>
> Changelog:
>
> * modulo-sched.c (sms_schedule): Avoid unfreed memory when SMS fails.
>
> Index: modulo-sched.c
> ===
> --- modulo-sched.c      (revision 170464)
> +++ modulo-sched.c      (working copy)
> @@ -1177,7 +1177,6 @@ sms_schedule (void)
>              fprintf (dump_file, HOST_WIDEST_INT_PRINT_DEC, trip_count);
>              fprintf (dump_file, ")\n");
>            }
> -         continue;
>        }
>       else
>        {
>


Re: [PATCH, SMS] Free sccs field

2011-04-15 Thread Richard Guenther
On Fri, Apr 15, 2011 at 5:27 PM, Revital Eres  wrote:
> Hello,
>
> The attached patch adds missing free operation for storage
> allocated while calculating SCCs.
>
> Bootstrap and regtested on ppc64-redhat-linux.
>
> OK for mainline?

Ok.

Thanks,
Richard.

> Thanks,
> Revital
>
> Changelog:
>
>        * ddg.c (free_ddg_all_sccs): Free sccs field in struct ddg_all_sccs.
>
> Index: ddg.c
> ===
> --- ddg.c       (revision 171573)
> +++ ddg.c       (working copy)
> @@ -1011,6 +1082,8 @@ free_ddg_all_sccs (ddg_all_sccs_ptr all_
>   for (i = 0; i < all_sccs->num_sccs; i++)
>     free_scc (all_sccs->sccs[i]);
>
> +  if (all_sccs->sccs)
> +    free (all_sccs->sccs);
>   free (all_sccs);
>  }
>


Re: [PATCH, SMS] New flag to apply SMS when SC equals 1

2011-04-15 Thread Richard Guenther
On Fri, Apr 15, 2011 at 5:28 PM, Revital Eres  wrote:
> Hello,
>
> The attached patch introduces a new flag to allow applying SMS when
> stage count (SC) also equals 1.
> Currently, SMS is applied only when SC greater than 1 as stage count
> of 1 means that there is no interleaving between iterations and the
> scheduling passes do the job in this case.
> The new flag is introduced for debugging purposes to apply SMS
> on more loops.
>
> Bootstrap and regtest on ppc64-redhat-linux.
>
> OK for mainline?

If it's for debugging, can you use a --parm instead (like
modulo-sched-min-sc or similar)?

Thanks,
Richard.

> Thanks,
> Revital
>
> Changelog:
>
>        * common.opt (fmodulo-sched-allow-sc-one): New flag.
>        * modulo-sched.c (sms_schedule): Allow SMS when stage count
>        equals one and -fmodulo-sched-allow-sc-one flag is set.
>


Re: [PATCH, SMS] Free sccs field

2011-04-15 Thread Nathan Froyd
On Fri, Apr 15, 2011 at 06:27:05PM +0300, Revital Eres wrote:
> +  if (all_sccs->sccs)
> +free (all_sccs->sccs);

No need to check for non-NULL prior to free'ing.

-Nathan



Re: [Patch, Fortran] PR 18918: Build + install libcaf_single.a

2011-04-15 Thread Janne Blomqvist
On Wed, Apr 13, 2011 at 17:08, Tobias Burnus  wrote:
> Hello all, hi Ralf and Jorge,
>
> The attached patch causes libgfortran/ also to build
> libgfortran/caf/single.c, which will be installed as
> $PREFIX/.../target-triplet/gcc-version/libcaf_single.a.
>
> Build and tested on x86-64-linux.
> OK for the trunk?

Ok.

--
Janne Blomqvist


Re: [PATCH 3/6] Allow jumps in epilogues

2011-04-15 Thread Bernd Schmidt
On 04/13/2011 02:38 PM, Bernd Schmidt wrote:
> This still requires the i386 output_set_got which I think I can cope
> with [...]

Patch below, to be applied on top of all the others. Only lightly tested
so far beyond standard (fairly useless) regression tests, by comparing
generated assembly before/after, for -fpic -march=pentium and core2, for
i686-linux and i686-apple-darwin10 (for TARGET_MACHO).

I've not found or managed to create a testcase for making MI thunks
generate a call to output_set_got. I think it should work, but it's not
tested.


Bernd
* config/i386/i386.c (output_set_got): Don't call
dwarf2out_flush_queued_reg_saves.
(ix86_reorg): Split set_got patterns.
(i386_dwarf_handle_frame_unspec,
i386_dwarf_flush_queued_register_saves): New static functions.
(TARGET_DWARF_HANDLE_FRAME_UNSPEC,
TARGET_DWARF_FLUSH_QUEUED_REGISTER_SAVES): New macros.
* config/i386/i386.md (set_got_call, set_got_pop, set_got_add):
New patterns.
* dwarf2out.c (scan_until_barrier): Use the target's
dwarf_flush_queued_register_saves hook.
* target.def (dwarf_flush_queued_register_saves): New hook.
* doc/tm.texi: Regenerate.
Index: gcc/config/i386/i386.c
===
--- gcc.orig/config/i386/i386.c
+++ gcc/config/i386/i386.c
@@ -8953,6 +8953,8 @@ output_set_got (rtx dest, rtx label ATTR
output_asm_insn ("mov%z0\t{%2, %0|%0, %2}", xops);
   else
{
+ /* For normal functions, this pattern is split, but we can still
+get here for thunks.  */
  output_asm_insn ("call\t%a2", xops);
 #ifdef DWARF2_UNWIND_INFO
  /* The call to next label acts as a push.  */
@@ -9010,12 +9012,6 @@ output_set_got (rtx dest, rtx label ATTR
   get_pc_thunk_name (name, REGNO (dest));
   pic_labels_used |= 1 << REGNO (dest);
 
-#ifdef DWARF2_UNWIND_INFO
-  /* Ensure all queued register saves are flushed before the
-call.  */
-  if (dwarf2out_do_frame ())
-   dwarf2out_flush_queued_reg_saves ();
-#endif
   xops[2] = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (name));
   xops[2] = gen_rtx_MEM (QImode, xops[2]);
   output_asm_insn ("call\t%X2", xops);
@@ -30458,6 +30454,56 @@ ix86_reorg (void)
  with old MDEP_REORGS that are not CFG based.  Recompute it now.  */
   compute_bb_for_insn ();
 
+  /* Split any set_got patterns so that we interact correctly with
+ dwarf2out.  */
+  if (!TARGET_64BIT && !TARGET_VXWORKS_RTP && !TARGET_DEEP_BRANCH_PREDICTION
+  && flag_pic)
+{
+  rtx insn, next;
+  for (insn = get_insns (); insn; insn = next)
+   {
+ rtx pat, label, dest, cst, gotsym, new_insn;
+ int icode;
+
+ next = NEXT_INSN (insn);
+ if (!NONDEBUG_INSN_P (insn))
+   continue;
+
+ icode = recog_memoized (insn);
+ if (icode != CODE_FOR_set_got && icode != CODE_FOR_set_got_labelled)
+   continue;
+
+ extract_insn (insn);
+ if (icode == CODE_FOR_set_got)
+   {
+ label = gen_label_rtx ();
+ cst = const0_rtx;
+   }
+ else
+   {
+ label = recog_data.operand[1];
+ cst = const1_rtx;
+   }
+
+ dest = recog_data.operand[0];
+ pat = gen_set_got_call (label, cst);
+ new_insn = emit_insn_before (pat, insn);
+ RTX_FRAME_RELATED_P (new_insn) = 1;
+ RTX_FRAME_RELATED_P (XVECEXP (PATTERN (new_insn), 0, 1)) = 1;
+ gotsym = gen_rtx_SYMBOL_REF (Pmode, GOT_SYMBOL_NAME);
+ pat = gen_set_got_pop (dest, label);
+ new_insn = emit_insn_before (pat, insn);
+ RTX_FRAME_RELATED_P (new_insn) = 1;
+ RTX_FRAME_RELATED_P (XVECEXP (PATTERN (new_insn), 0, 1)) = 1;
+ if (!TARGET_MACHO)
+   {
+ pat = gen_set_got_add (dest, gotsym, label);
+ new_insn = emit_insn_before (pat, insn);
+   }
+ delete_insn (insn);
+   }
+}
+
   if (optimize && optimize_function_for_speed_p (cfun))
 {
   if (TARGET_PAD_SHORT_FUNCTION)
@@ -30475,6 +30521,30 @@ ix86_reorg (void)
 move_or_delete_vzeroupper ();
 }
 
+/* Handle the TARGET_DWARF_HANDLE_FRAME_UNSPEC hook.
+   This is called from dwarf2out.c to emit call frame instructions
+   for frame-related insns containing UNSPECs and UNSPEC_VOLATILEs.  */
+static void
+i386_dwarf_handle_frame_unspec (rtx pattern ATTRIBUTE_UNUSED,
+   int index ATTRIBUTE_UNUSED)
+{
+  gcc_assert (index == UNSPEC_SET_GOT);
+}
+
+/* Handle the TARGET_DWARF_FLUSH_QUEUED_REGISTER_SAVES hook.
+   This is called from dwarf2out.c to decide whether all queued
+   register saves should be emitted before INSN.  */
+static bool
+i386_dwarf_flush_queued_register_saves (rtx insn)
+{
+  if (!TARGET_VXWORKS_RTP || !flag_pic)
+{
+  int icode = recog_memoized (insn);
+  return (

[patch, libgfortran] PR48589 Invalid G0/G0.d editing for NaN/infinity

2011-04-15 Thread Jerry DeLisle

Hi,

I plan to commit the following simple and obvious patch.

Regression tested on x86-64-linux-gnu.

I will apply the test case in the PR to the testsuite.

Regards,

Jerry

2011-04-15  Jerry DeLisle  

PR libgfortran/48589
* io/write_float.def (write_infnan): Set width properly for G0.

Index: write_float.def
===
--- write_float.def (revision 172467)
+++ write_float.def (working copy)
@@ -654,11 +654,11 @@ write_infnan (st_parameter_dt *dtp, const fnode *f
   mark = (sign == S_PLUS || sign == S_MINUS) ? 8 : 7;

   nb =  f->u.real.w;
-
+
   /* If the field width is zero, the processor must select a width
 not zero.  4 is chosen to allow output of '-Inf' or '+Inf' */

-  if (nb == 0)
+  if ((nb == 0) || dtp->u.p.g0_no_blanks)
{
  if (isnan_flag)
nb = 3;



Re: patch pings

2011-04-15 Thread Bernd Schmidt
On 04/15/2011 04:18 PM, Jeff Law wrote:

> http://gcc.gnu.org/ml/gcc-patches/2011-03/msg02247.html

I don't know. I sympathize with the goal, but I'm not too happy about
the structure of this patch. Doesn't this do the scan once for every
reload in an insn? It seems to me like the loop (or rather, a function
call to it) should at least be placed at the top of choose_reload_regs
rather than in allocate_reload_reg.

In my tests (i686-linux, compiling my standard set of testcases) the
patch appears to have, in general, very little effect on code quality.
It might on average be slightly better, but I've also seen several cases
where we do worse. If we go to so much effort to do scanning of
subsequent insns it ought to be possible to do better.

AFAICT the patch ignores whether the pseudo that's being reloaded will
be reloaded again in the current ebb - if not, it should get a "bad"
spill register, where "bad" in this case only includes hard regs that
don't currently hold a spilled pseudo. Likewise, once we've reloaded the
last occurrence of a pseudo in an ebb, we should forget the
reg_reloaded_contents to make sure we consider the spill reg good again.
However, I'm not sure whether changes such as these alone will reduce
the number of cases where the patch regresses.


Bernd


Re: FDO usability patch -- cfg and lineno checksum

2011-04-15 Thread Xinliang David Li
Resent in plain text mode ..

David

On Fri, Apr 15, 2011 at 9:28 AM, Xinliang David Li  wrote:
>
> Honza, do you have a chance to take a look at it?
> Thanks,
>
> David
>
> On Wed, Apr 13, 2011 at 3:25 PM, Xinliang David Li  wrote:
>>
>> Hi,  in current FDO implementation, the source file version used in
>> profile-generate needs to strictly match the version used in
>> profile-use -- simple formating changes will invalidate the profile
>> data (use of it will lead to compiler error or ICE). There are two
>> main problems that lead to the weakness:
>>
>> 1) the function checksum is computed based on line number and number
>> of basic blocks -- this means any line number change will invalidate
>> the profile data. In fact, line number should play very minimal role
>> in profile matching decision. Another problem is that number of basic
>> blocks is also not a good indicator whether CFG matches or not.
>> 2) cgraph's pid is used as the key to find the matching profile data
>> for a function. If there is any new function added, or if the order of
>> the functions changes, the profile data is invalidated.
>>
>> The attached is a patch from google that improves this.
>> 1) function checksum is split into two parts: lineno checksum and cfg 
>> checksum
>> 2) function assembler name is used in profile hashing.
>>
>> Bootstrapped and regression tested on x86_64/linux.
>>
>> Ok for trunk?
>>
>> Thanks,
>>
>> David
>>
>>  2011-04-13  Xinliang David Li  
>>
>>        * tree.c (crc32_byte): New function.
>>        * tree.h (crc32_byte): New function.
>>        * gcov.c (function_info): Add new fields.
>>        (read_graph_file): Handle new fields.
>>        (read_count_file): Handle name.
>>        * gcov-io.c (gcov_string_length): New function.
>>        (gcov_write_words): Bug fix.
>>        (gcov_read_words): Bug fix.
>>        * gcov-io.h: New interfaces.
>>        * profile.c (get_exec_counts): New parameter.
>>        (compute_branch_probablilities): New parameter.
>>        (compute_value_histogram): New parameter.
>>        (branch_prob): Pass cfg_checksum.
>>        * coverage.c (get_const_string_type): New function.
>>        (hash_counts_entry_hash): Use string hash.
>>        (hash_counts_entry_eq): Use string compare.
>>        (htab_counts_entry_del): Delete name.
>>        (read_count_file): Add handling of cfg checksum.
>>        (get_coverage_counts): New parameter.
>>        (xstrdup_mask_random): New function.
>>        (coverage_compute_lineno_checksum): New function.
>>        (coverage_compute_cfg_checksum): New function.
>>        (coverage_begin_output): New parameter.
>>        (coverage_end_function): New parameter.
>>        (build_fn_info_type): Build new fields.
>>        (build_fn_info_value): Build new field values.
>>        * coverage.h: Interface changes.
>>        * libgcov.c (gcov_exit): Dump new fields.
>>        * gcov_dump.c (tag_function): Print new fields.
>


Re: [PATCH] Improve combining of conditionals

2011-04-15 Thread Mike Stump
On Apr 15, 2011, at 3:44 AM, Eric Botcazou  wrote:
>> And make IN_COND a bool instead of an int?
> 
> I had the same initial reaction but then came to the same conclusion as Maxim.

If it can be bool, it should be bool.


Re: [PATCH, SMS] Free sccs field

2011-04-15 Thread Revital Eres
Hello,

On 15 April 2011 18:53, Nathan Froyd  wrote:
> On Fri, Apr 15, 2011 at 06:27:05PM +0300, Revital Eres wrote:
>> +  if (all_sccs->sccs)
>> +    free (all_sccs->sccs);
>
> No need to check for non-NULL prior to free'ing.

OK, I'll commit the patch without the check then.
(after re-testing)

Thanks,
Revital


>
> -Nathan
>
>


Re: [build] Allow building libobjc_gc on Tru64 UNIX, Darwin

2011-04-15 Thread Rainer Orth
Nicola,

>> Ok for mainline if both pass?
>
> Yes.
>
> [and by the way, I think you're fixing PR libobjc/32037 in the process. :-)]

indeed, thanks for checking.  Testing revealed that I'd been lazy with
quoting.  I'm including the patch I've installed, which encloses
OBJC_BOEHM_GC in configure.ac in single quotes.

>> Btw., it would be considerably easier if --enable-libobjc-gc could be
>> enabled automatically if boehm-gc is configured.
>
> Yes.

I've filed

libobjc/48626   --enable-objc-gc should be automatic

>> Besides, it seems that libobjc_gc isn't tested anywhere.
>
> Yes.

and

libobjc/48627   libobjc_gc should be tested

just in case.

Rainer


2011-04-13  Rainer Orth  

PR libobjc/32037
* Makefile.in (OBJC_GCFLAGS): Move ...
* configure.ac (enable_objc_gc): ... here.
Add $(libsuffix) to OBJC_BOEHM_GC.
* configure: Regenerate.

diff --git a/libobjc/Makefile.in b/libobjc/Makefile.in
--- a/libobjc/Makefile.in
+++ b/libobjc/Makefile.in
@@ -93,7 +93,7 @@ LIBTOOL_INSTALL = $(LIBTOOL) --mode=inst
 LIBTOOL_CLEAN   = $(LIBTOOL) --mode=clean
 #LIBTOOL_UNINSTALL = $(LIBTOOL) --mode=uninstall
 
-OBJC_GCFLAGS=-DOBJC_WITH_GC=1
+OBJC_GCFLAGS=@OBJC_GCFLAGS@
 OBJC_BOEHM_GC=@OBJC_BOEHM_GC@
 OBJC_BOEHM_GC_INCLUDES=@OBJC_BOEHM_GC_INCLUDES@
 OBJC_BOEHM_GC_LIBS=../boehm-gc/libgcjgc_convenience.la $(thread_libs_and_flags)
diff --git a/libobjc/configure.ac b/libobjc/configure.ac
--- a/libobjc/configure.ac
+++ b/libobjc/configure.ac
@@ -1,6 +1,6 @@
 # Process this file with autoconf to produce a configure script.
 #   Copyright 1994, 1995, 1996, 1997, 1998, 1999, 2001, 2002, 2003, 2004
-#   2005, 2006, 2009 Free Software Foundation, Inc.
+#   2005, 2006, 2009, 2011 Free Software Foundation, Inc.
 #   Originally contributed by Dave Love (d.l...@dl.ac.uk).
 #
 #This file is part of GCC.
@@ -63,15 +63,25 @@ AC_ARG_ENABLE(objc-gc,
   the GNU Objective-C runtime.],
 [case $enable_objc_gc in
   no)
+OBJC_GCFLAGS=''
 OBJC_BOEHM_GC=''
 OBJC_BOEHM_GC_INCLUDES=''
 ;;
   *)
-OBJC_BOEHM_GC=libobjc_gc.la
+OBJC_GCFLAGS='-DOBJC_WITH_GC=1'
+OBJC_BOEHM_GC='libobjc_gc$(libsuffix).la'
 OBJC_BOEHM_GC_INCLUDES='-I$(top_srcdir)/../boehm-gc/include 
-I../boehm-gc/include'
+case "${host}" in
+  alpha*-dec-osf*)
+# boehm-gc headers include , which needs to be compiled
+   # with -pthread on Tru64 UNIX.
+OBJC_GCFLAGS="${OBJC_GCFLAGS} -pthread"
+   ;;
+esac
 ;;
 esac],
-[OBJC_BOEHM_GC=''; OBJC_BOEHM_GC_INCLUDES=''])
+[OBJC_GCFLAGS=''; OBJC_BOEHM_GC=''; OBJC_BOEHM_GC_INCLUDES=''])
+AC_SUBST(OBJC_GCFLAGS)
 AC_SUBST(OBJC_BOEHM_GC)
 AC_SUBST(OBJC_BOEHM_GC_INCLUDES)
 

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[v3] Handle NOTY in extract_symvers.pl

2011-04-15 Thread Rainer Orth
While testing the close-to-final version of my COMDAT-group-with-Sun as
patch

[build, c++, lto] Support COMDAT group with Sun as (PR target/40483)
http://gcc.gnu.org/ml/gcc-patches/2010-05/msg01365.html
http://gcc.gnu.org/ml/gcc-patches/2010-06/msg00600.html

which only awaits the release of a linker with the required fixes, I
noticed that extract_symvers.pl doesn't handly NOTY/NOTYPE entries, but
chokes instead, whereas the readelf based version handles them just
fine.  Although the final patch won't create them anymore, I'm
installing the following patch that makes the script more robust.  It
was tested while putting the finishing touches on the as-comdat patch.

Installed on mainline.

Rainer


2011-04-04  Rainer Orth  

* scripts/extract_symvers.pl: Handle NOTY.

diff --git a/libstdc++-v3/scripts/extract_symvers.pl 
b/libstdc++-v3/scripts/extract_symvers.pl
--- a/libstdc++-v3/scripts/extract_symvers.pl
+++ b/libstdc++-v3/scripts/extract_symvers.pl
@@ -1,6 +1,6 @@
 #!/usr/bin/perl -w
 
-# Copyright (C) 2010 Free Software Foundation, Inc.
+# Copyright (C) 2010, 2011 Free Software Foundation, Inc.
 #
 # This file is part of the GNU ISO C++ Library.  This library is free
 # software; you can redistribute it and/or modify it under the
@@ -108,6 +108,7 @@ while () {
 die "unhandled symbol:\n$_" if ($bind !~ /^(GLOB|WEAK)/ or $oth !~ /[DP]/);
 
 # Adapt to readelf type naming convention.
+$type = "NOTYPE" if ($type eq "NOTY");
 $type = "OBJECT" if ($type eq "OBJT");
 
 # Use correct symbol type.
@@ -116,7 +117,7 @@ while () {
 close ELFDUMP or die "elfdump error";
 
 foreach $symbol (keys %type) {
-if ($type{$symbol} eq "FUNC") {
+if ($type{$symbol} eq "FUNC" || $type{$symbol} eq "NOTYPE") {
push @lines, "$type{$symbol}:$symbol\@\@$version{$symbol}\n";
 } elsif ($type{$symbol} eq "OBJECT" and $size{$symbol} == 0) {
push @lines, "$type{$symbol}:$size{$symbol}:$version{$symbol}\n";


-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] doubled words

2011-04-15 Thread Mike Stump
On Apr 15, 2011, at 1:18 AM, Jim Meyering  wrote:
> While most of these are in comments, the corrections
> in gcc/tree-cfg.c and gcc/config/sh/constraints.md are in strings.
> The former at least is marked for translation, and hence appears
> in every .po file.

I think these are obvious.
> 


[libjava, testsuite] Link jni tests with -liconv on Tru64 UNIX

2011-04-15 Thread Rainer Orth
While checking Tru64 UNIX libjava testsuite results, I noticed that the
libjava.jni/invocation/PR16923.c execution test was failing like this:

277710:./PR16923: /sbin/loader: Error: libgcj.so.12: symbol "iconv" unresolved
277710:./PR16923: /sbin/loader: Fatal Error: Load of "./PR16923" failed: 
Unresolved symbol name
FAIL: PR16923 run
UNTESTED: PR16923 output

This happens because the test isn't linked with gcj (which handles this
in libgcj.spec), but with gcc, which does not.  Perhaps this could (and
should) be done, or rather libgcj.so always linked with -liconv on
targets that require that, but I chose the easy way out for now,
following the Darwin lead.

Tested with the appropriate runtest invocation (which unfortunately
means running all of jni.exp, not just the single testcase, installed on
mainline, 4.5 and 4.6 branches.

Rainer


2011-04-14  Rainer Orth  

* testsuite/libjava.jni/jni.exp (gcj_jni_get_cxxflags_invocation):
Add $libiconv to cxxflags for alpha*-dec-osf*.

diff --git a/libjava/testsuite/libjava.jni/jni.exp 
b/libjava/testsuite/libjava.jni/jni.exp
--- a/libjava/testsuite/libjava.jni/jni.exp
+++ b/libjava/testsuite/libjava.jni/jni.exp
@@ -280,6 +280,11 @@ proc gcj_jni_get_cxxflags_invocation {} 
 lappend cxxflags "-shared-libgcc"
   }
 
+  # Tru64 UNIX needs -liconv linked explicitly since gcc does the linking.
+  if { [istarget "alpha*-dec-osf*"] } {
+lappend cxxflags $libiconv
+  }
+
   return $cxxflags
 }
 

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] Fix PR46399 - missing mode promotion for libcall args - updated

2011-04-15 Thread Bernd Schmidt
On 03/03/2011 05:32 PM, Andreas Krebbel wrote:
> [PATCH] Fix PR46399 - missing mode promotion for libcall args
> http://gcc.gnu.org/ml/gcc-patches/2011-02/msg01595.html
> 
> after removing the FUNCTION_VALUE target macro with:
> 
> [Committed] S/390: Remove deprecated target macro FUNCTION_VALUE
> http://gcc.gnu.org/ml/gcc-patches/2011-03/msg00127.html
> 
> And I've moved the documentation of the new target hook to target.def
> as requested by Nathan.
> 
> Bootstrapped on x86_64, s390 and s390x. No regressions.
> 
> Two testcases fixed on s390x:
> 
> < FAIL: gcc.dg/dfp/pr41049.c execution test
> < FAIL: decimal/comparison.cc execution test

The problem is quite real, it's just bitten us with some ARM changes.
The fact that we don't have types for libcall function args is very
unfortunate. Maybe that's something we need to change? It would be a
much better fix, and it might not even be _that_ much work to add an
optional type field to each optab entry, which could then be passed to
emit_library_call - if it's nonnull, we'd use the normal
promote_function_mode hook otherwise ignore the problem as before. This
would allow us to solve the s390 problem reasonably quickly and lets us
add more libcall types later.

As to the existing patch, isn't fntype NULL throughout
emit_library_call_value_1? It might make sense to lose that argument
from the hook. I also doubt that the unsignedp bit is very useful
without an argument type, but I'm at a loss to think of something
sensible to do about it.


Bernd


ObjC: rewritten checks for duplicate instance variables (for improved speed and improved error messages)

2011-04-15 Thread Nicola Pero
This patch rewrites the check for duplicate instance variables
done by the Objective-C (and Objective-C++) compiler.

The new code has many advantages:

 * it avoids copying all the instance variables into a temporary
   tree chain in the way that the all code was doing (according
   to some notes from benchmarks and profiling I did a few months
   ago, this should save about 5k allocations per compilation
   when compiling a typical ObjC GNUstep file, giving an approx 0.25%
   order of magnitude speedup with -fsyntax-only);

 * it provides better error messages by providing the informative
   message "previous declaration of ..." when a duplicate declaration
   is found;

 * it avoids producing duplicate errors if a class with duplicate
   instance variables is subclassed.  The existing code always built
   a copy of the full list of instance variables for the class (including
   superclass ones) and then would look for duplicates.  This means if
   you have a class with a duplicate, and subclass it 10 times, you'll get
   the same error message 11 times!  The new code avoids this mistake,
   which also reduces the number of checks required (for the standard case
   without a hashtable, I'll explain in a minute).

 * it removes the special, ad-hoc field duplicate check implementation for
   ObjC++ and has it use the same code as the ObjC codebase.

The new code does the checks for duplicates directly, by just iterating
and comparing, unless the number of checks to do is very large, in which
case it falls back to a hashtable.  This is similar to what the C code
does for structs, but here the check is more specific as we are only
checking the instance variables in the current class against the other
instance variables in the class and the superclass, so it's all more
specific and optimized for the situation.

Updated existing testcases (to check for the new "previous declaration of ..."
messages), and added more testcases.

Ok to commit ?

Thanks

Index: c-family/c-objc.h
===
--- c-family/c-objc.h   (revision 172444)
+++ c-family/c-objc.h   (working copy)
@@ -62,7 +62,7 @@ extern tree objc_build_string_object (tree);
 extern tree objc_get_protocol_qualified_type (tree, tree);
 extern tree objc_get_class_reference (tree);
 extern tree objc_get_class_ivars (tree);
-extern tree objc_get_interface_ivars (tree);
+extern bool objc_detect_field_duplicates (bool);
 extern void objc_start_class_interface (tree, tree, tree, tree);
 extern void objc_start_category_interface (tree, tree, tree, tree);
 extern void objc_start_protocol (tree, tree, tree);
Index: c-family/ChangeLog
===
--- c-family/ChangeLog  (revision 172444)
+++ c-family/ChangeLog  (working copy)
@@ -1,3 +1,9 @@
+2011-04-15  Nicola Pero  
+
+   * c-objc.h (objc_get_interface_ivars): Removed.
+   (objc_detect_field_duplicates): New.
+   * stub-objc.c: Likewise.
+   
 2011-04-14  Nicola Pero  
 
* stub-objc.c (objc_declare_protocols): Renamed to
Index: c-family/stub-objc.c
===
--- c-family/stub-objc.c(revision 172444)
+++ c-family/stub-objc.c(working copy)
@@ -275,10 +275,10 @@ objc_get_class_reference (tree ARG_UNUSED (name))
   return 0;
 }
 
-tree
-objc_get_interface_ivars (tree ARG_UNUSED (fieldlist))
+bool
+objc_detect_field_duplicates (bool ARG_UNUSED (check_superclasses_only))
 {
-  return 0;
+  return false;
 }
 
 tree
Index: objc/ChangeLog
===
--- objc/ChangeLog  (revision 172444)
+++ objc/ChangeLog  (working copy)
@@ -1,3 +1,10 @@
+2011-04-15  Nicola Pero  
+
+   * objc-act.c (objc_get_interface_ivars): Removed.
+   (objc_detect_field_duplicates): New.
+   (hash_instance_variable): New.
+   (eq_instance_variable): New.
+   
 2011-04-14  Nicola Pero  
 
* objc-act.c (objc_declare_protocols): Renamed to
Index: objc/objc-act.c
===
--- objc/objc-act.c (revision 172444)
+++ objc/objc-act.c (working copy)
@@ -3813,6 +3813,8 @@ lookup_interface (tree ident)
   }
 }
 
+
+
 /* Implement @defs () within struct bodies.  */
 
 tree
@@ -3829,19 +3831,242 @@ objc_get_class_ivars (tree class_name)
   return error_mark_node;
 }
 
+
+/* Functions used by the hashtable for field duplicates in
+   objc_detect_field_duplicates().  Ideally, we'd use a standard
+   key-value dictionary hashtable , and store as keys the field names,
+   and as values the actual declarations (used to print nice error
+   messages with the locations).  But, the hashtable we are using only
+   allows us to store keys in the hashtable, without values (it looks
+   more like a set).  So, we store the DECLs, but define equality as
+   DECLs having the same name, and hash as the hash of the n

[PATCH, PR 48601] cgraph_get_create_node in emultls.c

2011-04-15 Thread Martin Jambor
Hi,

lower_emutls_function_body in emultls.c should use
cgraph_get_create_node to create call graph nodes if need be because
it is introducing TLS builtin decls into IL.

I have bootstrapped and tested this on x86_64-linux (where the bug
does not happen) and in bugzilla it has been confirmed it fixes the
issue.  It is rather straight-forward and was pre-approved on IRC by
Honza so I am about to commit it now.  Testcases are already in the
testsuite.

Thanks,

Martin



2011-04-15  Martin Jambor  

PR middle-end/48601
* tree-emutls.c (lower_emutls_function_body): Call
cgraph_get_create_node instead of cgraph_get_node.  Do not assert the
result is non-NULL.

Index: src/gcc/tree-emutls.c
===
--- src.orig/gcc/tree-emutls.c
+++ src/gcc/tree-emutls.c
@@ -619,8 +619,9 @@ lower_emutls_function_body (struct cgrap
 
   d.cfun_node = node;
   d.builtin_decl = built_in_decls[BUILT_IN_EMUTLS_GET_ADDRESS];
-  d.builtin_node = cgraph_get_node (d.builtin_decl);
-  gcc_checking_assert (d.builtin_node);
+  /* This is where we introduce the declaration to the IL and so we have to
+ create a node for it.  */
+  d.builtin_node = cgraph_get_create_node (d.builtin_decl);
 
   FOR_EACH_BB (d.bb)
 {


Re: [PATCH] Improve combining of conditionals

2011-04-15 Thread Eric Botcazou
> If it can be bool, it should be bool.

Rather basic principle IMO.  Consistency is of much greater value.

-- 
Eric Botcazou


Re: [Patch,AVR]: FIX ICE in optabs due to bad rotate expander.

2011-04-15 Thread Georg-Johann Lay
Denis Chertykov schrieb:
> 2011/4/14 Georg-Johann Lay :
>> Denis Chertykov schrieb:
>>> 2011/4/14 Georg-Johann Lay :
 The "rotl3" expanders (mode \in {HI,SI,DI}) violates synopsis by
 using 4 operands instead of 3. This runs in ICE in top of
 optabs.c:maybe_gen_insn.

 The right way to do this is to use match_dup, not match_operand. So
 the fix is obvious.

 Regenerated avr-gcc and run against avr testsuite:

 gcc.target/avr/torture/pr41885.c generates these patterns

 Johann

 2011-04-14  Georg-Johann Lay  

* config/avr/avr.md ("rotlhi3", "rotlsi3", "rotldi3"): Fix
expanders operand 3 from match_operand to match_dup.
>>> May be better to use (clobber (match_scratch ...)) here and in
>>> `*rotw' and `*rotb'.
>> These are splitters that might split before reload, producing strange,
>> non-matching insns like
>>  (set (scratch:QI) ...
>> or crashing already in avr_rotate_bytes becase the split condition reads
>> "&& (reload_completed || mode == DImode)"
> 
> Generally I'm agree with you, change match_operand to match_dup is easy.

I already submitted the easy patch to avoid the ICE.

> But IMHO the right way is:
>  - the splitters and avr_rotate_bytes are wrong and must be fixed too.
>  - operands[3] is a scratch register and right way is to use match_scratch.
> 
> I can approve the patch.
> 
> Denis.

Ok, here is the right-way patch.

The expander generates a SCRATCH instead of a pseudo, and in case of a
pre-reload split the SCRATCH is replaced with a pseudo because it is
not known if or if not the scratch will actually be needed.

avr_rotate_bytes now skips operations on non-REG 'scratch'.
Furthermore, I added an assertion that ensures that 'scratch' is
actually a REG when it is needed.

Besides that, I fixed indentation. I know it is not optimal to fix
indentation alongside with functionality... did it anyway :-)

Finally, I exposed alternative #3 of the insns to the register
allocator, because it is not possible to distinguish between
overlapping or non-overlapping regs, and #3 does not need a scratch.

Ran C-testsuite with no regressions.

Johann

2011-04-15  Georg-Johann Lay  

* config/avr/avr.md ("rotl3"): Use SCRATCH instead
of REG in operand 3. Fix indentation. Unquote C snippet.
("*rotw","*rotbw"): Ditto. Replace scratch
with pseudo for pre-reload splits. Use const_int_operand for
operand 2. Expose alternative 3 to register allocator.
* config/avr/avr.c (avr_rotate_bytes): Handle SCRATCH in
operands[3]. Use GNU indentation style.

Index: config/avr/avr.md
===
--- config/avr/avr.md	(Revision 172493)
+++ config/avr/avr.md	(Arbeitskopie)
@@ -1519,22 +1519,22 @@ (define_mode_attr rotsmode [(DI "QI") (S
 
 (define_expand "rotl3"
   [(parallel [(set (match_operand:HIDI 0 "register_operand" "")
-		   (rotate:HIDI (match_operand:HIDI 1 "register_operand" "")
-(match_operand:VOID 2 "const_int_operand" "")))
-		(clobber (match_dup 3))])]
+   (rotate:HIDI (match_operand:HIDI 1 "register_operand" "")
+(match_operand:VOID 2 "const_int_operand" "")))
+  (clobber (match_dup 3))])]
   ""
-  "
-{
-  if (CONST_INT_P (operands[2]) && 0 == (INTVAL (operands[2]) % 8))
   {
-  if (AVR_HAVE_MOVW && 0 == INTVAL (operands[2]) % 16)
-operands[3] = gen_reg_rtx (mode);
-  else
-operands[3] = gen_reg_rtx (QImode);
-  }
-  else
-FAIL;
-}")
+if (CONST_INT_P (operands[2])
+&& 0 == INTVAL (operands[2]) % 8)
+  {
+if (AVR_HAVE_MOVW && 0 == INTVAL (operands[2]) % 16)
+  operands[3] = gen_rtx_SCRATCH (mode);
+else
+  operands[3] = gen_rtx_SCRATCH (QImode);
+  }
+else
+  FAIL;
+  })
 
 
 ;; Overlapping non-HImode registers often (but not always) need a scratch.
@@ -1545,35 +1545,49 @@ (define_expand "rotl3"
 
 ; Split word aligned rotates using scratch that is mode dependent.
 (define_insn_and_split "*rotw"
-  [(set (match_operand:HIDI 0 "register_operand" "=r,r,#&r")
-	(rotate:HIDI (match_operand:HIDI 1 "register_operand" "0,r,r")
-		 (match_operand 2 "immediate_operand" "n,n,n")))
-   (clobber (match_operand: 3 "register_operand"  "=" ))]
-  "(CONST_INT_P (operands[2]) &&
- (0 == (INTVAL (operands[2]) % 16) && AVR_HAVE_MOVW))"
+  [(set (match_operand:HIDI 0 "register_operand"  "=r,r,&r")
+(rotate:HIDI (match_operand:HIDI 1 "register_operand"  "0,r,r")
+ (match_operand 2 "const_int_operand"  "n,n,n")))
+   (clobber (match_scratch: 3   "="))]
+  "AVR_HAVE_MOVW
+   && 0 == INTVAL (operands[2]) % 16"
   "#"
   "&& (reload_completed || mode == DImode)"
   [(const_int 0)]
-  "avr_rotate_bytes (operands);
-  DONE;"
-)
+  {
+/* Split happens in .split1: Cook up a pseudo */
+if (SCRATCH == GET_CODE (operands[3])
+&& !re

[wwwdocs] Shuffle entries in the footer and remove margin at the bottom

2011-04-15 Thread Gerald Pfeifer
This concludes this mini project. :-)

Committed and the pages on gcc.gnu.org regenerated; www.gnu.org will
follow over night.

Gerald


Index: style.mhtml
===
RCS file: /cvs/gcc/wwwdocs/htdocs/style.mhtml,v
retrieving revision 1.115
diff -u -r1.115 style.mhtml
--- style.mhtml 12 Apr 2011 21:11:23 -  1.115
+++ style.mhtml 15 Apr 2011 17:22:30 -
@@ -232,14 +232,10 @@
 
 
 
-These pages are
-http://gcc.gnu.org/about.html";>maintained by the GCC team.
-Last modified http://validator.w3.org/check/referer";>.
-
-For questions related to the use of GCC, please consult these web
-pages and the http://gcc.gnu.org/onlinedocs/";>GCC manuals. If
-that fails, the mailto:gcc-h...@gcc.gnu.org";>gcc-h...@gcc.gnu.org
+For questions related to the use of GCC,
+please consult these web pages and the
+http://gcc.gnu.org/onlinedocs/";>GCC manuals. If that fails,
+the mailto:gcc-h...@gcc.gnu.org";>gcc-h...@gcc.gnu.org
 mailing list might help.
 Comments on these web pages and the development of GCC are welcome on our
 developer list at mailto:g...@gcc.gnu.org";>g...@gcc.gnu.org.
@@ -252,6 +248,11 @@
 Verbatim copying and distribution of this entire article is
 permitted in any medium, provided this notice is preserved.
 
+These pages are
+http://gcc.gnu.org/about.html";>maintained by the GCC team.
+Last modified http://validator.w3.org/check/referer";>.
+
 
 
 



Re: [PATCH v3] Re: avoid useless if-before-free tests

2011-04-15 Thread Jim Meyering
> I added Jim to the gcc group.

Thanks, Tom.


Re: [PATCH] Improve combining of conditionals

2011-04-15 Thread Mike Stump
On Apr 15, 2011, at 10:22 AM, Eric Botcazou wrote:
>> If it can be bool, it should be bool.
> 
> Rather basic principle IMO.  Consistency is of much greater value.

Yes, and consistency is had by upgrading existing int uses that should be bool 
to bool as they are spotted  :-)  I prefer doing this, even if done one at 
a time.  If one discourages it from being done, they run the risk of it being 
done later.  I'd rather have it sooner in this case.


Re: ObjC: rewritten checks for duplicate instance variables (for improved speed and improved error messages)

2011-04-15 Thread Mike Stump
On Apr 15, 2011, at 10:19 AM, Nicola Pero wrote:
> This patch rewrites the check for duplicate instance variables
> done by the Objective-C (and Objective-C++) compiler.

> Ok to commit ?

Ok.


Use ASM_COMMENT_START

2011-04-15 Thread Xinliang David Li
The following is a trivial patch to remove hard coded '#' usage.

Ok for trunk?

Thanks,

David



2011-04-15  Xinliang David Li  

* final.c (dump_basic_block_info): Use ASM_COMMENT_START.
Index: final.c
===
--- final.c	(revision 172212)
+++ final.c	(working copy)
@@ -1691,14 +1691,14 @@ dump_basic_block_info (FILE *file, rtx i
   edge e;
   edge_iterator ei;
 
-  fprintf (file, "# BLOCK %d", bb->index);
+  fprintf (file, "%s BLOCK %d", ASM_COMMENT_START, bb->index);
   if (bb->frequency)
 fprintf (file, " freq:%d", bb->frequency);
   if (bb->count)
 fprintf (file, " count:" HOST_WIDEST_INT_PRINT_DEC,
  bb->count);
   fprintf (file, " seq:%d", (*bb_seqn)++);
-  fprintf (file, "\n# PRED:");
+  fprintf (file, "\n%s PRED:", ASM_COMMENT_START);
   FOR_EACH_EDGE (e, ei, bb->preds)
 {
   dump_edge_info (file, e, 0);
@@ -1711,7 +1711,7 @@ dump_basic_block_info (FILE *file, rtx i
   edge e;
   edge_iterator ei;
 
-  fprintf (asm_out_file, "# SUCC:");
+  fprintf (asm_out_file, "%s SUCC:", ASM_COMMENT_START);
   FOR_EACH_EDGE (e, ei, bb->succs)
{
  dump_edge_info (asm_out_file, e, 1);


Re: Use ASM_COMMENT_START

2011-04-15 Thread Diego Novillo
On Fri, Apr 15, 2011 at 14:11, Xinliang David Li  wrote:

> 2011-04-15  Xinliang David Li  
>
>        * final.c (dump_basic_block_info): Use ASM_COMMENT_START.

OK.


Diego.


[pph] Clean positive tests. (issue4423044)

2011-04-15 Thread Lawrence Crowl

This patch cleans up positive tests.  First, stop on first failure so
as to avoid littering the logfile.  Second, change the extensions of
the generated assembly files to .s-pph and .s+pph to be clear on the
provenence of each file.


Index: gcc/testsuite/ChangeLog.pph

2011-04-15  Lawrence Crowl  

* lib/dg-pph.exp (dg-pph-pos): Stop on first failure.  Change names of
assembly files to be clear on which is with and without PPH.


Index: gcc/testsuite/lib/dg-pph.exp
===
--- gcc/testsuite/lib/dg-pph.exp(revision 172457)
+++ gcc/testsuite/lib/dg-pph.exp(working copy)
@@ -74,13 +74,13 @@ proc dg-pph-pos { subdir test options ma
 if { $have_errs } {
verbose -log "regular compilation failed"
fail "$nshort $options, regular compilation failed"
-return
+   return
 }
 
 if { ! [ file_on_host exists "$bname.s" ] } {
-   verbose -log "assembly file '$bname.s' missing"
-   fail "$nshort $options, assembly comparison"
-return
+   verbose -log "regular assembly file '$bname.s' missing"
+   fail "$nshort $options, regular assembly missing"
+   return
 }
 
 # Rename the .s file into .s-pph to compare it after the second build.
@@ -89,19 +89,33 @@ proc dg-pph-pos { subdir test options ma
 
 # Compile a second time using the pph files.
 dg-test -keep-output $test "$options $mapflag -I." ""
-remote_upload host "$bname.s"
+
+if { $have_errs } {
+   verbose -log "PPH compilation failed"
+   fail "$nshort $options, PPH compilation failed"
+   return
+}
+
+if { ! [ file_on_host exists "$bname.s" ] } {
+   verbose -log "PPH assembly file '$bname.s' missing"
+   fail "$nshort $options, PPH assembly missing"
+   return
+}
+
+# Rename the .s file into .s+pph to compare it.
+remote_upload host "$bname.s" "$bname.s+pph"
+remote_download host "$bname.s+pph"
 
 # Compare the two assembly files.  They should be identical.
-set tmp [ diff "$bname.s" "$bname.s-pph" ]
+set tmp [ diff "$bname.s-pph" "$bname.s+pph" ]
 if { $tmp == 0 } {
-verbose -log "assembly file '$bname.s', '$bname.s-pph' comparison 
error"
-fail "$nshort $options assembly comparison"
+   verbose -log "assembly file '$bname.s-pph', '$bname.s+pph' comparison 
error"
+   fail "$nshort $options assembly comparison"
 } elseif { $tmp == 1 } {
-pass "$nshort $options assembly comparison"
+   pass "$nshort $options assembly comparison"
+   file_on_host delete "$bname.s-pph"
+   file_on_host delete "$bname.s+pph"
 } else {
-fail "$nshort $options assembly comparison"
+   fail "$nshort $options assembly comparison"
 }
-file_on_host delete "$bname.s"
-file_on_host delete "$bname.s-pph"
-
 }

--
This patch is available for review at http://codereview.appspot.com/4423044


Re: [PATCH] sel-sched: Fix erroneous re-use of pointer to last insn in a BB (PR 48235)

2011-04-15 Thread Steve Ellcey
Vladimir and Dmitry,

The new test you added, gcc.dg/pr48235.c, is failing on IA64 because
it gets this excess message:

cc1: note: -freorder-blocks-and-partition does not work on this architecture

Could you modify the test to either filter out this message or not 
use the -freorder-blocks-and-partition option in IA64 or not run the
test at all on IA64.  I notice that gcc.dg/tree-prof/pr45354.c looks
like this test but also has:

/* { dg-require-effective-target freorder } */

That test does not fail, or run, on IA64 due to this dg option.

Steve Ellcey
s...@cup.hp.com


ObjC: get rid of another unnecessary, temporary copy of instance variables

2011-04-15 Thread Nicola Pero
This patch for the Objective-C compiler gets rid of another case where
we'd create a temporary tree chain with a copy of all the instance variables
of a class just to do a simple check on them.

The check is the one to check the scope of the instance variable that is
being accessed; without this patch, the check requires copying all the
instance variables for the class on a temporary tree chain.  With this
patch, the check is done (as you'd expect) on the instance variables
themselves, without copying them.

This is obviously faster; on the other hand, the typical GNUstep-based ObjC file
has classes with few instance variables (maybe 10 or 20) and only accesses 
instance
variables up to a few hundred times per file, so the speedup is only of the 
order
of 0.1% to 0.2% even with -fsyntax-only.  Of course, in unusual cases (ie, 
classes
with many instance variables, and a big file accessing instance variables lots 
of
times) the speedup may be bigger.

Ok to commit ?

Thanks

Index: ChangeLog
===
--- ChangeLog   (revision 172511)
+++ ChangeLog   (working copy)
@@ -1,5 +1,10 @@
 2011-04-15  Nicola Pero  
 
+   * objc-act.c (ivar_of_class): New.
+   (objc_is_public): Use ivar_of_class.
+
+2011-04-15  Nicola Pero  
+
* objc-act.c (objc_get_interface_ivars): Removed.
(objc_detect_field_duplicates): New.
(hash_instance_variable): New.
Index: objc-act.c
===
--- objc-act.c  (revision 172511)
+++ objc-act.c  (working copy)
@@ -6367,6 +6367,35 @@ is_private (tree decl)
DECL_NAME (decl)));
 }
 
+/* Searches all the instance variables of 'klass' and of its
+   superclasses for an instance variable whose name (identifier) is
+   'ivar_name_ident'.  Return the declaration (DECL) of the instance
+   variable, if found, or NULL_TREE, if not found.  */
+static inline tree
+ivar_of_class (tree klass, tree ivar_name_ident)
+{
+  /* First, look up the ivar in CLASS_RAW_IVARS.  */
+  tree decl_chain = CLASS_RAW_IVARS (klass);
+
+  for ( ; decl_chain; decl_chain = DECL_CHAIN (decl_chain))
+if (DECL_NAME (decl_chain) == ivar_name_ident)
+  return decl_chain;
+
+  /* If not found, search up the class hierarchy.  */
+  while (CLASS_SUPER_NAME (klass))
+{
+  klass = lookup_interface (CLASS_SUPER_NAME (klass));
+  
+  decl_chain = CLASS_RAW_IVARS (klass);
+  
+  for ( ; decl_chain; decl_chain = DECL_CHAIN (decl_chain))
+   if (DECL_NAME (decl_chain) == ivar_name_ident)
+ return decl_chain;
+}
+
+  return NULL_TREE;
+}
+
 /* We have an instance variable reference;, check to see if it is public.  */
 
 int
@@ -6397,7 +6426,7 @@ objc_is_public (tree expr, tree identifier)
  return 0;
}
 
- if ((decl = is_ivar (get_class_ivars (klass, true), identifier)))
+ if ((decl = ivar_of_class (klass, identifier)))
{
  if (TREE_PUBLIC (decl))
return 1;




FDO usability trivial patch to fix div by zero error with insane profile

2011-04-15 Thread Xinliang David Li
This is a trivial patch to deal with bad profile data (from
multi-threaded programs) that leads to divide by zero error.

Ok for trunk?

Thanks,

David



2011-04-15  Xinliang David Li  

* ipa-inline.c (cgraph_decide_recursive_inlining): Fix
div by zero error with insane profile.
Index: ipa-inline.c
===
--- ipa-inline.c	(revision 172510)
+++ ipa-inline.c	(working copy)
@@ -779,7 +779,7 @@ cgraph_decide_recursive_inlining (struct
 		fprintf (dump_file, "   Not inlining cold call\n");
 	  continue;
 	}
-  if (curr->count * 100 / node->count < probability)
+  if (node->count == 0 || curr->count * 100 / node->count < probability)
 	{
 	  if (dump_file)
 		fprintf (dump_file,


Re: FDO usability trivial patch to fix div by zero error with insane profile

2011-04-15 Thread Diego Novillo
On Fri, Apr 15, 2011 at 15:20, Xinliang David Li  wrote:
> This is a trivial patch to deal with bad profile data (from
> multi-threaded programs) that leads to divide by zero error.
>
> Ok for trunk?
>
> Thanks,
>
> David
>
>
>
> 2011-04-15  Xinliang David Li  
>
>        * ipa-inline.c (cgraph_decide_recursive_inlining): Fix
>        div by zero error with insane profile.

Looks fine to me, but you really want Honza to review this.  Any
chance we can get a test case for this?


Diego.


Re: Remember/restore ALLOCA_FOR_VAR_P over tuples

2011-04-15 Thread Michael Matz
Hi,

On Thu, 14 Apr 2011, Michael Matz wrote:

> > Btw, I don't remember why I chose ALLOCA_FOR_VAR_P over 
> > CALL_ALLOCA_FOR_VAR_P but, given the name of the GIMPLE flag and 
> > predicate, it's probably time to change it.
> 
> Good idea, I'll rename it before checking in.

r172516, for reference also below.  I later saw that my patch causes 
cxg2001 to fail.  I've analyzed it enough to be sure that it's only 
exposed by this patch (due to inlining now happening), in fact it's a 
problem in IRA that is reproducible with a C++ testcase even without the 
patch.  I've filed PR48633 for this.


Ciao,
Michael.

* tree.h (ALLOCA_FOR_VAR_P): Rename to CALL_ALLOCA_FOR_VAR_P.
* builtins.c (expand_builtin): Use CALL_ALLOCA_FOR_VAR_P.
* function.c (gimplify_parameters): Ditto.
* gimplify.c (gimplify_vla_decl): Ditto.

* gimple.h (enum gf_mask): Add GF_CALL_ALLOCA_FOR_VAR.
(gimple_call_set_alloca_for_var): New inline function.
(gimple_call_alloca_for_var_p): Ditto.
* gimple.c (gimple_build_call_from_tree): Remember CALL_ALLOCA_FOR_VAR_P
state.
* cfgexpand.c (expand_call_stmt): Restore CALL_ALLOCA_FOR_VAR_P state.

* tree-inline.c (inline_forbidden_p_stmt): Don't reject alloca
calls if they were for VLA objects.

Index: tree.h
===
*** tree.h  (revision 172431)
--- tree.h  (working copy)
*** struct GTY(()) tree_common {
*** 574,580 
 all decls
  
 CALL_FROM_THUNK_P and
!ALLOCA_FOR_VAR_P in
 CALL_EXPR
  
 side_effects_flag:
--- 574,580 
 all decls
  
 CALL_FROM_THUNK_P and
!CALL_ALLOCA_FOR_VAR_P in
 CALL_EXPR
  
 side_effects_flag:
*** extern void omp_clause_range_check_faile
*** 1388,1394 
  
  /* In a CALL_EXPR, if the function being called is BUILT_IN_ALLOCA, means that
 it has been built for the declaration of a variable-sized object.  */
! #define ALLOCA_FOR_VAR_P(NODE) (CALL_EXPR_CHECK (NODE)->base.protected_flag)
  
  /* In a type, nonzero means that all objects of the type are guaranteed by the
 language or front-end to be properly aligned, so we can indicate that a MEM
--- 1388,1395 
  
  /* In a CALL_EXPR, if the function being called is BUILT_IN_ALLOCA, means that
 it has been built for the declaration of a variable-sized object.  */
! #define CALL_ALLOCA_FOR_VAR_P(NODE) \
!   (CALL_EXPR_CHECK (NODE)->base.protected_flag)
  
  /* In a type, nonzero means that all objects of the type are guaranteed by the
 language or front-end to be properly aligned, so we can indicate that a MEM
Index: builtins.c
===
*** builtins.c  (revision 172431)
--- builtins.c  (working copy)
*** expand_builtin (tree exp, rtx target, rt
*** 6025,6031 
  case BUILT_IN_ALLOCA:
/* If the allocation stems from the declaration of a variable-sized
 object, it cannot accumulate.  */
!   target = expand_builtin_alloca (exp, ALLOCA_FOR_VAR_P (exp));
if (target)
return target;
break;
--- 6025,6031 
  case BUILT_IN_ALLOCA:
/* If the allocation stems from the declaration of a variable-sized
 object, it cannot accumulate.  */
!   target = expand_builtin_alloca (exp, CALL_ALLOCA_FOR_VAR_P (exp));
if (target)
return target;
break;
Index: function.c
===
*** function.c  (revision 172431)
--- function.c  (working copy)
*** gimplify_parameters (void)
*** 3652,3658 
  t = built_in_decls[BUILT_IN_ALLOCA];
  t = build_call_expr (t, 1, DECL_SIZE_UNIT (parm));
  /* The call has been built for a variable-sized object.  */
! ALLOCA_FOR_VAR_P (t) = 1;
  t = fold_convert (ptr_type, t);
  t = build2 (MODIFY_EXPR, TREE_TYPE (addr), addr, t);
  gimplify_and_add (t, &stmts);
--- 3652,3658 
  t = built_in_decls[BUILT_IN_ALLOCA];
  t = build_call_expr (t, 1, DECL_SIZE_UNIT (parm));
  /* The call has been built for a variable-sized object.  */
! CALL_ALLOCA_FOR_VAR_P (t) = 1;
  t = fold_convert (ptr_type, t);
  t = build2 (MODIFY_EXPR, TREE_TYPE (addr), addr, t);
  gimplify_and_add (t, &stmts);
Index: gimplify.c
===
*** gimplify.c  (revision 172431)
--- gimplify.c  (working copy)
*** gimplify_vla_decl (tree decl, gimple_seq
*** 1329,1335 
t = built_in_decls[BUILT_IN_ALLOCA];
t = build_call_expr (t, 1, DECL_SIZE_UNIT (decl));
/* The call has been built for a variable-sized object.  */
!   ALLOCA

[PATCH] Prefer simplify_gen_* over gen_* in expand_debug_expr

2011-04-15 Thread Jakub Jelinek
Hi!

On IRC richi mentioned yesterday that when gcc.dg/guality/asm-1.c
testcase is compiled with g++ instead of gcc, it fails.
Apparently that is because of slightly different ARRAY_REF
in the code, which for g++ leads into
(sign_extend:DI (zero_extend:SI (something:HI)))
in DEBUG_INSN (as opposed to
(zero_extend:DI (something:HI))
with C FE) and var-tracking doesn't happen to match it in some
places.  The above is something simplify-rtx.c can fix up though.

This patch fixes it by using simplify_gen_* instead of gen_* in many
more places than it did until now in expand_debug_expr.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2011-04-15  Jakub Jelinek  

* cfgexpand.c (expand_debug_expr): Use
simplify_gen_{unary,binary,ternary} instead of gen_rtx_*.

--- gcc/cfgexpand.c.jj  2011-04-09 19:29:19.081421040 +0200
+++ gcc/cfgexpand.c 2011-04-14 18:52:51.354402038 +0200
@@ -2364,6 +2364,7 @@ expand_debug_expr (tree exp)
 {
   rtx op0 = NULL_RTX, op1 = NULL_RTX, op2 = NULL_RTX;
   enum machine_mode mode = TYPE_MODE (TREE_TYPE (exp));
+  enum machine_mode inner_mode = VOIDmode;
   int unsignedp = TYPE_UNSIGNED (TREE_TYPE (exp));
   addr_space_t as;
 
@@ -2410,6 +2411,7 @@ expand_debug_expr (tree exp)
 
 unary:
 case tcc_unary:
+  inner_mode = TYPE_MODE (TREE_TYPE (TREE_OPERAND (exp, 0)));
   op0 = expand_debug_expr (TREE_OPERAND (exp, 0));
   if (!op0)
return NULL_RTX;
@@ -2513,7 +2515,7 @@ expand_debug_expr (tree exp)
 case NOP_EXPR:
 case CONVERT_EXPR:
   {
-   enum machine_mode inner_mode = GET_MODE (op0);
+   inner_mode = GET_MODE (op0);
 
if (mode == inner_mode)
  return op0;
@@ -2560,9 +2562,9 @@ expand_debug_expr (tree exp)
else if (TREE_CODE_CLASS (TREE_CODE (exp)) == tcc_unary
 ? TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (exp, 0)))
 : unsignedp)
- op0 = gen_rtx_ZERO_EXTEND (mode, op0);
+ op0 = simplify_gen_unary (ZERO_EXTEND, mode, op0, inner_mode);
else
- op0 = gen_rtx_SIGN_EXTEND (mode, op0);
+ op0 = simplify_gen_unary (SIGN_EXTEND, mode, op0, inner_mode);
 
return op0;
   }
@@ -2697,7 +2699,8 @@ expand_debug_expr (tree exp)
/* Don't use offset_address here, we don't need a
   recognizable address, and we don't want to generate
   code.  */
-   op0 = gen_rtx_MEM (mode, gen_rtx_PLUS (addrmode, op0, op1));
+   op0 = gen_rtx_MEM (mode, simplify_gen_binary (PLUS, addrmode,
+ op0, op1));
  }
 
if (MEM_P (op0))
@@ -2770,25 +2773,23 @@ expand_debug_expr (tree exp)
   }
 
 case ABS_EXPR:
-  return gen_rtx_ABS (mode, op0);
+  return simplify_gen_unary (ABS, mode, op0, mode);
 
 case NEGATE_EXPR:
-  return gen_rtx_NEG (mode, op0);
+  return simplify_gen_unary (NEG, mode, op0, mode);
 
 case BIT_NOT_EXPR:
-  return gen_rtx_NOT (mode, op0);
+  return simplify_gen_unary (NOT, mode, op0, mode);
 
 case FLOAT_EXPR:
-  if (unsignedp)
-   return gen_rtx_UNSIGNED_FLOAT (mode, op0);
-  else
-   return gen_rtx_FLOAT (mode, op0);
+  return simplify_gen_unary (TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (exp,
+0)))
+? UNSIGNED_FLOAT : FLOAT, mode, op0,
+inner_mode);
 
 case FIX_TRUNC_EXPR:
-  if (unsignedp)
-   return gen_rtx_UNSIGNED_FIX (mode, op0);
-  else
-   return gen_rtx_FIX (mode, op0);
+  return simplify_gen_unary (unsignedp ? UNSIGNED_FIX : FIX, mode, op0,
+inner_mode);
 
 case POINTER_PLUS_EXPR:
   /* For the rare target where pointers are not the same size as
@@ -2799,161 +2800,164 @@ expand_debug_expr (tree exp)
  && GET_MODE (op0) != GET_MODE (op1))
{
  if (GET_MODE_BITSIZE (GET_MODE (op0)) < GET_MODE_BITSIZE (GET_MODE 
(op1)))
-   op1 = gen_rtx_TRUNCATE (GET_MODE (op0), op1);
+   op1 = simplify_gen_unary (TRUNCATE, GET_MODE (op0), op1,
+ GET_MODE (op1));
  else
/* We always sign-extend, regardless of the signedness of
   the operand, because the operand is always unsigned
   here even if the original C expression is signed.  */
-   op1 = gen_rtx_SIGN_EXTEND (GET_MODE (op0), op1);
+   op1 = simplify_gen_unary (SIGN_EXTEND, GET_MODE (op0), op1,
+ GET_MODE (op1));
}
   /* Fall through.  */
 case PLUS_EXPR:
-  return gen_rtx_PLUS (mode, op0, op1);
+  return simplify_gen_binary (PLUS, mode, op0, op1);
 
 case MINUS_EXPR:
-  return gen_rtx_MINUS (mode, op0, op1);
+  return simplify_gen_binary (MINUS, mode, op0, op1);
 
 case MULT_EXPR:
-

Re: FDO usability trivial patch to fix div by zero error with insane profile

2011-04-15 Thread Xinliang David Li
There is no way to get a test case that can produce the problem in a
deterministic way.  There is really not much for Honza to review
though :)

David

On Fri, Apr 15, 2011 at 12:52 PM, Diego Novillo  wrote:
> On Fri, Apr 15, 2011 at 15:20, Xinliang David Li  wrote:
>> This is a trivial patch to deal with bad profile data (from
>> multi-threaded programs) that leads to divide by zero error.
>>
>> Ok for trunk?
>>
>> Thanks,
>>
>> David
>>
>>
>>
>> 2011-04-15  Xinliang David Li  
>>
>>        * ipa-inline.c (cgraph_decide_recursive_inlining): Fix
>>        div by zero error with insane profile.
>
> Looks fine to me, but you really want Honza to review this.  Any
> chance we can get a test case for this?
>
>
> Diego.
>


Re: ObjC: get rid of another unnecessary, temporary copy of instance variables

2011-04-15 Thread Mike Stump
On Apr 15, 2011, at 11:52 AM, Nicola Pero wrote:
> This patch for the Objective-C compiler gets rid of another case where
> we'd create a temporary tree chain with a copy of all the instance variables
> of a class just to do a simple check on them.

> Ok to commit ?

Ok.


Re: [PATCH] Prefer simplify_gen_* over gen_* in expand_debug_expr

2011-04-15 Thread Richard Guenther
On Fri, Apr 15, 2011 at 10:21 PM, Jakub Jelinek  wrote:
> Hi!
>
> On IRC richi mentioned yesterday that when gcc.dg/guality/asm-1.c
> testcase is compiled with g++ instead of gcc, it fails.
> Apparently that is because of slightly different ARRAY_REF
> in the code, which for g++ leads into
> (sign_extend:DI (zero_extend:SI (something:HI)))
> in DEBUG_INSN (as opposed to
> (zero_extend:DI (something:HI))
> with C FE) and var-tracking doesn't happen to match it in some
> places.  The above is something simplify-rtx.c can fix up though.
>
> This patch fixes it by using simplify_gen_* instead of gen_* in many
> more places than it did until now in expand_debug_expr.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Thanks,
Richard.

> 2011-04-15  Jakub Jelinek  
>
>        * cfgexpand.c (expand_debug_expr): Use
>        simplify_gen_{unary,binary,ternary} instead of gen_rtx_*.
>
> --- gcc/cfgexpand.c.jj  2011-04-09 19:29:19.081421040 +0200
> +++ gcc/cfgexpand.c     2011-04-14 18:52:51.354402038 +0200
> @@ -2364,6 +2364,7 @@ expand_debug_expr (tree exp)
>  {
>   rtx op0 = NULL_RTX, op1 = NULL_RTX, op2 = NULL_RTX;
>   enum machine_mode mode = TYPE_MODE (TREE_TYPE (exp));
> +  enum machine_mode inner_mode = VOIDmode;
>   int unsignedp = TYPE_UNSIGNED (TREE_TYPE (exp));
>   addr_space_t as;
>
> @@ -2410,6 +2411,7 @@ expand_debug_expr (tree exp)
>
>     unary:
>     case tcc_unary:
> +      inner_mode = TYPE_MODE (TREE_TYPE (TREE_OPERAND (exp, 0)));
>       op0 = expand_debug_expr (TREE_OPERAND (exp, 0));
>       if (!op0)
>        return NULL_RTX;
> @@ -2513,7 +2515,7 @@ expand_debug_expr (tree exp)
>     case NOP_EXPR:
>     case CONVERT_EXPR:
>       {
> -       enum machine_mode inner_mode = GET_MODE (op0);
> +       inner_mode = GET_MODE (op0);
>
>        if (mode == inner_mode)
>          return op0;
> @@ -2560,9 +2562,9 @@ expand_debug_expr (tree exp)
>        else if (TREE_CODE_CLASS (TREE_CODE (exp)) == tcc_unary
>                 ? TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (exp, 0)))
>                 : unsignedp)
> -         op0 = gen_rtx_ZERO_EXTEND (mode, op0);
> +         op0 = simplify_gen_unary (ZERO_EXTEND, mode, op0, inner_mode);
>        else
> -         op0 = gen_rtx_SIGN_EXTEND (mode, op0);
> +         op0 = simplify_gen_unary (SIGN_EXTEND, mode, op0, inner_mode);
>
>        return op0;
>       }
> @@ -2697,7 +2699,8 @@ expand_debug_expr (tree exp)
>            /* Don't use offset_address here, we don't need a
>               recognizable address, and we don't want to generate
>               code.  */
> -           op0 = gen_rtx_MEM (mode, gen_rtx_PLUS (addrmode, op0, op1));
> +           op0 = gen_rtx_MEM (mode, simplify_gen_binary (PLUS, addrmode,
> +                                                         op0, op1));
>          }
>
>        if (MEM_P (op0))
> @@ -2770,25 +2773,23 @@ expand_debug_expr (tree exp)
>       }
>
>     case ABS_EXPR:
> -      return gen_rtx_ABS (mode, op0);
> +      return simplify_gen_unary (ABS, mode, op0, mode);
>
>     case NEGATE_EXPR:
> -      return gen_rtx_NEG (mode, op0);
> +      return simplify_gen_unary (NEG, mode, op0, mode);
>
>     case BIT_NOT_EXPR:
> -      return gen_rtx_NOT (mode, op0);
> +      return simplify_gen_unary (NOT, mode, op0, mode);
>
>     case FLOAT_EXPR:
> -      if (unsignedp)
> -       return gen_rtx_UNSIGNED_FLOAT (mode, op0);
> -      else
> -       return gen_rtx_FLOAT (mode, op0);
> +      return simplify_gen_unary (TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (exp,
> +                                                                        0)))
> +                                ? UNSIGNED_FLOAT : FLOAT, mode, op0,
> +                                inner_mode);
>
>     case FIX_TRUNC_EXPR:
> -      if (unsignedp)
> -       return gen_rtx_UNSIGNED_FIX (mode, op0);
> -      else
> -       return gen_rtx_FIX (mode, op0);
> +      return simplify_gen_unary (unsignedp ? UNSIGNED_FIX : FIX, mode, op0,
> +                                inner_mode);
>
>     case POINTER_PLUS_EXPR:
>       /* For the rare target where pointers are not the same size as
> @@ -2799,161 +2800,164 @@ expand_debug_expr (tree exp)
>          && GET_MODE (op0) != GET_MODE (op1))
>        {
>          if (GET_MODE_BITSIZE (GET_MODE (op0)) < GET_MODE_BITSIZE (GET_MODE 
> (op1)))
> -           op1 = gen_rtx_TRUNCATE (GET_MODE (op0), op1);
> +           op1 = simplify_gen_unary (TRUNCATE, GET_MODE (op0), op1,
> +                                     GET_MODE (op1));
>          else
>            /* We always sign-extend, regardless of the signedness of
>               the operand, because the operand is always unsigned
>               here even if the original C expression is signed.  */
> -           op1 = gen_rtx_SIGN_EXTEND (GET_MODE (op0), op1);
> +           op1 = simplify_gen_unary (SIGN_EXTEND, GET_MODE (op0), op1,
> +                                     GET_MODE (op1));
>        }
>       /* Fall through.  */
>     case PLUS_EXPR:
> -

Re: FDO usability trivial patch to fix div by zero error with insane profile

2011-04-15 Thread Diego Novillo
On Fri, Apr 15, 2011 at 16:21, Xinliang David Li  wrote:
> There is no way to get a test case that can produce the problem in a
> deterministic way.  There is really not much for Honza to review
> though :)

The check may be needed somewhere else, you may be papering over the
real issue.  I don't think you are but I don't know the code well
enough to say for sure.


Diego.


  1   2   >