Re: [RFC PATCH, go]: Port to ALPHA arch - sysinfo.go fixup

2011-06-01 Thread Uros Bizjak
On Tue, May 31, 2011 at 8:08 PM, Ian Lance Taylor  wrote:
> Uros Bizjak  writes:
>
>> (BTW: Original calculation of Ctime_ns has a cut'n'paste error,
>> stat.Ctime.Nsec should be used instead of stat.Atime.Nsec).
>
> Thanks.  Fixed like so.  Bootstrapped and ran Go testsuite on
> x86_64-unknown-linux-gnu.  Committed to mainline.

Using your latest fixes, I was able to compile libgo on
alphaev68-pc-linux-gnu out of the box, without any additional patches.

One problem remains in the libgo testsuite: certain tests have to be
compiled with -mieee, otherwise FPE is generated for unordered values.
Any suggestions, where -mieee should be placed?

Thanks,
Uros.


Re: New options to disable/enable any pass for any functions (issue4550056)

2011-06-01 Thread Jan Hubicka
> Please discard the previous one. This is the right one:
> 
> David
> Index: tree-pretty-print.c
> ===
> --- tree-pretty-print.c   (revision 174424)
> +++ tree-pretty-print.c   (working copy)
> @@ -3013,3 +3013,36 @@ pp_base_tree_identifier (pretty_printer 
>  pp_append_text (pp, IDENTIFIER_POINTER (id),
>   IDENTIFIER_POINTER (id) + IDENTIFIER_LENGTH (id));
>  }
> +
> +/* A helper function that is used to dump function information before the
> +   function dump.  */
> +
> +void
> +dump_function_header (FILE *dump_file, tree fdecl, struct function *fun)

You can get to FUN via DECL_STRUCT_FUNCTION (fndecl) that would save you a 
parameter...
> Index: tree-pretty-print.h
> ===
> --- tree-pretty-print.h   (revision 174422)
> +++ tree-pretty-print.h   (working copy)
> @@ -24,6 +24,7 @@ along with GCC; see the file COPYING3.  
>  #define GCC_TREE_PRETTY_PRINT_H
>  
>  #include "pretty-print.h"
> +#include "coretypes.h"

.. and need for this
> Index: coretypes.h
> ===
> --- coretypes.h   (revision 174422)
> +++ coretypes.h   (working copy)
> @@ -75,6 +75,7 @@ typedef struct diagnostic_context diagno
>  struct gimple_seq_d;
>  typedef struct gimple_seq_d *gimple_seq;
>  typedef const struct gimple_seq_d *const_gimple_seq;
> +struct function;

... and this.

Otherwise the patch seems OK.  You need to update Makefile.in for new 
dependencies.

Honza


Re: C6X port 5/11: Track predication conditions more accurately

2011-06-01 Thread Andrey Belevantsev

On 31.05.2011 23:59, Andrey Belevantsev wrote:


On 31.05.2011 22:24, Steve Ellcey wrote:

Bernd,

This patch (r174336) is causing me many testsuite failures on IA64.
Tests like gcc.c-torture/compile/20010408-1.c are dying with a
seg fault in vinsn_detach.

I will look at it tomorrow. Bernd, Steve, please let us know about any
issues with sel-sched code so we can help.
I cannot reproduce this with today's trunk with a cross either to 
ia64-linux or ia64-hpux, can you give me a test case with compiler options 
etc.?


Andrey



Andrey



#0 0x55b8760:0 in vinsn_detach (vi=0xf)
#1 0x55bda30:0 in clear_expr (expr=0x7fffeef0)
#2 0x562ea50:0 in schedule_expr_on_boundary (bnd=0x4040cf14,
expr_vliw=0x4040c4ac, seqno=-1)
#3 0x562fab0:0 in fill_insns (fence=0x4040bdec, seqno=-1,
scheduled_insns_tailpp=0x7fffeff0)
#4 0x563d3b0:0 in schedule_on_fences (fences=0x4040bde8, max_seqno=1,
scheduled_insns_tailpp=0x7fffeff0)
#5 0x563eb70:0 in sel_sched_region_2 (orig_max_seqno=22)
#6 0x563f1b0:0 in sel_sched_region_1 ()
#7 0x56403f0:0 in sel_sched_region (rgn=9)
#8 0x5640980:0 in run_selective_scheduling ()
#9 0x613d0f0:0 in ia64_reorg ()

I am still trying to get more information but I was wondering if you
have already seen this problem or if anyone else has reported it?

Steve Ellcey
s...@cup.hp.com





Re: -fdump-passes -fenable-xxx=func_name_list

2011-06-01 Thread Richard Guenther
On Wed, Jun 1, 2011 at 1:34 AM, Xinliang David Li  wrote:
> The following patch implements the a new option that dumps gcc PASS
> configuration. The sample output is attached.  There is one
> limitation: some placeholder passes that are named with '*xxx' are
> note registered thus they are not listed. They are not important as
> they can not be turned on/off anyway.
>
> The patch also enhanced -fenable-xxx and -fdisable-xx to allow a list
> of function assembler names to be specified.
>
> Ok for trunk?

Please split the patch.

I'm not too happy how you dump the pass configuration.  Why not simply,
at a _single_ place, walk the pass tree?  Instead of doing pieces of it
at pass execution time when it's not already dumped - that really looks
gross.

The documentation should also link this option to the -fenable/disable
options as obviously the pass names in that dump are those to be
used for those flags (and not readily available anywhere else).

I also think that it would be way more useful to note in the individual
dump files the functions (at the place they would usually appear) that
have the pass explicitly enabled/disabled.

Richard.

> Thanks,
>
> David
>


Re: approved but not committed? - [PATCH, ARM] Testcases incorrectly run in Thumb/Xscale

2011-06-01 Thread Richard Earnshaw

On Tue, 2011-05-31 at 12:49 -0700, Jing Yu wrote:
> Since this patch has been properly approved, if there is no objection
> in 24 hours, I will commit this patch to trunk.
> 

Once a patch has been approved by an appropriate maintainer, anybody
with an account for gcc can commit the patch.

R.

> Thanks,
> Jing
> 
> On Fri, May 27, 2011 at 3:55 PM, Jing Yu  wrote:
> > Hi Sofiane,
> >
> > I find your following patch has been approved by Richard in Oct last
> > year, but it is not trunk.
> > Is there any problem with it?
> > http://gcc.gnu.org/ml/gcc-patches/2010-10/msg00266.html
> >
> > If you don't mind, I can help to commit the patch.
> >
> > Thanks,
> > Jing
> >
> 




Re: [PATCH, ARM] Thumb-2 12-bit immediates in ADD and SUB instructions

2011-06-01 Thread Andrew Stubbs

On 31/05/11 16:27, Dmitry Plotnikov wrote:

Would you include this in your patch? Or should we submit it as a
separate patch?


I'm not sure I *can* commit your patches, legally speaking, although 
this one is small enough that probably it's ok ... probably.


Perhaps you should submit it yourself, once mine has gone in (assuming 
it ever gets reviewed ... Richard) and then you can take all the glory 
you're due! ;)


Andrew


Re: PATCH: adding invoking_program to plugin_gcc_version

2011-06-01 Thread Richard Guenther
On Wed, Jun 1, 2011 at 8:05 AM, Basile Starynkevitch
 wrote:
> On Wed, 1 Jun 2011 07:52:48 +0200
> Basile Starynkevitch  wrote:
>
>>
>> Hello All,
>>
>> The attached patch to trunk 174518 adds a field invoking_program to the
>> plugin_gcc_version structure. It informs the plugin about the program
>> "cc1", "cc1plus", "lto1" using them.
>
> Wrong patch, here is a better one
>
> # gcc/ChangeLog entry ##
> 2011-06-01  Basile Starynkevitch  
>
>        * gcc-plugin.h (struct plugin_gcc_version): Add invoking_program field.
>
>        * configure.ac: Ditto.
>
>        * configure: Regenerate.
>
>        * plugin.c (initialize_plugins): Set invoking_program.
>
> 
>
> Ok if it bootstraps?

It's redundant information with lang_hooks.name.  So, NO!  (yes,
again, no!)

Richard.

> Cheers
> --
> Basile STARYNKEVITCH         http://starynkevitch.net/Basile/
> email: basilestarynkevitchnet mobile: +33 6 8501 2359
> 8, rue de la Faiencerie, 92340 Bourg La Reine, France
> *** opinions {are only mine, sont seulement les miennes} ***
>


Re: New options to disable/enable any pass for any functions (issue4550056)

2011-06-01 Thread Richard Guenther
On Wed, Jun 1, 2011 at 2:03 AM, Xinliang David Li  wrote:
> Please discard the previous one. This is the right one:

See also Honzas comments (on the wrong patch presumably ;)).

+  if (node)
+fprintf (dump_file, "\n;; Function %s (%s, funcdef_no=%d,
decl_uid = %d, cgraph_uid=%d)",
+ dname, aname, fun->funcdef_no, DECL_UID(fdecl), node->uid);
+  else
+fprintf (dump_file, "\n;; Function %s (%s, funcdef_no=%d, decl_uid = %d)",
+ dname, aname, fun->funcdef_no, DECL_UID(fdecl));
+
+  fprintf (dump_file, "%s\n\n",
+   node->frequency == NODE_FREQUENCY_HOT

you also need to check for node == NULL here.

Ok with this (and honzas suggested change for not passing struct
function *).

Thanks,
Richard.

> David
>
> On Tue, May 31, 2011 at 5:01 PM, Xinliang David Li  wrote:
>> The new patch is attached. The test (c,c++,fortran, java, ada) is on going.
>>
>> Thanks,
>>
>> David
>>
>> On Tue, May 31, 2011 at 9:06 AM, Xinliang David Li  
>> wrote:
>>> On Tue, May 31, 2011 at 2:05 AM, Richard Guenther
>>>  wrote:
 On Mon, May 30, 2011 at 10:16 PM, Xinliang David Li  
 wrote:
> The attached are two simple follow up patches
>
> 1) the first patch does some refactorization on function header
> dumping (with more information printed)
>
> 2) the second patch cleans up some pass names. Part of the cleanup
> results from a previous discussion with Honza -- a) rename
> 'tree_profile_ipa' into 'profile', and make 'ipa_profile' and
> 'profile' into 'profile_estimate'. The rest of cleanups are needed to
> make sure pass names are unique.
>
> Ok for trunk?

 +
 +void
 +pass_dump_function_header (FILE *dump_file, tree fdecl, struct function 
 *fun)

 This function needs documentation, the ChangeLog entry misses
 the tree-pretty-print.h change.

 +struct function;

 instead of this please include coretypes.h from tree-pretty-print.h
 and add the struct function forward declaration there if it isn't already
 present.

 You change the output of the header, so please make sure you
 have bootstrapped and tested with _all_ languages included
 (and also watch for bugreports for target specific bugs).

>>>
>>> Ok.
>>>
 +  fprintf (dump_file, "\n;; Function %s (%s, funcdef_no=%d, uid=%d)",
 +           dname, aname, fun->funcdef_no, node->uid);

 I see no point in dumping funcdef_no - it wasn't dumped before in
 any place.  Instead I miss dumping of the DECL_UID and thus
 a more specific 'uid', like 'cgraph-uid'.
>>>
>>> Ok will add decl_uid. Funcdef_no is very useful for debugging FDO
>>> coverage mismatch related problems as it is the id used in profile
>>> hashing.
>>>

 +  aname = (IDENTIFIER_POINTER
 +          (DECL_ASSEMBLER_NAME (fdecl)));

 using DECL_ASSEMBLER_NAME is bad - it might trigger computation
 of DECL_ASSEMBLER_NAME which certainly shouldn't be done
 only for dumping purposes.  Instead do sth like

   if (DECL_ASSEMBLER_NAME_SET_P (fdecl))
     aname = DECL_ASSEMBLER_NAME (fdecl);
   else
     aname = '';
>>>
>>> Ok.
>>>

 and please also watch for cgraph_get_node returning NULL.

 Also please call the function dump_function_header instead of
 pass_dump_function_header.

>>>
>>> Ok.
>>>
>>> Thanks,
>>>
>>> David
 Please re-post with appropriate changes.

 Thanks,
 Richard.

> Thanks,
>
> David
>
> On Fri, May 27, 2011 at 2:58 AM, Richard Guenther
>  wrote:
>> On Fri, May 27, 2011 at 12:02 AM, Xinliang David Li  
>> wrote:
>>> The latest version that only exports 2 functions from passes.c.
>>
>> Ok with ...
>>
>> @@ -637,4 +637,8 @@ extern bool first_pass_instance;
>>  /* Declare for plugins.  */
>>  extern void do_per_function_toporder (void (*) (void *), void *);
>>
>> +extern void disable_pass (const char *);
>> +extern void enable_pass (const char *);
>> +struct function;
>> +
>>
>> struct function forward decl removed.
>>
>> +  explicitly_enabled = is_pass_explicitly_enabled (pass, func);
>> +  explicitly_disabled = is_pass_explicitly_disabled (pass, func);
>>
>> both functions inlined here and removed.
>>
>> +#define MAX_PASS_ID 512
>>
>> this removed and instead a VEC_safe_grow_cleared () or VEC_length ()
>> before the accesses.
>>
>> +-fenable-ipa-@var{pass} @gol
>> +-fenable-rtl-@var{pass} @gol
>> +-fenable-rtl-@var{pass}=@var{range-list} @gol
>> +-fenable-tree-@var{pass} @gol
>> +-fenable-tree-@var{pass}=@var{range-list} @gol
>>
>> -fenable-@var{kind}-@var{pass}, etc.
>>
>> +@item -fdisable-@var{ipa|tree|rtl}-@var{pass}
>> +@itemx -fenable-@var{ipa|tree|rtl}-@var{pass}
>> +@itemx -fdisable-@var{tree|rtl}-@var{pass}=@var{range-list}
>> +@itemx -fenable-@var

[patch] Improve detection of widening multiplication in the vectorizer

2011-06-01 Thread Ira Rosen
Hi,

The vectorizer expects widening multiplication pattern to be:

 type a_t, b_t;
 TYPE a_T, b_T, prod_T;

 a_T = (TYPE) a_t;
 b_T = (TYPE) b_t;
 prod_T = a_T * b_T;

where type 'TYPE' is double the size of type 'type'. This works fine
when the types are signed. For the unsigned types the code looks like:

 unsigned type a_t, b_t;
 unsigned TYPE u_prod_T;
 TYPE a_T, b_T, prod_T;

  a_T = (TYPE) a_t;
  b_T = (TYPE) b_t;
  prod_T = a_T * b_T;
  u_prod_T = (unsigned TYPE) prod_T;

i.e., the multiplication is done on signed, followed by a cast to unsigned.
This patch adds a support of such patterns and generates
WIDEN_MULT_EXPR for the unsigned type.

Another unsupported case is multiplication by a constant (e.g., b_T is
a constant). This patch checks that the constant fits the smaller type
'type' and recognizes such cases as widening multiplication.

Bootstrapped and tested on powerpc64-suse-linux. Tested the
vectorization testsuite on arm-linux-gnueabi.
I'll commit the patch shortly if there are no comments/objections.

Ira

ChangeLog:

   * tree-vectorizer.h (vect_recog_func_ptr): Make last argument to be
   a pointer.
   * tree-vect-patterns.c (vect_recog_widen_sum_pattern,
   vect_recog_widen_mult_pattern, vect_recog_dot_prod_pattern,
   vect_recog_pow_pattern): Likewise.
   (vect_pattern_recog_1): Remove declaration.
   (widened_name_p): Remove declaration.  Add new argument to specify
   whether to check that both types are either signed or unsigned.
   (vect_recog_widen_mult_pattern): Update documentation.  Handle
   unsigned patterns and multiplication by constants.
   (vect_pattern_recog_1): Update vect_recog_func references.  Use
   statement information from the statement returned from pattern
   detection functions.
   (vect_pattern_recog): Update vect_recog_func reference.
   * tree-vect-stmts.c (vectorizable_type_promotion): For widening
   multiplication by a constant use the type of the other operand.

testsuite/ChangeLog:

   * lib/target-supports.exp
(check_effective_target_vect_widen_mult_qi_to_hi):
   Add NEON as supporting target.
   (check_effective_target_vect_widen_mult_hi_to_si): Likewise.
   (check_effective_target_vect_widen_mult_qi_to_hi_pattern): New.
   (check_effective_target_vect_widen_mult_hi_to_si_pattern): New.
   * gcc.dg/vect/vect-widen-mult-u8.c: Expect to be vectorized
using widening
   multiplication on targets that support it.
   * gcc.dg/vect/vect-widen-mult-u16.c: Likewise.
   * gcc.dg/vect/vect-widen-mult-const-s16.c: New test.
   * gcc.dg/vect/vect-widen-mult-const-u16.c: New test.
Index: tree-vectorizer.h
===
--- tree-vectorizer.h   (revision 174475)
+++ tree-vectorizer.h   (working copy)
@@ -896,7 +896,7 @@ extern void vect_slp_transform_bb (basic_block);
 /* Pattern recognition functions.
Additional pattern recognition functions can (and will) be added
in the future.  */
-typedef gimple (* vect_recog_func_ptr) (gimple, tree *, tree *);
+typedef gimple (* vect_recog_func_ptr) (gimple *, tree *, tree *);
 #define NUM_PATTERNS 4
 void vect_pattern_recog (loop_vec_info);
 
Index: tree-vect-patterns.c
===
--- tree-vect-patterns.c(revision 174475)
+++ tree-vect-patterns.c(working copy)
@@ -38,16 +38,11 @@ along with GCC; see the file COPYING3.  If not see
 #include "recog.h"
 #include "diagnostic-core.h"
 
-/* Function prototypes */
-static void vect_pattern_recog_1
-  (gimple (* ) (gimple, tree *, tree *), gimple_stmt_iterator);
-static bool widened_name_p (tree, gimple, tree *, gimple *);
-
 /* Pattern recognition functions  */
-static gimple vect_recog_widen_sum_pattern (gimple, tree *, tree *);
-static gimple vect_recog_widen_mult_pattern (gimple, tree *, tree *);
-static gimple vect_recog_dot_prod_pattern (gimple, tree *, tree *);
-static gimple vect_recog_pow_pattern (gimple, tree *, tree *);
+static gimple vect_recog_widen_sum_pattern (gimple *, tree *, tree *);
+static gimple vect_recog_widen_mult_pattern (gimple *, tree *, tree *);
+static gimple vect_recog_dot_prod_pattern (gimple *, tree *, tree *);
+static gimple vect_recog_pow_pattern (gimple *, tree *, tree *);
 static vect_recog_func_ptr vect_vect_recog_func_ptrs[NUM_PATTERNS] = {
vect_recog_widen_mult_pattern,
vect_recog_widen_sum_pattern,
@@ -61,10 +56,12 @@ static vect_recog_func_ptr vect_vect_recog_func_pt
is a result of a type-promotion, such that:
  DEF_STMT: NAME = NOP (name0)
where the type of name0 (HALF_TYPE) is smaller than the type of NAME.
-*/
+   If CHECK_SIGN is TRUE, check that either both types are signed or both are
+   unsigned.  */
 
 static bool
-widened_name_p (tree name, gimple use_stmt, tree *half_type, gimple *def_stmt)
+widened_name_p (tree name, gim

Re: RFC: explicitely mark out-of-scope deaths

2011-06-01 Thread Richard Sandiford
Michael Matz  writes:
> Stores are better than builtin functions here, so as to not artificially 
> take addresses of the decls in question.

For the record, you wouldn't need to take the address if you had an
internal function (internal-fn.def) of the form:

MEM_REF [] = internal_fn_that_returns_unknown_data ();

This was one of the reasons for adding internal functions, and we use
a similar technique for the interleaved load/stores.

Not an argument in favour of using calls.  There are probably other
reasons to prefer your representation.  It just seemed that, whatever
the arguments against using calls are, taking the address doesn't
need to be one of them.

Richard


Re: New options to disable/enable any pass for any functions (issue4550056)

2011-06-01 Thread Jan Hubicka
> On Wed, Jun 1, 2011 at 2:03 AM, Xinliang David Li  wrote:
> > Please discard the previous one. This is the right one:
> 
> See also Honzas comments (on the wrong patch presumably ;)).
> 
> +  if (node)
> +fprintf (dump_file, "\n;; Function %s (%s, funcdef_no=%d,
> decl_uid = %d, cgraph_uid=%d)",
> + dname, aname, fun->funcdef_no, DECL_UID(fdecl), node->uid);
> +  else
> +fprintf (dump_file, "\n;; Function %s (%s, funcdef_no=%d, decl_uid = 
> %d)",
> + dname, aname, fun->funcdef_no, DECL_UID(fdecl));
> +
> +  fprintf (dump_file, "%s\n\n",
> +   node->frequency == NODE_FREQUENCY_HOT
> 
> you also need to check for node == NULL here.
> 
> Ok with this (and honzas suggested change for not passing struct
> function *).
Also I still think the function funcdef_no is quite redundant UID. For 
debugging profiling
I would go for cgraph_uid instead.  It has also the advantage of being 
available at WPA stage.
But I have nothing against printing it here.

Honza


Re: [patch] Improve detection of widening multiplication in the vectorizer

2011-06-01 Thread Richard Guenther
On Wed, Jun 1, 2011 at 11:23 AM, Ira Rosen  wrote:
> Hi,
>
> The vectorizer expects widening multiplication pattern to be:
>
>     type a_t, b_t;
>     TYPE a_T, b_T, prod_T;
>
>     a_T = (TYPE) a_t;
>     b_T = (TYPE) b_t;
>     prod_T = a_T * b_T;
>
> where type 'TYPE' is double the size of type 'type'. This works fine
> when the types are signed. For the unsigned types the code looks like:
>
>     unsigned type a_t, b_t;
>     unsigned TYPE u_prod_T;
>     TYPE a_T, b_T, prod_T;
>
>      a_T = (TYPE) a_t;
>      b_T = (TYPE) b_t;
>      prod_T = a_T * b_T;
>      u_prod_T = (unsigned TYPE) prod_T;
>
> i.e., the multiplication is done on signed, followed by a cast to unsigned.
> This patch adds a support of such patterns and generates
> WIDEN_MULT_EXPR for the unsigned type.
>
> Another unsupported case is multiplication by a constant (e.g., b_T is
> a constant). This patch checks that the constant fits the smaller type
> 'type' and recognizes such cases as widening multiplication.
>
> Bootstrapped and tested on powerpc64-suse-linux. Tested the
> vectorization testsuite on arm-linux-gnueabi.
> I'll commit the patch shortly if there are no comments/objections.

Did you think about moving pass_optimize_widening_mul before
loop optimizations?  Does that pass catch the cases you are
teaching the pattern recognizer?  I think we should try to expose
these more complicated instructions to loop optimizers.

Thanks,
Richard.

> Ira
>
> ChangeLog:
>
>       * tree-vectorizer.h (vect_recog_func_ptr): Make last argument to be
>       a pointer.
>       * tree-vect-patterns.c (vect_recog_widen_sum_pattern,
>       vect_recog_widen_mult_pattern, vect_recog_dot_prod_pattern,
>       vect_recog_pow_pattern): Likewise.
>       (vect_pattern_recog_1): Remove declaration.
>       (widened_name_p): Remove declaration.  Add new argument to specify
>       whether to check that both types are either signed or unsigned.
>       (vect_recog_widen_mult_pattern): Update documentation.  Handle
>       unsigned patterns and multiplication by constants.
>       (vect_pattern_recog_1): Update vect_recog_func references.  Use
>       statement information from the statement returned from pattern
>       detection functions.
>       (vect_pattern_recog): Update vect_recog_func reference.
>       * tree-vect-stmts.c (vectorizable_type_promotion): For widening
>       multiplication by a constant use the type of the other operand.
>
> testsuite/ChangeLog:
>
>       * lib/target-supports.exp
> (check_effective_target_vect_widen_mult_qi_to_hi):
>       Add NEON as supporting target.
>       (check_effective_target_vect_widen_mult_hi_to_si): Likewise.
>       (check_effective_target_vect_widen_mult_qi_to_hi_pattern): New.
>       (check_effective_target_vect_widen_mult_hi_to_si_pattern): New.
>       * gcc.dg/vect/vect-widen-mult-u8.c: Expect to be vectorized
> using widening
>       multiplication on targets that support it.
>       * gcc.dg/vect/vect-widen-mult-u16.c: Likewise.
>       * gcc.dg/vect/vect-widen-mult-const-s16.c: New test.
>       * gcc.dg/vect/vect-widen-mult-const-u16.c: New test.
>


Re: RFC: explicitely mark out-of-scope deaths

2011-06-01 Thread Richard Guenther
On Wed, Jun 1, 2011 at 11:25 AM, Richard Sandiford
 wrote:
> Michael Matz  writes:
>> Stores are better than builtin functions here, so as to not artificially
>> take addresses of the decls in question.
>
> For the record, you wouldn't need to take the address if you had an
> internal function (internal-fn.def) of the form:
>
>    MEM_REF [] = internal_fn_that_returns_unknown_data ();
>
> This was one of the reasons for adding internal functions, and we use
> a similar technique for the interleaved load/stores.
>
> Not an argument in favour of using calls.  There are probably other
> reasons to prefer your representation.  It just seemed that, whatever
> the arguments against using calls are, taking the address doesn't
> need to be one of them.

True at least since we have internal functions ;)  Still an aggregate
assignment looks less disturbing to random optimizers than a call.

Richard.


Re: [PATH] PR/49139 fix always_inline failures diagnostics

2011-06-01 Thread Richard Guenther
On Tue, May 31, 2011 at 6:03 PM, Christian Bruel  wrote:
>
>
> On 05/31/2011 11:18 AM, Richard Guenther wrote:
>>
>> On Tue, May 31, 2011 at 9:54 AM, Christian Bruel
>>  wrote:
>>>
>>> Hello,
>>>
>>> The attached patch fixes a few diagnostic discrepancies for always_inline
>>> failures.
>>>
>>> Illustrated by the fail_always_inline[12].c attached cases, the current
>>> behavior is one of:
>>>
>>> - success (with and without -Winline), silently not honoring
>>> always_inline
>>>   gcc fail_always_inline1.c -S -Winline -O0 -fpic
>>>   gcc fail_always_inline1.c -S -O2 -fpic
>>>
>>> - error: with -Winline but not without
>>>   gcc fail_always_inline1.c -S -Winline -O2 -fpic
>>>
>>> - error: without -Winline
>>>   gcc fail_always_inline2.c -S -fno-early-inlining -O2
>>>   or the original c++ attachment in this defect
>>>
>>> note that -Winline never warns, as stated in the documentation
>>>
>>> This simple patch consistently emits a warning (changing the sorry
>>> unimplemented message) whenever the attribute is not honored.
>>>
>>> My first will was to generate and error instead of the warning, but since
>>> it
>>> is possible that inlines is only performed at LTO time, an error would be
>>> inapropriate (Note that today this is not possible with -Winline that
>>> would
>>> abort).
>>>
>>> Another alternative I considered would be to emit the warning under
>>> -Winline
>>> rather than unconditionally, but this more a user misuse of the
>>> attribute,
>>> so should always be warned anyway. Or maybe a new -Winline-always that
>>> would
>>> be activated under -Wall ? Other opinion welcomed.
>>>
>>> Tested with standard bootstrap and regression on x86.
>>>
>>> Comments, and/or OK for trunk ?
>>
>> The patch is not ok, we may not fail to inline an always_inline
>> function.
>
> OK, I thought so that would be an error. but I was afraid to abort the
> inline of function with a missing body (provided by another file) by LTO,
> which would be a regression. rethinking about this and considering that a
> valid LTO program should be valid without LTO, and that the scope is the
> translation unit, that would be OK to always reject attribute_inline on
> functions without a body.
>
> To make this more consistent I proposed to warn
>>
>> whenever you take the address of an always_inline function
>> (because then you can confuse GCC by indirectly calling
>> such function which we might inline dependent on optimization
>> setting and which we might discover we didn't inline only
>> dependent on optimization setting).Honza proposed to move
>> the sorry()ing to when we feel the need to output the
>> always_inline function, thus when it was not optimized away,
>> but that would require us not preserving the body (do we?)
>> with -fpreserve-inline-functions.
>>
>
> But we don't know if taking the address of the function will result by a
> call to it, or how the pointer will be used. So I think the check should be
> done at the caller site. But I code like:
>
> inline __attribute__((always_inline))  int foo() { return 0; }
>
> static int (*ptr)() = foo;
>
> int
> bar()
> {
>  return ptr();
> }
>
> is not be (legitimately) inlined, but without diagnostic, I don't know where
> to look at this that at the moment.

Yeah, the issue is that we only warn if we end up seeing a direct
call to an always_inline function that wasn't inlined.  The idea of
sorrying() for remaining always_inline bodies instead would also
catch the above, but so would

inline __attribute__((always_inline))  int foo() { return 0; }
int (*ptr)() = foo;

(address-taken but not called).

>> For fail_always_inline1.c we should diagnose the appearant
>> misuse of always_inline with a warning, drop the attribute
>> but keep DECL_DISREGARD_INLINE_LIMITS set.
>>
>> Same for fail_always_inline2.c.
>>
>> I agree that sorry()ing for those cases is odd.  EIther we
>> should reject the declarations upfront
>> ("always-inline function will not be inlinable"), or we should
>> emit a warning of that kind and make sure to not sorry later.
>
> In addition to the error at the caller site, I've added the specific warn
> about the missing inline keyword.

But not in a good place.  Please instead check it alongside the
other attribute checks in cgraphunit.c:process_function_and_variable_attributes

> Thanks for your comments, here is the new patch that I'm testing, (without
> the handling of indirect calls which can be treated separately).

Index: gcc/ipa-inline-transform.c
===
--- gcc/ipa-inline-transform.c  (revision 174264)
+++ gcc/ipa-inline-transform.c  (working copy)
@@ -302,9 +302,20 @@

   for (e = node->callees; e; e = e->next_callee)
 {
-  cgraph_redirect_edge_call_stmt_to_callee (e);
+  gimple call = cgraph_redirect_edge_call_stmt_to_callee (e);
+
+  if (!inline_p)
+   {
   if (!e->inline_failed || warn_inline)
 inline_p = true;
+ else
+   {
+ tree fn = g

[PATCH][all-langs] Defer size_t and sizetype setting to the middle-end

2011-06-01 Thread Richard Guenther

This patch defers the control over size_t and sizetype to the
middle-end which in turn consults the target.  This removes
various inconsistencies for frontends that do not seem to care
about size_t and will allow simplifying the global tree initialization.

Bootstrapped on x86_64-unknown-linux-gnu for all languages, testing
in progress.

Ok for trunk?  (the change is worthwhile from an LTO and middle-end
perspective and I'll apply leeway to frontends that appear to be
unmaintained - hello Java)

Thanks,
Richard.

2011-06-01  Richard Guenther  

* tree.c (build_common_tree_nodes): Also initialize size_type_node.
Call set_sizetype from here.

c-family/
* c-common.c (c_common_nodes_and_builtins): Do not set
size_type_node or call set_sizetype.

go/
* go-lang.c (go_langhook_init): Do not set
size_type_node or call set_sizetype.

fortran/
* f95-lang.c (gfc_init_decl_processing): Do not set
size_type_node or call set_sizetype.

java/
* decl.c (java_init_decl_processing): Properly initialize
size_type_node.

lto/
* lto-lang.c (lto_init): Do not set
size_type_node or call set_sizetype.

ada/
* gcc-interface/misc.c (gnat_init): Do not set
size_type_node or call set_sizetype.

Index: gcc/tree.c
===
--- gcc/tree.c  (revision 174520)
+++ gcc/tree.c  (working copy)
@@ -9142,6 +9142,7 @@ build_common_tree_nodes (bool signed_cha
 int128_unsigned_type_node = make_unsigned_type (128);
   }
 #endif
+
   /* Define a boolean type.  This type only represents boolean values but
  may be larger than char depending on the value of BOOL_TYPE_SIZE.
  Front ends which want to override this size (i.e. Java) can redefine
@@ -9151,6 +9152,17 @@ build_common_tree_nodes (bool signed_cha
   TYPE_MAX_VALUE (boolean_type_node) = build_int_cst (boolean_type_node, 1);
   TYPE_PRECISION (boolean_type_node) = 1;
 
+  /* Define what type to use for size_t.  */
+  if (strcmp (SIZE_TYPE, "unsigned int") == 0)
+size_type_node = unsigned_type_node;
+  else if (strcmp (SIZE_TYPE, "long unsigned int") == 0)
+size_type_node = long_unsigned_type_node;
+  else if (strcmp (SIZE_TYPE, "long long unsigned int") == 0)
+size_type_node = long_long_unsigned_type_node;
+  else
+gcc_unreachable ();
+  set_sizetype (size_type_node);
+
   /* Fill in the rest of the sized types.  Reuse existing type nodes
  when possible.  */
   intQI_type_node = make_or_reuse_type (GET_MODE_BITSIZE (QImode), 0);
Index: gcc/c-family/c-common.c
===
--- gcc/c-family/c-common.c (revision 174520)
+++ gcc/c-family/c-common.c (working copy)
@@ -4666,13 +4666,7 @@ c_common_nodes_and_builtins (void)
 TYPE_DECL, NULL_TREE,
 widest_unsigned_literal_type_node));
 
-  /* `unsigned long' is the standard type for sizeof.
- Note that stddef.h uses `unsigned long',
- and this must agree, even if long and int are the same size.  */
-  size_type_node =
-TREE_TYPE (identifier_global_value (get_identifier (SIZE_TYPE)));
   signed_size_type_node = c_common_signed_type (size_type_node);
-  set_sizetype (size_type_node);
 
   pid_type_node =
 TREE_TYPE (identifier_global_value (get_identifier (PID_TYPE)));
Index: gcc/go/go-lang.c
===
--- gcc/go/go-lang.c(revision 174520)
+++ gcc/go/go-lang.c(working copy)
@@ -87,15 +87,6 @@ go_langhook_init (void)
 {
   build_common_tree_nodes (false);
 
-  /* The sizetype may be "unsigned long" or "unsigned long long".  */
-  if (TYPE_MODE (long_unsigned_type_node) == ptr_mode)
-size_type_node = long_unsigned_type_node;
-  else if (TYPE_MODE (long_long_unsigned_type_node) == ptr_mode)
-size_type_node = long_long_unsigned_type_node;
-  else
-size_type_node = long_unsigned_type_node;
-  set_sizetype (size_type_node);
-
   build_common_tree_nodes_2 (0);
 
   /* We must create the gogo IR after calling build_common_tree_nodes
Index: gcc/fortran/f95-lang.c
===
--- gcc/fortran/f95-lang.c  (revision 174520)
+++ gcc/fortran/f95-lang.c  (working copy)
@@ -590,9 +590,6 @@ gfc_init_decl_processing (void)
  want double_type_node to actually have double precision.  */
   build_common_tree_nodes (false);
 
-  size_type_node = gfc_build_uint_type (POINTER_SIZE);
-  set_sizetype (size_type_node);
-
   build_common_tree_nodes_2 (0);
   void_list_node = build_tree_list (NULL_TREE, void_type_node);
 
Index: gcc/java/decl.c
===
--- gcc/java/decl.c (revision 174520)
+++ gcc/java/decl.c (working copy)
@@ -606,7 +606,14 @@ java_init_decl_pro

Re: [patch] Improve detection of widening multiplication in the vectorizer

2011-06-01 Thread Ira Rosen
On 1 June 2011 12:42, Richard Guenther  wrote:

> Did you think about moving pass_optimize_widening_mul before
> loop optimizations?  Does that pass catch the cases you are
> teaching the pattern recognizer?  I think we should try to expose
> these more complicated instructions to loop optimizers.
>

pass_optimize_widening_mul doesn't catch these cases, but I can try to
teach it instead of the vectorizer.
I am now testing

Index: passes.c
===
--- passes.c(revision 174391)
+++ passes.c(working copy)
@@ -870,6 +870,7 @@
   NEXT_PASS (pass_split_crit_edges);
   NEXT_PASS (pass_pre);
   NEXT_PASS (pass_sink_code);
+  NEXT_PASS (pass_optimize_widening_mul);
   NEXT_PASS (pass_tree_loop);
{
  struct opt_pass **p = &pass_tree_loop.pass.sub;
@@ -934,7 +935,6 @@
   NEXT_PASS (pass_forwprop);
   NEXT_PASS (pass_phiopt);
   NEXT_PASS (pass_fold_builtins);
-  NEXT_PASS (pass_optimize_widening_mul);
   NEXT_PASS (pass_tail_calls);
   NEXT_PASS (pass_rename_ssa_copies);
   NEXT_PASS (pass_uncprop);

to see how it affects other loop optimizations (vectorizer pattern
tests obviously fail).

Thanks,
Ira

> Thanks,
> Richard.
>


Re: [lto] Merge streamer hooks from pph branch. (issue4568043)

2011-06-01 Thread Richard Guenther
On Tue, 31 May 2011, Diego Novillo wrote:

> 
> This patch merges the LTO streamer hooks from the pph branch.
> 
> These hooks are meant to separate streaming work that is specific to
> GIMPLE from other modules that may want to stream their own data
> structures.  In the PPH branch, we are using the streamer to save
> front end data structures, so there was a bunch of work and checks
> that made no sense in the front end.
> 
> All this module-specific work can be moved to the hooks defined in
> struct lto_streamer_hooks.
> 
> One of the concerns I had with the hooks is potential overhead when
> streaming because of the new indirect calls added by the patch.  This
> does not seem to be the case.
> 
> I did several timing runs using insn-emit.o (the biggest object file
> we generate in a bootstrap) and using an LTO link of cc1.  The times
> are slightly in favour of the hooks, actually (the patch removes some
> checks from the core tree pickling routines), but the difference is
> insignificant.
> 
> Over 5 runs on an idle system: Without this patch, we compile
> insn-emit.o in 41.56 secs (wall time), with the patch the time is
> 41.34 secs (wall time).
> 
> The other change in the patch is to reserve enough tags in enum
> LTO_tags to cover all the possible tree codes.  This exposes the bug
> that I fixed yesterday in the pph branch (the sign of the shifts in
> lto_output_int_in_range).  This means that we need to use
> output_record_start (LTO_null) instead of output_zero to output
> 0-delimiters, which increases the size of the output by less than 1%
> on average.

Can you split this fix out from this merge?

> Tested with LTO profiledbootstrap on x86_64.  OK for trunk?

Comments inline

> 
> Diego.
> 
> 
>   * Makefile.in (cgraphunit.o): Add dependency on LTO_STREAMER_H.
>   * cgraphunit.c: Include lto-streamer.h
>   (cgraph_finalize_compilation_unit): Call gimple_streamer_hooks_init
>   if LTO is enabled.
>   * lto-streamer-in.c (unpack_value_fields): Do not check
>   for TS_SSA_NAME, TS_STATEMENT_LIST and TS_OMP_CLAUSE.
>   Call lto_streamer_hooks.unpack_value_fields if set.
>   (lto_materialize_tree): For unhandled nodes, first try to
>   call lto_streamer_hooks.alloc_tree, if it exists.
>   (lto_input_ts_decl_common_tree_pointers): Move reading of
>   DECL_INITIAL to gimple_streamer_read_tree.
>   (lto_input_tree_pointers): Do not handle TS_SSA_NAME,
>   TS_STATEMENT_LIST, TS_OMP_CLAUSE and TS_OPTIMIZATION.
>   (lto_read_tree): Call lto_streamer_hooks.read_tree if
>   set.
>   Only register symbols in symtab if
>   lto_streamer_hooks.register_decls_in_symtab_p is set.
>   (gimple_streamer_read_tree): New.
>   (lto_reader_init): Rename from lto_init_reader.
>   Move initialization code to gimple_streamer_reader_init.
>   Call lto_streamer_hooks.reader_init if set.
>   (gimple_streamer_reader_init): New.
>   * lto-streamer-out.c (pack_value_fields): Do not handle
>   TS_SSA_NAME, TS_STATEMENT_LIST and TS_OMP_CLAUSE.
>   Call lto_streamer_hooks.pack_value_fields if set.
>   (lto_output_tree_ref): For tree nodes that are not
>   normally indexable, call
>   lto_streamer_hooks.indexable_with_decls_p before giving
>   up.
>   (lto_output_ts_decl_common_tree_pointers): Move handling
>   for FUNCTION_DECL and TRANSLATION_UNIT_DECL to
>   gimple_streamer_write_tree.
>   Move assertion for NULL DECL_SAVED_TREEs to
>   gimple_streamer_write_tree.
>   (lto_output_ts_decl_with_vis_tree_pointers): Call
>   output_record_start with LTO_null instead of output_zero.
>   (lto_output_ts_binfo_tree_pointers): Likewise.
>   (lto_output_tree): Likewise.
>   (output_eh_try_list): Likewise.
>   (output_eh_region): Likewise.
>   (output_eh_lp): Likewise.
>   (output_eh_regions): Likewise.
>   (output_bb): Likewise.
>   (output_function): Likewise.
>   (output_unreferenced_globals): Likewise.
>   (lto_output_tree_pointers): Do not handle TS_SSA_NAME,
>   TS_STATEMENT_LIST, TS_OMP_CLAUSE and TS_OPTIMIZATION.
>   (lto_output_tree_header): Call
>   lto_streamer_hooks.is_streamable instead of
>   lto_is_streamable.
>   Call lto_streamer_hooks.output_tree_header if set.
>   (lto_write_tree): Call lto_streamer_hooks.write_tree if
>   set.
>   (gimple_streamer_write_tree): New.
>   (lto_output_tree): If
>   lto_streamer_hooks.has_unique_integer_csts_p is set,
>   lookup the constant in the streamer cache first.
>   * lto-streamer.c (lto_streamer_cache_create): Call
>   lto_streamer_hooks.preload_common_nodes instead of
>   lto_preload_common_nodes.
>   (lto_is_streamable): Move from lto-streamer.h
>   (gimple_streamer_hooks_init): New.
>   (streamer_hooks): New.
>   (streamer_hooks_): Declare.
>   (streamer_hooks_init): New.
>   * lto-streamer.h (struct output_block): Forwar

Re: [patch] Improve detection of widening multiplication in the vectorizer

2011-06-01 Thread Richard Guenther
On Wed, Jun 1, 2011 at 1:37 PM, Ira Rosen  wrote:
> On 1 June 2011 12:42, Richard Guenther  wrote:
>
>> Did you think about moving pass_optimize_widening_mul before
>> loop optimizations?  Does that pass catch the cases you are
>> teaching the pattern recognizer?  I think we should try to expose
>> these more complicated instructions to loop optimizers.
>>
>
> pass_optimize_widening_mul doesn't catch these cases, but I can try to
> teach it instead of the vectorizer.
> I am now testing
>
> Index: passes.c
> ===
> --- passes.c    (revision 174391)
> +++ passes.c    (working copy)
> @@ -870,6 +870,7 @@
>       NEXT_PASS (pass_split_crit_edges);
>       NEXT_PASS (pass_pre);
>       NEXT_PASS (pass_sink_code);
> +      NEXT_PASS (pass_optimize_widening_mul);
>       NEXT_PASS (pass_tree_loop);
>        {
>          struct opt_pass **p = &pass_tree_loop.pass.sub;
> @@ -934,7 +935,6 @@
>       NEXT_PASS (pass_forwprop);
>       NEXT_PASS (pass_phiopt);
>       NEXT_PASS (pass_fold_builtins);
> -      NEXT_PASS (pass_optimize_widening_mul);
>       NEXT_PASS (pass_tail_calls);
>       NEXT_PASS (pass_rename_ssa_copies);
>       NEXT_PASS (pass_uncprop);
>
> to see how it affects other loop optimizations (vectorizer pattern
> tests obviously fail).

Thanks.  I would hope that we eventually can get rid of the
pattern recognizer ... at least for SSE there is also always
a scalar variant instruction for each vectorized one.

Richard.


Re: [PATH] PR/49139 fix always_inline failures diagnostics

2011-06-01 Thread Christian Bruel



On 06/01/2011 12:02 PM, Richard Guenther wrote:

On Tue, May 31, 2011 at 6:03 PM, Christian Bruel  wrote:



On 05/31/2011 11:18 AM, Richard Guenther wrote:


On Tue, May 31, 2011 at 9:54 AM, Christian Bruel
  wrote:


Hello,

The attached patch fixes a few diagnostic discrepancies for always_inline
failures.

Illustrated by the fail_always_inline[12].c attached cases, the current
behavior is one of:

- success (with and without -Winline), silently not honoring
always_inline
   gcc fail_always_inline1.c -S -Winline -O0 -fpic
   gcc fail_always_inline1.c -S -O2 -fpic

- error: with -Winline but not without
   gcc fail_always_inline1.c -S -Winline -O2 -fpic

- error: without -Winline
   gcc fail_always_inline2.c -S -fno-early-inlining -O2
   or the original c++ attachment in this defect

note that -Winline never warns, as stated in the documentation

This simple patch consistently emits a warning (changing the sorry
unimplemented message) whenever the attribute is not honored.

My first will was to generate and error instead of the warning, but since
it
is possible that inlines is only performed at LTO time, an error would be
inapropriate (Note that today this is not possible with -Winline that
would
abort).

Another alternative I considered would be to emit the warning under
-Winline
rather than unconditionally, but this more a user misuse of the
attribute,
so should always be warned anyway. Or maybe a new -Winline-always that
would
be activated under -Wall ? Other opinion welcomed.

Tested with standard bootstrap and regression on x86.

Comments, and/or OK for trunk ?


The patch is not ok, we may not fail to inline an always_inline
function.


OK, I thought so that would be an error. but I was afraid to abort the
inline of function with a missing body (provided by another file) by LTO,
which would be a regression. rethinking about this and considering that a
valid LTO program should be valid without LTO, and that the scope is the
translation unit, that would be OK to always reject attribute_inline on
functions without a body.

To make this more consistent I proposed to warn


whenever you take the address of an always_inline function
(because then you can confuse GCC by indirectly calling
such function which we might inline dependent on optimization
setting and which we might discover we didn't inline only
dependent on optimization setting).Honza proposed to move
the sorry()ing to when we feel the need to output the
always_inline function, thus when it was not optimized away,
but that would require us not preserving the body (do we?)
with -fpreserve-inline-functions.



But we don't know if taking the address of the function will result by a
call to it, or how the pointer will be used. So I think the check should be
done at the caller site. But I code like:

inline __attribute__((always_inline))  int foo() { return 0; }

static int (*ptr)() = foo;

int
bar()
{
  return ptr();
}

is not be (legitimately) inlined, but without diagnostic, I don't know where
to look at this that at the moment.


Yeah, the issue is that we only warn if we end up seeing a direct
call to an always_inline function that wasn't inlined.  The idea of
sorrying() for remaining always_inline bodies instead would also
catch the above, but so would

inline __attribute__((always_inline))  int foo() { return 0; }
int (*ptr)() = foo;

(address-taken but not called).


For fail_always_inline1.c we should diagnose the appearant
misuse of always_inline with a warning, drop the attribute
but keep DECL_DISREGARD_INLINE_LIMITS set.

Same for fail_always_inline2.c.

I agree that sorry()ing for those cases is odd.  EIther we
should reject the declarations upfront
("always-inline function will not be inlinable"), or we should
emit a warning of that kind and make sure to not sorry later.


In addition to the error at the caller site, I've added the specific warn
about the missing inline keyword.


But not in a good place.  Please instead check it alongside the
other attribute checks in cgraphunit.c:process_function_and_variable_attributes


OK, the only difference is that we don't have the node analyzed here, so 
externally_visible, etc are not set. With the initial proposal the 
warning was emitted only if the function could not be inlined. Now it 
will be emitted for each  DECL_COMDAT (decl) && !DECL_DECLARED_INLINE_P 
(decl)) even if not preemptible, so conservatively we don't want to 
duplicate the availability check.


see attached new patch for that.





Thanks for your comments, here is the new patch that I'm testing, (without
the handling of indirect calls which can be treated separately).


Index: gcc/ipa-inline-transform.c
===
--- gcc/ipa-inline-transform.c  (revision 174264)
+++ gcc/ipa-inline-transform.c  (working copy)
@@ -302,9 +302,20 @@

for (e = node->callees; e; e = e->next_callee)
  {
-  cgraph_redirect_edge_call_stmt_to_callee (e);
+  gimple call =

Re: [pph] Renaming output/write and input/read to out/in + standardizing pph_stream_* to pph_* (issue4532102)

2011-06-01 Thread dnovillo

Looks OK.  One minor formatting comment that I will fix myself when I
commit the patch.


Diego.


http://codereview.appspot.com/4532102/diff/1/pph-streamer-in.c
File pph-streamer-in.c (right):

http://codereview.appspot.com/4532102/diff/1/pph-streamer-in.c#newcode42
pph-streamer-in.c:42: pph_register_shared_data (STREAM, DATA, IX);  \
   (DATA) = (ALLOC_EXPR);   \
-  pph_stream_register_shared_data (STREAM, DATA, IX);  \
+  pph_register_shared_data (STREAM, DATA, IX); \

Align the '\' with the ones above.

http://codereview.appspot.com/4532102/


Re: [PATH] PR/49139 fix always_inline failures diagnostics

2011-06-01 Thread Richard Guenther
On Wed, Jun 1, 2011 at 3:01 PM, Christian Bruel  wrote:
>
>
> On 06/01/2011 12:02 PM, Richard Guenther wrote:
>>
>> On Tue, May 31, 2011 at 6:03 PM, Christian Bruel
>>  wrote:
>>>
>>>
>>> On 05/31/2011 11:18 AM, Richard Guenther wrote:

 On Tue, May 31, 2011 at 9:54 AM, Christian Bruel
  wrote:
>
> Hello,
>
> The attached patch fixes a few diagnostic discrepancies for
> always_inline
> failures.
>
> Illustrated by the fail_always_inline[12].c attached cases, the current
> behavior is one of:
>
> - success (with and without -Winline), silently not honoring
> always_inline
>   gcc fail_always_inline1.c -S -Winline -O0 -fpic
>   gcc fail_always_inline1.c -S -O2 -fpic
>
> - error: with -Winline but not without
>   gcc fail_always_inline1.c -S -Winline -O2 -fpic
>
> - error: without -Winline
>   gcc fail_always_inline2.c -S -fno-early-inlining -O2
>   or the original c++ attachment in this defect
>
> note that -Winline never warns, as stated in the documentation
>
> This simple patch consistently emits a warning (changing the sorry
> unimplemented message) whenever the attribute is not honored.
>
> My first will was to generate and error instead of the warning, but
> since
> it
> is possible that inlines is only performed at LTO time, an error would
> be
> inapropriate (Note that today this is not possible with -Winline that
> would
> abort).
>
> Another alternative I considered would be to emit the warning under
> -Winline
> rather than unconditionally, but this more a user misuse of the
> attribute,
> so should always be warned anyway. Or maybe a new -Winline-always that
> would
> be activated under -Wall ? Other opinion welcomed.
>
> Tested with standard bootstrap and regression on x86.
>
> Comments, and/or OK for trunk ?

 The patch is not ok, we may not fail to inline an always_inline
 function.
>>>
>>> OK, I thought so that would be an error. but I was afraid to abort the
>>> inline of function with a missing body (provided by another file) by LTO,
>>> which would be a regression. rethinking about this and considering that a
>>> valid LTO program should be valid without LTO, and that the scope is the
>>> translation unit, that would be OK to always reject attribute_inline on
>>> functions without a body.
>>>
>>> To make this more consistent I proposed to warn

 whenever you take the address of an always_inline function
 (because then you can confuse GCC by indirectly calling
 such function which we might inline dependent on optimization
 setting and which we might discover we didn't inline only
 dependent on optimization setting).Honza proposed to move
 the sorry()ing to when we feel the need to output the
 always_inline function, thus when it was not optimized away,
 but that would require us not preserving the body (do we?)
 with -fpreserve-inline-functions.

>>>
>>> But we don't know if taking the address of the function will result by a
>>> call to it, or how the pointer will be used. So I think the check should
>>> be
>>> done at the caller site. But I code like:
>>>
>>> inline __attribute__((always_inline))  int foo() { return 0; }
>>>
>>> static int (*ptr)() = foo;
>>>
>>> int
>>> bar()
>>> {
>>>  return ptr();
>>> }
>>>
>>> is not be (legitimately) inlined, but without diagnostic, I don't know
>>> where
>>> to look at this that at the moment.
>>
>> Yeah, the issue is that we only warn if we end up seeing a direct
>> call to an always_inline function that wasn't inlined.  The idea of
>> sorrying() for remaining always_inline bodies instead would also
>> catch the above, but so would
>>
>> inline __attribute__((always_inline))  int foo() { return 0; }
>> int (*ptr)() = foo;
>>
>> (address-taken but not called).
>>
 For fail_always_inline1.c we should diagnose the appearant
 misuse of always_inline with a warning, drop the attribute
 but keep DECL_DISREGARD_INLINE_LIMITS set.

 Same for fail_always_inline2.c.

 I agree that sorry()ing for those cases is odd.  EIther we
 should reject the declarations upfront
 ("always-inline function will not be inlinable"), or we should
 emit a warning of that kind and make sure to not sorry later.
>>>
>>> In addition to the error at the caller site, I've added the specific warn
>>> about the missing inline keyword.
>>
>> But not in a good place.  Please instead check it alongside the
>> other attribute checks in
>> cgraphunit.c:process_function_and_variable_attributes
>
> OK, the only difference is that we don't have the node analyzed here, so
> externally_visible, etc are not set. With the initial proposal the warning
> was emitted only if the function could not be inlined. Now it will be
> emitted for each  DECL_COMDAT (decl) && !DECL_DECLARED_INLINE_P (decl)) even
> if no

Re: C++ PATCH for c++/44870 (wrong overload resolution error in template)

2011-06-01 Thread H.J. Lu
On Tue, May 31, 2011 at 11:05 AM, Jason Merrill  wrote:
> lvalue_kind has tried to give an approximate answer for value category in
> templates; in the past, it was OK to say that an arbitrary expression was an
> lvalue, as the only effect would be that errors we could have given at
> template definition time would be delayed until instantiation, which is
> still conforming.  But now that we have rvalue references that can't bind to
> lvalues, it has become important to get the right answer.  So this patch
> makes us look through NON_DEPENDENT_EXPR at the actual underlying tree
> structure.  We need to add a couple more cases for lvalue expressions that
> only appear in templates, and handle overloaded functions/operators that
> return class type; I will not be surprised if there are other cases I didn't
> think of, but we are still in stage 1... :)
>
> Tested x86_64-pc-linux-gnu, applying to trunk.
>

This caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49253

-- 
H.J.


ARM Cortex-R5 support

2011-06-01 Thread Paul Brook
The attached patch adds support from the ARM Cortex-R5 cpu core.  For compiler 
purposes this is basically the same as the Cortex-R4, except it supports the 
integer division instructions in both ARM and Thumb mode.

The Cortex-A15 also supports this, so I've enabled it there at the same time.

Tested on arm-none-eabi
Applied to SVN head.

Paul

2011-06-01  Paul Brook  

gcc/
* config/arm/arm-cores.def: Add cortex-r5.  Add DIV flags to
Cortex-A15.
* config/arm/arm-tune.md: Regenerate.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm.c (FL_DIV): Rename...
(FL_THUMB_DIV): ... to this.
(FL_ARM_DIV): Define.
(FL_FOR_ARCH7R, FL_FOR_ARCH7M): Use FL_THUMB_DIV.
(arm_arch_hwdiv): Remove.
(arm_arch_thumb_hwdiv, arm_arch_arm_hwdiv): New variables.
(arm_issue_rate): Add cortexr5.
* config/arm/arm.h (TARGET_CPU_CPP_BUILTINS): Set
__ARM_ARCH_EXT_IDIV__.
(TARGET_IDIV): Define.
(arm_arch_hwdiv): Remove.
(arm_arch_arm_hwdiv, arm_arch_thumb_hwdiv): New prototypes.
* config/arm/arm.md (tune_cortexr4): Add cortexr5.
(divsi3, udivsi3): New patterns.
* config/arm/thumb2.md (divsi3, udivsi3): Remove.
* doc/invoke.texi: Document ARM -mcpu=cortex-r5
Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi	(revision 174524)
+++ gcc/doc/invoke.texi	(working copy)
@@ -10241,7 +10241,8 @@ assembly code.  Permissible names are: @
 @samp{arm1136j-s}, @samp{arm1136jf-s}, @samp{mpcore}, @samp{mpcorenovfp},
 @samp{arm1156t2-s}, @samp{arm1156t2f-s}, @samp{arm1176jz-s}, @samp{arm1176jzf-s},
 @samp{cortex-a5}, @samp{cortex-a8}, @samp{cortex-a9}, @samp{cortex-a15},
-@samp{cortex-r4}, @samp{cortex-r4f}, @samp{cortex-m4}, @samp{cortex-m3},
+@samp{cortex-r4}, @samp{cortex-r4f}, @samp{cortex-r5},
+@samp{cortex-m4}, @samp{cortex-m3},
 @samp{cortex-m1},
 @samp{cortex-m0},
 @samp{xscale}, @samp{iwmmxt}, @samp{iwmmxt2}, @samp{ep9312}.
Index: gcc/config/arm/arm.c
===
--- gcc/config/arm/arm.c	(revision 174524)
+++ gcc/config/arm/arm.c	(working copy)
@@ -662,12 +662,13 @@ static int thumb_call_reg_needed;
 #define FL_THUMB2 (1 << 16)	  /* Thumb-2.  */
 #define FL_NOTM	  (1 << 17)	  /* Instructions not present in the 'M'
 	 profile.  */
-#define FL_DIV	  (1 << 18)	  /* Hardware divide.  */
+#define FL_THUMB_DIV  (1 << 18)	  /* Hardware divide (Thumb mode).  */
 #define FL_VFPV3  (1 << 19)   /* Vector Floating Point V3.  */
 #define FL_NEON   (1 << 20)   /* Neon instructions.  */
 #define FL_ARCH7EM(1 << 21)	  /* Instructions present in the ARMv7E-M
 	 architecture.  */
 #define FL_ARCH7  (1 << 22)   /* Architecture 7.  */
+#define FL_ARM_DIV(1 << 23)	  /* Hardware divide (ARM mode).  */
 
 #define FL_IWMMXT (1 << 29)	  /* XScale v2 or "Intel Wireless MMX technology".  */
 
@@ -694,8 +695,8 @@ static int thumb_call_reg_needed;
 #define FL_FOR_ARCH6M	(FL_FOR_ARCH6 & ~FL_NOTM)
 #define FL_FOR_ARCH7	((FL_FOR_ARCH6T2 & ~FL_NOTM) | FL_ARCH7)
 #define FL_FOR_ARCH7A	(FL_FOR_ARCH7 | FL_NOTM | FL_ARCH6K)
-#define FL_FOR_ARCH7R	(FL_FOR_ARCH7A | FL_DIV)
-#define FL_FOR_ARCH7M	(FL_FOR_ARCH7 | FL_DIV)
+#define FL_FOR_ARCH7R	(FL_FOR_ARCH7A | FL_THUMB_DIV)
+#define FL_FOR_ARCH7M	(FL_FOR_ARCH7 | FL_THUMB_DIV)
 #define FL_FOR_ARCH7EM  (FL_FOR_ARCH7M | FL_ARCH7EM)
 
 /* The bits in this mask specify which
@@ -781,7 +782,8 @@ int arm_cpp_interwork = 0;
 int arm_arch_thumb2;
 
 /* Nonzero if chip supports integer division instruction.  */
-int arm_arch_hwdiv;
+int arm_arch_arm_hwdiv;
+int arm_arch_thumb_hwdiv;
 
 /* In case of a PRE_INC, POST_INC, PRE_DEC, POST_DEC memory reference,
we must report the mode of the memory reference from
@@ -1449,7 +1451,8 @@ arm_option_override (void)
   arm_tune_wbuf = (tune_flags & FL_WBUF) != 0;
   arm_tune_xscale = (tune_flags & FL_XSCALE) != 0;
   arm_arch_iwmmxt = (insn_flags & FL_IWMMXT) != 0;
-  arm_arch_hwdiv = (insn_flags & FL_DIV) != 0;
+  arm_arch_thumb_hwdiv = (insn_flags & FL_THUMB_DIV) != 0;
+  arm_arch_arm_hwdiv = (insn_flags & FL_ARM_DIV) != 0;
   arm_tune_cortex_a9 = (arm_tune == cortexa9) != 0;
 
   /* If we are not using the default (ARM mode) section anchor offset
@@ -23019,6 +23022,7 @@ arm_issue_rate (void)
 {
 case cortexr4:
 case cortexr4f:
+case cortexr5:
 case cortexa5:
 case cortexa8:
 case cortexa9:
Index: gcc/config/arm/thumb2.md
===
--- gcc/config/arm/thumb2.md	(revision 174524)
+++ gcc/config/arm/thumb2.md	(working copy)
@@ -779,26 +779,6 @@ (define_insn "*thumb2_addsi_short"
(set_attr "length" "2")]
 )
 
-(define_insn "divsi3"
-  [(set (match_operand:SI	  0 "s_register_operand" "=r")
-	(div:SI (match_operand:SI 1 "s_register_operand" 

Re: [PATCH] ENTRY_VALUE fixes (PR debug/48203)

2011-06-01 Thread Jakub Jelinek
On Mon, Mar 28, 2011 at 10:50:14AM -0700, Richard Henderson wrote:
> > I will look into creating helper inlines to reduce code duplication.
> 
> Please.  You can do this as a follow-up if you prefer.

Sorry it took so long, here it is.  Bootstrapped/regtested on x86_64-linux
and i686-linux, makes zero difference on cc1plus nor gcc.dg/guality/*.c
generated debug info.  Ok for trunk?

2011-06-01  Jakub Jelinek  

* var-tracking.c (create_entry_value): New function.
(vt_add_function_parameter): Use it.

--- gcc/var-tracking.c.jj   2011-06-01 10:20:03.0 +0200
+++ gcc/var-tracking.c  2011-06-01 10:51:30.0 +0200
@@ -8378,6 +8378,39 @@ vt_get_decl_and_offset (rtx rtl, tree *d
   return false;
 }
 
+/* Helper function for vt_add_function_parameter.  RTL is
+   the expression and VAL corresponding cselib_val pointer
+   for which ENTRY_VALUE should be created.  */
+
+static void
+create_entry_value (rtx rtl, cselib_val *val)
+{
+  cselib_val *val2;
+  struct elt_loc_list *el;
+  el = (struct elt_loc_list *) ggc_alloc_cleared_atomic (sizeof (*el));
+  el->next = val->locs;
+  el->loc = gen_rtx_ENTRY_VALUE (GET_MODE (rtl));
+  ENTRY_VALUE_EXP (el->loc) = rtl;
+  el->setting_insn = get_insns ();
+  val->locs = el;
+  val2 = cselib_lookup_from_insn (el->loc, GET_MODE (rtl), true,
+ VOIDmode, get_insns ());
+  if (val2
+  && val2 != val
+  && val2->locs
+  && rtx_equal_p (val2->locs->loc, el->loc))
+{
+  struct elt_loc_list *el2;
+
+  preserve_value (val2);
+  el2 = (struct elt_loc_list *) ggc_alloc_cleared_atomic (sizeof (*el2));
+  el2->next = val2->locs;
+  el2->loc = val->val_rtx;
+  el2->setting_insn = get_insns ();
+  val2->locs = el2;
+}
+}
+
 /* Insert function parameter PARM in IN and OUT sets of ENTRY_BLOCK.  */
 
 static void
@@ -8501,32 +8534,8 @@ vt_add_function_parameter (tree parm)
 VAR_INIT_STATUS_INITIALIZED, NULL, INSERT);
   if (dv_is_value_p (dv))
{
- cselib_val *val = CSELIB_VAL_PTR (dv_as_value (dv)), *val2;
- struct elt_loc_list *el;
- el = (struct elt_loc_list *)
-   ggc_alloc_cleared_atomic (sizeof (*el));
- el->next = val->locs;
- el->loc = gen_rtx_ENTRY_VALUE (GET_MODE (incoming));
- ENTRY_VALUE_EXP (el->loc) = incoming;
- el->setting_insn = get_insns ();
- val->locs = el;
- val2 = cselib_lookup_from_insn (el->loc, GET_MODE (incoming),
- true, VOIDmode, get_insns ());
- if (val2
- && val2 != val
- && val2->locs
- && rtx_equal_p (val2->locs->loc, el->loc))
-   {
- struct elt_loc_list *el2;
-
- preserve_value (val2);
- el2 = (struct elt_loc_list *)
-   ggc_alloc_cleared_atomic (sizeof (*el2));
- el2->next = val2->locs;
- el2->loc = dv_as_value (dv);
- el2->setting_insn = get_insns ();
- val2->locs = el2;
-   }
+ cselib_val *val = CSELIB_VAL_PTR (dv_as_value (dv));
+ create_entry_value (incoming, val);
  if (TREE_CODE (TREE_TYPE (parm)) == REFERENCE_TYPE
  && INTEGRAL_TYPE_P (TREE_TYPE (TREE_TYPE (parm
{
@@ -8538,31 +8547,7 @@ vt_add_function_parameter (tree parm)
  if (val)
{
  preserve_value (val);
- el = (struct elt_loc_list *)
-   ggc_alloc_cleared_atomic (sizeof (*el));
- el->next = val->locs;
- el->loc = gen_rtx_ENTRY_VALUE (indmode);
- ENTRY_VALUE_EXP (el->loc) = mem;
- el->setting_insn = get_insns ();
- val->locs = el;
- val2 = cselib_lookup_from_insn (el->loc, GET_MODE (mem),
- true, VOIDmode,
- get_insns ());
- if (val2
- && val2 != val
- && val2->locs
- && rtx_equal_p (val2->locs->loc, el->loc))
-   {
- struct elt_loc_list *el2;
-
- preserve_value (val2);
- el2 = (struct elt_loc_list *)
-   ggc_alloc_cleared_atomic (sizeof (*el2));
- el2->next = val2->locs;
- el2->loc = val->val_rtx;
- el2->setting_insn = get_insns ();
- val2->locs = el2;
-   }
+ create_entry_value (mem, val);
}
}
}


Jakub


[PATCH] Decrease size of mem_loc_descriptor

2011-06-01 Thread Jakub Jelinek
On Tue, May 31, 2011 at 09:26:23AM -0700, Richard Henderson wrote:
> On 05/31/2011 02:19 AM, Jakub Jelinek wrote:
> > Hi!
> > 
> > - http://gcc.gnu.org/ml/gcc-patches/2011-05/msg01182.html   
> >
> >   various debug info improvements (typed DWARF stack etc.)  
> >
> 
> Ok.  Although it might be good to break up mem_loc_descriptor.
> You've got some rather big implementations of operations now.

Here is a patch that does that for a bunch of larger hunks,
mem_loc_descriptor linecount decreased by 40% or so.

Bootstrapped/regtested on x86_64-linux and i686-linux,
the patch should make no difference on generated debug info,
which I've verified on cc1plus debuginfo as well as debuginfo
of gcc.dg/guality/*.c at -g -O2 {-m32,-m64}.
Ok for trunk?

2011-06-01  Jakub Jelinek  

* dwarf2out.c (compare_loc_descriptor, scompare_loc_descriptor,
ucompare_loc_descriptor, minmax_loc_descriptor, clz_loc_descriptor,
popcount_loc_descriptor, bswap_loc_descriptor, rotate_loc_descriptor):
New functions.
(mem_loc_descriptor): Use them.

--- gcc/dwarf2out.c.jj  2011-06-01 10:20:03.0 +0200
+++ gcc/dwarf2out.c 2011-06-01 12:23:32.0 +0200
@@ -13855,6 +13855,627 @@ convert_descriptor_to_signed (enum machi
   return op;
 }
 
+/* Return location descriptor for comparison OP with operands OP0 and OP1.  */
+
+static dw_loc_descr_ref
+compare_loc_descriptor (enum dwarf_location_atom op, dw_loc_descr_ref op0,
+   dw_loc_descr_ref op1)
+{
+  dw_loc_descr_ref ret = op0;
+  add_loc_descr (&ret, op1);
+  add_loc_descr (&ret, new_loc_descr (op, 0, 0));
+  if (STORE_FLAG_VALUE != 1)
+{
+  add_loc_descr (&ret, int_loc_descriptor (STORE_FLAG_VALUE));
+  add_loc_descr (&ret, new_loc_descr (DW_OP_mul, 0, 0));
+}
+  return ret;
+}
+
+/* Return location descriptor for signed comparison OP RTL.  */
+
+static dw_loc_descr_ref
+scompare_loc_descriptor (enum dwarf_location_atom op, rtx rtl,
+enum machine_mode mem_mode)
+{
+  enum machine_mode op_mode = GET_MODE (XEXP (rtl, 0));
+  dw_loc_descr_ref op0, op1;
+  int shift;
+
+  if (op_mode == VOIDmode)
+op_mode = GET_MODE (XEXP (rtl, 1));
+  if (op_mode == VOIDmode)
+return NULL;
+
+  if (dwarf_strict
+  && (GET_MODE_CLASS (op_mode) != MODE_INT
+ || GET_MODE_SIZE (op_mode) > DWARF2_ADDR_SIZE))
+return NULL;
+
+  op0 = mem_loc_descriptor (XEXP (rtl, 0), op_mode, mem_mode,
+   VAR_INIT_STATUS_INITIALIZED);
+  op1 = mem_loc_descriptor (XEXP (rtl, 1), op_mode, mem_mode,
+   VAR_INIT_STATUS_INITIALIZED);
+
+  if (op0 == NULL || op1 == NULL)
+return NULL;
+
+  if (GET_MODE_CLASS (op_mode) != MODE_INT
+  || GET_MODE_SIZE (op_mode) >= DWARF2_ADDR_SIZE)
+return compare_loc_descriptor (op, op0, op1);
+
+  shift = (DWARF2_ADDR_SIZE - GET_MODE_SIZE (op_mode)) * BITS_PER_UNIT;
+  /* For eq/ne, if the operands are known to be zero-extended,
+ there is no need to do the fancy shifting up.  */
+  if (op == DW_OP_eq || op == DW_OP_ne)
+{
+  dw_loc_descr_ref last0, last1;
+  for (last0 = op0; last0->dw_loc_next != NULL; last0 = last0->dw_loc_next)
+   ;
+  for (last1 = op1; last1->dw_loc_next != NULL; last1 = last1->dw_loc_next)
+   ;
+  /* deref_size zero extends, and for constants we can check
+whether they are zero extended or not.  */
+  if (((last0->dw_loc_opc == DW_OP_deref_size
+   && last0->dw_loc_oprnd1.v.val_int <= GET_MODE_SIZE (op_mode))
+  || (CONST_INT_P (XEXP (rtl, 0))
+  && (unsigned HOST_WIDE_INT) INTVAL (XEXP (rtl, 0))
+ == (INTVAL (XEXP (rtl, 0)) & GET_MODE_MASK (op_mode
+ && ((last1->dw_loc_opc == DW_OP_deref_size
+  && last1->dw_loc_oprnd1.v.val_int <= GET_MODE_SIZE (op_mode))
+ || (CONST_INT_P (XEXP (rtl, 1))
+ && (unsigned HOST_WIDE_INT) INTVAL (XEXP (rtl, 1))
+== (INTVAL (XEXP (rtl, 1)) & GET_MODE_MASK (op_mode)
+   return compare_loc_descriptor (op, op0, op1);
+}
+  add_loc_descr (&op0, int_loc_descriptor (shift));
+  add_loc_descr (&op0, new_loc_descr (DW_OP_shl, 0, 0));
+  if (CONST_INT_P (XEXP (rtl, 1)))
+op1 = int_loc_descriptor (INTVAL (XEXP (rtl, 1)) << shift);
+  else
+{
+  add_loc_descr (&op1, int_loc_descriptor (shift));
+  add_loc_descr (&op1, new_loc_descr (DW_OP_shl, 0, 0));
+}
+  return compare_loc_descriptor (op, op0, op1);
+}
+
+/* Return location descriptor for unsigned comparison OP RTL.  */
+
+static dw_loc_descr_ref
+ucompare_loc_descriptor (enum dwarf_location_atom op, rtx rtl,
+enum machine_mode mem_mode)
+{
+  enum machine_mode op_mode = GET_MODE (XEXP (rtl, 0));
+  dw_loc_descr_ref op0, op1;
+

[committed] Fix a typed DWARF stack unsigned comparison thinko

2011-06-01 Thread Jakub Jelinek
Hi!

While working on patches I'm going to post momentarily, I've noticed
a buglet in ucompare handling, mode is the mode of the comparison, while
obviously I meant op_mode which is the mode of the comparison operands.
Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
committed as obvious.

2011-06-01  Jakub Jelinek  

* dwarf2out.c (mem_loc_descriptor) : Call
base_type_for_mode with op_mode instead of mode.

--- gcc/dwarf2out.c.jj  2011-05-31 21:14:53.0 +0200
+++ gcc/dwarf2out.c 2011-06-01 12:34:45.985670994 +0200
@@ -14685,7 +14685,7 @@ mem_loc_descriptor (rtx rtl, enum machin
  }
else
  {
-   dw_die_ref type_die = base_type_for_mode (mode, 1);
+   dw_die_ref type_die = base_type_for_mode (op_mode, 1);
dw_loc_descr_ref cvt;
 
if (type_die == NULL)

Jakub


Add missing ChangeLog entry

2011-06-01 Thread Ian Lance Taylor
I noticed that we have a --with-specs option in gcc/configure.ac, added
in revision 155208 with this e-mail message:
http://gcc.gnu.org/ml/gcc-patches/2009-12/msg00132.html

Nathan forgot to commit the ChangeLog entry, so I have committed this
patch to ChangeLog-2009.  (Patches to ChangeLog files do not themselves
require ChangeLog entries.)

More seriously, this option is not documented in gcc/doc/install.texi,
as pointed out by Gerald in
http://gcc.gnu.org/ml/gcc-patches/2009-12/msg00947.html .  We've had
discussions circling around this idea in the past, and I was surprised
to discover that it has already been committed.

Nathan, would you be willing to write some docs for it?  Thanks.

Ian


Index: gcc/ChangeLog-2009
===
--- gcc/ChangeLog-2009	(revision 174368)
+++ gcc/ChangeLog-2009	(working copy)
@@ -474,6 +474,13 @@
 	* intl.c (get_spaces): New.
 	* intl.h (get_spaces): New.
 
+2009-12-14  Mark Mitchell  
+
+	* configure.ac (--with-specs): New option.
+	* configure: Regenerated.
+	* gcc.c (driver_self_specs): Include CONFIGURE_SPECS.
+	* Makefile.in (DRIVER_DEFINES): Add -DCONFIGURE_SPECS.
+
 2009-12-14  Jakub Jelinek  
 
 	PR bootstrap/42369


Re: [RFC PATCH, go]: Port to ALPHA arch - sysinfo.go fixup

2011-06-01 Thread Ian Lance Taylor
Uros Bizjak  writes:

> On Tue, May 31, 2011 at 8:08 PM, Ian Lance Taylor  wrote:
>> Uros Bizjak  writes:
>>
>>> (BTW: Original calculation of Ctime_ns has a cut'n'paste error,
>>> stat.Ctime.Nsec should be used instead of stat.Atime.Nsec).
>>
>> Thanks.  Fixed like so.  Bootstrapped and ran Go testsuite on
>> x86_64-unknown-linux-gnu.  Committed to mainline.
>
> Using your latest fixes, I was able to compile libgo on
> alphaev68-pc-linux-gnu out of the box, without any additional patches.

Cool.

> One problem remains in the libgo testsuite: certain tests have to be
> compiled with -mieee, otherwise FPE is generated for unordered values.
> Any suggestions, where -mieee should be placed?

That's an interesting question.  I think that ideally we would like
-mieee to become the default when using gccgo.  Users could still use
-mno-ieee (well, they could if there were such an option) but it seems
that for Go -mieee ought to be on by default.  I'm not sure whether it
would be more appropriate to do that in code in gcc/go or in code in
gcc/config/alpha.

In gcc/go it would require some sort of #ifdef, or a new target hook.

In gcc/config/alpha one approach would be defining DRIVER_SELF_SPECS in
alpha.h with %{.go:-mieee}.

Thoughts?

Ian


Re: [pph] Renaming output/write and input/read to out/in + standardizing pph_stream_* to pph_* (issue4532102)

2011-06-01 Thread dnovillo

On 2011/06/01 13:06:15, Diego Novillo wrote:

Looks OK.  One minor formatting comment that I will fix myself when I

commit the

patch.


Committed as rev 174530.


Diego.

http://codereview.appspot.com/4532102/


Re: [PATH] PR/49139 fix always_inline failures diagnostics

2011-06-01 Thread Jan Hubicka
> >>
> >> whenever you take the address of an always_inline function
> >> (because then you can confuse GCC by indirectly calling
> >> such function which we might inline dependent on optimization
> >> setting and which we might discover we didn't inline only
> >> dependent on optimization setting).Honza proposed to move
> >> the sorry()ing to when we feel the need to output the
> >> always_inline function, thus when it was not optimized away,
> >> but that would require us not preserving the body (do we?)
> >> with -fpreserve-inline-functions.

I don't think we can preserve them with -fpreserve-inline-functions because
the ssa intrincisc or functions calling va_arg_pack can not be expanded
when not inlined into proper context and having those in headers would imply
units using those headers to not compile with -fpreserve-inline-functions...

> 
> Honza - this conditional calling of optimize_inline_calls just if
> warn_inline is on is extra ugly.  Does it really save that much
> time to only conditionally run optimize_inline_calls?  If so
> we should re-write that function completely.

I don't think it is big deal to call that function each time.  It used
to be more expensive than it is now and the conditional was there just
because it was always there as far as I recall.

Honza


Re: Use i386/crtfastmath.c on Solaris 2/x86

2011-06-01 Thread Rainer Orth
Uros Bizjak  writes:

> Please just put "if (edx & bit_SSE)" part inside existing check. You
> will need to split assignment of mxcsr from the declaration, though.
>
> OK with this change.

Here's the patch I've actually comitted after a quick bootstrap on
i386-pc-solaris2.10.

Thanks.
Rainer


2011-05-28  Rainer Orth  

gcc:
* config/i386/crtfastmath.c [!__x86_64__ && __sun__ && __svr4__]:
Include , .
(sigill_caught): Define.
(sigill_hdlr): New function.
(set_fast_math) [!__x86_64__ && __sun__ && __svr4__]: Check if SSE
insns can be executed.
* config/sol2.h (ENDFILE_SPEC): Use crtfastmath.o if -ffast-math
etc.
* config/sparc/sol2.h (ENDFILE_SPEC): Remove.

libgcc:
* config.host (i[34567]86-*-solaris2*): Add i386/t-crtfm to
tmake_file.
Add crtfastmath.o to extra_parts.

diff --git a/gcc/config/i386/crtfastmath.c b/gcc/config/i386/crtfastmath.c
--- a/gcc/config/i386/crtfastmath.c
+++ b/gcc/config/i386/crtfastmath.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2005, 2007, 2009 Free Software Foundation, Inc.
+ * Copyright (C) 2005, 2007, 2009, 2011 Free Software Foundation, Inc.
  *
  * This file is free software; you can redistribute it and/or modify it
  * under the terms of the GNU General Public License as published by the
@@ -30,6 +30,26 @@
 #include "cpuid.h"
 #endif
 
+#if !defined __x86_64 && defined __sun__ && defined __svr4__
+#include 
+#include 
+
+static volatile sig_atomic_t sigill_caught;
+
+static void
+sigill_hdlr (int sig __attribute((unused)),
+siginfo_t *sip __attribute__((unused)),
+ucontext_t *ucp)
+{
+  sigill_caught = 1;
+  /* Set PC to the instruction after the faulting one to skip over it,
+ otherwise we enter an infinite loop.  4 is the size of the stmxcsr
+ instruction.  */
+  ucp->uc_mcontext.gregs[EIP] += 4;
+  setcontext (ucp);
+}
+#endif
+
 static void __attribute__((constructor))
 #ifndef __x86_64__
 /* The i386 ABI only requires 4-byte stack alignment, so this is necessary
@@ -47,9 +67,31 @@ set_fast_math (void)
 
   if (edx & bit_SSE)
 {
-  unsigned int mxcsr = __builtin_ia32_stmxcsr ();
+  unsigned int mxcsr;
   
-  mxcsr |= MXCSR_FTZ;
+#if defined __sun__ && defined __svr4__
+  /* Solaris 2 before Solaris 9 4/04 cannot execute SSE instructions even
+if the CPU supports them.  Programs receive SIGILL instead, so check
+for that at runtime.  */
+  struct sigaction act, oact;
+
+  act.sa_handler = sigill_hdlr;
+  sigemptyset (&act.sa_mask);
+  /* Need to set SA_SIGINFO so a ucontext_t * is passed to the handler.  */
+  act.sa_flags = SA_SIGINFO;
+  sigaction (SIGILL, &act, &oact);
+
+  /* We need a single SSE instruction here so the handler can safely skip
+over it.  */
+  __asm__ volatile ("movss %xmm2,%xmm1");
+
+  sigaction (SIGILL, &oact, NULL);
+
+  if (sigill_caught)
+   return;
+#endif /* __sun__ && __svr4__ */
+
+  mxcsr = __builtin_ia32_stmxcsr () | MXCSR_FTZ;
 
   if (edx & bit_FXSAVE)
{
diff --git a/gcc/config/sol2.h b/gcc/config/sol2.h
--- a/gcc/config/sol2.h
+++ b/gcc/config/sol2.h
@@ -141,7 +141,9 @@ along with GCC; see the file COPYING3.  
  %{p|pg:-ldl} -lc}"
 
 #undef  ENDFILE_SPEC
-#define ENDFILE_SPEC "crtend.o%s crtn.o%s"
+#define ENDFILE_SPEC \
+  "%{Ofast|ffast-math|funsafe-math-optimizations:crtfastmath.o%s} \
+   crtend.o%s crtn.o%s"
 
 /* We don't use the standard svr4 STARTFILE_SPEC because it's wrong for us.  */
 #undef STARTFILE_SPEC
diff --git a/gcc/config/sparc/sol2.h b/gcc/config/sparc/sol2.h
--- a/gcc/config/sparc/sol2.h
+++ b/gcc/config/sparc/sol2.h
@@ -117,11 +117,6 @@ along with GCC; see the file COPYING3.  
 #define NO_DBX_BNSYM_ENSYM 1
 
 
-#undef  ENDFILE_SPEC
-#define ENDFILE_SPEC \
-  "%{Ofast|ffast-math|funsafe-math-optimizations:crtfastmath.o%s} \
-   crtend.o%s crtn.o%s"
-
 /* Select a format to encode pointers in exception handling data.  CODE
is 0 for data, 1 for code labels, 2 for function pointers.  GLOBAL is
true if the symbol may be affected by dynamic relocations.
diff --git a/libgcc/config.host b/libgcc/config.host
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -338,6 +338,8 @@ i[34567]86-*-rtems*)
tmake_file="${tmake_file} t-crtin i386/t-softfp i386/t-crtstuff t-rtems"
;;
 i[34567]86-*-solaris2*)
+   tmake_file="$tmake_file i386/t-crtfm"
+   extra_parts="$extra_parts crtfastmath.o"
;;
 i[4567]86-wrs-vxworks|i[4567]86-wrs-vxworksae)
;;


-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


better wpa [2/n]: merge some top-level trees

2011-06-01 Thread Michael Matz
Hi,

so here's something more of my patch queue.  It adds the facility to merge 
also other trees than types over compilation unit borders.  This specific 
patch only deals with STRING_CST and INTEGER_CST nodes.  Originally I used 
that place for merging declarations, hence the naming of some variables.  
That needs to wait for adjustments of the cgraph/varpool streamer.

On building cc1 the overall numbers of trees that stay around after all 
global sections are streamed in goes down from 620818 to 598843 trees, 
overall saving 1 MB (out of 76 MB for just the trees).  Not much, but 
hey...

Regstrapping on x86_64-linux in progress.  Okay for trunk?


Ciao,
Michael.


Re: better wpa [2/n]: merge some top-level trees

2011-06-01 Thread Michael Matz
Hi,

On Wed, 1 Jun 2011, Michael Matz wrote:

> Hi,
> 
> so here's something more of my patch queue.  It adds the facility to merge 
> also other trees than types over compilation unit borders.  This specific 
> patch only deals with STRING_CST and INTEGER_CST nodes.  Originally I used 
> that place for merging declarations, hence the naming of some variables.  
> That needs to wait for adjustments of the cgraph/varpool streamer.
> 
> On building cc1 the overall numbers of trees that stay around after all 
> global sections are streamed in goes down from 620818 to 598843 trees, 
> overall saving 1 MB (out of 76 MB for just the trees).  Not much, but 
> hey...
> 
> Regstrapping on x86_64-linux in progress.  Okay for trunk?

Ahem.

* lto.c (top_decls): New hash table.
(simple_tree_hash, simple_tree_eq, uniquify_top): New.
(LTO_FIXUP_TREE): Call uniquify_top.
(uniquify_nodes): Ditto.
(read_cgraph_and_symbols): Allocate and destroy top_decls.

Index: lto/lto.c
===
*** lto/lto.c   (revision 174524)
--- lto/lto.c   (working copy)
*** lto_read_in_decl_state (struct data_in *
*** 231,236 
--- 231,316 
 that must be replaced with their prevailing variant.  */
  static GTY((if_marked ("ggc_marked_p"), param_is (union tree_node))) htab_t
tree_with_vars;
+ /* A hashtable of top-level trees that can potentially be merged with trees
+from other compilation units.  */
+ static GTY((if_marked ("ggc_marked_p"), param_is (union tree_node))) htab_t
+   top_decls;
+ 
+ /* Given a tree in ITEM, return a hash value designed to easily recognize
+equal trees.  */
+ static hashval_t
+ simple_tree_hash (const void *item)
+ {
+   tree t = CONST_CAST_TREE ((const_tree) item);
+   enum tree_code code = TREE_CODE (t);
+   hashval_t hashcode = 0;
+   hashcode = iterative_hash_object (code, hashcode);
+   if (TREE_CODE (t) == STRING_CST)
+ {
+   if (TREE_TYPE (t))
+   hashcode = iterative_hash_object (TYPE_HASH (TREE_TYPE (t)), hashcode);
+   hashcode = iterative_hash_hashval_t ((hashval_t)TREE_STRING_LENGTH (t),
+  hashcode);
+   hashcode = iterative_hash (TREE_STRING_POINTER (t),
+TREE_STRING_LENGTH (t), hashcode);
+   return hashcode;
+ }
+   gcc_unreachable ();
+ }
+ 
+ /* Given two trees in VA and VB return true if both trees can be considered
+equal for easy merging across different compilation units.  */
+ static int
+ simple_tree_eq (const void *va, const void *vb)
+ {
+   tree a = CONST_CAST_TREE ((const_tree) va);
+   tree b = CONST_CAST_TREE ((const_tree) vb);
+   if (a == b)
+ return 1;
+   if (TREE_CODE (a) != TREE_CODE (b))
+ return 0;
+   if (TREE_CODE (a) == STRING_CST)
+ {
+   return TREE_TYPE (a) == TREE_TYPE (b)
+&& TREE_STRING_LENGTH (a) == TREE_STRING_LENGTH (b)
+&& !memcmp (TREE_STRING_POINTER (a), TREE_STRING_POINTER (b),
+TREE_STRING_LENGTH (a));
+ }
+   gcc_unreachable ();
+ }
+ 
+ /* Given a tree T return its canonical variant, considering merging
+of equal trees across different compilation units.  */
+ static tree
+ uniquify_top (tree t)
+ {
+   switch (TREE_CODE (t))
+ {
+   case INTEGER_CST:
+   {
+ tree newtype = gimple_register_type (TREE_TYPE (t));
+ if (newtype != TREE_TYPE (t))
+   t = build_int_cst_wide (newtype, TREE_INT_CST_LOW (t),
+   TREE_INT_CST_HIGH (t));
+   }
+ break;
+   case STRING_CST:
+   {
+ tree *t2;
+ if (TREE_TYPE (t))
+   TREE_TYPE (t) = gimple_register_type (TREE_TYPE (t));
+ t2 = (tree *) htab_find_slot (top_decls, t, INSERT);
+ if (*t2)
+   t = *t2;
+ else
+   *t2 = t;
+   }
+ break;
+   default:
+   break;
+ }
+   return t;
+ }
  
  /* Remember that T is a tree that (potentially) refers to a variable
 or function decl that may be replaced with its prevailing variant.  */
*** remember_with_vars (tree t)
*** 247,252 
--- 327,334 
{ \
  if (TYPE_P (tt)) \
(tt) = gimple_register_type (tt); \
+ else\
+   (tt) = uniquify_top (tt); \
  if (VAR_OR_FUNCTION_DECL_P (tt) && TREE_PUBLIC (tt)) \
remember_with_vars (t); \
} \
*** lto_fixup_types (tree t)
*** 504,510 
  /* Given a streamer cache structure DATA_IN (holding a sequence of trees
 for one compilation unit) go over all trees starting at index FROM until 
the
 end of the sequence and replace fields of those trees, and the trees
!themself with their canonical variants as per gimple_register_type.  */
  
  static void
  uniquify_nodes (struct data_in *data_in, unsigned from)
--- 586,593 
  /* Given a streamer cache structure DATA

[PATCH, ARM] Make usage of MOVT/MOVW pairs (vs. constant pool) a tunable parameter

2011-06-01 Thread Julian Brown
This patch allows the usage of MOVT/MOVW pairs (vs. constant-pool loads)
to be controlled based on the target CPU (-mtune= option), using the ARM
backend's tuning infrastructure. This is to enable constant-pool loads
to be used in preference to MOVW/MOVT instruction pairs, when the
former are faster for a given core.

This patch just adds the field to the tuning structure, and adds
(dummy) tuning structures. There should be no effective change in
behaviour.

Testing has not yet completed (but isn't expected to show up anything
untoward). OK to apply?

Thanks,

Julian

ChangeLog

gcc/
* arm-cores.def (arm1156t2-s, arm1156t2f-s): Use v6t2 tuning.
(cortex-a5, cortex-a8, cortex-a15, cortex-r4, cortex-r4f, cortex-m4)
(cortex-m3, cortex-m1, cortex-m0): Use cortex tuning.
* config/arm/arm-protos.h (tune_params): Add prefer_constant_pool
field.
* config/arm/arm.c (arm_slowmul_tune, arm_fastmul_tune)
(arm_xscale_tune, arm_9e_tune, arm_cortex_a9_tune)
(arm_fa726te_tune): Add prefer_constant_pool setting.
(arm_v6t2_tune, arm_cortex_tune): New.
* config/arm/arm.h (TARGET_USE_MOVT): Make dependent on
prefer_constant_pool setting.commit 98c2f384567c54f41244b8869b31d15662afe95e
Author: Julian Brown 
Date:   Fri May 27 10:00:00 2011 -0700

Make usage of constant pool (vs movt) a tunable parameter.

diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
index 0bb9aa3..b315df7 100644
--- a/gcc/config/arm/arm-cores.def
+++ b/gcc/config/arm/arm-cores.def
@@ -122,15 +122,15 @@ ARM_CORE("arm1176jz-s",	  arm1176jzs,	6ZK, FL_LDSCHED, 9e)
 ARM_CORE("arm1176jzf-s",  arm1176jzfs,	6ZK, FL_LDSCHED | FL_VFPV2, 9e)
 ARM_CORE("mpcorenovfp",	  mpcorenovfp,	6K, FL_LDSCHED, 9e)
 ARM_CORE("mpcore",	  mpcore,	6K, FL_LDSCHED | FL_VFPV2, 9e)
-ARM_CORE("arm1156t2-s",	  arm1156t2s,	6T2, FL_LDSCHED, 9e)
-ARM_CORE("arm1156t2f-s",  arm1156t2fs,  6T2, FL_LDSCHED | FL_VFPV2, 9e)
-ARM_CORE("cortex-a5",	  cortexa5,	7A, FL_LDSCHED, 9e)
-ARM_CORE("cortex-a8",	  cortexa8,	7A, FL_LDSCHED, 9e)
+ARM_CORE("arm1156t2-s",	  arm1156t2s,	6T2, FL_LDSCHED, v6t2)
+ARM_CORE("arm1156t2f-s",  arm1156t2fs,  6T2, FL_LDSCHED | FL_VFPV2, v6t2)
+ARM_CORE("cortex-a5",	  cortexa5,	7A, FL_LDSCHED, cortex)
+ARM_CORE("cortex-a8",	  cortexa8,	7A, FL_LDSCHED, cortex)
 ARM_CORE("cortex-a9",	  cortexa9,	7A, FL_LDSCHED, cortex_a9)
-ARM_CORE("cortex-a15",	  cortexa15,	7A, FL_LDSCHED, 9e)
-ARM_CORE("cortex-r4",	  cortexr4,	7R, FL_LDSCHED, 9e)
-ARM_CORE("cortex-r4f",	  cortexr4f,	7R, FL_LDSCHED, 9e)
-ARM_CORE("cortex-m4",	  cortexm4,	7EM, FL_LDSCHED, 9e)
-ARM_CORE("cortex-m3",	  cortexm3,	7M, FL_LDSCHED, 9e)
-ARM_CORE("cortex-m1",	  cortexm1,	6M, FL_LDSCHED, 9e)
-ARM_CORE("cortex-m0",	  cortexm0,	6M, FL_LDSCHED, 9e)
+ARM_CORE("cortex-a15",	  cortexa15,	7A, FL_LDSCHED, cortex)
+ARM_CORE("cortex-r4",	  cortexr4,	7R, FL_LDSCHED, cortex)
+ARM_CORE("cortex-r4f",	  cortexr4f,	7R, FL_LDSCHED, cortex)
+ARM_CORE("cortex-m4",	  cortexm4,	7EM, FL_LDSCHED, cortex)
+ARM_CORE("cortex-m3",	  cortexm3,	7M, FL_LDSCHED, cortex)
+ARM_CORE("cortex-m1",	  cortexm1,	6M, FL_LDSCHED, cortex)
+ARM_CORE("cortex-m0",	  cortexm0,	6M, FL_LDSCHED, cortex)
diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index fa25283..8e0d54d 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -224,6 +224,7 @@ struct tune_params
   int num_prefetch_slots;
   int l1_cache_size;
   int l1_cache_line_size;
+  bool prefer_constant_pool;
 };
 
 extern const struct tune_params *current_tune;
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 1eda45e..8c8982d 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -854,48 +854,73 @@ const struct tune_params arm_slowmul_tune =
 {
   arm_slowmul_rtx_costs,
   NULL,
-  3,
-  ARM_PREFETCH_NOT_BENEFICIAL
+  3,		/* Constant limit.  */
+  ARM_PREFETCH_NOT_BENEFICIAL,
+  true		/* Prefer constant pool.  */
 };
 
 const struct tune_params arm_fastmul_tune =
 {
   arm_fastmul_rtx_costs,
   NULL,
-  1,
-  ARM_PREFETCH_NOT_BENEFICIAL
+  1,		/* Constant limit.  */
+  ARM_PREFETCH_NOT_BENEFICIAL,
+  true		/* Prefer constant pool.  */
 };
 
 const struct tune_params arm_xscale_tune =
 {
   arm_xscale_rtx_costs,
   xscale_sched_adjust_cost,
-  2,
-  ARM_PREFETCH_NOT_BENEFICIAL
+  2,		/* Constant limit.  */
+  ARM_PREFETCH_NOT_BENEFICIAL,
+  true		/* Prefer constant pool.  */
 };
 
 const struct tune_params arm_9e_tune =
 {
   arm_9e_rtx_costs,
   NULL,
-  1,
-  ARM_PREFETCH_NOT_BENEFICIAL
+  1,		/* Constant limit.  */
+  ARM_PREFETCH_NOT_BENEFICIAL,
+  true		/* Prefer constant pool.  */
+};
+
+const struct tune_params arm_v6t2_tune =
+{
+  arm_9e_rtx_costs,
+  NULL,
+  1,		/* Constant limit.  */
+  ARM_PREFETCH_NOT_BENEFICIAL,
+  false		/* Prefer constant pool.  */
+};
+
+/* Generic Cortex tuning.  Use more specific tunings if

[PATCH, ARM] Make branch cost a tunable parameter

2011-06-01 Thread Julian Brown
This patch allows the BRANCH_COST macro to be altered for a given
target using the ARM backend's tuning infrastructure. It's not easy
to reduce the cost to e.g. a single integer or a set of integers (cores
may have different branch costing characteristics for ARM vs. Thumb-2
mode for instance, as in the existing BRANCH_COST definition), so I've
used a function pointer in the tuning structure for maximum flexibility.

This patch just uses the same hook for all existing cores (i.e. it
should result in unchanged behaviour). Later patches can then override
the default in specific cases.

Testing is still in progress. OK to apply, pending success with that?

Thanks,

Julian

ChangeLog

gcc/
* config/arm/arm-protos.h (tune_params): Add branch_cost hook.
* config/arm/arm.c (arm_default_branch_cost): New.
(arm_slowmul_tune, arm_fastmul_tune, arm_xscale_tune, arm_9e_tune)
(arm_v6t2_tune, arm_cortex_tune, arm_cortex_a9_tune)
(arm_fa726_tune): Set branch_cost field using
arm_default_branch_cost.
* config/arm/arm.h (BRANCH_COST): Use branch_cost hook from
current_tune structure.
* dojump.c (tm_p.h): Include file.commit 31c8614cbe32a81960c9d8634f2c06534492b515
Author: Julian Brown 
Date:   Fri May 27 10:39:01 2011 -0700

Make branch cost a tunable parameter.

diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 8e0d54d..c104d74 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -225,6 +225,7 @@ struct tune_params
   int l1_cache_size;
   int l1_cache_line_size;
   bool prefer_constant_pool;
+  int (*branch_cost) (bool, bool);
 };
 
 extern const struct tune_params *current_tune;
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 8c8982d..c7eb5b0 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -255,6 +255,7 @@ static bool arm_builtin_support_vector_misalignment (enum machine_mode mode,
 static void arm_conditional_register_usage (void);
 static reg_class_t arm_preferred_rename_class (reg_class_t rclass);
 static unsigned int arm_autovectorize_vector_sizes (void);
+static int arm_default_branch_cost (bool, bool);
 
 
 /* Table of machine attributes.  */
@@ -856,7 +857,8 @@ const struct tune_params arm_slowmul_tune =
   NULL,
   3,		/* Constant limit.  */
   ARM_PREFETCH_NOT_BENEFICIAL,
-  true		/* Prefer constant pool.  */
+  true,		/* Prefer constant pool.  */
+  arm_default_branch_cost
 };
 
 const struct tune_params arm_fastmul_tune =
@@ -865,7 +867,8 @@ const struct tune_params arm_fastmul_tune =
   NULL,
   1,		/* Constant limit.  */
   ARM_PREFETCH_NOT_BENEFICIAL,
-  true		/* Prefer constant pool.  */
+  true,		/* Prefer constant pool.  */
+  arm_default_branch_cost
 };
 
 const struct tune_params arm_xscale_tune =
@@ -874,7 +877,8 @@ const struct tune_params arm_xscale_tune =
   xscale_sched_adjust_cost,
   2,		/* Constant limit.  */
   ARM_PREFETCH_NOT_BENEFICIAL,
-  true		/* Prefer constant pool.  */
+  true,		/* Prefer constant pool.  */
+  arm_default_branch_cost
 };
 
 const struct tune_params arm_9e_tune =
@@ -883,7 +887,8 @@ const struct tune_params arm_9e_tune =
   NULL,
   1,		/* Constant limit.  */
   ARM_PREFETCH_NOT_BENEFICIAL,
-  true		/* Prefer constant pool.  */
+  true,		/* Prefer constant pool.  */
+  arm_default_branch_cost
 };
 
 const struct tune_params arm_v6t2_tune =
@@ -892,7 +897,8 @@ const struct tune_params arm_v6t2_tune =
   NULL,
   1,		/* Constant limit.  */
   ARM_PREFETCH_NOT_BENEFICIAL,
-  false		/* Prefer constant pool.  */
+  false,	/* Prefer constant pool.  */
+  arm_default_branch_cost
 };
 
 /* Generic Cortex tuning.  Use more specific tunings if appropriate.  */
@@ -902,7 +908,8 @@ const struct tune_params arm_cortex_tune =
   NULL,
   1,		/* Constant limit.  */
   ARM_PREFETCH_NOT_BENEFICIAL,
-  false		/* Prefer constant pool.  */
+  false,	/* Prefer constant pool.  */
+  arm_default_branch_cost
 };
 
 const struct tune_params arm_cortex_a9_tune =
@@ -911,7 +918,8 @@ const struct tune_params arm_cortex_a9_tune =
   cortex_a9_sched_adjust_cost,
   1,		/* Constant limit.  */
   ARM_PREFETCH_BENEFICIAL(4,32,32),
-  false		/* Prefer constant pool.  */
+  false,	/* Prefer constant pool.  */
+  arm_default_branch_cost
 };
 
 const struct tune_params arm_fa726te_tune =
@@ -920,7 +928,8 @@ const struct tune_params arm_fa726te_tune =
   fa726te_sched_adjust_cost,
   1,		/* Constant limit.  */
   ARM_PREFETCH_NOT_BENEFICIAL,
-  true		/* Prefer constant pool.  */
+  true,		/* Prefer constant pool.  */
+  arm_default_branch_cost
 };
 
 
@@ -8080,6 +8089,15 @@ arm_adjust_cost (rtx insn, rtx link, rtx dep, int cost)
   return cost;
 }
 
+static int
+arm_default_branch_cost (bool speed_p, bool predictable_p ATTRIBUTE_UNUSED)
+{
+  if (TARGET_32BIT)
+return (TARGET_THUMB2 && !speed_p) ? 1 : 4;
+  else
+return (optimize > 0) ? 2 : 0;
+}
+
 static int fp_consts_inited =

Re: [PATCH][RFC] Init sizetypes based on target defs

2011-06-01 Thread Richard Guenther
On Tue, 31 May 2011, Richard Guenther wrote:

> 
> This initializes sizetypes correctly from the start, using target
> definitions available.  All Frontends initialize sizetypes from
> size_type_node for which there is a target macro SIZE_TYPE which
> tells what type to use for this (C runtime ABI) type.
> 
> Now, there are two frontends who do not honor SIZE_TYPE but
> have an idea on its own.  That's Java (probably by accident) and
> Ada (of course).  Java does
> 
>   /* This is not a java type, however tree-dfa requires a definition for
>  size_type_node.  */
>   size_type_node = make_unsigned_type (POINTER_SIZE);
>   set_sizetype (size_type_node);
> 
> so the FE itself doesn't care and POINTER_SIZE for almost all targets
> yields the same result as following the SIZE_TYPE advice.  Ada has
> its own idea and thinks it can choose size_t freely,
> 
>   /* In Ada, we use the unsigned type corresponding to the width of Pmode 
> as
>  SIZETYPE.  In most cases when ptr_mode and Pmode differ, C will use 
> the
>  width of ptr_mode for SIZETYPE, but we get better code using the 
> width
>  of Pmode.  Note that, although we manipulate negative offsets for 
> some
>  internal constructs and rely on compile time overflow detection in 
> size
>  computations, using unsigned types for SIZETYPEs is fine since they 
> are
>  treated specially by the middle-end, in particular sign-extended.  */
>   size_type_node = gnat_type_for_mode (Pmode, 1);
>   set_sizetype (size_type_node);
>   TYPE_NAME (sizetype) = get_identifier ("size_type");
> 
> hmm, yes.  Again practically for most targets size_t will be following
> its SIZE_TYPE advice, but surely not for all.  OTOH while the above
> clearly doesn't look "accidential", it certainly looks wrong.  If
> not for sizetype then at least for size_type_node.  The comment hints
> that the patch at most will no longer "get better code", but if
> Pmode gets better code when used for sizetype(!) then we should do
> so unconditionally and could get rid of the size_t reverse-engineering
> in initialize_sizetypes completely (m32c might disagree here).
> 
> Not yet bootstrapped or tested (but I don't expect any issues other
> than eventual typos on the targets I have access to).
> 
> Now, any objections?  (Patch to be adjusted to really remove
> all set_sizetype calls)

And this one, ontop of the previously posted patch to defer things
to the middle-end, passed bootstrap and regtest for all languages
on x86_64-unknown-linux-gnu.

Richard.

2011-05-31  Richard Guenther  

* stor-layout.c (initialize_sizetypes): Initialize all
sizetypes based on target definitions.
(set_sizetype): Remove.

Index: gcc/stor-layout.c
===
*** gcc/stor-layout.c.orig  2011-06-01 15:41:56.0 +0200
--- gcc/stor-layout.c   2011-06-01 16:14:03.0 +0200
*** make_accum_type (int precision, int unsi
*** 2189,2216 
return type;
  }
  
! /* Initialize sizetype and bitsizetype to a reasonable and temporary
!value to enable integer types to be created.  */
  
  void
  initialize_sizetypes (void)
  {
!   tree t = make_node (INTEGER_TYPE);
!   int precision = GET_MODE_BITSIZE (SImode);
  
!   SET_TYPE_MODE (t, SImode);
!   TYPE_ALIGN (t) = GET_MODE_ALIGNMENT (SImode);
!   TYPE_IS_SIZETYPE (t) = 1;
!   TYPE_UNSIGNED (t) = 1;
!   TYPE_SIZE (t) = build_int_cst (t, precision);
!   TYPE_SIZE_UNIT (t) = build_int_cst (t, GET_MODE_SIZE (SImode));
!   TYPE_PRECISION (t) = precision;
  
!   set_min_and_max_values_for_integral_type (t, precision,
/*is_unsigned=*/true);
  
!   sizetype = t;
!   bitsizetype = build_distinct_type_copy (t);
  }
  
  /* Make sizetype a version of TYPE, and initialize *sizetype accordingly.
--- 2189,2258 
return type;
  }
  
! /* Initialize sizetypes so layout_type can use them.  */
  
  void
  initialize_sizetypes (void)
  {
!   int precision, bprecision;
  
!   /* Get sizetypes precision from the SIZE_TYPE target macro.  */
!   if (strcmp (SIZE_TYPE, "unsigned int") == 0)
! precision = INT_TYPE_SIZE;
!   else if (strcmp (SIZE_TYPE, "long unsigned int") == 0)
! precision = LONG_TYPE_SIZE;
!   else if (strcmp (SIZE_TYPE, "long long unsigned int") == 0)
! precision = LONG_LONG_TYPE_SIZE;
!   else
! gcc_unreachable ();
! 
!   bprecision
! = MIN (precision + BITS_PER_UNIT_LOG + 1, MAX_FIXED_MODE_SIZE);
!   bprecision
! = GET_MODE_PRECISION (smallest_mode_for_size (bprecision, MODE_INT));
!   if (bprecision > HOST_BITS_PER_WIDE_INT * 2)
! bprecision = HOST_BITS_PER_WIDE_INT * 2;
! 
!   /* Create stubs for sizetype and bitsizetype so we can create constants.  */
!   sizetype = make_node (INTEGER_TYPE);
!   /* ???  We can't set a name for sizetype because it appears in C diagnostics
!  and pp_c_type_specifier doesn't deal with IDENTIFIER_NODE TYPE_NAMEs.  */
!   TYPE_PRECISIO

Re: better wpa [2/n]: merge some top-level trees

2011-06-01 Thread Richard Guenther
On Wed, Jun 1, 2011 at 5:07 PM, Michael Matz  wrote:
> Hi,
>
> On Wed, 1 Jun 2011, Michael Matz wrote:
>
>> Hi,
>>
>> so here's something more of my patch queue.  It adds the facility to merge
>> also other trees than types over compilation unit borders.  This specific
>> patch only deals with STRING_CST and INTEGER_CST nodes.  Originally I used
>> that place for merging declarations, hence the naming of some variables.
>> That needs to wait for adjustments of the cgraph/varpool streamer.
>>
>> On building cc1 the overall numbers of trees that stay around after all
>> global sections are streamed in goes down from 620818 to 598843 trees,
>> overall saving 1 MB (out of 76 MB for just the trees).  Not much, but
>> hey...
>>
>> Regstrapping on x86_64-linux in progress.  Okay for trunk?

Ok.

Thanks,
Richard.

> Ahem.
>
>        * lto.c (top_decls): New hash table.
>        (simple_tree_hash, simple_tree_eq, uniquify_top): New.
>        (LTO_FIXUP_TREE): Call uniquify_top.
>        (uniquify_nodes): Ditto.
>        (read_cgraph_and_symbols): Allocate and destroy top_decls.
>
> Index: lto/lto.c
> ===
> *** lto/lto.c   (revision 174524)
> --- lto/lto.c   (working copy)
> *** lto_read_in_decl_state (struct data_in *
> *** 231,236 
> --- 231,316 
>     that must be replaced with their prevailing variant.  */
>  static GTY((if_marked ("ggc_marked_p"), param_is (union tree_node))) htab_t
>    tree_with_vars;
> + /* A hashtable of top-level trees that can potentially be merged with trees
> +    from other compilation units.  */
> + static GTY((if_marked ("ggc_marked_p"), param_is (union tree_node))) htab_t
> +   top_decls;
> +
> + /* Given a tree in ITEM, return a hash value designed to easily recognize
> +    equal trees.  */
> + static hashval_t
> + simple_tree_hash (const void *item)
> + {
> +   tree t = CONST_CAST_TREE ((const_tree) item);
> +   enum tree_code code = TREE_CODE (t);
> +   hashval_t hashcode = 0;
> +   hashcode = iterative_hash_object (code, hashcode);
> +   if (TREE_CODE (t) == STRING_CST)
> +     {
> +       if (TREE_TYPE (t))
> +       hashcode = iterative_hash_object (TYPE_HASH (TREE_TYPE (t)), 
> hashcode);
> +       hashcode = iterative_hash_hashval_t ((hashval_t)TREE_STRING_LENGTH 
> (t),
> +                                          hashcode);
> +       hashcode = iterative_hash (TREE_STRING_POINTER (t),
> +                                TREE_STRING_LENGTH (t), hashcode);
> +       return hashcode;
> +     }
> +   gcc_unreachable ();
> + }
> +
> + /* Given two trees in VA and VB return true if both trees can be considered
> +    equal for easy merging across different compilation units.  */
> + static int
> + simple_tree_eq (const void *va, const void *vb)
> + {
> +   tree a = CONST_CAST_TREE ((const_tree) va);
> +   tree b = CONST_CAST_TREE ((const_tree) vb);
> +   if (a == b)
> +     return 1;
> +   if (TREE_CODE (a) != TREE_CODE (b))
> +     return 0;
> +   if (TREE_CODE (a) == STRING_CST)
> +     {
> +       return TREE_TYPE (a) == TREE_TYPE (b)
> +            && TREE_STRING_LENGTH (a) == TREE_STRING_LENGTH (b)
> +            && !memcmp (TREE_STRING_POINTER (a), TREE_STRING_POINTER (b),
> +                        TREE_STRING_LENGTH (a));
> +     }
> +   gcc_unreachable ();
> + }
> +
> + /* Given a tree T return its canonical variant, considering merging
> +    of equal trees across different compilation units.  */
> + static tree
> + uniquify_top (tree t)
> + {
> +   switch (TREE_CODE (t))
> +     {
> +       case INTEGER_CST:
> +       {
> +         tree newtype = gimple_register_type (TREE_TYPE (t));
> +         if (newtype != TREE_TYPE (t))
> +           t = build_int_cst_wide (newtype, TREE_INT_CST_LOW (t),
> +                                   TREE_INT_CST_HIGH (t));
> +       }
> +         break;
> +       case STRING_CST:
> +       {
> +         tree *t2;
> +         if (TREE_TYPE (t))
> +           TREE_TYPE (t) = gimple_register_type (TREE_TYPE (t));
> +         t2 = (tree *) htab_find_slot (top_decls, t, INSERT);
> +         if (*t2)
> +           t = *t2;
> +         else
> +           *t2 = t;
> +       }
> +         break;
> +       default:
> +       break;
> +     }
> +   return t;
> + }
>
>  /* Remember that T is a tree that (potentially) refers to a variable
>     or function decl that may be replaced with its prevailing variant.  */
> *** remember_with_vars (tree t)
> *** 247,252 
> --- 327,334 
>        { \
>          if (TYPE_P (tt)) \
>            (tt) = gimple_register_type (tt); \
> +         else\
> +           (tt) = uniquify_top (tt); \
>          if (VAR_OR_FUNCTION_DECL_P (tt) && TREE_PUBLIC (tt)) \
>            remember_with_vars (t); \
>        } \
> *** lto_fixup_types (tree t)
> *** 504,510 
>  /* Given a streamer cache structure DATA_IN (holding a sequence of trees
>     for one compilation unit) go over all trees starting at index FROM

[rfa] Give thunks correct RESULT_DECL

2011-06-01 Thread Michael Matz
Hi,

I noticed this a while ago while working on early merging of decls.  When 
we build thunk decls ourself we give RESULT_DECL of it integer_type, even 
when the thunk decl itself says something else.  (In particular thunks can 
very well return void or a pointer type).  This fixes that glitch.

Regstrapping in progress (on top the wpa[2/n] patch) on x86_64-linux.  
Okay for trunk?


Ciao,
Michael.
-
* cgraphunit.c (assemble_thunk): Use correct return type.

Index: cgraphunit.c
===
*** cgraphunit.c(revision 174523)
--- cgraphunit.c(working copy)
*** assemble_thunk (struct cgraph_node *node
*** 1412,1421 
  {
const char *fnname;
tree fn_block;

DECL_RESULT (thunk_fndecl)
!   = build_decl (DECL_SOURCE_LOCATION (thunk_fndecl),
! RESULT_DECL, 0, integer_type_node);
fnname = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (thunk_fndecl));
  
/* The back end expects DECL_INITIAL to contain a BLOCK, so we
--- 1412,1422 
  {
const char *fnname;
tree fn_block;
+   tree restype = TREE_TYPE (TREE_TYPE (thunk_fndecl));

DECL_RESULT (thunk_fndecl)
! = build_decl (DECL_SOURCE_LOCATION (thunk_fndecl),
!   RESULT_DECL, 0, restype);
fnname = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (thunk_fndecl));
  
/* The back end expects DECL_INITIAL to contain a BLOCK, so we


Re: [rfa] Give thunks correct RESULT_DECL

2011-06-01 Thread Richard Guenther
On Wed, Jun 1, 2011 at 5:36 PM, Michael Matz  wrote:
> Hi,
>
> I noticed this a while ago while working on early merging of decls.  When
> we build thunk decls ourself we give RESULT_DECL of it integer_type, even
> when the thunk decl itself says something else.  (In particular thunks can
> very well return void or a pointer type).  This fixes that glitch.
>
> Regstrapping in progress (on top the wpa[2/n] patch) on x86_64-linux.
> Okay for trunk?

Ok.

Thanks,
Richard.

>
> Ciao,
> Michael.
> -
>        * cgraphunit.c (assemble_thunk): Use correct return type.
>
> Index: cgraphunit.c
> ===
> *** cgraphunit.c        (revision 174523)
> --- cgraphunit.c        (working copy)
> *** assemble_thunk (struct cgraph_node *node
> *** 1412,1421 
>      {
>        const char *fnname;
>        tree fn_block;
>
>        DECL_RESULT (thunk_fndecl)
> !       = build_decl (DECL_SOURCE_LOCATION (thunk_fndecl),
> !                     RESULT_DECL, 0, integer_type_node);
>        fnname = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (thunk_fndecl));
>
>        /* The back end expects DECL_INITIAL to contain a BLOCK, so we
> --- 1412,1422 
>      {
>        const char *fnname;
>        tree fn_block;
> +       tree restype = TREE_TYPE (TREE_TYPE (thunk_fndecl));
>
>        DECL_RESULT (thunk_fndecl)
> !         = build_decl (DECL_SOURCE_LOCATION (thunk_fndecl),
> !                       RESULT_DECL, 0, restype);
>        fnname = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (thunk_fndecl));
>
>        /* The back end expects DECL_INITIAL to contain a BLOCK, so we
>


Re: C6X port 5/11: Track predication conditions more accurately

2011-06-01 Thread Steve Ellcey
On Wed, 2011-06-01 at 12:18 +0400, Andrey Belevantsev wrote:
> On 31.05.2011 23:59, Andrey Belevantsev wrote:
> >
> > On 31.05.2011 22:24, Steve Ellcey wrote:
> >> Bernd,
> >>
> >> This patch (r174336) is causing me many testsuite failures on IA64.
> >> Tests like gcc.c-torture/compile/20010408-1.c are dying with a
> >> seg fault in vinsn_detach.
> > I will look at it tomorrow. Bernd, Steve, please let us know about any
> > issues with sel-sched code so we can help.
> I cannot reproduce this with today's trunk with a cross either to 
> ia64-linux or ia64-hpux, can you give me a test case with compiler options 
> etc.?
> 
> Andrey

gcc.c-torture/compile/20010408-1.c was the test case that the stack
trace was from.  That test failed on IA64 HP-UX with just the -O3
option.  It looks like that passes in 64 bit mode though (-mlp64) so
that is probably why it also doesn't fail on Linux.

It looks like the failures on IA64 Linux require -g or
-fomit-frame-pointer in addition to -O3 in order to have the failure.

So for example, gcc.c-torture/execute/20020402-3.c, fails on IA64
Linux with -O3 -fomit-frame-pointer or with -O3 -g.

Steve Ellcey
s...@cup.hp.com



[pph] Add new C test case (issue4559064)

2011-06-01 Thread Diego Novillo
This test case from the C testsuite started working after the last
merge from trunk.

Collin, this is the test I was referring to yesterday.  I'm going to
add some more C test cases to the testsuite that are currently not
working.  I think that's going to make our fixing job easier
(otherwise, it's not trivial to test these test cases outside my own
local tree).

Tested on x86_64.  Committed to pph.

Diego.


* g++.dg/pph/c1return-5.cc: New.
* g++.dg/pph/c1return-5.h: New.

diff --git a/gcc/testsuite/g++.dg/pph/c1return-5.cc 
b/gcc/testsuite/g++.dg/pph/c1return-5.cc
new file mode 100644
index 000..804e113
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pph/c1return-5.cc
@@ -0,0 +1 @@
+#include "c1return-5.h"
diff --git a/gcc/testsuite/g++.dg/pph/c1return-5.h 
b/gcc/testsuite/g++.dg/pph/c1return-5.h
new file mode 100644
index 000..8b72a51
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pph/c1return-5.h
@@ -0,0 +1,16 @@
+#ifndef __PPH_GUARD_H
+#define __PPH_GUARD_H
+/* { dg-options "-mpreferred-stack-boundary=4" } */
+/* { dg-final { scan-assembler-not "and\[lq\]?\[^\\n\]*-64,\[^\\n\]*sp" } } */
+
+/* This compile only test is to detect an assertion failure in stack branch
+   development.  */
+struct bar
+{
+  int x;
+} __attribute__((aligned(64)));
+
+
+struct bar
+foo (void) { }
+#endif

--
This patch is available for review at http://codereview.appspot.com/4559064


Re: [PATCH][RFC] Init sizetypes based on target defs

2011-06-01 Thread Eric Botcazou
> This initializes sizetypes correctly from the start, using target
> definitions available.  All Frontends initialize sizetypes from
> size_type_node for which there is a target macro SIZE_TYPE which
> tells what type to use for this (C runtime ABI) type.

And this is a prerequisite if you want to do LTO in the language; otherwise, 
LTO doesn't work at all, for example for Ada on the 4.5 branch.

> so the FE itself doesn't care and POINTER_SIZE for almost all targets
> yields the same result as following the SIZE_TYPE advice.  Ada has
> its own idea and thinks it can choose size_t freely,

Yes, like for boolean_type_node, you can set size_type_node to whatever you 
want as long as you don't do LTO.  At least it must be unsigned now.

> hmm, yes.  Again practically for most targets size_t will be following
> its SIZE_TYPE advice, but surely not for all.  OTOH while the above
> clearly doesn't look "accidential", it certainly looks wrong.  If
> not for sizetype then at least for size_type_node.  The comment hints
> that the patch at most will no longer "get better code", but if
> Pmode gets better code when used for sizetype(!) then we should do
> so unconditionally and could get rid of the size_t reverse-engineering
> in initialize_sizetypes completely (m32c might disagree here).

The thing is, I don't think you can have different types for size_type_node and 
the *sizetype series.  So, while for the C family of language, you are forced 
to use SIZE_TYPE for both because of size_t, you still need to have the same 
type for the other languages.

> Now, any objections?  (Patch to be adjusted to really remove
> all set_sizetype calls)

Fine with me at least.  When I was changing the signedness of sizetype in Ada, 
I hesitated to remove the Pmode vs ptr_mode kludge and eventually erred on the 
side of conservatism.  But it clearly needs to go and now seems a good time.

-- 
Eric Botcazou


[PATCH, ARM] Cortex-A5 tuning [1/2] - branch costs

2011-06-01 Thread Julian Brown
This patch overrides the branch cost for Cortex-A5 cores, building on
the previous patch:

  http://gcc.gnu.org/ml/gcc-patches/2011-06/msg00045.html

(And also depending on:

  http://gcc.gnu.org/ml/gcc-patches/2011-06/msg00044.html

to apply correctly.)

The rationale is as follows: branches are pretty much the only
instructions which can dual-issued on Cortex-A5. This makes them
relatively cheap: in particular, cheaper than long sequences of
conditionally-executed instructions. Setting the cost to zero was
experimentally determined to work better than one (or several other
values).

Together with the follow-up patch to tweak the value of
max_insns_skipped (for the arm_final_prescan_insn function), we obtain
(on a popular embedded benchmark, geometric mean improvement):

  * 2.75% improvement in ARM mode (~0.9% with just this patch).

  * 0.91% improvement in Thumb-2 mode.

Caveat: based on only a single test run, although previous benchmarking
(on a 4.5-based branch IIRC) showed similar improvements.

Testing still in progress. OK to apply?

Thanks,

Julian

ChangeLog

gcc/
* config/arm/arm-cores.def (cortex-a5): Use cortex_a5 tuning.
* config/arm/arm.c (arm_cortex_a5_branch_cost): New.
(arm_cortex_a5_tune): New.commit c027c802ea85090f54df7432709f12be33226266
Author: Julian Brown 
Date:   Fri May 27 11:05:49 2011 -0700

Branch cost for Cortex-A5.

diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
index b315df7..4ff2324 100644
--- a/gcc/config/arm/arm-cores.def
+++ b/gcc/config/arm/arm-cores.def
@@ -124,7 +124,7 @@ ARM_CORE("mpcorenovfp",	  mpcorenovfp,	6K, FL_LDSCHED, 9e)
 ARM_CORE("mpcore",	  mpcore,	6K, FL_LDSCHED | FL_VFPV2, 9e)
 ARM_CORE("arm1156t2-s",	  arm1156t2s,	6T2, FL_LDSCHED, v6t2)
 ARM_CORE("arm1156t2f-s",  arm1156t2fs,  6T2, FL_LDSCHED | FL_VFPV2, v6t2)
-ARM_CORE("cortex-a5",	  cortexa5,	7A, FL_LDSCHED, cortex)
+ARM_CORE("cortex-a5",	  cortexa5,	7A, FL_LDSCHED, cortex_a5)
 ARM_CORE("cortex-a8",	  cortexa8,	7A, FL_LDSCHED, cortex)
 ARM_CORE("cortex-a9",	  cortexa9,	7A, FL_LDSCHED, cortex_a9)
 ARM_CORE("cortex-a15",	  cortexa15,	7A, FL_LDSCHED, cortex)
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index c7eb5b0..cd3f104 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -256,6 +256,7 @@ static void arm_conditional_register_usage (void);
 static reg_class_t arm_preferred_rename_class (reg_class_t rclass);
 static unsigned int arm_autovectorize_vector_sizes (void);
 static int arm_default_branch_cost (bool, bool);
+static int arm_cortex_a5_branch_cost (bool, bool);
 
 
 /* Table of machine attributes.  */
@@ -912,6 +913,16 @@ const struct tune_params arm_cortex_tune =
   arm_default_branch_cost
 };
 
+const struct tune_params arm_cortex_a5_tune =
+{
+  arm_9e_rtx_costs,
+  NULL,
+  1,		/* Constant limit.  */
+  ARM_PREFETCH_NOT_BENEFICIAL,
+  false,	/* Prefer constant pool.  */
+  arm_cortex_a5_branch_cost
+};
+
 const struct tune_params arm_cortex_a9_tune =
 {
   arm_9e_rtx_costs,
@@ -8098,6 +8109,12 @@ arm_default_branch_cost (bool speed_p, bool predictable_p ATTRIBUTE_UNUSED)
 return (optimize > 0) ? 2 : 0;
 }
 
+static int
+arm_cortex_a5_branch_cost (bool speed_p, bool predictable_p)
+{
+  return speed_p ? 0 : arm_default_branch_cost (speed_p, predictable_p);
+}
+
 static int fp_consts_inited = 0;
 
 /* Only zero is valid for VFP.  Other values are also valid for FPA.  */


[PATCH, ARM] Cortex-A5 tuning [2/2] - tweak instruction conditionalisation

2011-06-01 Thread Julian Brown
This patch tweaks the behaviour of arm_final_prescan_insn when tuning
for Cortex-A5 cores, since branches are cheaper than long sequences of
conditionalised instructions on those processors. As posted in the
previous patch, this provides a measurable increase in performance on a
popular embedded benchmark.

(I didn't use the tuning infrastructure for this one, though it could
easily be changed to do so, now I come to think of it.)

Testing is still in progress. OK to apply, pending success with that?

Thanks,

Julian

ChangeLog

gcc/
* config/arm/arm.c (arm_tune_cortex_a5): New variable.
(arm_option_override): Use above. Set max_insns_skipped to 1 when
tuning for Cortex-A5.
* config/arm/arm.h (arm_tune_cortex_a5): Add declaration.commit 094f41f1d05322d24b76c7a680219a8549a9e717
Author: Julian Brown 
Date:   Fri May 27 11:26:57 2011 -0700

Tune max_insns_skipped for conditionalization for Cortex-A5.

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index cd3f104..22b2a1d 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -763,6 +763,9 @@ int arm_tune_xscale = 0;
This typically means an ARM6 or ARM7 with MMU or MPU.  */
 int arm_tune_wbuf = 0;
 
+/* Nonzero if tuning for Cortex-A5.  */
+int arm_tune_cortex_a5 = 0;
+
 /* Nonzero if tuning for Cortex-A9.  */
 int arm_tune_cortex_a9 = 0;
 
@@ -1495,6 +1498,7 @@ arm_option_override (void)
   arm_tune_xscale = (tune_flags & FL_XSCALE) != 0;
   arm_arch_iwmmxt = (insn_flags & FL_IWMMXT) != 0;
   arm_arch_hwdiv = (insn_flags & FL_DIV) != 0;
+  arm_tune_cortex_a5 = (arm_tune == cortexa5) != 0;
   arm_tune_cortex_a9 = (arm_tune == cortexa9) != 0;
 
   /* If we are not using the default (ARM mode) section anchor offset
@@ -1737,6 +1741,11 @@ arm_option_override (void)
  that is worth skipping is shorter.  */
   if (arm_tune_strongarm)
 max_insns_skipped = 3;
+
+  /* Branches can be dual-issued on Cortex-A5, so conditional execution is
+	 less appealing.  */
+  if (arm_tune_cortex_a5)
+max_insns_skipped = 1;
 }
 
   /* Hot/Cold partitioning is not currently supported, since we can't
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index ae6b39c..f4c34c1 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -418,6 +418,9 @@ extern int arm_tune_xscale;
 /* Nonzero if tuning for stores via the write buffer.  */
 extern int arm_tune_wbuf;
 
+/* Nonzero if tuning for Cortex-A5.  */
+extern int arm_tune_cortex_a5;
+
 /* Nonzero if tuning for Cortex-A9.  */
 extern int arm_tune_cortex_a9;
 


Re: [PATCH] ENTRY_VALUE fixes (PR debug/48203)

2011-06-01 Thread Richard Henderson
On 06/01/2011 07:25 AM, Jakub Jelinek wrote:
> 2011-06-01  Jakub Jelinek  
> 
>   * var-tracking.c (create_entry_value): New function.
>   (vt_add_function_parameter): Use it.

Ok.


r~


Re: [PATCH] Decrease size of mem_loc_descriptor

2011-06-01 Thread Richard Henderson
On 06/01/2011 07:29 AM, Jakub Jelinek wrote:
> 2011-06-01  Jakub Jelinek  
> 
>   * dwarf2out.c (compare_loc_descriptor, scompare_loc_descriptor,
>   ucompare_loc_descriptor, minmax_loc_descriptor, clz_loc_descriptor,
>   popcount_loc_descriptor, bswap_loc_descriptor, rotate_loc_descriptor):
>   New functions.
>   (mem_loc_descriptor): Use them.

Ok.


r~


Re: [PATCH, ARM] Make usage of MOVT/MOVW pairs (vs. constant pool) a tunable parameter

2011-06-01 Thread Richard Earnshaw

On Wed, 2011-06-01 at 16:24 +0100, Julian Brown wrote:
> This patch allows the usage of MOVT/MOVW pairs (vs. constant-pool loads)
> to be controlled based on the target CPU (-mtune= option), using the ARM
> backend's tuning infrastructure. This is to enable constant-pool loads
> to be used in preference to MOVW/MOVT instruction pairs, when the
> former are faster for a given core.
> 
> This patch just adds the field to the tuning structure, and adds
> (dummy) tuning structures. There should be no effective change in
> behaviour.
> 
> Testing has not yet completed (but isn't expected to show up anything
> untoward). OK to apply?
> 
> Thanks,
> 
> Julian
> 
> ChangeLog
> 
> gcc/
> * arm-cores.def (arm1156t2-s, arm1156t2f-s): Use v6t2 tuning.
> (cortex-a5, cortex-a8, cortex-a15, cortex-r4, cortex-r4f, cortex-m4)
> (cortex-m3, cortex-m1, cortex-m0): Use cortex tuning.
> * config/arm/arm-protos.h (tune_params): Add prefer_constant_pool
> field.
> * config/arm/arm.c (arm_slowmul_tune, arm_fastmul_tune)
> (arm_xscale_tune, arm_9e_tune, arm_cortex_a9_tune)
> (arm_fa726te_tune): Add prefer_constant_pool setting.
> (arm_v6t2_tune, arm_cortex_tune): New.
> * config/arm/arm.h (TARGET_USE_MOVT): Make dependent on
> prefer_constant_pool setting.

OK.

R.




Re: [PATCH, ARM] Make branch cost a tunable parameter

2011-06-01 Thread Richard Earnshaw

On Wed, 2011-06-01 at 16:24 +0100, Julian Brown wrote:
> This patch allows the BRANCH_COST macro to be altered for a given
> target using the ARM backend's tuning infrastructure. It's not easy
> to reduce the cost to e.g. a single integer or a set of integers (cores
> may have different branch costing characteristics for ARM vs. Thumb-2
> mode for instance, as in the existing BRANCH_COST definition), so I've
> used a function pointer in the tuning structure for maximum flexibility.
> 
> This patch just uses the same hook for all existing cores (i.e. it
> should result in unchanged behaviour). Later patches can then override
> the default in specific cases.
> 
> Testing is still in progress. OK to apply, pending success with that?
> 
> Thanks,
> 
> Julian
> 
> ChangeLog
> 
> gcc/
> * config/arm/arm-protos.h (tune_params): Add branch_cost hook.
> * config/arm/arm.c (arm_default_branch_cost): New.
> (arm_slowmul_tune, arm_fastmul_tune, arm_xscale_tune, arm_9e_tune)
> (arm_v6t2_tune, arm_cortex_tune, arm_cortex_a9_tune)
> (arm_fa726_tune): Set branch_cost field using
> arm_default_branch_cost.
> * config/arm/arm.h (BRANCH_COST): Use branch_cost hook from
> current_tune structure.
> * dojump.c (tm_p.h): Include file.

OK.

R.




Re: [PATCH, ARM] Cortex-A5 tuning [1/2] - branch costs

2011-06-01 Thread Richard Earnshaw

On Wed, 2011-06-01 at 16:49 +0100, Julian Brown wrote:
> This patch overrides the branch cost for Cortex-A5 cores, building on
> the previous patch:
> 
>   http://gcc.gnu.org/ml/gcc-patches/2011-06/msg00045.html
> 
> (And also depending on:
> 
>   http://gcc.gnu.org/ml/gcc-patches/2011-06/msg00044.html
> 
> to apply correctly.)
> 
> The rationale is as follows: branches are pretty much the only
> instructions which can dual-issued on Cortex-A5. This makes them
> relatively cheap: in particular, cheaper than long sequences of
> conditionally-executed instructions. Setting the cost to zero was
> experimentally determined to work better than one (or several other
> values).
> 
> Together with the follow-up patch to tweak the value of
> max_insns_skipped (for the arm_final_prescan_insn function), we obtain
> (on a popular embedded benchmark, geometric mean improvement):
> 
>   * 2.75% improvement in ARM mode (~0.9% with just this patch).
> 
>   * 0.91% improvement in Thumb-2 mode.
> 
> Caveat: based on only a single test run, although previous benchmarking
> (on a 4.5-based branch IIRC) showed similar improvements.
> 
> Testing still in progress. OK to apply?
> 
> Thanks,
> 
> Julian
> 
> ChangeLog
> 
> gcc/
> * config/arm/arm-cores.def (cortex-a5): Use cortex_a5 tuning.
> * config/arm/arm.c (arm_cortex_a5_branch_cost): New.
> (arm_cortex_a5_tune): New.

OK.

R.




Re: [PATCH][all-langs] Defer size_t and sizetype setting to the middle-end

2011-06-01 Thread Eric Botcazou
>   ada/
>   * gcc-interface/misc.c (gnat_init): Do not set
>   size_type_node or call set_sizetype.

OK, thanks.

-- 
Eric Botcazou


Re: [PATCH, ARM] Cortex-A5 tuning [2/2] - tweak instruction conditionalisation

2011-06-01 Thread Richard Earnshaw

On Wed, 2011-06-01 at 16:49 +0100, Julian Brown wrote:
> This patch tweaks the behaviour of arm_final_prescan_insn when tuning
> for Cortex-A5 cores, since branches are cheaper than long sequences of
> conditionalised instructions on those processors. As posted in the
> previous patch, this provides a measurable increase in performance on a
> popular embedded benchmark.
> 
> (I didn't use the tuning infrastructure for this one, though it could
> easily be changed to do so, now I come to think of it.)
> 
> Testing is still in progress. OK to apply, pending success with that?
> 
> Thanks,
> 
> Julian
> 
> ChangeLog
> 
> gcc/
> * config/arm/arm.c (arm_tune_cortex_a5): New variable.
> (arm_option_override): Use above. Set max_insns_skipped to 1 when
> tuning for Cortex-A5.
> * config/arm/arm.h (arm_tune_cortex_a5): Add declaration.

I would much prefer that this was done through the tuning
infrastructure.  If one core likes it this way, there's a strong chance
of another one coming along that has similar preferences.

R.




Re: [build] Move MD_UNWIND_SUPPORT to toplevel libgcc

2011-06-01 Thread Rainer Orth
Mike Stump  writes:

> On May 30, 2011, at 8:43 AM, Rainer Orth wrote:
>> * The three users of MD_UNWIND_SUPPORT are modified to unconditionally
>>  include a new md-unwind-support.h header which is created from the
>>  info in config.host: if md_unwind_header exists, it is included in
>>  md-unwind-support.h, otherwise the generated header is empty.
>
>> diff --git a/gcc/config/rs6000/darwin.h b/gcc/config/rs6000/darwin.h
>> --- a/gcc/config/rs6000/darwin.h
>> +++ b/gcc/config/rs6000/darwin.h
>> @@ -381,10 +381,6 @@ extern int darwin_emit_branch_islands;
>> #include 
>> #endif
>> 
>> -#if !defined(__LP64__) && !defined(DARWIN_LIBSYSTEM_HAS_UNWIND)
>> -#define MD_UNWIND_SUPPORT "config/rs6000/darwin-unwind.h"
>> -#endif
>> -
>
> So, I'm wondering, can we just roll this check into the header, so instead of:
>
> #if A
> #include file
> #endif
>
> file:
> bla
>
> we have:
>
> #include file
>
> file:
> #if A
> bla
> #endif
>
> The advantages, any wrapping code is handled the exact same way.  Once this 
> is done, then the transformation to port is identical to every other port.  
> Also, this general rule would apply to the other corner cases as well, if I 
> read them right.
>
> ?

The problem with this approach is that some of the macros tested only
live in gcc, not libgcc once the libgcc sources no longer include tm.h
etc.  E.g. look at i386/mingw32.h:

#if !TARGET_64BIT_DEFAULT && !defined (TARGET_BI_ARCH)
#define MD_UNWIND_SUPPORT "config/i386/w32-unwind.h"
#endif

Both TARGET_64BIT_DEFAULT and TARGET_BI_ARCH live in gcc only, so at
least in the medium term, we need different tests here.

> Oh, once this is done, I think:
>
> /* libSystem contains unwind information for signal frames.  */
> #define DARWIN_LIBSYSTEM_HAS_UNWIND
>
> is only used by libgcc.  Does it have to move at the same time?  If so, then 
> it needs moving.  If it doesn't have to move, you can leave it behind if you 
> want, though my preference would be to move it.

It doesn't have to, but it could.  On the other hand, my question still
stands: DARWIN_LIBSYSTEM_HAS_UNWIND is defined in gcc/config/darwin9.h.
So if every release up to Darwin 8 on PowerPC is 32-bit only (I honestly
don't know), then we could just restrict rs6000/darwin-unwind.h to
darwin < 9 and be done with it, no need for the macros above.

> I think the darwin bits are Ok with this change.

I can certainly do it this way for now, but if we could do away with the
tests completely, that would be cleaner.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [lto] Merge streamer hooks from pph branch. (issue4568043)

2011-06-01 Thread Diego Novillo
On Wed, Jun 1, 2011 at 08:07, Richard Guenther  wrote:

>>  static void cgraph_expand_all_functions (void);
>>  static void cgraph_mark_functions_to_output (void);
>> @@ -1092,6 +1093,10 @@ cgraph_finalize_compilation_unit (void)
>>  {
>>    timevar_push (TV_CGRAPH);
>>
>> +  /* If LTO is enabled, initialize the streamer hooks needed by GIMPLE.  */
>> +  if (flag_lto)
>> +    gimple_streamer_hooks_init ();
>
> Ugh.  Isn't there a better entry for this?  Are you going to add
>
>  if (flag_pph)
>    init_hooks_some_other_way ();
>
> here?  It looks it rather belongs to opts.c or toplev.c if the hooks
> are really initialized dependent on compiler flags.

Not at all, this is for gimple, specifically.  The front end
initializes hooks in its own way.  The problem here is that the gimple
hooks are needed by the middle end.  If we initialize gimple hooks too
early, the FE will override them.  So we need to initialize them after
the front end is done (hence the location for this call).

I'm happy to move this somewhere else, but it needs to happen right
before the middle end starts calling LTO pickling routines.

>
>>    /* If we're here there's no current function anymore.  Some frontends
>>       are lazy in clearing these.  */
>>    current_function_decl = NULL;
>> diff --git a/gcc/lto-streamer-in.c b/gcc/lto-streamer-in.c
>> index 88966f2..801fe6f 100644
>> --- a/gcc/lto-streamer-in.c
>> +++ b/gcc/lto-streamer-in.c
>> @@ -1833,6 +1833,7 @@ static void
>>  unpack_value_fields (struct bitpack_d *bp, tree expr)
>>  {
>>    enum tree_code code;
>> +  lto_streamer_hooks *h = streamer_hooks ();
>
> A function to access a global ... we have lang_hooks and targetm,
> so please simply use streamer_hooks as a variable.
> streamer_hooks ()->preload_common_nodes (cache) looks super-ugly.

I did not want to add yet another global.  I don't feel too strong
about this one, given the presence of lang_hooks and targetm.  So, you
prefer the direct global access?

>> @@ -1864,26 +1865,11 @@ unpack_value_fields (struct bitpack_d *bp, tree expr)
>>    if (CODE_CONTAINS_STRUCT (code, TS_BLOCK))
>>      unpack_ts_block_value_fields (bp, expr);
>>
>> -  if (CODE_CONTAINS_STRUCT (code, TS_SSA_NAME))
>> -    {
>> -      /* We only stream the version number of SSA names.  */
>> -      gcc_unreachable ();
>> -    }
>> -
>> -  if (CODE_CONTAINS_STRUCT (code, TS_STATEMENT_LIST))
>> -    {
>> -      /* This is only used by GENERIC.  */
>> -      gcc_unreachable ();
>> -    }
>> -
>> -  if (CODE_CONTAINS_STRUCT (code, TS_OMP_CLAUSE))
>> -    {
>> -      /* This is only used by High GIMPLE.  */
>> -      gcc_unreachable ();
>> -    }
>> -
>>    if (CODE_CONTAINS_STRUCT (code, TS_TRANSLATION_UNIT_DECL))
>>      unpack_ts_translation_unit_decl_value_fields (bp, expr);
>> +
>> +  if (h->unpack_value_fields)
>> +    h->unpack_value_fields (bp, expr);
>
> I suppose the LTO implementation has a gcc_unreachable () for
> the cases we do not handle here?

Right.  This was already superfluous.  It's tested already by
lto_is_streamable().

>
>>  }
>>
>>
>> @@ -1935,8 +1921,17 @@ lto_materialize_tree (struct lto_input_block *ib, 
>> struct data_in *data_in,
>>      }
>>    else
>>      {
>> -      /* All other nodes can be materialized with a raw make_node call.  */
>> -      result = make_node (code);
>> +      lto_streamer_hooks *h = streamer_hooks ();
>> +
>> +      /* For all other nodes, see if the streamer knows how to allocate
>> +      it.  */
>> +      if (h->alloc_tree)
>> +     result = h->alloc_tree (code, ib, data_in);
>> +
>> +      /* If the hook did not handle it, materialize the tree with a raw
>> +      make_node call.  */
>> +      if (result == NULL_TREE)
>> +     result = make_node (code);
>>      }
>>
>>  #ifdef LTO_STREAMER_DEBUG
>> @@ -2031,12 +2026,8 @@ lto_input_ts_decl_common_tree_pointers (struct 
>> lto_input_block *ib,
>>  {
>>    DECL_SIZE (expr) = lto_input_tree (ib, data_in);
>>    DECL_SIZE_UNIT (expr) = lto_input_tree (ib, data_in);
>> -
>> -  if (TREE_CODE (expr) != FUNCTION_DECL
>> -      && TREE_CODE (expr) != TRANSLATION_UNIT_DECL)
>> -    DECL_INITIAL (expr) = lto_input_tree (ib, data_in);
>> -
>
> Why move those?  DECL_INITIAL _is_ in decl_common.

I needed to move the handling of DECL_INITIAL in the writer.  This
forces us to move the handling in the reader.  Otherwise, reader and
writer will be out of sync (DECL_INITIAL is now written last).

> Where do those checks go?  Or do we simply lose them?

They already are in lto_is_streamable.  See above.

>> -  if (TREE_CODE (result) == VAR_DECL)
>> -    lto_register_var_decl_in_symtab (data_in, result);
>> -  else if (TREE_CODE (result) == FUNCTION_DECL && !DECL_BUILT_IN (result))
>> -    lto_register_function_decl_in_symtab (data_in, result);
>> +  if (h->register_decls_in_symtab_p)
>> +    {
>> +      if (TREE_CODE (result) == VAR_DECL)
>> +     lto_register_var_decl_in_symtab (data_in, result);
>> +      else if (TREE_CODE (result) == FUNCTION_DECL && !DECL_BUILT_

Re: -fdump-passes -fenable-xxx=func_name_list

2011-06-01 Thread Xinliang David Li
On Wed, Jun 1, 2011 at 1:51 AM, Richard Guenther
 wrote:
> On Wed, Jun 1, 2011 at 1:34 AM, Xinliang David Li  wrote:
>> The following patch implements the a new option that dumps gcc PASS
>> configuration. The sample output is attached.  There is one
>> limitation: some placeholder passes that are named with '*xxx' are
>> note registered thus they are not listed. They are not important as
>> they can not be turned on/off anyway.
>>
>> The patch also enhanced -fenable-xxx and -fdisable-xx to allow a list
>> of function assembler names to be specified.
>>
>> Ok for trunk?
>
> Please split the patch.
>
> I'm not too happy how you dump the pass configuration.  Why not simply,
> at a _single_ place, walk the pass tree?  Instead of doing pieces of it
> at pass execution time when it's not already dumped - that really looks
> gross.

Yes, that was the original plan -- but it has problems
1) the dumper needs to know the root pass lists -- which can change
frequently -- it can be a long term maintanance burden;
2) the centralized dumper needs to be done after option processing
3) not sure if gate functions have any side effects or have dependencies on cfun

The proposed solutions IMHO is not that intrusive -- just three hooks
to do the dumping and tracking indentation.

>
> The documentation should also link this option to the -fenable/disable
> options as obviously the pass names in that dump are those to be
> used for those flags (and not readily available anywhere else).

Ok.

>
> I also think that it would be way more useful to note in the individual
> dump files the functions (at the place they would usually appear) that
> have the pass explicitly enabled/disabled.

Ok -- for ipa passes or tree/rtl passes where all functions are
explicitly disabled.

Thanks,

David

>
> Richard.
>
>> Thanks,
>>
>> David
>>
>


Re: approved but not committed? - [PATCH, ARM] Testcases incorrectly run in Thumb/Xscale

2011-06-01 Thread Jing Yu
On Wed, Jun 1, 2011 at 1:51 AM, Richard Earnshaw  wrote:
>
> On Tue, 2011-05-31 at 12:49 -0700, Jing Yu wrote:
>> Since this patch has been properly approved, if there is no objection
>> in 24 hours, I will commit this patch to trunk.
>>
>
> Once a patch has been approved by an appropriate maintainer, anybody
> with an account for gcc can commit the patch.

I see. I will commit this patch then.
Thanks!

Jing

>
> R.
>
>> Thanks,
>> Jing
>>
>> On Fri, May 27, 2011 at 3:55 PM, Jing Yu  wrote:
>> > Hi Sofiane,
>> >
>> > I find your following patch has been approved by Richard in Oct last
>> > year, but it is not trunk.
>> > Is there any problem with it?
>> > http://gcc.gnu.org/ml/gcc-patches/2010-10/msg00266.html
>> >
>> > If you don't mind, I can help to commit the patch.
>> >
>> > Thanks,
>> > Jing
>> >
>>
>
>
>


Re: [build] Move MD_UNWIND_SUPPORT to toplevel libgcc

2011-06-01 Thread Mike Stump
On Jun 1, 2011, at 9:01 AM, Rainer Orth wrote:
> Both TARGET_64BIT_DEFAULT and TARGET_BI_ARCH live in gcc only, so at
> least in the medium term, we need different tests here.

Ah, ick.  Oh well...  The next more general rule would be something like: one 
can set a feature (implicit -D__GCC_DO_UNWIND_BLA) in the compiler when 
TARGET_64BIT_DEFAULT and TARGET_BI_ARCH are set a certain way, and then in 
libgcc, one can just test that feature directly.  Ick, I hate inventing feature 
names here...

> I can certainly do it this way for now, but if we could do away with the
> tests completely, that would be cleaner.

Agreed, though, I don't believe the test is superfluous.



[PATCH] c-pragma: adding a data field to pragma_handler

2011-06-01 Thread Pierre

This patch is about the pragmas.

In c-family/c-pragma.h, we declare a pragma_handler which is a function 
accepting cpp_reader as parameter.


I have changed this handler in order to accept a second parameter which 
is a void *, allowing to give extra datas to the handler. I think this 
data field might be of general use: we can have condition or data at 
register time that we want to express in the handler. I guess this is a 
common way to pass data to an handler function.


I would like your opinion on this patch! Thanks!

Pierre Vittet

Changelog

2011-06-01  Pierre Vittet  

* c-pragma.h (pragma_handler,internal_pragma_handler, c_register_pragma,
c_register_pragma_with_expansion): create internal_pragma_handler, add a
new void * data parameter.
* c-pragma.c (handle_pragma_pack, handle_pragma_weak,
handle_pragma_redefine_extname, handle_pragma_visibility,
handle_pragma_diagnostic, handle_pragma_target, handle_pragma_optimize,
handle_pragma_push_options, handle_pragma_pop_options,
handle_pragma_reset_options, handle_pragma_message,
handle_pragma_float_const_decimal64, registered_pragmas,
c_register_pragma_1, c_register_pragma, 
c_register_pragma_with_expansion,
init_pragma): add support of the void * data field.



Index: gcc/c-family/c-pragma.h
===
--- gcc/c-family/c-pragma.h (revision 174521)
+++ gcc/c-family/c-pragma.h (working copy)
@@ -84,10 +84,19 @@ extern bool pop_visibility (int);
 extern void init_pragma (void);
 
 /* Front-end wrappers for pragma registration.  */
-typedef void (*pragma_handler)(struct cpp_reader *);
-extern void c_register_pragma (const char *, const char *, pragma_handler);
-extern void c_register_pragma_with_expansion (const char *, const char *,
- pragma_handler);
+/* The void * allows to pass extra data to the handler.  */
+typedef void (*pragma_handler)(struct cpp_reader *, void * );
+/* Internally use to keep the data of the handler.  */
+struct internal_pragma_handler_d{
+  pragma_handler handler;
+  void * data; 
+};
+typedef struct internal_pragma_handler_d internal_pragma_handler;
+
+extern void c_register_pragma (const char * space, const char * name,
+   pragma_handler handler, void * data);
+extern void c_register_pragma_with_expansion (const char * space, 
+  const char * name,  pragma_handler handler , void * 
data);
 extern void c_invoke_pragma_handler (unsigned int);
 
 extern void maybe_apply_pragma_weak (tree);
Index: gcc/c-family/c-pragma.c
===
--- gcc/c-family/c-pragma.c (revision 174521)
+++ gcc/c-family/c-pragma.c (working copy)
@@ -53,7 +53,7 @@ typedef struct GTY(()) align_stack {
 
 static GTY(()) struct align_stack * alignment_stack;
 
-static void handle_pragma_pack (cpp_reader *);
+static void handle_pragma_pack (cpp_reader *, void * data);
 
 /* If we have a "global" #pragma pack() in effect when the first
#pragma pack(push,) is encountered, this stores the value of
@@ -133,7 +133,7 @@ pop_alignment (tree id)
#pragma pack (pop)
#pragma pack (pop, ID) */
 static void
-handle_pragma_pack (cpp_reader * ARG_UNUSED (dummy))
+handle_pragma_pack (cpp_reader * ARG_UNUSED (dummy), void * ARG_UNUSED (data))
 {
   tree x, id = 0;
   int align = -1;
@@ -247,7 +247,7 @@ DEF_VEC_ALLOC_O(pending_weak,gc);
 static GTY(()) VEC(pending_weak,gc) *pending_weaks;
 
 static void apply_pragma_weak (tree, tree);
-static void handle_pragma_weak (cpp_reader *);
+static void handle_pragma_weak (cpp_reader *, void * data);
 
 static void
 apply_pragma_weak (tree decl, tree value)
@@ -334,7 +334,7 @@ maybe_apply_pending_pragma_weaks (void)
 
 /* #pragma weak name [= value] */
 static void
-handle_pragma_weak (cpp_reader * ARG_UNUSED (dummy))
+handle_pragma_weak (cpp_reader * ARG_UNUSED (dummy), void * ARG_UNUSED (data))
 {
   tree name, value, x, decl;
   enum cpp_ttype t;
@@ -411,11 +411,12 @@ DEF_VEC_ALLOC_O(pending_redefinition,gc);
 
 static GTY(()) VEC(pending_redefinition,gc) *pending_redefine_extname;
 
-static void handle_pragma_redefine_extname (cpp_reader *);
+static void handle_pragma_redefine_extname (cpp_reader *, void * data);
 
 /* #pragma redefine_extname oldname newname */
 static void
-handle_pragma_redefine_extname (cpp_reader * ARG_UNUSED (dummy))
+handle_pragma_redefine_extname (cpp_reader * ARG_UNUSED (dummy), 
+void * ARG_UNUSED (data))
 {
   tree oldname, newname, decl, x;
   enum cpp_ttype t;
@@ -481,7 +482,8 @@ static GTY(()) tree pragma_extern_prefix;
 
 /* #pragma extern_prefix "prefix" */
 static void
-handle_pragma_extern_prefix (cpp_reader * ARG_UNUSED (dummy))
+handle_pragma_extern_prefix (cpp_reader * ARG_UNUSED (dummy), 
+ void * ARG_UNUSED (data))
 {
   tree

Re: [RFC PATCH, go]: Port to ALPHA arch - sysinfo.go fixup

2011-06-01 Thread Mike Stump
On Jun 1, 2011, at 7:37 AM, Ian Lance Taylor wrote:
>> One problem remains in the libgo testsuite: certain tests have to be
>> compiled with -mieee, otherwise FPE is generated for unordered values.
>> Any suggestions, where -mieee should be placed?
> 
> That's an interesting question.  I think that ideally we would like
> -mieee to become the default when using gccgo.

If the language spec requires it, then it should go into gcc/go.  See 
java_post_options:

static bool
java_post_options (const char **pfilename)
{
  /* Excess precision other than "fast" requires front-end  

 support.  */
  if (flag_excess_precision_cmdline == EXCESS_PRECISION_STANDARD
  && TARGET_FLT_EVAL_METHOD_NON_DEFAULT)
sorry ("-fexcess-precision=standard for Java");
  flag_excess_precision_cmdline = EXCESS_PRECISION_FAST;

so, you could check the setting and reset any flag that should be off or error 
out on incompatible flags.  I'd like to think we could get more milage out of 
making a flag like -mieee be machine independent and then ports could just 
check the base flag for validating machine specific flags.  Certainly alpha 
isn't the only port that has -mieee.  There are likely to be very few flags 
promoted because of this, ieee being the most obvious example.


Re: Use i386/crtfastmath.c on Solaris 2/x86

2011-06-01 Thread Richard Henderson
On 06/01/2011 07:51 AM, Rainer Orth wrote:
> +  /* Set PC to the instruction after the faulting one to skip over it,
> + otherwise we enter an infinite loop.  4 is the size of the stmxcsr
> + instruction.  */
...
> +  /* We need a single SSE instruction here so the handler can safely skip
> +  over it.  */
> +  __asm__ volatile ("movss %xmm2,%xmm1");

The comment referencing stmxcsr doesn't match the movss code.
It's still a 4 byte opcode, so the code still works.

I do wonder if using "movaps %xmm0,%xmm0" might be cleaner,
to avoid clobbering a register, even if that register is
surely dead anyway.  That's a 3 byte opcode though, so the
handler would need updating.


r~


Re: -fdump-passes -fenable-xxx=func_name_list

2011-06-01 Thread Xinliang David Li
The attached is the split #1 patch that enhances -fenable/disable.

Ok after testing?

Thanks,
David

On Wed, Jun 1, 2011 at 9:16 AM, Xinliang David Li  wrote:
> On Wed, Jun 1, 2011 at 1:51 AM, Richard Guenther
>  wrote:
>> On Wed, Jun 1, 2011 at 1:34 AM, Xinliang David Li  wrote:
>>> The following patch implements the a new option that dumps gcc PASS
>>> configuration. The sample output is attached.  There is one
>>> limitation: some placeholder passes that are named with '*xxx' are
>>> note registered thus they are not listed. They are not important as
>>> they can not be turned on/off anyway.
>>>
>>> The patch also enhanced -fenable-xxx and -fdisable-xx to allow a list
>>> of function assembler names to be specified.
>>>
>>> Ok for trunk?
>>
>> Please split the patch.
>>
>> I'm not too happy how you dump the pass configuration.  Why not simply,
>> at a _single_ place, walk the pass tree?  Instead of doing pieces of it
>> at pass execution time when it's not already dumped - that really looks
>> gross.
>
> Yes, that was the original plan -- but it has problems
> 1) the dumper needs to know the root pass lists -- which can change
> frequently -- it can be a long term maintanance burden;
> 2) the centralized dumper needs to be done after option processing
> 3) not sure if gate functions have any side effects or have dependencies on 
> cfun
>
> The proposed solutions IMHO is not that intrusive -- just three hooks
> to do the dumping and tracking indentation.
>
>>
>> The documentation should also link this option to the -fenable/disable
>> options as obviously the pass names in that dump are those to be
>> used for those flags (and not readily available anywhere else).
>
> Ok.
>
>>
>> I also think that it would be way more useful to note in the individual
>> dump files the functions (at the place they would usually appear) that
>> have the pass explicitly enabled/disabled.
>
> Ok -- for ipa passes or tree/rtl passes where all functions are
> explicitly disabled.
>
> Thanks,
>
> David
>
>>
>> Richard.
>>
>>> Thanks,
>>>
>>> David
>>>
>>
>
Index: doc/invoke.texi
===
--- doc/invoke.texi	(revision 174424)
+++ doc/invoke.texi	(working copy)
@@ -5056,11 +5056,12 @@ appended with a sequential number starti
 Disable rtl pass @var{pass}.  @var{pass} is the pass name.  If the same pass is
 statically invoked in the compiler multiple times, the pass name should be
 appended with a sequential number starting from 1.  @var{range-list} is a comma
-seperated list of function ranges.  Each range is a number pair seperated by a colon.
-The range is inclusive in both ends.  If the range is trivial, the number pair can be
-simplified a a single number.  If the function's cgraph node's @var{uid} is falling
-within one of the specified ranges, the @var{pass} is disabled for that function.
-The @var{uid} is shown in the function header of a dump file.
+seperated list of function ranges or assembler names.  Each range is a number
+pair seperated by a colon.  The range is inclusive in both ends.  If the range
+is trivial, the number pair can be simplified as a single number.  If the
+function's cgraph node's @var{uid} is falling within one of the specified ranges,
+the @var{pass} is disabled for that function.  The @var{uid} is shown in the
+function header of a dump file.
 
 @item -fdisable-tree-@var{pass}
 @item -fdisable-tree-@var{pass}=@var{range-list}
@@ -5090,7 +5091,8 @@ of option arguments.
-fenable-tree-cunroll=1
 # disable gcse2 for functions at the following ranges [1,1],
 # [300,400], and [400,1000]
-   -fdisable-rtl-gcse2=1:100,300,400:1000
+# disable gcse2 for functions foo and foo2
+   -fdisable-rtl-gcse2=foo,foo2
 # disable early inlining
-fdisable-tree-einline
 # disable ipa inlining
Index: testsuite/gcc.dg/inline_2.c
===
--- testsuite/gcc.dg/inline_2.c	(revision 0)
+++ testsuite/gcc.dg/inline_2.c	(revision 0)
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized -fdisable-tree-einline=0:3 -fdisable-ipa-inline" } */
+int g;
+__attribute__((always_inline)) void bar (void)
+{
+  g++;
+}
+
+int foo (void)
+{
+  bar ();
+  return g;
+}
+
+int foo2 (void)
+{
+  bar();
+  return g + 1;
+}
+
+/* { dg-final { scan-tree-dump-times "bar" 5 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
+/* { dg-excess-errors "extra notes" } */
Index: testsuite/gcc.dg/inline_6.c
===
--- testsuite/gcc.dg/inline_6.c	(revision 0)
+++ testsuite/gcc.dg/inline_6.c	(revision 0)
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized -fdisable-tree-einline=foo2 -fdisable-ipa-inline" } */
+int g;
+__attribute__((always_inline)) void bar (void)
+{
+  g++;
+}
+
+int foo (void)
+{
+  bar ();
+  return g;
+}
+
+int foo2 (void)
+{
+  bar();
+  return g + 1;
+}
+
+/* { dg-fi

Re: [PATCH][all-langs] Defer size_t and sizetype setting to the middle-end

2011-06-01 Thread Andrew Haley
On 06/01/2011 05:45 PM, Bryce McKinlay wrote:
> 
> Can I suggest that you cc such approvals to gcc-patches. Richard (and
> others) may not be subscribed to the j...@gcc.gnu.org list.

Sorry, I meant to do so.  This idiot mailer has its "reply list" button
replying to just the one list, not all the CC:s.

(I know, I know, a bad workman blames his tools.  But I wish I could
fix this in the mailer...  :-)

Andrew.



Re: [PATCH][all-langs] Defer size_t and sizetype setting to the middle-end

2011-06-01 Thread Andrew Haley
On 06/01/2011 12:34 PM, Richard Guenther wrote:
> >
> > java/
> > * decl.c (java_init_decl_processing): Properly initialize
> > size_type_node.
> >
> > Index: gcc/java/decl.c
> > ===
> > --- gcc/java/decl.c (revision 174520)
> > +++ gcc/java/decl.c (working copy)
> > @@ -606,7 +606,14 @@ java_init_decl_processing (void)
> >
> >/* This is not a java type, however tree-dfa requires a definition for
> >   size_type_node.  */
> > -  size_type_node = make_unsigned_type (POINTER_SIZE);
> > +  if (strcmp (SIZE_TYPE, "unsigned int") == 0)
> > +size_type_node = make_unsigned_type (INT_TYPE_SIZE);
> > +  else if (strcmp (SIZE_TYPE, "long unsigned int") == 0)
> > +size_type_node = make_unsigned_type (LONG_TYPE_SIZE);
> > +  else if (strcmp (SIZE_TYPE, "long long unsigned int") == 0)
> > +size_type_node = make_unsigned_type (LONG_LONG_TYPE_SIZE);
> > +  else
> > +gcc_unreachable ();
> >set_sizetype (size_type_node);
> >

OK.

Andrew.


[Patch ARM] Unbreak bootstrap for --with-fpu=neon.

2011-06-01 Thread Ramana Radhakrishnan
Hi,

It turns out that my effort last week in canonicalizing the vbic and
the vorn patterns in the neon bug exposed a latent bug while
bootstrapping trunk with Neon which Michael's tester picked up.. The
splitting is slightly tricky because in T2 state you've got the orn
instruction but in ARM state you don't .

I intend to follow this up with a separate patch that turns some of
these patterns off on the A8 in line with the other patches that have
come in recently to do. Before doing that I also need to reorganize
the arch attributes a bit and move the a8 and nota8 bits into a
separate attribute - so that's the matter of another patch.

Verified that the compiler passes bootstrap in both ARM and Thumb2
states . Regression tests are still running. It will be committed
after tests finish.

cheers
Ramana


2011-05-31  Ramana Radhakrishnan  

* config/arm/neon.md (orndi3_neon): Actually split it.
Index: gcc/config/arm/neon.md
===
--- gcc/config/arm/neon.md  (revision 174266)
+++ gcc/config/arm/neon.md  (working copy)
@@ -801,17 +801,44 @@
   [(set_attr "neon_type" "neon_int_1")]
 )
 
-(define_insn "orndi3_neon"
-  [(set (match_operand:DI 0 "s_register_operand" "=w,?=&r,?&r")
-   (ior:DI (not:DI (match_operand:DI 2 "s_register_operand" "w,0,r"))
-   (match_operand:DI 1 "s_register_operand" "w,r,0")))]
+;; TODO: investigate whether we should disable 
+;; this and bicdi3_neon for the A8 in line with the other
+;; changes above. 
+(define_insn_and_split "orndi3_neon"
+  [(set (match_operand:DI 0 "s_register_operand" "=w,?=&r,?=&r,?&r")
+   (ior:DI (not:DI (match_operand:DI 2 "s_register_operand" "w,0,0,r"))
+   (match_operand:DI 1 "s_register_operand" "w,r,r,0")))]
   "TARGET_NEON"
   "@
vorn\t%P0, %P1, %P2
#
+   #
#"
-  [(set_attr "neon_type" "neon_int_1,*,*")
-   (set_attr "length" "*,8,8")]
+  "reload_completed && 
+   (TARGET_NEON && !(IS_VFP_REGNUM (REGNO (operands[0]"
+  [(set (match_dup 0) (ior:SI (not:SI (match_dup 2)) (match_dup 1)))
+   (set (match_dup 3) (ior:SI (not:SI (match_dup 4)) (match_dup 5)))]
+  "
+  {
+if (TARGET_THUMB2)
+  {
+operands[3] = gen_highpart (SImode, operands[0]);
+operands[0] = gen_lowpart (SImode, operands[0]);
+operands[4] = gen_highpart (SImode, operands[2]);
+operands[2] = gen_lowpart (SImode, operands[2]);
+operands[5] = gen_highpart (SImode, operands[1]);
+operands[1] = gen_lowpart (SImode, operands[1]);
+  }
+else
+  {
+emit_insn (gen_one_cmpldi2 (operands[0], operands[2]));
+emit_insn (gen_iordi3 (operands[0], operands[1], operands[0]));
+DONE;
+  }
+  }"
+  [(set_attr "neon_type" "neon_int_1,*,*,*")
+   (set_attr "length" "*,16,8,8")
+   (set_attr "arch" "any,a,t2,t2")]
 )
 
 (define_insn "bic3_neon"


Re: Use i386/crtfastmath.c on Solaris 2/x86

2011-06-01 Thread Rainer Orth
Richard Henderson  writes:

> On 06/01/2011 07:51 AM, Rainer Orth wrote:
>> +  /* Set PC to the instruction after the faulting one to skip over it,
>> + otherwise we enter an infinite loop.  4 is the size of the stmxcsr
>> + instruction.  */
> ...
>> +  /* We need a single SSE instruction here so the handler can safely 
>> skip
>> + over it.  */
>> +  __asm__ volatile ("movss %xmm2,%xmm1");
>
> The comment referencing stmxcsr doesn't match the movss code.
> It's still a 4 byte opcode, so the code still works.

Copy-and-paste error ;-(  We already have the same code in
libgfortran/config/fpu-387.h and (without the comment) in
gcc/testsuite/lib/target-supports.exp.  I still mean to fix
driver-i386.c to correcly handle -march=native on Solaris 8 and 9 which
cannot in general execute SSE insns.  I wonder if there's a better place
to share this code?

> I do wonder if using "movaps %xmm0,%xmm0" might be cleaner,
> to avoid clobbering a register, even if that register is
> surely dead anyway.  That's a 3 byte opcode though, so the
> handler would need updating.

I'll give it a try.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [build] Move MD_UNWIND_SUPPORT to toplevel libgcc

2011-06-01 Thread Richard Henderson
On 06/01/2011 09:01 AM, Rainer Orth wrote:
> The problem with this approach is that some of the macros tested only
> live in gcc, not libgcc once the libgcc sources no longer include tm.h
> etc.  E.g. look at i386/mingw32.h:
> 
> #if !TARGET_64BIT_DEFAULT && !defined (TARGET_BI_ARCH)
> #define MD_UNWIND_SUPPORT "config/i386/w32-unwind.h"
> #endif
> 
> Both TARGET_64BIT_DEFAULT and TARGET_BI_ARCH live in gcc only, so at
> least in the medium term, we need different tests here.

For this specific case, surely neither isn't relevant.
Surely the proper test, in the target header, is simply 

#ifndef __MINGW64__

as one would write in normal user-level code.


r~


Re: [build] Move MD_UNWIND_SUPPORT to toplevel libgcc

2011-06-01 Thread Kai Tietz
2011/6/1 Richard Henderson :
> On 06/01/2011 09:01 AM, Rainer Orth wrote:
>> The problem with this approach is that some of the macros tested only
>> live in gcc, not libgcc once the libgcc sources no longer include tm.h
>> etc.  E.g. look at i386/mingw32.h:
>>
>> #if !TARGET_64BIT_DEFAULT && !defined (TARGET_BI_ARCH)
>> #define MD_UNWIND_SUPPORT "config/i386/w32-unwind.h"
>> #endif
>>
>> Both TARGET_64BIT_DEFAULT and TARGET_BI_ARCH live in gcc only, so at
>> least in the medium term, we need different tests here.
>
> For this specific case, surely neither isn't relevant.
> Surely the proper test, in the target header, is simply
>
> #ifndef __MINGW64__
>
> as one would write in normal user-level code.
>
>
> r~

Yes, thanks.  Well, we would loose here the ability to build for
mingw-w64 dw2 support for 32-bit (to be compatible to mingw.org's
32-bit variant, as they want to use this dw2 unwinder), but mingw-w64
doesn't want dw2-unwind in general, as dw2-unwind has some issues
about throwing of VC generated code. So this test might be ok too.

Regards,
Kai


Re: [RFC PATCH, go]: Port to ALPHA arch - sysinfo.go fixup

2011-06-01 Thread Ian Lance Taylor
Mike Stump  writes:

> On Jun 1, 2011, at 7:37 AM, Ian Lance Taylor wrote:
>>> One problem remains in the libgo testsuite: certain tests have to be
>>> compiled with -mieee, otherwise FPE is generated for unordered values.
>>> Any suggestions, where -mieee should be placed?
>> 
>> That's an interesting question.  I think that ideally we would like
>> -mieee to become the default when using gccgo.
>
> If the language spec requires it, then it should go into gcc/go.  See 
> java_post_options:
>
> static bool
> java_post_options (const char **pfilename)
> {
>   /* Excess precision other than "fast" requires front-end
>   
>  support.  */
>   if (flag_excess_precision_cmdline == EXCESS_PRECISION_STANDARD
>   && TARGET_FLT_EVAL_METHOD_NON_DEFAULT)
> sorry ("-fexcess-precision=standard for Java");
>   flag_excess_precision_cmdline = EXCESS_PRECISION_FAST;

Sure, the Go frontend does stuff like that too.  But of course the Go
frontend can't directly set -mieee, because -mieee is a machine
dependent option.


> so, you could check the setting and reset any flag that should be off
> or error out on incompatible flags.  I'd like to think we could get
> more milage out of making a flag like -mieee be machine independent
> and then ports could just check the base flag for validating machine
> specific flags.  Certainly alpha isn't the only port that has -mieee.
> There are likely to be very few flags promoted because of this, ieee
> being the most obvious example.

What I think you are suggesting here is another approach: Alpha should
set -mieee based on a machine-independent option, and then the Go
frontend can set that option instead.  I'm fine with that approach too.
I don't think we currently have a machine-independent option which
corresponds to the Alpha -mieee option.  According to the documentation,
-mieee does two things: adds support for NaN and infinity, and adds
support for denormal numbers.  The first is the -fno-finite-math-only
option, which is actually the default for other targets.  The second has
no machine independent option as far as I know.

Ian


Re: [build] Move MD_UNWIND_SUPPORT to toplevel libgcc

2011-06-01 Thread Rainer Orth
Mike Stump  writes:

> On Jun 1, 2011, at 9:01 AM, Rainer Orth wrote:
>> Both TARGET_64BIT_DEFAULT and TARGET_BI_ARCH live in gcc only, so at
>> least in the medium term, we need different tests here.
>
> Ah, ick.  Oh well...  The next more general rule would be something like: one 
> can set a feature (implicit -D__GCC_DO_UNWIND_BLA) in the compiler when 
> TARGET_64BIT_DEFAULT and TARGET_BI_ARCH are set a certain way, and then in 
> libgcc, one can just test that feature directly.  Ick, I hate inventing 
> feature names here...

True, but only as a last resort.  Alternatively, one could try to
determine the feature with autoconf.

>> I can certainly do it this way for now, but if we could do away with the
>> tests completely, that would be cleaner.
>
> Agreed, though, I don't believe the test is superfluous.

You still haven't answered my question wrt. Darwin 8 vs. 64-bit on
PowerPC.  Perhaps we can do away with DARWIN_LIBSYSTEM_HAS_UNWIND
completely?

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[pph] Add one failing C test (issue4524085)

2011-06-01 Thread Diego Novillo

This is the last of the large set of failing single-file C test cases I had
collected.  It still fails, but given that we are more people that may
be hacking on the branch now, I wanted to put it out there so I don't
have to keep testing my private set of files anymore.

This fails on read with:

c120060625-1.h:10:22: internal compiler error: invalid built-in macro 
"__FLT_MAX__"


Diego.


* g++.dg/pph/c120060625-1.cc: New.
* g++.dg/pph/c120060625-1.h: New.

diff --git a/gcc/testsuite/g++.dg/pph/c120060625-1.cc 
b/gcc/testsuite/g++.dg/pph/c120060625-1.cc
new file mode 100644
index 000..05c7929
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pph/c120060625-1.cc
@@ -0,0 +1 @@
+#include "c120060625-1.h"
diff --git a/gcc/testsuite/g++.dg/pph/c120060625-1.h 
b/gcc/testsuite/g++.dg/pph/c120060625-1.h
new file mode 100644
index 000..07266d9
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pph/c120060625-1.h
@@ -0,0 +1,13 @@
+#ifndef __PPH_GUARD_H
+#define __PPH_GUARD_H
+/* PR middle-end/28151 */
+/* Testcase by Steven Bosscher  */
+
+_Complex float b;
+
+void foo (void)
+{
+  _Complex float a = __FLT_MAX__;
+  b = __FLT_MAX__ + a;
+}
+#endif

--
This patch is available for review at http://codereview.appspot.com/4524085


Re: [PATCH] c-pragma: adding a data field to pragma_handler

2011-06-01 Thread Basile Starynkevitch
On Wed, 01 Jun 2011 18:54:38 +0200
Pierre  wrote:

> This patch is about the pragmas.
> 
> In c-family/c-pragma.h, we declare a pragma_handler which is a function 
> accepting cpp_reader as parameter.
> 
> I have changed this handler in order to accept a second parameter which 
> is a void *, allowing to give extra datas to the handler. I think this 
> data field might be of general use: we can have condition or data at 
> register time that we want to express in the handler. I guess this is a 
> common way to pass data to an handler function.

I find this patch interesting and useful (& not only for MELT).

A general coding rule in C seems to be that every time function can be
variably called thru indirect pointers (which can have several
different functions as value), they better take an extra data argument.
This is the case, in particular, inside Glib & GTK, inside the Linux
kernel, and on several other occurrences in GCC.

A use case of such pragma handlers with data for pragmas would be a
plugin which permit some messages to other channels than stdout/stderr
(e.g. other files, or a pipe, or the D-Bus, or a widget, or a web
service...). Then the same routine would handle 
   #pragma GCCPLUGIN message_to_file "foo"
and 
   #pragma GCCPLUGIN message_to_pipe "bar"
and the data pointer would be different (an fopen-ed or popen-ed
FILE*, or even an std::ostream& if the plugin is coded in C++). 


I am not authorized to ok the patch (I believe the changelog had some
typos), but I hope someone will review & ok it.

Regards




-- 
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basilestarynkevitchnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mine, sont seulement les miennes} ***


Re: Use i386/crtfastmath.c on Solaris 2/x86

2011-06-01 Thread Richard Henderson
On 06/01/2011 10:29 AM, Rainer Orth wrote:
> I still mean to fix
> driver-i386.c to correcly handle -march=native on Solaris 8 and 9 which
> cannot in general execute SSE insns.  I wonder if there's a better place
> to share this code?

I can't think of a good place.  :-(


r~


C++ PATCH for c++/49253 (v3 debug mode regression)

2011-06-01 Thread Jason Merrill
My changes to preserve reference semantics in templates broke the 
shenanigans build_x_arrow was using for non-dependent ARROW_EXPR; it 
called build_min_non_dep and then overwrote the TREE_TYPE, but that 
broke in the case of a reference to pointer because it ended up giving 
the ARROW_EXPR REFERENCE_TYPE and then changing the type of the implicit 
INDIRECT_REF.  We shouldn't be using build_min_non_dep when there's an 
additional implied operation, anyway.  So this patch fixes it to use 
build_min instead, and set TREE_SIDE_EFFECTS directly.


Tested x86_64-pc-linux-gnu, applied to trunk.
commit 99c854c1078c3fc486ecf9b336e3a274a0c46469
Author: Jason Merrill 
Date:   Wed Jun 1 13:31:33 2011 -0400

	PR c++/49253
	* typeck2.c (build_x_arrow): Don't use build_min_nt.

diff --git a/gcc/cp/typeck2.c b/gcc/cp/typeck2.c
index 031f076..4d5c21a 100644
--- a/gcc/cp/typeck2.c
+++ b/gcc/cp/typeck2.c
@@ -1463,9 +1463,9 @@ build_x_arrow (tree expr)
 {
   if (processing_template_decl)
 	{
-	  expr = build_min_non_dep (ARROW_EXPR, last_rval, orig_expr);
-	  /* It will be dereferenced.  */
-	  TREE_TYPE (expr) = TREE_TYPE (TREE_TYPE (last_rval));
+	  expr = build_min (ARROW_EXPR, TREE_TYPE (TREE_TYPE (last_rval)),
+			orig_expr);
+	  TREE_SIDE_EFFECTS (expr) = TREE_SIDE_EFFECTS (last_rval);
 	  return expr;
 	}
 


Re: New options to disable/enable any pass for any functions (issue4550056)

2011-06-01 Thread H.J. Lu
On Mon, May 30, 2011 at 2:44 PM, Xinliang David Li  wrote:
> This is the complete patch for pass name fixes (with test case changes).
>
> David
>
>

I think your change caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49261


H.J.


Re: [lto] Merge streamer hooks from pph branch. (issue4568043)

2011-06-01 Thread Richard Guenther
On Wed, 1 Jun 2011, Diego Novillo wrote:

> On Wed, Jun 1, 2011 at 08:07, Richard Guenther  wrote:
> 
> >>  static void cgraph_expand_all_functions (void);
> >>  static void cgraph_mark_functions_to_output (void);
> >> @@ -1092,6 +1093,10 @@ cgraph_finalize_compilation_unit (void)
> >>  {
> >>    timevar_push (TV_CGRAPH);
> >>
> >> +  /* If LTO is enabled, initialize the streamer hooks needed by GIMPLE.  
> >> */
> >> +  if (flag_lto)
> >> +    gimple_streamer_hooks_init ();
> >
> > Ugh.  Isn't there a better entry for this?  Are you going to add
> >
> >  if (flag_pph)
> >    init_hooks_some_other_way ();
> >
> > here?  It looks it rather belongs to opts.c or toplev.c if the hooks
> > are really initialized dependent on compiler flags.
> 
> Not at all, this is for gimple, specifically.  The front end
> initializes hooks in its own way.  The problem here is that the gimple
> hooks are needed by the middle end.  If we initialize gimple hooks too
> early, the FE will override them.  So we need to initialize them after
> the front end is done (hence the location for this call).
> 
> I'm happy to move this somewhere else, but it needs to happen right
> before the middle end starts calling LTO pickling routines.
> 
> >
> >>    /* If we're here there's no current function anymore.  Some frontends
> >>       are lazy in clearing these.  */
> >>    current_function_decl = NULL;
> >> diff --git a/gcc/lto-streamer-in.c b/gcc/lto-streamer-in.c
> >> index 88966f2..801fe6f 100644
> >> --- a/gcc/lto-streamer-in.c
> >> +++ b/gcc/lto-streamer-in.c
> >> @@ -1833,6 +1833,7 @@ static void
> >>  unpack_value_fields (struct bitpack_d *bp, tree expr)
> >>  {
> >>    enum tree_code code;
> >> +  lto_streamer_hooks *h = streamer_hooks ();
> >
> > A function to access a global ... we have lang_hooks and targetm,
> > so please simply use streamer_hooks as a variable.
> > streamer_hooks ()->preload_common_nodes (cache) looks super-ugly.
> 
> I did not want to add yet another global.  I don't feel too strong
> about this one, given the presence of lang_hooks and targetm.  So, you
> prefer the direct global access?

Yes, I see no benefit of using a global function to get access
to the address of a global variable.

> >> @@ -1864,26 +1865,11 @@ unpack_value_fields (struct bitpack_d *bp, tree 
> >> expr)
> >>    if (CODE_CONTAINS_STRUCT (code, TS_BLOCK))
> >>      unpack_ts_block_value_fields (bp, expr);
> >>
> >> -  if (CODE_CONTAINS_STRUCT (code, TS_SSA_NAME))
> >> -    {
> >> -      /* We only stream the version number of SSA names.  */
> >> -      gcc_unreachable ();
> >> -    }
> >> -
> >> -  if (CODE_CONTAINS_STRUCT (code, TS_STATEMENT_LIST))
> >> -    {
> >> -      /* This is only used by GENERIC.  */
> >> -      gcc_unreachable ();
> >> -    }
> >> -
> >> -  if (CODE_CONTAINS_STRUCT (code, TS_OMP_CLAUSE))
> >> -    {
> >> -      /* This is only used by High GIMPLE.  */
> >> -      gcc_unreachable ();
> >> -    }
> >> -
> >>    if (CODE_CONTAINS_STRUCT (code, TS_TRANSLATION_UNIT_DECL))
> >>      unpack_ts_translation_unit_decl_value_fields (bp, expr);
> >> +
> >> +  if (h->unpack_value_fields)
> >> +    h->unpack_value_fields (bp, expr);
> >
> > I suppose the LTO implementation has a gcc_unreachable () for
> > the cases we do not handle here?
> 
> Right.  This was already superfluous.  It's tested already by
> lto_is_streamable().
> 
> >
> >>  }
> >>
> >>
> >> @@ -1935,8 +1921,17 @@ lto_materialize_tree (struct lto_input_block *ib, 
> >> struct data_in *data_in,
> >>      }
> >>    else
> >>      {
> >> -      /* All other nodes can be materialized with a raw make_node call.  
> >> */
> >> -      result = make_node (code);
> >> +      lto_streamer_hooks *h = streamer_hooks ();
> >> +
> >> +      /* For all other nodes, see if the streamer knows how to allocate
> >> +      it.  */
> >> +      if (h->alloc_tree)
> >> +     result = h->alloc_tree (code, ib, data_in);
> >> +
> >> +      /* If the hook did not handle it, materialize the tree with a raw
> >> +      make_node call.  */
> >> +      if (result == NULL_TREE)
> >> +     result = make_node (code);
> >>      }
> >>
> >>  #ifdef LTO_STREAMER_DEBUG
> >> @@ -2031,12 +2026,8 @@ lto_input_ts_decl_common_tree_pointers (struct 
> >> lto_input_block *ib,
> >>  {
> >>    DECL_SIZE (expr) = lto_input_tree (ib, data_in);
> >>    DECL_SIZE_UNIT (expr) = lto_input_tree (ib, data_in);
> >> -
> >> -  if (TREE_CODE (expr) != FUNCTION_DECL
> >> -      && TREE_CODE (expr) != TRANSLATION_UNIT_DECL)
> >> -    DECL_INITIAL (expr) = lto_input_tree (ib, data_in);
> >> -
> >
> > Why move those?  DECL_INITIAL _is_ in decl_common.
> 
> I needed to move the handling of DECL_INITIAL in the writer.  This
> forces us to move the handling in the reader.  Otherwise, reader and
> writer will be out of sync (DECL_INITIAL is now written last).
> 
> > Where do those checks go?  Or do we simply lose them?
> 
> They already are in lto_is_streamable.  See above.
> 
> >> -  if (TREE_CODE (result) == VAR_DECL)
> >> -

[Patch, Fortran] Fix -fcheck=pointer for F2008's NULL ptr to optional arguments

2011-06-01 Thread Tobias Burnus
The NULL pointer check (-fcheck=pointer) was wrong for Fortran 2008: It 
is now allowed to pass a null pointer (or not associated allocatables) 
to optional arguments to denote absent arguments.


Build and regtested on x86-64-linux.
OK for the trunk?

Tobias
2011-06-01  Tobias Burnus  

	PR fortran/49255
	* trans-expr.c (gfc_conv_procedure_call): Fix -fcheck=pointer
	for F2008.

2011-06-01  Tobias Burnus  

	PR fortran/49255
	* gfortran.dg/pointer_check_9.f90: New.
	* gfortran.dg/pointer_check_10.f90: New.

diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index bfe966f..da4af1a 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -3269,6 +3269,12 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
 	  else
 	goto end_pointer_check;
 
+	  /*  In Fortran 2008 it's allowed to pass a NULL pointer/nonallocated
+	  allocatable to an optional dummy, cf. 12.5.2.12.  */
+	  if (fsym != NULL && fsym->attr.optional && !attr.proc_pointer
+	  && (gfc_option.allow_std & GFC_STD_F2008) != 0)
+	goto end_pointer_check;
+
   if (attr.optional)
 	{
   /* If the actual argument is an optional pointer/allocatable and
--- /dev/null	2011-05-31 07:23:47.047892583 +0200
+++ gcc/gcc/testsuite/gfortran.dg/pointer_check_9.f90	2011-06-01 20:18:52.0 +0200
@@ -0,0 +1,15 @@
+! { dg-do run }
+! { dg-options "-fcheck=all -std=f2008 -fall-intrinsics" }
+!
+! PR fortran/49255
+!
+! Valid F2008, invalid F95/F2003.
+!
+integer,pointer :: ptr => null()
+call foo (ptr)
+contains
+  subroutine foo (x)
+integer, optional :: x
+if (present (x)) call abort ()
+  end subroutine foo
+end
--- /dev/null	2011-05-31 07:23:47.047892583 +0200
+++ gcc/gcc/testsuite/gfortran.dg/pointer_check_10.f90	2011-06-01 20:19:05.0 +0200
@@ -0,0 +1,16 @@
+! { dg-do run }
+! { dg-options "-fcheck=all -std=f2003 -fall-intrinsics" }
+! { dg-shouldfail "Pointer actual argument 'ptr' is not associated" }
+!
+! PR fortran/49255
+!
+! Valid F2008, invalid F95/F2003.
+!
+integer,pointer :: ptr => null()
+call foo (ptr)
+contains
+  subroutine foo (x)
+integer, optional :: x
+if (present (x)) call abort ()
+  end subroutine foo
+end


[PATCH] gimple_val_nonnegative_real_p (PR46728 patch 7 of 7)

2011-06-01 Thread William J. Schmidt
This patch cleans up the FIXME logic in gimple_expand_builtin_pow by
introducing gimple_val_nonnegative_real_p for the same purpose that
tree_expr_nonnegative_p served in the expand logic.  This completes the
work for PR46728.

Bootstrapped/regtested on powerpc64-linux.


2011-06-01  Bill Schmidt  

PR tree-optimization/46728
* tree-ssa-math-opts.c (gimple_expand_builtin_pow): Change FIXME
to use gimple_val_nonnegative_real_p.
* gimple-fold.c (gimple_val_nonnegative_real_p): New function.
* gimple.h (gimple_val_nonnegative_real_p): New declaration.


Index: gcc/tree-ssa-math-opts.c
===
--- gcc/tree-ssa-math-opts.c(revision 174535)
+++ gcc/tree-ssa-math-opts.c(working copy)
@@ -1172,13 +1172,7 @@ gimple_expand_builtin_pow (gimple_stmt_iterator *g
 
   if (flag_unsafe_math_optimizations
   && cbrtfn
-  /* FIXME: The following line was originally
-&& (tree_expr_nonnegative_p (arg0) || !HONOR_NANS (mode)),
-but since arg0 is a gimple value, the first predicate
-will always return false.  It needs to be replaced with a
-call to a similar gimple_val_nonnegative_p function to be
- added in gimple-fold.c.  */
-  && !HONOR_NANS (mode)
+  && (gimple_val_nonnegative_real_p (arg0) || !HONOR_NANS (mode))
   && REAL_VALUES_EQUAL (c, dconst1_3))
 return build_and_insert_call (gsi, loc, &target, cbrtfn, arg0);
   
@@ -1190,13 +1184,7 @@ gimple_expand_builtin_pow (gimple_stmt_iterator *g
   if (flag_unsafe_math_optimizations
   && sqrtfn
   && cbrtfn
-  /* FIXME: The following line was originally
-&& (tree_expr_nonnegative_p (arg0) || !HONOR_NANS (mode)),
-but since arg0 is a gimple value, the first predicate
-will always return false.  It needs to be replaced with a
-call to a similar gimple_val_nonnegative_p function to be
- added in gimple-fold.c.  */
-  && !HONOR_NANS (mode)
+  && (gimple_val_nonnegative_real_p (arg0) || !HONOR_NANS (mode))
   && optimize_function_for_speed_p (cfun)
   && hw_sqrt_exists
   && REAL_VALUES_EQUAL (c, dconst1_6))
@@ -1270,13 +1258,7 @@ gimple_expand_builtin_pow (gimple_stmt_iterator *g
 
   if (flag_unsafe_math_optimizations
   && cbrtfn
-  /* FIXME: The following line was originally
-&& (tree_expr_nonnegative_p (arg0) || !HONOR_NANS (mode)),
-but since arg0 is a gimple value, the first predicate
-will always return false.  It needs to be replaced with a
-call to a similar gimple_val_nonnegative_p function to be
- added in gimple-fold.c.  */
-  && !HONOR_NANS (mode)
+  && (gimple_val_nonnegative_real_p (arg0) || !HONOR_NANS (mode))
   && real_identical (&c2, &c)
   && optimize_function_for_speed_p (cfun)
   && powi_cost (n / 3) <= POWI_MAX_MULTS)
Index: gcc/gimple-fold.c
===
--- gcc/gimple-fold.c   (revision 174535)
+++ gcc/gimple-fold.c   (working copy)
@@ -3433,3 +3433,224 @@ fold_const_aggregate_ref (tree t)
 {
   return fold_const_aggregate_ref_1 (t, NULL);
 }
+
+/* Return true iff VAL is a gimple expression that is known to be
+   non-negative.  Restricted to floating-point inputs.  When changing
+   this function, review fold-const.c:tree_expr_nonnegative_p to see
+   whether similar changes are required.  */
+
+bool
+gimple_val_nonnegative_real_p (tree val)
+{
+  gimple def_stmt;
+
+  /* Use existing logic for non-gimple trees.  */
+  if (tree_expr_nonnegative_p (val))
+return true;
+
+  if (TREE_CODE (val) != SSA_NAME)
+return false;
+
+  def_stmt = SSA_NAME_DEF_STMT (val);
+
+  if (is_gimple_assign (def_stmt))
+{
+  tree op0, op1;
+
+  /* If this is just a copy between SSA names, check the RHS.  */
+  if (gimple_assign_ssa_name_copy_p (def_stmt))
+   {
+ op0 = gimple_assign_rhs1 (def_stmt);
+ return gimple_val_nonnegative_real_p (op0);
+   }
+
+  switch (gimple_assign_rhs_code (def_stmt))
+   {
+   case ABS_EXPR:
+ /* Always true for floating-point operands.  */
+ return true;
+
+   case NOP_EXPR:
+   case CONVERT_EXPR:
+ /* True if the first operand is a nonnegative real.  */
+ op0 = gimple_assign_rhs1 (def_stmt);
+ return (TREE_CODE (TREE_TYPE (op0)) == REAL_TYPE
+ && gimple_val_nonnegative_real_p (op0));
+
+   case PLUS_EXPR:
+   case MIN_EXPR:
+   case RDIV_EXPR:
+ /* True if both operands are nonnegative.  */
+ op0 = gimple_assign_rhs1 (def_stmt);
+ op1 = gimple_assign_rhs2 (def_stmt);
+ return (gimple_val_nonnegative_real_p (op0)
+ && gimple_val_nonnegative_real_p (op1));
+
+   case MAX_EXPR:
+ /* True if either operand is nonnegative.  */
+ op0 = gimple_assign_rhs1 (def_stmt);
+ op1 = 

Re: -fdump-passes -fenable-xxx=func_name_list

2011-06-01 Thread Xinliang David Li
The attached is patch-2 (-fdump-passes) and a sample output:

Ok for trunk?

David

On Wed, Jun 1, 2011 at 9:16 AM, Xinliang David Li  wrote:
> On Wed, Jun 1, 2011 at 1:51 AM, Richard Guenther
>  wrote:
>> On Wed, Jun 1, 2011 at 1:34 AM, Xinliang David Li  wrote:
>>> The following patch implements the a new option that dumps gcc PASS
>>> configuration. The sample output is attached.  There is one
>>> limitation: some placeholder passes that are named with '*xxx' are
>>> note registered thus they are not listed. They are not important as
>>> they can not be turned on/off anyway.
>>>
>>> The patch also enhanced -fenable-xxx and -fdisable-xx to allow a list
>>> of function assembler names to be specified.
>>>
>>> Ok for trunk?
>>
>> Please split the patch.
>>
>> I'm not too happy how you dump the pass configuration.  Why not simply,
>> at a _single_ place, walk the pass tree?  Instead of doing pieces of it
>> at pass execution time when it's not already dumped - that really looks
>> gross.
>
> Yes, that was the original plan -- but it has problems
> 1) the dumper needs to know the root pass lists -- which can change
> frequently -- it can be a long term maintanance burden;
> 2) the centralized dumper needs to be done after option processing
> 3) not sure if gate functions have any side effects or have dependencies on 
> cfun
>
> The proposed solutions IMHO is not that intrusive -- just three hooks
> to do the dumping and tracking indentation.
>
>>
>> The documentation should also link this option to the -fenable/disable
>> options as obviously the pass names in that dump are those to be
>> used for those flags (and not readily available anywhere else).
>
> Ok.
>
>>
>> I also think that it would be way more useful to note in the individual
>> dump files the functions (at the place they would usually appear) that
>> have the pass explicitly enabled/disabled.
>
> Ok -- for ipa passes or tree/rtl passes where all functions are
> explicitly disabled.
>
> Thanks,
>
> David
>
>>
>> Richard.
>>
>>> Thanks,
>>>
>>> David
>>>
>>
>
Index: doc/invoke.texi
===
--- doc/invoke.texi	(revision 174535)
+++ doc/invoke.texi	(working copy)
@@ -291,6 +291,7 @@ Objective-C and Objective-C++ Dialects}.
 -fdump-translation-unit@r{[}-@var{n}@r{]} @gol
 -fdump-class-hierarchy@r{[}-@var{n}@r{]} @gol
 -fdump-ipa-all -fdump-ipa-cgraph -fdump-ipa-inline @gol
+-fdump-passes @gol
 -fdump-statistics @gol
 -fdump-tree-all @gol
 -fdump-tree-original@r{[}-@var{n}@r{]}  @gol
@@ -5060,7 +5061,8 @@ seperated list of function ranges.  Each
 The range is inclusive in both ends.  If the range is trivial, the number pair can be
 simplified a a single number.  If the function's cgraph node's @var{uid} is falling
 within one of the specified ranges, the @var{pass} is disabled for that function.
-The @var{uid} is shown in the function header of a dump file.
+The @var{uid} is shown in the function header of a dump file, and pass names can be
+dumped by using option @option{-fdump-passes}.
 
 @item -fdisable-tree-@var{pass}
 @item -fdisable-tree-@var{pass}=@var{range-list}
@@ -5483,6 +5485,11 @@ Dump after function inlining.
 
 @end table
 
+@item -fdump-passes
+@opindex fdump-passes
+Dump the list of optimization passes that are turned on and off by
+the current command line options.
+
 @item -fdump-statistics-@var{option}
 @opindex fdump-statistics
 Enable and control dumping of pass statistics in a separate file.  The
Index: common.opt
===
--- common.opt	(revision 174535)
+++ common.opt	(working copy)
@@ -1012,6 +1012,10 @@ fdump-noaddr
 Common Report Var(flag_dump_noaddr)
 Suppress output of addresses in debugging dumps
 
+fdump-passes
+Common Var(flag_dump_passes) Init(0)
+Dump optimization passes
+
 fdump-unnumbered
 Common Report Var(flag_dump_unnumbered)
 Suppress output of instruction numbers, line number notes and addresses in debugging dumps
Index: passes.c
===
--- passes.c	(revision 174536)
+++ passes.c	(working copy)
@@ -478,7 +478,7 @@ passr_eq (const void *p1, const void *p2
   return !strcmp (s1->unique_name, s2->unique_name);
 }
 
-static htab_t pass_name_tab = NULL;
+static htab_t name_to_pass_map = NULL;
 
 /* Register PASS with NAME.  */
 
@@ -488,11 +488,11 @@ register_pass_name (struct opt_pass *pas
   struct pass_registry **slot;
   struct pass_registry pr;
 
-  if (!pass_name_tab)
-pass_name_tab = htab_create (256, passr_hash, passr_eq, NULL);
+  if (!name_to_pass_map)
+name_to_pass_map = htab_create (256, passr_hash, passr_eq, NULL);
 
   pr.unique_name = name;
-  slot = (struct pass_registry **) htab_find_slot (pass_name_tab, &pr, INSERT);
+  slot = (struct pass_registry **) htab_find_slot (name_to_pass_map, &pr, INSERT);
   if (!*slot)
 {
   struct pass_registry *new_pr;
@@ -506,6 +506,117 @@ register_pass_name (struct opt_pas

Re: -fdump-passes -fenable-xxx=func_name_list

2011-06-01 Thread Richard Guenther
On Wed, Jun 1, 2011 at 6:16 PM, Xinliang David Li  wrote:
> On Wed, Jun 1, 2011 at 1:51 AM, Richard Guenther
>  wrote:
>> On Wed, Jun 1, 2011 at 1:34 AM, Xinliang David Li  wrote:
>>> The following patch implements the a new option that dumps gcc PASS
>>> configuration. The sample output is attached.  There is one
>>> limitation: some placeholder passes that are named with '*xxx' are
>>> note registered thus they are not listed. They are not important as
>>> they can not be turned on/off anyway.
>>>
>>> The patch also enhanced -fenable-xxx and -fdisable-xx to allow a list
>>> of function assembler names to be specified.
>>>
>>> Ok for trunk?
>>
>> Please split the patch.
>>
>> I'm not too happy how you dump the pass configuration.  Why not simply,
>> at a _single_ place, walk the pass tree?  Instead of doing pieces of it
>> at pass execution time when it's not already dumped - that really looks
>> gross.
>
> Yes, that was the original plan -- but it has problems
> 1) the dumper needs to know the root pass lists -- which can change
> frequently -- it can be a long term maintanance burden;
> 2) the centralized dumper needs to be done after option processing
> 3) not sure if gate functions have any side effects or have dependencies on 
> cfun
>
> The proposed solutions IMHO is not that intrusive -- just three hooks
> to do the dumping and tracking indentation.

Well, if you have a CU that is empty or optimized to nothing at some point
you will not get a complete pass list.  I suppose optimize attributes might
also confuse output.  Your solution might not be that intrusive
but it is still ugly.  I don't see 1) as an issue, for 2) you can just call the
dumping from toplev_main before calling do_compile (), 3) gate functions
shouldn't have side-effects, but as they could gate on optimize_for_speed ()
your option summary output will be bogus anyway.

So - what is the output intended for if it isn't reliable?

Richard.

>>
>> The documentation should also link this option to the -fenable/disable
>> options as obviously the pass names in that dump are those to be
>> used for those flags (and not readily available anywhere else).
>
> Ok.
>
>>
>> I also think that it would be way more useful to note in the individual
>> dump files the functions (at the place they would usually appear) that
>> have the pass explicitly enabled/disabled.
>
> Ok -- for ipa passes or tree/rtl passes where all functions are
> explicitly disabled.
>
> Thanks,
>
> David
>
>>
>> Richard.
>>
>>> Thanks,
>>>
>>> David
>>>
>>
>


[lto] Fix streaming of multi-byte enums (issue4526099)

2011-06-01 Thread Diego Novillo

This patch (split out of
http://gcc.gnu.org/ml/gcc-patches/2011-06/msg4.html), fixes the
streaming of enum values when they are larger than a single byte.

Tested with LTO profiledbootstrap on x86_64.

OK for trunk?


Diego.

* lto-streamer-out.c (lto_output_ts_decl_with_vis_tree_pointers): Call
output_record_start with LTO_null instead of output_zero.
(lto_output_ts_binfo_tree_pointers): Likewise.
(lto_output_tree): Likewise.
(output_eh_try_list): Likewise.
(output_eh_region): Likewise.
(output_eh_lp): Likewise.
(output_eh_regions): Likewise.
(output_bb): Likewise.
(output_function): Likewise.
(output_unreferenced_globals): Likewise.
* lto-streamer.h (enum LTO_tags): Reserve MAX_TREE_CODES
instead of NUM_TREE_CODES.
(lto_tag_is_tree_code_p): Check max value against MAX_TREE_CODES.
(lto_output_int_in_range): Change << to >> when shifting VAL.

diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
index b3b81bd..3d42483 100644
--- a/gcc/lto-streamer-out.c
+++ b/gcc/lto-streamer-out.c
@@ -955,7 +950,7 @@ lto_output_ts_decl_with_vis_tree_pointers (struct 
output_block *ob, tree expr,
   if (DECL_ASSEMBLER_NAME_SET_P (expr))
 lto_output_tree_or_ref (ob, DECL_ASSEMBLER_NAME (expr), ref_p);
   else
-output_zero (ob);
+output_record_start (ob, LTO_null);
 
   lto_output_tree_or_ref (ob, DECL_SECTION_NAME (expr), ref_p);
   lto_output_tree_or_ref (ob, DECL_COMDAT_GROUP (expr), ref_p);
@@ -1136,7 +1131,7 @@ lto_output_ts_binfo_tree_pointers (struct output_block 
*ob, tree expr,
  is needed to build the empty BINFO node on the reader side.  */
   FOR_EACH_VEC_ELT (tree, BINFO_BASE_BINFOS (expr), i, t)
 lto_output_tree_or_ref (ob, t, ref_p);
-  output_zero (ob);
+  output_record_start (ob, LTO_null);
 
   lto_output_tree_or_ref (ob, BINFO_OFFSET (expr), ref_p);
   lto_output_tree_or_ref (ob, BINFO_VTABLE (expr), ref_p);
@@ -1430,7 +1425,7 @@ lto_output_tree (struct output_block *ob, tree expr, bool 
ref_p)
 
   if (expr == NULL_TREE)
 {
-  output_zero (ob);
+  output_record_start (ob, LTO_null);
   return;
 }
 
@@ -1486,7 +1481,7 @@ output_eh_try_list (struct output_block *ob, eh_catch 
first)
   lto_output_tree_ref (ob, n->label);
 }
 
-  output_zero (ob);
+  output_record_start (ob, LTO_null);
 }
 
 
@@ -1501,7 +1496,7 @@ output_eh_region (struct output_block *ob, eh_region r)
 
   if (r == NULL)
 {
-  output_zero (ob);
+  output_record_start (ob, LTO_null);
   return;
 }
 
@@ -1564,7 +1559,7 @@ output_eh_lp (struct output_block *ob, eh_landing_pad lp)
 {
   if (lp == NULL)
 {
-  output_zero (ob);
+  output_record_start (ob, LTO_null);
   return;
 }
 
@@ -1633,9 +1628,9 @@ output_eh_regions (struct output_block *ob, struct 
function *fn)
}
 }
 
-  /* The 0 either terminates the record or indicates that there are no
- eh_records at all.  */
-  output_zero (ob);
+  /* The LTO_null either terminates the record or indicates that there
+ are no eh_records at all.  */
+  output_record_start (ob, LTO_null);
 }
 
 
@@ -1880,10 +1875,10 @@ output_bb (struct output_block *ob, basic_block bb, 
struct function *fn)
  output_sleb128 (ob, region);
}
  else
-   output_zero (ob);
+   output_record_start (ob, LTO_null);
}
 
-  output_zero (ob);
+  output_record_start (ob, LTO_null);
 
   for (bsi = gsi_start_phis (bb); !gsi_end_p (bsi); gsi_next (&bsi))
{
@@ -1896,7 +1891,7 @@ output_bb (struct output_block *ob, basic_block bb, 
struct function *fn)
output_phi (ob, phi);
}
 
-  output_zero (ob);
+  output_record_start (ob, LTO_null);
 }
 }
 
@@ -2053,7 +2048,7 @@ output_function (struct cgraph_node *node)
 output_bb (ob, bb, fn);
 
   /* The terminator for this function.  */
-  output_zero (ob);
+  output_record_start (ob, LTO_null);
 
   output_cfg (ob, fn);
 
@@ -2167,7 +2162,7 @@ output_unreferenced_globals (cgraph_node_set set, 
varpool_node_set vset)
   }
   symbol_alias_set_destroy (defined);
 
-  output_zero (ob);
+  output_record_start (ob, LTO_null);
 
   produce_asm (ob, NULL);
   destroy_output_block (ob);
diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h
index e8410d4..9de24ff 100644
--- a/gcc/lto-streamer.h
+++ b/gcc/lto-streamer.h
@@ -186,7 +186,7 @@ enum LTO_tags
 
  Conversely, to map between LTO tags and tree/gimple codes, the
  reverse operation must be applied.  */
-  LTO_bb0 = 1 + NUM_TREE_CODES + LAST_AND_UNUSED_GIMPLE_CODE,
+  LTO_bb0 = 1 + MAX_TREE_CODES + LAST_AND_UNUSED_GIMPLE_CODE,
   LTO_bb1,
 
   /* EH region holding the previous statement.  */
@@ -957,7 +957,7 @@ extern VEC(lto_out_decl_state_ptr, heap) 
*lto_function_decl_states;
 static inline bool
 lto_tag_is_tree_code_p (enum LTO_tags tag)
 {
-  return tag > LTO_null && (unsigned) tag <= N

Re: [build] Move MD_UNWIND_SUPPORT to toplevel libgcc

2011-06-01 Thread Mike Stump
On Jun 1, 2011, at 10:51 AM, Rainer Orth wrote:
>>> I can certainly do it this way for now, but if we could do away with the
>>> tests completely, that would be cleaner.
>> 
>> Agreed, though, I don't believe the test is superfluous.
> 
> You still haven't answered my question wrt. Darwin 8 vs. 64-bit on
> PowerPC.  Perhaps we can do away with DARWIN_LIBSYSTEM_HAS_UNWIND
> completely?

To quote my previous email:

>> I don't believe the test is superfluous.

This means that I can't say for sure that is is unneeded.  There was 64-bit 
support on darwin 8 as I recall.


Re: -fdump-passes -fenable-xxx=func_name_list

2011-06-01 Thread Xinliang David Li
On Wed, Jun 1, 2011 at 12:29 PM, Richard Guenther
 wrote:
> On Wed, Jun 1, 2011 at 6:16 PM, Xinliang David Li  wrote:
>> On Wed, Jun 1, 2011 at 1:51 AM, Richard Guenther
>>  wrote:
>>> On Wed, Jun 1, 2011 at 1:34 AM, Xinliang David Li  
>>> wrote:
 The following patch implements the a new option that dumps gcc PASS
 configuration. The sample output is attached.  There is one
 limitation: some placeholder passes that are named with '*xxx' are
 note registered thus they are not listed. They are not important as
 they can not be turned on/off anyway.

 The patch also enhanced -fenable-xxx and -fdisable-xx to allow a list
 of function assembler names to be specified.

 Ok for trunk?
>>>
>>> Please split the patch.
>>>
>>> I'm not too happy how you dump the pass configuration.  Why not simply,
>>> at a _single_ place, walk the pass tree?  Instead of doing pieces of it
>>> at pass execution time when it's not already dumped - that really looks
>>> gross.
>>
>> Yes, that was the original plan -- but it has problems
>> 1) the dumper needs to know the root pass lists -- which can change
>> frequently -- it can be a long term maintanance burden;
>> 2) the centralized dumper needs to be done after option processing
>> 3) not sure if gate functions have any side effects or have dependencies on 
>> cfun
>>
>> The proposed solutions IMHO is not that intrusive -- just three hooks
>> to do the dumping and tracking indentation.
>
> Well, if you have a CU that is empty or optimized to nothing at some point
> you will not get a complete pass list.  I suppose optimize attributes might
> also confuse output.  Your solution might not be that intrusive
> but it is still ugly.  I don't see 1) as an issue, for 2) you can just call 
> the
> dumping from toplev_main before calling do_compile (), 3) gate functions
> shouldn't have side-effects, but as they could gate on optimize_for_speed ()
> your option summary output will be bogus anyway.
>
> So - what is the output intended for if it isn't reliable?

This needs to be cleaned up at some point -- the gate function should
behave the same for all functions and per-function decisions need to
be pushed down to the executor body.  I will try to rework the patch
as you suggested to see if there are problems.

David


>
> Richard.
>
>>>
>>> The documentation should also link this option to the -fenable/disable
>>> options as obviously the pass names in that dump are those to be
>>> used for those flags (and not readily available anywhere else).
>>
>> Ok.
>>
>>>
>>> I also think that it would be way more useful to note in the individual
>>> dump files the functions (at the place they would usually appear) that
>>> have the pass explicitly enabled/disabled.
>>
>> Ok -- for ipa passes or tree/rtl passes where all functions are
>> explicitly disabled.
>>
>> Thanks,
>>
>> David
>>
>>>
>>> Richard.
>>>
 Thanks,

 David

>>>
>>
>


Re: [patch, ARM] Fix PR48808, PR48792: More work on CANNOT_CHANGE_MODE_CLASS

2011-06-01 Thread Richard Sandiford
Eric Botcazou  writes:
>>  * reload.c (push_reload): Check contains_reg_of_mode.
>>  * reload1.c (strip_paradoxical_subreg): New function.
>>  (gen_reload_chain_without_interm_reg_p): Use it to handle
>>  paradoxical subregs.
>>  (emit_output_reload_insns, gen_reload): Likewise.
>
> Testing (not a full cycle, but still) revealed no problems on SPARC
> (32-bit and 64-bit) or IA-64.  The patch is OK as far as I'm
> concerned, but you might want to get a second opinion from a reload
> expert.

Thanks.  Rather than hand-picking an expert, I compromised and waited for
a couple of days to see if anyone had any comments or objections.

> Now I have a couple of requests:
>
>   1. Could you add PR rtl-optimization/48830 to the ChangeLog and install the 
> testcase (attached) distilled by Hans-Peter as gcc.target/sparc/ultrasp12.c?

OK, done.

>   2. Could you rename the first parameter of the new function?
> "focus" sounds a little strange to me and is unheard of in the GCC
> codebase.  Maybe "target"?

I went for "op".  "target" might have been misleading because the
parameter is sometimes the source (rather than destination) of the
reload.

Here's what I installed after retesting on x86_64-linux-gnu.
Thanks again for the review.

Richard


gcc/
PR rtl-optimization/48830
PR rtl-optimization/48808
PR rtl-optimization/48792
* reload.c (push_reload): Check contains_reg_of_mode.
* reload1.c (strip_paradoxical_subreg): New function.
(gen_reload_chain_without_interm_reg_p): Use it to handle
paradoxical subregs.
(emit_output_reload_insns, gen_reload): Likewise.

gcc/testsuite/
2011-06-01  Eric Botcazou  
Hans-Peter Nilsson  

PR rtl-optimization/48830
* gcc.target/sparc/ultrasp12.c: New test.

Index: gcc/reload.c
===
--- gcc/reload.c2011-05-30 17:26:36.0 +0100
+++ gcc/reload.c2011-06-01 18:45:48.0 +0100
@@ -1019,6 +1019,7 @@ push_reload (rtx in, rtx out, rtx *inloc
 #ifdef CANNOT_CHANGE_MODE_CLASS
   && !CANNOT_CHANGE_MODE_CLASS (GET_MODE (SUBREG_REG (in)), inmode, rclass)
 #endif
+  && contains_reg_of_mode[(int) rclass][(int) GET_MODE (SUBREG_REG (in))]
   && (CONSTANT_P (SUBREG_REG (in))
  || GET_CODE (SUBREG_REG (in)) == PLUS
  || strict_low
@@ -1125,6 +1126,7 @@ push_reload (rtx in, rtx out, rtx *inloc
 #ifdef CANNOT_CHANGE_MODE_CLASS
   && !CANNOT_CHANGE_MODE_CLASS (GET_MODE (SUBREG_REG (out)), outmode, 
rclass)
 #endif
+  && contains_reg_of_mode[(int) rclass][(int) GET_MODE (SUBREG_REG (out))]
   && (CONSTANT_P (SUBREG_REG (out))
  || strict_low
  || (((REG_P (SUBREG_REG (out))
Index: gcc/reload1.c
===
--- gcc/reload1.c   2011-05-30 17:26:36.0 +0100
+++ gcc/reload1.c   2011-06-01 18:50:01.0 +0100
@@ -4471,6 +4471,43 @@ scan_paradoxical_subregs (rtx x)
}
 }
 }
+
+/* *OP_PTR and *OTHER_PTR are two operands to a conceptual reload.
+   If *OP_PTR is a paradoxical subreg, try to remove that subreg
+   and apply the corresponding narrowing subreg to *OTHER_PTR.
+   Return true if the operands were changed, false otherwise.  */
+
+static bool
+strip_paradoxical_subreg (rtx *op_ptr, rtx *other_ptr)
+{
+  rtx op, inner, other, tem;
+
+  op = *op_ptr;
+  if (GET_CODE (op) != SUBREG)
+return false;
+
+  inner = SUBREG_REG (op);
+  if (GET_MODE_SIZE (GET_MODE (op)) <= GET_MODE_SIZE (GET_MODE (inner)))
+return false;
+
+  other = *other_ptr;
+  tem = gen_lowpart_common (GET_MODE (inner), other);
+  if (!tem)
+return false;
+
+  /* If the lowpart operation turned a hard register into a subreg,
+ rather than simplifying it to another hard register, then the
+ mode change cannot be properly represented.  For example, OTHER
+ might be valid in its current mode, but not in the new one.  */
+  if (GET_CODE (tem) == SUBREG
+  && REG_P (other)
+  && HARD_REGISTER_P (other))
+return false;
+
+  *op_ptr = inner;
+  *other_ptr = tem;
+  return true;
+}
 
 /* A subroutine of reload_as_needed.  If INSN has a REG_EH_REGION note,
examine all of the reload insns between PREV and NEXT exclusive, and
@@ -5538,7 +5575,7 @@ gen_reload_chain_without_interm_reg_p (i
  chain reloads or do need an intermediate hard registers.  */
   bool result = true;
   int regno, n, code;
-  rtx out, in, tem, insn;
+  rtx out, in, insn;
   rtx last = get_last_insn ();
 
   /* Make r2 a component of r1.  */
@@ -5557,11 +5594,7 @@ gen_reload_chain_without_interm_reg_p (i
 
   /* If IN is a paradoxical SUBREG, remove it and try to put the
  opposite SUBREG on OUT.  Likewise for a paradoxical SUBREG on OUT.  */
-  if (GET_CODE (in) == SUBREG
-  && (GET_MODE_SIZE (GET_MODE (in))
- > GET_MODE_SIZE (GET_MODE (SUBREG_REG (in
-  && (tem = gen_lowpar

Re: Ping^2: PR target/45074: Check targets of multi-word operations

2011-06-01 Thread Richard Sandiford
Bernd Schmidt  writes:
> On 05/31/2011 07:51 PM, Richard Sandiford wrote:
>> Ping for:
>> 
>> http://gcc.gnu.org/ml/gcc-patches/2011-04/msg01327.html
>> 
>> It fixes the expansion of multiword operations in cases where the
>> suggested target is a hard register and where CANNOT_CHANGE_MODE_CLASS
>> forbids word-mode subparts.
>
> Can you call the new function valid_multiword_target_p? In a sense, we
> already know it's a multiword target, so the function name is a bit
> unfortunate.

Yeah, that's better.

> I see two copies of this code
>
> /* If TARGET is the same as one of the operands, the REG_EQUAL note
>won't be accurate, so use a new target.  */
> -   if (target == 0 || target == op0 || target == op1)
>
> in expand_binop, and you seem to be changing only one? Also, there's
>
>   xtarget = gen_reg_rtx (mode);
>
>   if (target == 0 || !REG_P (target))
> target = xtarget;
>
>   /* Indicate for flow that the entire target reg is being set.  */
>   if (REG_P (target))
> emit_clobber (xtarget);

Good catch.

> Ok with these changes (or if there's a good reason not to touch the ones
> you left out).

Thanks.  Here's what I installed after retesting on x86_64-linux-gnu.

Richard


gcc/
PR target/45074
* optabs.h (valid_multiword_target_p): Declare.
* expmed.c (extract_bit_field_1): Check valid_multiword_target_p when
doing multi-word operations.
* optabs.c (expand_binop): Likewise.
(expand_doubleword_bswap): Likewise.
(expand_absneg_bit): Likewise.
(expand_unop): Likewise.
(expand_copysign_bit): Likewise.
(multiword_target_p): New function.

gcc/testsuite/
PR target/45074
* gcc.target/mips/pr45074.c: New test.

Index: gcc/optabs.h
===
--- gcc/optabs.h2011-06-01 18:53:46.0 +0100
+++ gcc/optabs.h2011-06-01 18:54:09.0 +0100
@@ -1059,6 +1059,8 @@ create_integer_operand (struct expand_op
   create_expand_operand (op, EXPAND_INTEGER, GEN_INT (intval), VOIDmode, 
false);
 }
 
+extern bool valid_multiword_target_p (rtx);
+
 extern bool maybe_legitimize_operands (enum insn_code icode,
   unsigned int opno, unsigned int nops,
   struct expand_operand *ops);
Index: gcc/expmed.c
===
--- gcc/expmed.c2011-06-01 18:53:46.0 +0100
+++ gcc/expmed.c2011-06-01 18:54:09.0 +0100
@@ -1341,7 +1341,7 @@ extract_bit_field_1 (rtx str_rtx, unsign
   unsigned int nwords = (bitsize + (BITS_PER_WORD - 1)) / BITS_PER_WORD;
   unsigned int i;
 
-  if (target == 0 || !REG_P (target))
+  if (target == 0 || !REG_P (target) || !valid_multiword_target_p (target))
target = gen_reg_rtx (mode);
 
   /* Indicate for flow that the entire target reg is being set.  */
Index: gcc/optabs.c
===
--- gcc/optabs.c2011-06-01 18:53:46.0 +0100
+++ gcc/optabs.c2011-06-01 18:57:38.0 +0100
@@ -1537,7 +1537,10 @@ expand_binop (enum machine_mode mode, op
 
   /* If TARGET is the same as one of the operands, the REG_EQUAL note
 won't be accurate, so use a new target.  */
-  if (target == 0 || target == op0 || target == op1)
+  if (target == 0
+ || target == op0
+ || target == op1
+ || !valid_multiword_target_p (target))
target = gen_reg_rtx (mode);
 
   start_sequence ();
@@ -1605,7 +1608,10 @@ expand_binop (enum machine_mode mode, op
 
  /* If TARGET is the same as one of the operands, the REG_EQUAL note
 won't be accurate, so use a new target.  */
- if (target == 0 || target == op0 || target == op1)
+ if (target == 0
+ || target == op0
+ || target == op1
+ || !valid_multiword_target_p (target))
target = gen_reg_rtx (mode);
 
  start_sequence ();
@@ -1659,7 +1665,11 @@ expand_binop (enum machine_mode mode, op
 opportunities, and second because if target and op0 happen to be MEMs
 designating the same location, we would risk clobbering it too early
 in the code sequence we generate below.  */
-  if (target == 0 || target == op0 || target == op1 || ! REG_P (target))
+  if (target == 0
+ || target == op0
+ || target == op1
+ || !REG_P (target)
+ || !valid_multiword_target_p (target))
target = gen_reg_rtx (mode);
 
   start_sequence ();
@@ -1779,7 +1789,7 @@ expand_binop (enum machine_mode mode, op
 
   xtarget = gen_reg_rtx (mode);
 
-  if (target == 0 || !REG_P (target))
+  if (target == 0 || !REG_P (target) || !valid_multiword_target_p (target))
target = xtarget;
 
   /* Indicate for flow t

Allow alternatives for attr "predicable"

2011-06-01 Thread Bernd Schmidt
Currently, the predicable attribute can only be either "yes" or "no" for
an instruction, without distinguishing between alternatives. This can be
a problem - on C6X, there are move and add instructions where exactly
one alternative isn't predicable. The comment in gensupport.c mentions
that FRV also has this problem, and I seem to recall someone mentioning
a similar situation on ARM.

This patch extends gensupport to modify the attribute vector of the
conditional variant of each instruction so that the "predicable"
attribute is renamed to "ce_enabled" (an internal attribute with
default "yes"). The default definition of attribute "enabled" is then
modified to also test whether "ce_enabled" evaluates to "yes".

Definitions of "enabled" in a conditionalized pattern are renamed to
"nonce_enabled", and "enabled" is defined as "nonce_enabled" &&
"ce_enabled". It's done this way as we can't easily rewrite set_attr
definitions in gensupport.c without moving over a whole lot of code from
genattrtab.

Tested with a bootstrap on i686-linux, and regression tests with a
suitably modified 4.5 c6x-elf toolchain (one multilib successful so far
for the latest version of the patch and some more for earlier versions).
I've tried a few variants of "enabled" and "predicable" definitions in
the c6x machine description and it appears to work as intended.


Bernd
* gensupport.c (add_define_attr): New static function.
(is_predicable): Allow multi-alternative lists for the "predicable"
attribute.
(modify_attr_enabled_ce, alter_attrs_for_insn): New static functions.
(process_one_cond_exec): Call alter_attrs_for_insn.
* doc/md.texi (Defining Attributes): Mention some standard names.
(Conditional Execution): Update documentation for "predicable".

Index: gcc/gensupport.c
===
--- gcc/gensupport.c(revision 174430)
+++ gcc/gensupport.c(working copy)
@@ -368,6 +368,25 @@ queue_pattern (rtx pattern, struct queue
   return e;
 }
 
+/* Build a define_attr for an binary attribute with name NAME and
+   possible values "yes" and "no", and queue it.  */
+static void
+add_define_attr (const char *name)
+{
+  struct queue_elem *e = XNEW(struct queue_elem);
+  rtx t1 = rtx_alloc (DEFINE_ATTR);
+  XSTR (t1, 0) = name;
+  XSTR (t1, 1) = "no,yes";
+  XEXP (t1, 2) = rtx_alloc (CONST_STRING);
+  XSTR (XEXP (t1, 2), 0) = "yes";
+  e->data = t1;
+  e->filename = "built-in";
+  e->lineno = -1;
+  e->next = define_attr_queue;
+  define_attr_queue = e;
+
+}
+
 /* Recursively remove constraints from an rtx.  */
 
 static void
@@ -547,17 +566,10 @@ is_predicable (struct queue_elem *elem)
   return predicable_default;
 
  found:
-  /* Verify that predicability does not vary on the alternative.  */
-  /* ??? It should be possible to handle this by simply eliminating
- the non-predicable alternatives from the insn.  FRV would like
- to do this.  Delay this until we've got the basics solid.  */
+  /* Find out which value we're looking at.  Multiple alternatives means at
+ least one is predicable.  */
   if (strchr (value, ',') != NULL)
-{
-  error_with_line (elem->lineno, "multiple alternatives for `predicable'");
-  return 0;
-}
-
-  /* Find out which value we're looking at.  */
+return 1;
   if (strcmp (value, predicable_true) == 0)
 return 1;
   if (strcmp (value, predicable_false) == 0)
@@ -798,6 +810,146 @@ alter_test_for_insn (struct queue_elem *
XSTR (insn_elem->data, 2));
 }
 
+/* Modify VAL, which is an attribute expression for the "enabled" attribute,
+   to take "ce_enabled" into account.  Return the new expression.  */
+static rtx
+modify_attr_enabled_ce (rtx val)
+{
+  rtx eq_attr, str;
+  rtx ite;
+  eq_attr = rtx_alloc (EQ_ATTR);
+  ite = rtx_alloc (IF_THEN_ELSE);
+  str = rtx_alloc (CONST_STRING);
+
+  XSTR (eq_attr, 0) = "ce_enabled";
+  XSTR (eq_attr, 1) = "yes";
+  XSTR (str, 0) = "no";
+  XEXP (ite, 0) = eq_attr;
+  XEXP (ite, 1) = val;
+  XEXP (ite, 2) = str;
+
+  return ite;
+}
+
+/* Alter the attribute vector of INSN, which is a COND_EXEC variant created
+   from a define_insn pattern.  We must modify the "predicable" attribute
+   to be named "ce_enabled", and also change any "enabled" attribute that's
+   present so that it takes ce_enabled into account.
+   We rely on the fact that INSN was created with copy_rtx, and modify data
+   in-place.  */
+
+static void
+alter_attrs_for_insn (rtx insn)
+{
+  static bool global_changes_made = false;
+  rtvec vec = XVEC (insn, 4);
+  rtvec new_vec;
+  rtx val, set;
+  int num_elem;
+  int predicable_idx = -1;
+  int enabled_idx = -1;
+  int i;
+
+  if (! vec)
+return;
+
+  num_elem = GET_NUM_ELEM (vec);
+  for (i = num_elem - 1; i >= 0; --i)
+{
+  rtx sub = RTVEC_ELT (vec, i);
+  switch (GET_CODE (sub))
+   {
+   case SET_ATTR:
+ if (strcmp (XSTR (sub, 0), "predicable") == 0)
+

Re: [build] Move MD_UNWIND_SUPPORT to toplevel libgcc

2011-06-01 Thread IainS


On 1 Jun 2011, at 20:40, Mike Stump wrote:


On Jun 1, 2011, at 10:51 AM, Rainer Orth wrote:
I can certainly do it this way for now, but if we could do away  
with the

tests completely, that would be cleaner.


Agreed, though, I don't believe the test is superfluous.


You still haven't answered my question wrt. Darwin 8 vs. 64-bit on
PowerPC.  Perhaps we can do away with DARWIN_LIBSYSTEM_HAS_UNWIND
completely?


To quote my previous email:


I don't believe the test is superfluous.


This means that I can't say for sure that is is unneeded.  There was  
64-bit support on darwin 8 as I recall.


It was working last time I checked (around 3-ish months ago)...



[lto] Remove unnecessary assertion for DECL_SAVED_TREE (issue4530094)

2011-06-01 Thread Diego Novillo

This patchlet removes the assertion that DECL_SAVED_TREE should
be NULL.

As we discussed in http://gcc.gnu.org/ml/gcc-patches/2011-06/msg4.html,
it is no longer necessary.

Committed to trunk.


Diego.


* lto-streamer-out.c (lto_output_ts_decl_non_common_tree_pointers):
Remove assertion for DECL_SAVED_TREE in FUNCTION_DECL nodes.

diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
index 7f3217b..3d42483 100644
--- a/gcc/lto-streamer-out.c
+++ b/gcc/lto-streamer-out.c
@@ -931,11 +931,6 @@ lto_output_ts_decl_non_common_tree_pointers (struct 
output_block *ob,
 {
   if (TREE_CODE (expr) == FUNCTION_DECL)
 {
-  /* DECL_SAVED_TREE holds the GENERIC representation for DECL.
-At this point, it should not exist.  Either because it was
-converted to gimple or because DECL didn't have a GENERIC
-representation in this TU.  */
-  gcc_assert (DECL_SAVED_TREE (expr) == NULL_TREE);
   lto_output_tree_or_ref (ob, DECL_ARGUMENTS (expr), ref_p);
   lto_output_tree_or_ref (ob, DECL_RESULT (expr), ref_p);
 }

--
This patch is available for review at http://codereview.appspot.com/4530094


Re: introduce --param max-vartrack-expr-depth

2011-06-01 Thread Alexandre Oliva
On May 31, 2011, Alexandre Oliva  wrote:

> On May 30, 2011, Bernd Schmidt  wrote:
>> On 05/30/2011 12:35 PM, Alexandre Oliva wrote:
>>> One of my patches for PR 48866 regressed guality/asm-1.c on
>>> x86_64-linux-gnu because what used to be a single complex debug value
>>> expression became a chain of debug temps holding simpler expressions,
>>> and this chain exceeded the default recursion depth in resolving
>>> location expressions.

>> What's the worst that can happen if you remove the limit altogether?

> Exponential behavior comes to mind.

It's unusual, but debug/pr41264-1.c exhibits it, given INT_MAX for the
param, even though under such a (lack of) limit bootstrap doesn't go
slower or faster, after restoring depth 5 for the reverse_op() use.  As
Jakub pointed out, that one probably shouldn't be affected by the
parameter, as depth 5 is exactly what we want for the kind of expression
we're looking for.  With unlimited depth for that one, not even
libiberty/md5.c compiles successfully, exhausting memory on a box with
some 40GB of total VM (8+32).

So I guess I'll stick with what I checked in, but keep a patch handy to
bump the limit a little bit up and revert to 5 in reverse_op.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer


Re: [PR debug/47590] rework md option overriding to delay var-tracking

2011-06-01 Thread Alexandre Oliva
On May  4, 2011, Bernd Schmidt  wrote:

> This comment looks very weird when added to ia64_option_override
> (likewise for other targets). Is there a reason it's not true anymore?

Dunno, but the patch definitely didn't work any more when I retested it.
Maybe it didin't work when I first tested it, I don't recall having
actually looked at debug dumps then, but it was a while ago.

Here's an alternate approach that works now.  Regstrapped on
32- and 64-bit x86-linux-gnu, and cross-built on x86_64-linux-gnu to the
4 affected platforms, checking manually the proper presence of debug
insns and location notes at the expected dump files.

Ok to install?

for  gcc/ChangeLog
from  Alexandre Oliva  

	PR debug/47590
	* config/bfin/bfin.c (output_file_start): Move flag_var_tracking
	overriding...
	(bfin_option_override): ... here.
	* config/ia64/ia64.c (ia64_file_start): Likewise...
	(ia64_option_override): ... ditto.
	* config/spu/spu.c (asm_file_start): Likewise...
	(spu_option_override): ... ditto.
	* config/picochip/picochip.c (picochip_asm_file_start): Likewise...
	(picochip_option_override): ... ditto.  Split previous code into...
	(picochip_override_options_after_change): ... this new function.
	(TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE): Use the latter.

Index: gcc/config/bfin/bfin.c
===
--- gcc/config/bfin/bfin.c.orig	2011-05-31 12:51:54.361391409 -0300
+++ gcc/config/bfin/bfin.c	2011-05-31 17:57:54.264722118 -0300
@@ -86,14 +86,6 @@ const char *byte_reg_names[]   =  BYTE_R
 static int arg_regs[] = FUNCTION_ARG_REGISTERS;
 static int ret_regs[] = FUNCTION_RETURN_REGISTERS;
 
-/* Nonzero if -fschedule-insns2 was given.  We override it and
-   call the scheduler ourselves during reorg.  */
-static int bfin_flag_schedule_insns2;
-
-/* Determines whether we run variable tracking in machine dependent
-   reorganization.  */
-static int bfin_flag_var_tracking;
-
 struct bfin_cpu
 {
   const char *name;
@@ -375,13 +367,6 @@ output_file_start (void) 
   FILE *file = asm_out_file;
   int i;
 
-  /* Variable tracking should be run after all optimizations which change order
- of insns.  It also needs a valid CFG.  This can't be done in
- bfin_option_override, because flag_var_tracking is finalized after
- that.  */
-  bfin_flag_var_tracking = flag_var_tracking;
-  flag_var_tracking = 0;
-
   fprintf (file, ".file \"%s\";\n", input_filename);
   
   for (i = 0; arg_regs[i] >= 0; i++)
@@ -2772,11 +2757,6 @@ bfin_option_override (void)
 
   flag_schedule_insns = 0;
 
-  /* Passes after sched2 can break the helpful TImode annotations that
- haifa-sched puts on every insn.  Just do scheduling in reorg.  */
-  bfin_flag_schedule_insns2 = flag_schedule_insns_after_reload;
-  flag_schedule_insns_after_reload = 0;
-
   init_machine_status = bfin_init_machine_status;
 }
 
@@ -5550,7 +5530,7 @@ bfin_reorg (void)
  with old MDEP_REORGS that are not CFG based.  Recompute it now.  */
   compute_bb_for_insn ();
 
-  if (bfin_flag_schedule_insns2)
+  if (flag_schedule_insns_after_reload)
 {
   splitting_for_sched = 1;
   split_all_insns ();
@@ -5579,7 +5559,7 @@ bfin_reorg (void)
 
   workaround_speculation ();
 
-  if (bfin_flag_var_tracking)
+  if (flag_var_tracking)
 {
   timevar_push (TV_VAR_TRACKING);
   variable_tracking_main ();
@@ -6765,4 +6745,14 @@ bfin_conditional_register_usage (void)
 #undef TARGET_EXTRA_LIVE_ON_ENTRY
 #define TARGET_EXTRA_LIVE_ON_ENTRY bfin_extra_live_on_entry
 
+/* Passes after sched2 can break the helpful TImode annotations that
+   haifa-sched puts on every insn.  Just do scheduling in reorg.  */
+#undef TARGET_DELAY_SCHED2
+#define TARGET_DELAY_SCHED2 true
+
+/* Variable tracking should be run after all optimizations which
+   change order of insns.  It also needs a valid CFG.  */
+#undef TARGET_DELAY_VARTRACK
+#define TARGET_DELAY_VARTRACK true
+
 struct gcc_target targetm = TARGET_INITIALIZER;
Index: gcc/config/ia64/ia64.c
===
--- gcc/config/ia64/ia64.c.orig	2011-05-31 12:51:54.370391376 -0300
+++ gcc/config/ia64/ia64.c	2011-05-31 17:55:56.30254 -0300
@@ -103,14 +103,6 @@ static const char * const ia64_local_reg
 static const char * const ia64_output_reg_names[8] =
 { "out0", "out1", "out2", "out3", "out4", "out5", "out6", "out7" };
 
-/* Determines whether we run our final scheduling pass or not.  We always
-   avoid the normal second scheduling pass.  */
-static int ia64_flag_schedule_insns2;
-
-/* Determines whether we run variable tracking in machine dependent
-   reorganization.  */
-static int ia64_flag_var_tracking;
-
 /* Variables which are this size or smaller are put in the sdata/sbss
sections.  */
 
@@ -640,6 +632,14 @@ static const struct default_options ia64
 #undef TARGET_PREFERRED_RELOAD_CLASS
 #define TARGET_PREFERRED_RELOAD_CLASS ia64_preferred_reload_class
 
+#undef TARGET_DELAY_SCHED2
+#define TARGET_DELAY

Re: PR 49145: Another (zero_extend (const_int ...)) in combine

2011-06-01 Thread Eric Botcazou
> I've included the subreg handling as well as the zero_extend handling,
> even though make_compound_operation already handles that case correctly.
> However, I've cowardly not removed the known_cond subreg handling:
>
>   else if (code == SUBREG)
> {
>   enum machine_mode inner_mode = GET_MODE (SUBREG_REG (x));
>   rtx new_rtx, r = known_cond (SUBREG_REG (x), cond, reg, val);
>
>   if (SUBREG_REG (x) != r)
>   {
> /* We must simplify subreg here, before we lose track of the
>original inner_mode.  */
> new_rtx = simplify_subreg (GET_MODE (x), r,
>inner_mode, SUBREG_BYTE (x));
> if (new_rtx)
>   return new_rtx;
> else
>   SUBST (SUBREG_REG (x), r);
>   }
>
>   return x;
> }
>
> The new function would also handle the case described in the comment.
> However, I was afraid that we might rely on simplify_subreg being
> used for all simplified operands here, while also relying on it only
> being used for constants in subst.  (This comes from a general
> fear that combine is special in the way that it handles compound
> operations.  Calling the generic simplification routines too often
> could cause us to turn combine's preferred representation into
> the representation normally used elsewhere.)

Frankly, I'm not convinced that the benefits are worth the potential hassle in 
this case.  As you mentioned, the 3 functions already have their own handling 
for SUBREGs, slightly different from each other, so factoring it into a common 
form isn't trivial; now, if you do it and don't defer to the new function for 
the entire handling after that, you don't really simplify the code.

So we're essentially (see below) left with the ZERO_EXTEND case and I'd add the 
handful of missing lines to make_compound_operation and be done with it.

> The known_cond code has the comment:
>
>   /* We don't have to handle SIGN_EXTEND here, because even in the
>  case of replacing something with a modeless CONST_INT, a
>  CONST_INT is already (supposed to be) a valid sign extension for
>  its narrower mode, which implies it's already properly
>  sign-extended for the wider mode.  Now, for ZERO_EXTEND, the
>  story is different.  */
>
> While that is true, I don't see any point in going out of our way
> _not_ to handle sign_extend in the same way.  Or indeed all unary
> operators.  Testing UNARY_P seems justified on the basis that
> simplify_unary_operation requires the mode of the inner operand,
> and that losing that mode without giving simplify_unary_operation
> a chance could therefore lead to wrong results.  I'd rather not
> see us hard-code the knowledge of which operators actually care.

SUBREG and ZERO_EXTEND of CONST_INTs are treated somewhat specially in the 
entire file, see for example do_SUBST.  This isn't the case for other unary 
operators, presumably because this isn't really necessary here.  So I'm not 
convinced that such a generalization is really a good thing in this case.

-- 
Eric Botcazou


Re: [PR debug/47590] rework md option overriding to delay var-tracking

2011-06-01 Thread Bernd Schmidt
On 06/01/2011 10:10 PM, Alexandre Oliva wrote:
> On May  4, 2011, Bernd Schmidt  wrote:
> 
>> This comment looks very weird when added to ia64_option_override
>> (likewise for other targets). Is there a reason it's not true anymore?
> 
> Dunno, but the patch definitely didn't work any more when I retested it.
> Maybe it didin't work when I first tested it, I don't recall having
> actually looked at debug dumps then, but it was a while ago.
> 
> Here's an alternate approach that works now.  Regstrapped on
> 32- and 64-bit x86-linux-gnu, and cross-built on x86_64-linux-gnu to the
> 4 affected platforms, checking manually the proper presence of debug
> insns and location notes at the expected dump files.
> 
> Ok to install?

Looks ok, except I think you need to update tm.texi.in and tm.texi?


Bernd


Dump before flag

2011-06-01 Thread Xinliang David Li
Hi, this is a simple patch that support dump_before flag. E.g,

-fdump-tree-pre-before

This is useful for diffing the the IR before and after a pass.

Gcc dumping needs more cleanups -- such as allowing IR only dump,
allowing IR dumping for a particular function etc. The exposure of
'dumpfile' (instead of a dumping_level () function) makes those change
a little messy, but can be done.

Ok for trunk?

Thanks,

David


Re: [patch, ARM] Fix PR48808, PR48792: More work on CANNOT_CHANGE_MODE_CLASS

2011-06-01 Thread Eric Botcazou
> I went for "op".  "target" might have been misleading because the
> parameter is sometimes the source (rather than destination) of the
> reload.

Thanks.

> Here's what I installed after retesting on x86_64-linux-gnu.
> Thanks again for the review.

FWIW I also bootstrapped/regtested the first version on sparc-sun-solaris2.10 
in the meantime.

-- 
Eric Botcazou


Re: [lto] Fix streaming of multi-byte enums (issue4526099)

2011-06-01 Thread Richard Guenther
On Wed, 1 Jun 2011, Diego Novillo wrote:

> 
> This patch (split out of
> http://gcc.gnu.org/ml/gcc-patches/2011-06/msg4.html), fixes the
> streaming of enum values when they are larger than a single byte.
> 
> Tested with LTO profiledbootstrap on x86_64.
> 
> OK for trunk?

Ok.

Thanks,
Richard.

> 
> Diego.
> 
>   * lto-streamer-out.c (lto_output_ts_decl_with_vis_tree_pointers): Call
>   output_record_start with LTO_null instead of output_zero.
>   (lto_output_ts_binfo_tree_pointers): Likewise.
>   (lto_output_tree): Likewise.
>   (output_eh_try_list): Likewise.
>   (output_eh_region): Likewise.
>   (output_eh_lp): Likewise.
>   (output_eh_regions): Likewise.
>   (output_bb): Likewise.
>   (output_function): Likewise.
>   (output_unreferenced_globals): Likewise.
>   * lto-streamer.h (enum LTO_tags): Reserve MAX_TREE_CODES
>   instead of NUM_TREE_CODES.
>   (lto_tag_is_tree_code_p): Check max value against MAX_TREE_CODES.
>   (lto_output_int_in_range): Change << to >> when shifting VAL.
> 
> diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
> index b3b81bd..3d42483 100644
> --- a/gcc/lto-streamer-out.c
> +++ b/gcc/lto-streamer-out.c
> @@ -955,7 +950,7 @@ lto_output_ts_decl_with_vis_tree_pointers (struct 
> output_block *ob, tree expr,
>if (DECL_ASSEMBLER_NAME_SET_P (expr))
>  lto_output_tree_or_ref (ob, DECL_ASSEMBLER_NAME (expr), ref_p);
>else
> -output_zero (ob);
> +output_record_start (ob, LTO_null);
>  
>lto_output_tree_or_ref (ob, DECL_SECTION_NAME (expr), ref_p);
>lto_output_tree_or_ref (ob, DECL_COMDAT_GROUP (expr), ref_p);
> @@ -1136,7 +1131,7 @@ lto_output_ts_binfo_tree_pointers (struct output_block 
> *ob, tree expr,
>   is needed to build the empty BINFO node on the reader side.  */
>FOR_EACH_VEC_ELT (tree, BINFO_BASE_BINFOS (expr), i, t)
>  lto_output_tree_or_ref (ob, t, ref_p);
> -  output_zero (ob);
> +  output_record_start (ob, LTO_null);
>  
>lto_output_tree_or_ref (ob, BINFO_OFFSET (expr), ref_p);
>lto_output_tree_or_ref (ob, BINFO_VTABLE (expr), ref_p);
> @@ -1430,7 +1425,7 @@ lto_output_tree (struct output_block *ob, tree expr, 
> bool ref_p)
>  
>if (expr == NULL_TREE)
>  {
> -  output_zero (ob);
> +  output_record_start (ob, LTO_null);
>return;
>  }
>  
> @@ -1486,7 +1481,7 @@ output_eh_try_list (struct output_block *ob, eh_catch 
> first)
>lto_output_tree_ref (ob, n->label);
>  }
>  
> -  output_zero (ob);
> +  output_record_start (ob, LTO_null);
>  }
>  
>  
> @@ -1501,7 +1496,7 @@ output_eh_region (struct output_block *ob, eh_region r)
>  
>if (r == NULL)
>  {
> -  output_zero (ob);
> +  output_record_start (ob, LTO_null);
>return;
>  }
>  
> @@ -1564,7 +1559,7 @@ output_eh_lp (struct output_block *ob, eh_landing_pad 
> lp)
>  {
>if (lp == NULL)
>  {
> -  output_zero (ob);
> +  output_record_start (ob, LTO_null);
>return;
>  }
>  
> @@ -1633,9 +1628,9 @@ output_eh_regions (struct output_block *ob, struct 
> function *fn)
>   }
>  }
>  
> -  /* The 0 either terminates the record or indicates that there are no
> - eh_records at all.  */
> -  output_zero (ob);
> +  /* The LTO_null either terminates the record or indicates that there
> + are no eh_records at all.  */
> +  output_record_start (ob, LTO_null);
>  }
>  
>  
> @@ -1880,10 +1875,10 @@ output_bb (struct output_block *ob, basic_block bb, 
> struct function *fn)
> output_sleb128 (ob, region);
>   }
> else
> - output_zero (ob);
> + output_record_start (ob, LTO_null);
>   }
>  
> -  output_zero (ob);
> +  output_record_start (ob, LTO_null);
>  
>for (bsi = gsi_start_phis (bb); !gsi_end_p (bsi); gsi_next (&bsi))
>   {
> @@ -1896,7 +1891,7 @@ output_bb (struct output_block *ob, basic_block bb, 
> struct function *fn)
>   output_phi (ob, phi);
>   }
>  
> -  output_zero (ob);
> +  output_record_start (ob, LTO_null);
>  }
>  }
>  
> @@ -2053,7 +2048,7 @@ output_function (struct cgraph_node *node)
>  output_bb (ob, bb, fn);
>  
>/* The terminator for this function.  */
> -  output_zero (ob);
> +  output_record_start (ob, LTO_null);
>  
>output_cfg (ob, fn);
>  
> @@ -2167,7 +2162,7 @@ output_unreferenced_globals (cgraph_node_set set, 
> varpool_node_set vset)
>}
>symbol_alias_set_destroy (defined);
>  
> -  output_zero (ob);
> +  output_record_start (ob, LTO_null);
>  
>produce_asm (ob, NULL);
>destroy_output_block (ob);
> diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h
> index e8410d4..9de24ff 100644
> --- a/gcc/lto-streamer.h
> +++ b/gcc/lto-streamer.h
> @@ -186,7 +186,7 @@ enum LTO_tags
>  
>   Conversely, to map between LTO tags and tree/gimple codes, the
>   reverse operation must be applied.  */
> -  LTO_bb0 = 1 + NUM_TREE_CODES + LAST_AND_UNUSED_GIMPLE_CODE,
> + 

Re: Dump before flag

2011-06-01 Thread Richard Guenther
On Wed, Jun 1, 2011 at 10:26 PM, Xinliang David Li  wrote:
> Hi, this is a simple patch that support dump_before flag. E.g,
>
> -fdump-tree-pre-before
>
> This is useful for diffing the the IR before and after a pass.
>
> Gcc dumping needs more cleanups -- such as allowing IR only dump,
> allowing IR dumping for a particular function etc. The exposure of
> 'dumpfile' (instead of a dumping_level () function) makes those change
> a little messy, but can be done.
>
> Ok for trunk?

-ENOPATCH

> Thanks,
>
> David
>


Re: Dump before flag

2011-06-01 Thread Xinliang David Li
Sorry about it. Here it is.

David


On Wed, Jun 1, 2011 at 1:36 PM, Richard Guenther
 wrote:
> On Wed, Jun 1, 2011 at 10:26 PM, Xinliang David Li  wrote:
>> Hi, this is a simple patch that support dump_before flag. E.g,
>>
>> -fdump-tree-pre-before
>>
>> This is useful for diffing the the IR before and after a pass.
>>
>> Gcc dumping needs more cleanups -- such as allowing IR only dump,
>> allowing IR dumping for a particular function etc. The exposure of
>> 'dumpfile' (instead of a dumping_level () function) makes those change
>> a little messy, but can be done.
>>
>> Ok for trunk?
>
> -ENOPATCH
>
>> Thanks,
>>
>> David
>>
>
2011-06-01  David Li  

	* tree-dump.c: New dump flags.
	* tree-pass.h: New dump flags.
	* passes.c (execute_one_pass): Handle dump_before flag.

Index: tree-dump.c
===
--- tree-dump.c	(revision 174446)
+++ tree-dump.c	(working copy)
@@ -808,6 +808,7 @@ struct dump_option_value_info
in tree.h */
 static const struct dump_option_value_info dump_options[] =
 {
+  {"before",  TDF_BEFORE},
   {"address", TDF_ADDRESS},
   {"asmname", TDF_ASMNAME},
   {"slim", TDF_SLIM},
Index: tree-pass.h
===
--- tree-pass.h	(revision 174446)
+++ tree-pass.h	(working copy)
@@ -83,6 +83,7 @@ enum tree_dump_index
 #define TDF_ALIAS	(1 << 21)	/* display alias information  */
 #define TDF_ENUMERATE_LOCALS (1 << 22)	/* Enumerate locals by uid.  */
 #define TDF_CSELIB	(1 << 23)	/* Dump cselib details.  */
+#define TDF_BEFORE	(1 << 24)	/* Dump IR before pass.  */
 
 
 /* In tree-dump.c */
Index: passes.c
===
--- passes.c	(revision 174446)
+++ passes.c	(working copy)
@@ -1563,6 +1563,13 @@ execute_one_pass (struct opt_pass *pass)
 
   initializing_dump = pass_init_dump_file (pass);
 
+  /* Override dump TODOs.  */
+  if (dump_file && (pass->todo_flags_finish & TODO_dump_func)
+  && (dump_flags & TDF_BEFORE))
+{
+  pass->todo_flags_finish &= ~TODO_dump_func;
+  pass->todo_flags_start |= TODO_dump_func;
+}
   /* Run pre-pass verification.  */
   execute_todo (pass->todo_flags_start);
 


[lra] a patch to build ARM

2011-06-01 Thread Vladimir Makarov

Here is the patch to build arm-elf target with simulator.
It has been committed to the branch.

2011-06-01  Vladimir Makarov 

* lra-eliminations.c (lra_eliminate_reg_if_possible): Fix a typo.
(process_insn_for_elimination): Invalidate insn data if the insn
code was changed.

* lra-constraints.c (check_and_process_move): Set up temporarily
reg_renumber for secondary_reload hook.
(process_addr_reg): Use class of elimination.
(curr_insn_transform): Remove subreg before address processing.

Index: lra-eliminations.c
===
--- lra-eliminations.c  (revision 174485)
+++ lra-eliminations.c  (working copy)
@@ -1266,7 +1266,7 @@ lra_eliminate_reg_if_possible (rtx *loc)
   struct elim_table *ep;
 
   gcc_assert (REG_P (*loc));
-  if ((regno = REG_P (*loc)) >= FIRST_PSEUDO_REGISTER
+  if ((regno = REGNO (*loc)) >= FIRST_PSEUDO_REGISTER
   /* Virtual registers are not allocatable. ??? */
   || ! TEST_HARD_REG_BIT (lra_no_alloc_regs, regno))
 return;
@@ -1282,6 +1282,16 @@ process_insn_for_elimination (rtx insn, 
   eliminate_regs_in_insn (insn, final_p);
   if (! final_p)
 {
+  /* Check that insn changed its code.  This is a case when a move
+insn becomes an add insn and we do not want to process the
+insn as a move anymore.  */
+  int icode = recog (PATTERN (insn), insn, 0);
+
+  if (icode >= 0 && icode != INSN_CODE (insn))
+   {
+ INSN_CODE (insn) = icode;
+ lra_update_insn_recog_data (insn);
+   }
   lra_update_insn_regno_info (insn);
   lra_push_insn (insn);
   lra_set_used_insn_alternative (insn, -1);
Index: lra-constraints.c
===
--- lra-constraints.c   (revision 174485)
+++ lra-constraints.c   (working copy)
@@ -902,10 +902,11 @@ reg_class_from_constraints (const char *
 static bool
 check_and_process_move (bool *change_p)
 {
+  int regno;
   rtx set, dest, src, dreg, sr, dr, sreg, new_reg, before, x, scratch_reg;
-  enum reg_class dclass, sclass, rclass, secondary_class;
+  enum reg_class dclass, sclass, xclass, rclass, secondary_class;
   secondary_reload_info sri;
-  bool in_p;
+  bool in_p, temp_assign_p;
 
   *change_p = false;
   if ((set = single_set (curr_insn)) == NULL || side_effects_p (set))
@@ -981,18 +982,34 @@ check_and_process_move (bool *change_p)
   in_p = true;
   rclass = dclass;
   x = sreg;
+  xclass = sclass;
 }
   else if (sclass != NO_REGS)
 {
   in_p = false;
   rclass = sclass;
   x = dreg;
+  xclass = dclass;
 }
   else
 return false;
+  temp_assign_p = false;
+  /* Set up hard register for a reload pseudo for hook
+ secondary_reload because some targets just ignore pseudos in the
+ hook.  */
+  if (xclass != NO_REGS
+  && REG_P (x) && (regno = REGNO (x)) >= new_regno_start
+  && ! bitmap_bit_p (&lra_inheritance_pseudos, regno)
+  && lra_get_regno_hard_regno (regno) < 0)
+{
+  reg_renumber[regno] = ira_class_hard_regs[xclass][0];
+  temp_assign_p = true;
+}
   secondary_class
 = (enum reg_class) targetm.secondary_reload (in_p, x, (reg_class_t) rclass,
 GET_MODE (src), &sri);
+  if (temp_assign_p)
+reg_renumber [REGNO (x)] = -1;
   if (secondary_class == NO_REGS && sri.icode == CODE_FOR_nothing)
 return false;
   *change_p = true;
@@ -1090,7 +1107,7 @@ static int curr_swapped;
 static bool
 process_addr_reg (rtx *loc, rtx *before, rtx *after, enum reg_class cl)
 {
-  int regno;
+  int regno, final_regno;
   enum reg_class rclass, new_class;
   rtx reg = *loc;
   rtx new_reg;
@@ -1098,8 +1115,19 @@ process_addr_reg (rtx *loc, rtx *before,
   bool change_p = false;
 
   gcc_assert (REG_P (reg));
-  regno = REGNO (reg);
-  rclass = get_reg_class (regno);
+  final_regno = regno = REGNO (reg);
+  if (regno < FIRST_PSEUDO_REGISTER)
+{
+  rtx final_reg = reg;
+  rtx *final_loc = &final_reg;
+
+  lra_eliminate_reg_if_possible (final_loc);
+  final_regno = REGNO (*final_loc);
+}
+  /* Use class of hard register after elimination because some targets
+ do not recognize virtual hard registers as valid address
+ registers.  */
+  rclass = get_reg_class (final_regno);
   if ((*loc = get_equiv_substitution (reg)) != reg)
 {
   if (lra_dump_file != NULL)
@@ -1113,7 +1141,7 @@ process_addr_reg (rtx *loc, rtx *before,
   *loc = copy_rtx (*loc);
   change_p = true;
 }
-  if (*loc != reg || ! in_class_p (regno, cl, &new_class))
+  if (*loc != reg || ! in_class_p (final_regno, cl, &new_class))
 {
   mode = GET_MODE (reg);
   reg = *loc;
@@ -2629,16 +2657,9 @@ curr_insn_transform (void)
   curr_swapped = false;
   goal_alternative_swapped = false;
 
-  /* Reload address registers and displacements.  We do it before
- finding an alternative b

Re: Dump before flag

2011-06-01 Thread Basile Starynkevitch
On Wed, 1 Jun 2011 13:26:24 -0700
Xinliang David Li  wrote:

> Hi, this is a simple patch that support dump_before flag. E.g,
> 
> -fdump-tree-pre-before
> 
> This is useful for diffing the the IR before and after a pass.

Perhaps you forgot to actually attach the patch?

> Gcc dumping needs more cleanups -- such as allowing IR only dump,
> allowing IR dumping for a particular function etc. The exposure of
> 'dumpfile' (instead of a dumping_level () function) makes those change
> a little messy, but can be done.

I don't understand what you mean by a dumping_level () function. What
should that hypothetical function do? (I'm wrongly guessing it would
return an integer, but IIRC dumpfile is a FILE*)

Regards

-- 
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basilestarynkevitchnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mine, sont seulement les miennes} ***


Re: Dump before flag

2011-06-01 Thread Xinliang David Li
On Wed, Jun 1, 2011 at 2:12 PM, Basile Starynkevitch
 wrote:
> On Wed, 1 Jun 2011 13:26:24 -0700
> Xinliang David Li  wrote:
>
>> Hi, this is a simple patch that support dump_before flag. E.g,
>>
>> -fdump-tree-pre-before
>>
>> This is useful for diffing the the IR before and after a pass.
>
> Perhaps you forgot to actually attach the patch?

Right -- attached in a follow up email.
>
>> Gcc dumping needs more cleanups -- such as allowing IR only dump,
>> allowing IR dumping for a particular function etc. The exposure of
>> 'dumpfile' (instead of a dumping_level () function) makes those change
>> a little messy, but can be done.
>
> I don't understand what you mean by a dumping_level () function. What
> should that hypothetical function do? (I'm wrongly guessing it would
> return an integer, but IIRC dumpfile is a FILE*)

THere are two sources of dump:

1) IR dump performed by pass manager
2) pass specific debugging dump (the verbosity is controlled by -details flag).

2) is the part that is messy and needs cleanup. Every pass just checks
if dump_file is null or not and decide to dump the debugging info --
there is no easy way to turn it on and off. Ideally, individual pass
should call

  int debug_dump_level () -- dumps when it returns > 0.

With that in place, the dump flag -fdump-xxx-yyy-ir_only can be easily
implemented -- it only turns on pass manager dump, but lowers the
debug dump level to 0.

David


>
> Regards
>
> --
> Basile STARYNKEVITCH         http://starynkevitch.net/Basile/
> email: basilestarynkevitchnet mobile: +33 6 8501 2359
> 8, rue de la Faiencerie, 92340 Bourg La Reine, France
> *** opinions {are only mine, sont seulement les miennes} ***
>


[committed] Fix var-tracking ICE with ENTRY_VALUE (PR debug/49250)

2011-06-01 Thread Jakub Jelinek
Hi!

This is something I've introduced through cselib_subst_to_values
substing ENTRY_VALUE to corresponding VALUE and have been fixing already
in the
* var-tracking.c  (replace_expr_with_values): Return NULL for
ENTRY_VALUE too.
hunk.  Apparently there are other 3 places where it needs to be handled
similarly.  If a MEM has ENTRY_VALUE address, equating of cselib_lookup
on the ENTRY_VALUE and cselib_subst_to_values of it (which returns
the same thing) results in set_slot_part ICEs.
Fixed by not doing that for ENTRY_VALUEs, like it isn't done e.g. for REGs
which have the same problem.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk
as obvious.

2011-06-01  Jakub Jelinek  

PR debug/49250
* var-tracking.c (add_uses, add_stores): Don't call
cselib_subst_to_values on ENTRY_VALUE.

--- gcc/var-tracking.c.jj   2011-06-01 10:51:30.0 +0200
+++ gcc/var-tracking.c  2011-06-01 18:59:15.0 +0200
@@ -5052,6 +5052,7 @@ add_uses (rtx *ploc, void *data)
  if (MEM_P (vloc)
  && !REG_P (XEXP (vloc, 0))
  && !MEM_P (XEXP (vloc, 0))
+ && GET_CODE (XEXP (vloc, 0)) != ENTRY_VALUE
  && (GET_CODE (XEXP (vloc, 0)) != PLUS
  || XEXP (XEXP (vloc, 0), 0) != cfa_base_rtx
  || !CONST_INT_P (XEXP (XEXP (vloc, 0), 1
@@ -5130,6 +5131,7 @@ add_uses (rtx *ploc, void *data)
  if (MEM_P (oloc)
  && !REG_P (XEXP (oloc, 0))
  && !MEM_P (XEXP (oloc, 0))
+ && GET_CODE (XEXP (oloc, 0)) != ENTRY_VALUE
  && (GET_CODE (XEXP (oloc, 0)) != PLUS
  || XEXP (XEXP (oloc, 0), 0) != cfa_base_rtx
  || !CONST_INT_P (XEXP (XEXP (oloc, 0), 1
@@ -5383,6 +5385,7 @@ add_stores (rtx loc, const_rtx expr, voi
   if (MEM_P (loc) && type == MO_VAL_SET
  && !REG_P (XEXP (loc, 0))
  && !MEM_P (XEXP (loc, 0))
+ && GET_CODE (XEXP (loc, 0)) != ENTRY_VALUE
  && (GET_CODE (XEXP (loc, 0)) != PLUS
  || XEXP (XEXP (loc, 0), 0) != cfa_base_rtx
  || !CONST_INT_P (XEXP (XEXP (loc, 0), 1

Jakub


C++ PATCH for c++/44175 (segv with recursive decltype in template)

2011-06-01 Thread Jason Merrill
Two of the testcases in this PR were SEGVing for different reasons: one 
from excessive recursion exhausting the stack, and the other from 
dereferencing a null pointer.  This patch fixes both issues.


Tested x86_64-pc-linux-gnu, applied to trunk.
commit 0d7f1941a19ce5133c30f8a917379afc6bb31b82
Author: Jason Merrill 
Date:   Tue May 31 17:11:31 2011 -0400

	PR c++/44175
	* pt.c (template_args_equal): Handle one arg being NULL_TREE.
	(deduction_tsubst_fntype): Handle excessive non-infinite recursion.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index ae3d83d..c1bee3e 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -6476,6 +6476,8 @@ template_args_equal (tree ot, tree nt)
 {
   if (nt == ot)
 return 1;
+  if (nt == NULL_TREE || ot == NULL_TREE)
+return false;
 
   if (TREE_CODE (nt) == TREE_VEC)
 /* For member templates */
@@ -13598,7 +13600,14 @@ static GTY((param_is (spec_entry))) htab_t current_deduction_htab;
 /* In C++0x, it's possible to have a function template whose type depends
on itself recursively.  This is most obvious with decltype, but can also
occur with enumeration scope (c++/48969).  So we need to catch infinite
-   recursion and reject the substitution at deduction time.
+   recursion and reject the substitution at deduction time; this function
+   will return error_mark_node for any repeated substitution.
+
+   This also catches excessive recursion such as when f depends on
+   f across all integers, and returns error_mark_node for all the
+   substitutions back up to the initial one.
+
+   This is, of course, not reentrant.
 
Use of a VEC here is O(n^2) in the depth of function template argument
deduction substitution, but using a hash table creates a lot of constant
@@ -13611,6 +13620,8 @@ static GTY((param_is (spec_entry))) htab_t current_deduction_htab;
 static tree
 deduction_tsubst_fntype (tree fn, tree targs)
 {
+  static bool excessive_deduction_depth;
+
   unsigned i;
   spec_entry **slot;
   spec_entry *p;
@@ -13656,6 +13667,14 @@ deduction_tsubst_fntype (tree fn, tree targs)
   /* If we've created a hash table, look there.  */
   if (current_deduction_htab)
 {
+  if (htab_elements (current_deduction_htab)
+	  > (unsigned) max_tinst_depth)
+	{
+	  /* Trying to recurse across all integers or some such.  */
+	  excessive_deduction_depth = true;
+	  return error_mark_node;
+	}
+
   hash = hash_specialization (&elt);
   slot = (spec_entry **)
 	htab_find_slot_with_hash (current_deduction_htab, &elt, hash, INSERT);
@@ -13701,6 +13720,13 @@ deduction_tsubst_fntype (tree fn, tree targs)
 	r = error_mark_node;
   VEC_pop (spec_entry, current_deduction_vec);
 }
+  if (excessive_deduction_depth)
+{
+  r = error_mark_node;
+  if (htab_elements (current_deduction_htab) == 0)
+	/* Reset once we're all the way out.  */
+	excessive_deduction_depth = false;
+}
   return r;
 }
 
diff --git a/gcc/testsuite/g++.dg/cpp0x/decltype28.C b/gcc/testsuite/g++.dg/cpp0x/decltype28.C
new file mode 100644
index 000..0ab8932
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/decltype28.C
@@ -0,0 +1,16 @@
+// PR c++/44175
+// { dg-options -std=c++0x }
+
+template  struct enable_if { };
+template  struct enable_if  { typedef T type; };
+
+template 
+void ft (F f, typename enable_if::type) {}
+
+template< class F, int N >
+decltype(ft (F(), 0))
+ft (F f, typename enable_if::type) {}
+
+int main() {
+  ft (0, 0);
+}
diff --git a/gcc/testsuite/g++.dg/cpp0x/decltype29.C b/gcc/testsuite/g++.dg/cpp0x/decltype29.C
new file mode 100644
index 000..1dd5a5f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/decltype29.C
@@ -0,0 +1,19 @@
+// PR c++/44175
+// { dg-options -std=c++0x }
+
+template  struct enable_if { };
+template  struct enable_if  { typedef T type; };
+
+template 
+typename enable_if::type
+ft() {}
+
+template
+decltype (ft (F()))
+ft() {}
+
+int main() {
+ft();		// { dg-error "no match" }
+}
+
+// { dg-prune-output "note" }


  1   2   >