RE: Ping: [PATCH] Enable bbro for -Os

2012-08-29 Thread Zhenqiang Chen
> -Original Message-
> From: Steven Bosscher [mailto:stevenb@gmail.com]
> Sent: Friday, August 24, 2012 8:17 PM
> To: Zhenqiang Chen
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: Ping: [PATCH] Enable bbro for -Os
> 
> On Wed, Aug 22, 2012 at 8:49 AM, Zhenqiang Chen
>  wrote:
> >> The patch is to enable bbro for -Os. When optimizing for size, it
> >> * avoid duplicating block.
> >> * keep its original order if there is no chance to fall through.
> >> * ignore edge frequency and probability.
> >> * handle predecessor first if its index is smaller to break long trace.
> 
> You do this by inserting the index as a key. I don't fully understand this
> change. You're assuming that a block with a lower index has a lower pre-
> order number in the CFG's DFS spanning tree, IIUC (i.e. the blocks are
> numbered sequentially)? I'm not sure that's always true. I think you
should
> add an explanation for this heuristic.

Thank you for the comments.

cleanup_cfg is called at the end cfg_layout_initialize before
reorder_basic_blocks. cleanup_cfg does lots of optimization on cfg and
renumber the basic blocks. After cleanup_cfg, the blocks are roughly
numbered sequentially.

The heuristic bases on the result of cleanup_cfg. It just wants to keep the
order of cleanup_cfg since logs show we will have code size improvement (by
cleanup_cfg) even if we do not call reorder_basic_blocks. "index as a key"
is a simple way keep the original order.

Comments are added in the updated patch.

> >> * only connect Trace n with Trace n + 1 to reduce long jump.
> ...
> >>   * bb-reorder.c (connect_better_edge_p): New added.
> >>   (find_traces_1_round): When optimizing for size, ignore edge
> >> frequency
> >>   and probability, and handle all in one round.
> >>   (bb_to_key): Use bb->index as key for size.
> >>   (better_edge_p): The smaller bb index is better for size.
> >>   (connect_traces): Connect block n with block n + 1;
> >>   connect trace m with trace m + 1 if falling through.
> >>   (copy_bb_p): Avoid duplicating blocks.
> >>   (gate_handle_reorder_blocks): Enable bbro when optimizing for
-Os.
> 
> This probably fixes PR54364.

Try the case in PR54364, the patch does reduce several jmp.

> > @@ -1169,6 +1272,10 @@ copy_bb_p (const_basic_block bb, int
> code_may_grow)
> >int max_size = uncond_jump_length;
> >rtx insn;
> >
> > +  /* Avoid duplicating blocks for size.  */  if
> > + (optimize_function_for_size_p (cfun))
> > +return false;
> > +
> >if (!bb->frequency)
> >  return false;
> 
> This shouldn't be necessary, due to the CODE_MAY_GROW argument, and
> this change should result in a code size increase because jumps to
conditional
> jumps aren't removed anymore. What did you make this change for, do you
> have a test case where code size increases if you allow copy_bb_p to
return
> true?

Thanks. It is not necessary.

Here is the updated ChangeLog. The updated patch is attached.

ChangeLog
2012-08-29  Zhenqiang Chen 

PR middle-end/54364
* bb-reorder.c (connect_better_edge_p): New added.
(find_traces_1_round): When optimizing for size, ignore edge
frequency
and probability, and handle all in one round.
(bb_to_key): Use bb->index as key for size.
(better_edge_p): The smaller bb index is better for size.
(connect_traces): Connect block n with block n + 1;
connect trace m with trace m + 1 if falling through.
(gate_handle_reorder_blocks): Enable bbro when optimizing for -Os.

Enable-bbro-for-size-updated.patch
Description: Binary data


Re: [PATCH] MIPS16 TLS support for GCC

2012-08-29 Thread Chung-Lin Tang
On 2012/7/6 02:23 PM, Richard Sandiford wrote:
> Richard Sandiford  writes:
>>> (3) Also related to libraries, I edited CRT_CALL_STATIC_FUNCTION to emit
>>> a 32-bit code sequence under both MIPS/MIPS16 mode (under O32).
>>>
>>> As you can see in the original Feb. patch, I had changes to emit a
>>> MIPS16 version of these static calls, but with the changes in (2) above,
>>> they will not work with the usual situation of a 32-bit MIPS built /lib
>>> (.init/.fini will have 32/16-bit code improperly concatenated).
>>>
>>> The CodeSourcery builds use an independent mips16 sysroot for this, so a
>>> MIPS16 CRT_CALL_STATIC_FUNCTION works there. For the usual case, I think
>>> making it 32-bit is the compatible choice.
>>
>> Yeah, I agree that sounds like the right call.  Please do the same
>> for the n32/n64 version (i.e. explicitly make it nomips16 rather
>> than add the #error).
> 
> BTW, doing this has removed my main concern about having dead code.
> The original patch had a separate MIPS16 implementation that (as things
> stood) could never be used by stock sources.  That would make it difficult
> to maintain.
> 
> Now that the MIPS16 library support is purely adding nomips16 attributes
> to code that is obviously nomips16, those parts are OK on their own, thanks.
> (I.e. the mips.h change, the libgcc change, and the libgomp change.)
> Feel free to drop the multilib thing if you don't want to implement
> --with-multilib-list.

Hi Richard, just FYI, I just committed the said approved parts.
gcc/config/mips/t-linux64 had one additional change, adding
../lib/mips16 to the corresponding MULTILIB_OSDIRNAMES, or else we end
with a weird option-named directory for the mips16 libraries.

Thanks,
Chung-Lin



[AArch64] Merge from upstream trunk r190706

2012-08-29 Thread Sofiane Naci
Hi,

I've just merged upstream trunk on the aarch64-branch up to r190706.

Thanks
Sofiane






remove dependency on cp/parser.h from cp/lang.c

2012-08-29 Thread Aaron Gray
Just got my copyright assignment through, so here's my first GCC patch,

This is a one liner removing the unneeded dependency of cp-lang.c on
cp/parser.h. This has been tested on Linux.

Aaron

diff --git a/gcc/cp/cp-lang.c b/gcc/cp/cp-lang.c
index da7f1e1..5ca0b0a 100644
--- a/gcc/cp/cp-lang.c
+++ b/gcc/cp/cp-lang.c
@@ -32,7 +32,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "cp-objcp-common.h"
 #include "hashtab.h"
 #include "target.h"
-#include "parser.h"

 enum c_language_kind c_language = clk_cxx;
 static void cp_init_ts (void);


cp-lang.c.diff
Description: Binary data


Re: Inline hints

2012-08-29 Thread Martin Jambor
Hi,

On Tue, Aug 28, 2012 at 06:05:27PM +0200, Jan Hubicka wrote:
> > On Sun, Aug 19, 2012 at 07:43:45AM +0200, Jan Hubicka wrote:
> > > 
> > >   * gcc.dg/ipa/iinline-1.c: Update testcase to test inline hints.
> > >   
> > >   * ipa-inline.c (want_inline_small_function_p): Bypass
> > >   inline limits for hinted functions.
> > >   (edge_badness): Dump hints; decrease badness for hinted funcitons.
> > >   * ipa-inline.h (enum inline_hints_vals): New enum.
> > >   (inline_hints): New type.
> > >   (edge_growth_cache_entry): Add hints.
> > >   (dump_inline_summary): Update.
> > >   (dump_inline_hints): Declare.
> > >   (do_estimate_edge_hints): Declare.
> > >   (estimate_edge_hints): New inline function.
> > >   (reset_edge_growth_cache): Update.
> > >   * predict.c (cgraph_maybe_hot_edge_p): Do not ice on indirect edges.
> > >   * ipa-inline-analysis.c (dump_inline_hints): New function.
> > >   (estimate_edge_devirt_benefit): Return true when function should be
> > >   hinted.
> > >   (estimate_calls_size_and_time): New hints argument; set it when
> > >   devritualization happens.
> > >   (estimate_node_size_and_time): New hints argument.
> > >   (do_estimate_edge_time): Cache hints.
> > >   (do_estimate_edge_growth): Update.  
> > >   (do_estimate_edge_hints): New function
> > 
> > ...
> > 
> > > Index: ipa-inline.h
> > > ===
> > > *** ipa-inline.h  (revision 190508)
> > > --- ipa-inline.h  (working copy)
> > > *** typedef struct GTY(()) condition
> > > *** 42,47 
> > > --- 42,54 
> > >   unsigned by_ref : 1;
> > > } condition;
> > >   
> > > + /* Inline hints are reasons why inline heuristics should preffer 
> > > inlining given function.
> > > +They are represtented as bitmap of the following values.  */
> > > + enum inline_hints_vals {
> > > +   INLINE_HINT_indirect_call = 1
> > > + };
> > > + typedef int inline_hints;
> > > + 
> > >   DEF_VEC_O (condition);
> > >   DEF_VEC_ALLOC_O (condition, gc);
> > >   
> > > *** extern VEC(inline_edge_summary_t,heap) *
> > > *** 158,163 
> > > --- 165,171 
> > >   typedef struct edge_growth_cache_entry
> > >   {
> > > int time, size;
> > > +   inline_hints hints;
> > >   } edge_growth_cache_entry;
> > >   DEF_VEC_O(edge_growth_cache_entry);
> > >   DEF_VEC_ALLOC_O(edge_growth_cache_entry,heap);
> > > *** extern VEC(edge_growth_cache_entry,heap)
> > > *** 168,174 
> > >   /* In ipa-inline-analysis.c  */
> > >   void debug_inline_summary (struct cgraph_node *);
> > >   void dump_inline_summaries (FILE *f);
> > > ! void dump_inline_summary (FILE * f, struct cgraph_node *node);
> > >   void inline_generate_summary (void);
> > >   void inline_read_summary (void);
> > >   void inline_write_summary (void);
> > > --- 176,183 
> > >   /* In ipa-inline-analysis.c  */
> > >   void debug_inline_summary (struct cgraph_node *);
> > >   void dump_inline_summaries (FILE *f);
> > > ! void dump_inline_summary (FILE *f, struct cgraph_node *node);
> > > ! void dump_inline_hints (FILE *f, inline_hints);
> > >   void inline_generate_summary (void);
> > >   void inline_read_summary (void);
> > >   void inline_write_summary (void);
> > > *** void inline_merge_summary (struct cgraph
> > > *** 185,190 
> > > --- 194,200 
> > >   void inline_update_overall_summary (struct cgraph_node *node);
> > >   int do_estimate_edge_growth (struct cgraph_edge *edge);
> > >   int do_estimate_edge_time (struct cgraph_edge *edge);
> > > + inline_hints do_estimate_edge_hints (struct cgraph_edge *edge);
> > >   void initialize_growth_caches (void);
> > >   void free_growth_caches (void);
> > >   void compute_inline_parameters (struct cgraph_node *, bool);
> > > *** estimate_edge_time (struct cgraph_edge *
> > > *** 257,262 
> > > --- 267,288 
> > >   }
> > >   
> > >   
> > > + /* Return estimated callee runtime increase after inlning
> > > +EDGE.  */
> > > + 
> > > + static inline inline_hints
> > > + estimate_edge_hints (struct cgraph_edge *edge)
> > > + {
> > > +   inline_hints ret;
> > > +   if ((int)VEC_length (edge_growth_cache_entry, edge_growth_cache) <= 
> > > edge->uid
> > > +   || !(ret = VEC_index (edge_growth_cache_entry,
> > > + edge_growth_cache,
> > > + edge->uid).hints))
> > > + return do_estimate_edge_time (edge);
> > 
> > Surely this was supposed to be do_estimate_edge_hints instead?
> Oops, surely. It is harmless, since we always query time first and thus 
> populate the cache, but it ought to be fixed.
> Can you please apply the obvious patch? Chinese internet is bit restrictive 
> and it is hard to find SSH access..
> Honza

I have committed the following (after it passed bootstrap and testing
along with another patch on x86_64-linux).

Martin


2012-08-29  Martin Jambor  

* ipa-inline.h (estimate_edge_hints): Call do_estimate_edge_

Re: [PATCH, libstdc++] Make empty std::string storage readonly

2012-08-29 Thread Michael Haubenwallner

On 08/28/2012 08:12 PM, Jonathan Wakely wrote:
> On 28 August 2012 18:27, Michael Haubenwallner wrote:
>>>
>>> Does it actually produce a segfault? I suppose it might on some
>>> platforms, but not all, so I'm not sure it's worth changing.
>>
>> It does segfault here on (32bit each):
>>  i686-pc-linux-gnu
>>  ia64-hp-hpux11.31
>>  i386-pc-solaris2.10
>>  sparc-sun-solaris2.10
>>  powerpc-ibm-aix5.3.0.0
>>  powerpc-ibm-aix6.1.0.0
>>  powerpc-ibm-aix7.1.0.0
>>
>> It does not segfault here on:
>>  hppa2.0n-hp-hpux11.31
>>  i586-pc-interix5.2
>>  i586-pc-winnt5.2 (using MSVC)
>>
>> Maybe it could be made segfault on hppa2.0n-hp-hpux11.31 too using some 
>> linker flag,
>> but that's a deprecated platform anyway.
>>
>> As long as the major development platform (Linux) does segfault, it feels 
>> worth
>> changing - especially as string.clear() to write the '\0' back again won't 
>> help
>> as quick'n dirty workaround since gcc-4.4.4 any more.
> 
> Hmm, I tested it on x86_64-unknown-linux-gnu without getting a
> segfault - but I might have messed up my test.

Using this patch on my x86_64 Gentoo Linux Desktop with gcc-4.7.1 does segfault
as expected - when I make sure the correct libstdc++ is used at runtime,
having the '_S_empty_rep_storage' symbol in the .rodata section rather than 
.bss.

/haubi/


Re: [PATCH] Add counter histogram to fdo summary (issue6465057)

2012-08-29 Thread Jan Hubicka
> Index: libgcc/libgcov.c
> ===
> --- libgcc/libgcov.c  (revision 190736)
> +++ libgcc/libgcov.c  (working copy)
> @@ -276,6 +276,78 @@ gcov_version (struct gcov_info *ptr, gcov_unsigned
>return 1;
>  }
>  
> +/* Insert counter VALUE into HISTOGRAM.  */
> +
> +static void
> +gcov_histogram_insert(gcov_bucket_type *histogram, gcov_type value)
> +{
> +  unsigned i;
> +
> +  i = gcov_histo_index(value);
> +  gcc_assert (i < GCOV_HISTOGRAM_SIZE);
Does checking_assert work in libgcov? I do not think internal consistency check
should go to --enable-checking=release libgcov. We want to maintain it as
lightweight as possible. (I see there are two existing gcc_asserts, since they
report file format corruption, I think they should give better diagnostic).

Inliner will do good job here, but perhaps explicit inline fits.
> +  for (f_ix = 0; f_ix != gi_ptr->n_functions; f_ix++)
> +{
> +  gfi_ptr = gi_ptr->functions[f_ix];
> +
> +  if (!gfi_ptr || gfi_ptr->key != gi_ptr)
> +continue;
> +
> +  ci_ptr = &gfi_ptr->ctrs[ctr_info_ix];
> +  for (ix = 0; ix < ci_ptr->num; ix++)
> +gcov_histogram_insert(cs_ptr->histogram, ci_ptr->values[ix]);
Space before (.
> +}
> +}
> +}
> +
>  /* Dump the coverage counts. We merge with existing counts when
> possible, to avoid growing the .da files ad infinitum. We use this
> program's checksum to make sure we only accumulate whole program
> @@ -347,6 +419,7 @@ gcov_exit (void)
>   }
>   }
>  }
> +  gcov_compute_histogram (&this_prg);
> @@ -598,11 +669,18 @@ gcov_exit (void)
> if (gi_ptr->merge[t_ix])
>   {
> if (!cs_prg->runs++)
> - cs_prg->num = cs_tprg->num;
> +cs_prg->num = cs_tprg->num;
> +  else if (cs_prg->num != cs_tprg->num)
> +goto read_mismatch;

Doesn't think check that all the programs that contain this unit are the same?
I.e. will this survive profiledbootstrap where we interleave cc1 and cc1plus?
> +  /* Count number of non-zero histogram entries. The histogram is only
> + currently computed for arc counters.  */
> +  csum = &summary->ctrs[GCOV_COUNTER_ARCS];
> +  for (h_ix = 0; h_ix < GCOV_HISTOGRAM_SIZE; h_ix++)
> +{
> +  if (csum->histogram[h_ix].num_counters > 0)
> +h_cnt++;
> +}
> +  gcov_write_tag_length (tag, GCOV_TAG_SUMMARY_LENGTH(h_cnt));
>gcov_write_unsigned (summary->checksum);
>for (csum = summary->ctrs, ix = GCOV_COUNTERS_SUMMABLE; ix--; csum++)
>  {
> @@ -380,6 +388,21 @@ gcov_write_summary (gcov_unsigned_t tag, const str
>gcov_write_counter (csum->sum_all);
>gcov_write_counter (csum->run_max);
>gcov_write_counter (csum->sum_max);
> +  if (ix != GCOV_COUNTER_ARCS)
> +{
> +  gcov_write_unsigned (0);
> +  continue;
> +}
> +  gcov_write_unsigned (h_cnt);
> +  for (h_ix = 0; h_ix < GCOV_HISTOGRAM_SIZE; h_ix++)
> +{
> +  if (!csum->histogram[h_ix].num_counters)
> +continue;
> +  gcov_write_unsigned (h_ix);

It is kind of waste to write whole unsigned for each histogram index.
What about writting bitmap of non-zero entries followed by each entry?
> +/* Merge SRC_HISTO into TGT_HISTO.  */

Perhaps comment about overall concept of the merging routine would suit here.
> -#else /*!IN_GCOV */
> -#define GCOV_TYPE_SIZE (LONG_LONG_TYPE_SIZE > 32 ? 64 : 32)

Why do you need t omove this out of !libgcov? I do not thing this is correct 
for all configurations.
i.e. gcov_type may be 16bit.

Patch is OK if it passed profiledbootstrap modulo the comments above.
Thanks!
Honza


Re: out-of-line and arch-specific random_device

2012-08-29 Thread Paolo Carlini

Hi,

On 8/28/12 1:41 PM, Ulrich Drepper wrote:

On Tue, Aug 28, 2012 at 4:44 AM, Paolo Carlini  wrote:

Again, without context, I think this is not the point: random_device is meant 
to be just a simple high level wrapper
around things like dev/random, inspired by facilities like dev/random on unix-like OSes. 
The brutal "fall back" we have
now in place wouldn't be useful anyway for the uses Marc is talking about, 
because there is no way to provide a seed.
That said, I can't check right now C++11 about random_device, I suppose Uli has 
already ;)

I did read it.  random_device is all about non-determinism.  Of course
I know that RNGs in some situations have to be repeatable.  That's
what all the engines are about.  random_device isn't.  You use
random_device to seed an engine etc.

The spec says that if there is no way to create non-deterministic data
the implementation may use a random number engine.  "may" being to
key.

Ok.

I perhaps didn't make myself clear as to what the big problem is.
Depending on whether or not you define _GLIBCXX_USE_RANDOM_TR1 you get
an object definition for 'random"device" which has the same name and
mangling but has a different size.  This means binary
incompatibilities.  Memory corruptions.
But note that _GLIBCXX_USE_RANDOM_TR1, as *any* other such macro isn't 
supposed to be set by the user, definitely it's not. It's a 
configure-time macro. Thus, given your clarification above about "may", 
I think the issue here is whether normally people would like to see an 
abort, or the output of a fixed (no seed, that is, as we clarified 
already) deterministic engine as a fall back. In my opinion, having 
clarified the macro uses issue, the less bad solution is the 
deterministic engine. As a general maintainer of the library (that is 
not as a GNU/Linux maintainer) I would be more favorable to the abort if 
we had decently covered not just Unix-like systems but a few other 
systems, at least a bit of M$, etc.


Thus, all in all, I propose to just go ahead with your patch more or 
less, as-is, that is retain the MT fall back. Minor nit: are you sure we 
need to open a new minor version for the new symbol? Because it seemed 
to me that 4.7.x was behind by one. Please check. Also, again minor 
detail, we normally just use mangled names in the linker script, see all 
the examples the lines before, I don't think we should just now change 
that?!?


Paolo.


[PATCH] rs6000: Add a builtin to read the time base register on PowerPC

2012-08-29 Thread Tulio Magno Quites Machado Filho
Add __builtin_ppc_get_timebase to read the time base register on PowerPC.
This is required for applications that measure time at high frequencies
with high precision that can't afford a syscall.

[gcc]
2012-08-29 Tulio Magno Quites Machado Filho 

* config/rs6000/rs6000-builtin.def: Add __builtin_ppc_get_timebase.
* config/rs6000/rs6000.c (rs6000_expand_noop_builtin): New
function to expand an expression that calls a builtin without
arguments.
(rs6000_expand_builtin): Add __builtin_ppc_get_timebase.
(rs6000_init_builtins): Likewise.
* config/rs6000/rs6000.md: Likewise.

[gcc/testsuite]
2012-08-29 Tulio Magno Quites Machado Filho 

* gcc.target/powerpc/ppc-get-timebase.c: New file.
---
 gcc/config/rs6000/rs6000-builtin.def   |3 ++
 gcc/config/rs6000/rs6000.c |   31 +
 gcc/config/rs6000/rs6000.md|   36 
 .../gcc.target/powerpc/ppc-get-timebase.c  |   22 
 4 files changed, 92 insertions(+), 0 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/ppc-get-timebase.c

diff --git a/gcc/config/rs6000/rs6000-builtin.def 
b/gcc/config/rs6000/rs6000-builtin.def
index c8f8f86..75ad184 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -1429,6 +1429,9 @@ BU_SPECIAL_X (RS6000_BUILTIN_RSQRT, "__builtin_rsqrt", 
RS6000_BTM_FRSQRTE,
 BU_SPECIAL_X (RS6000_BUILTIN_RSQRTF, "__builtin_rsqrtf", RS6000_BTM_FRSQRTES,
  RS6000_BTC_FP)
 
+BU_SPECIAL_X (RS6000_BUILTIN_GET_TB, "__builtin_ppc_get_timebase",
+RS6000_BTM_POWERPC, RS6000_BTC_MISC)
+
 /* Darwin CfString builtin.  */
 BU_SPECIAL_X (RS6000_BUILTIN_CFSTRING, "__builtin_cfstring", RS6000_BTM_ALWAYS,
  RS6000_BTC_MISC)
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 6c58307..24e274d 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -9747,6 +9747,30 @@ rs6000_overloaded_builtin_p (enum rs6000_builtins fncode)
   return (rs6000_builtin_info[(int)fncode].attr & RS6000_BTC_OVERLOADED) != 0;
 }
 
+/* Expand an expression EXP that calls a builtin without arguments.  */
+static rtx
+rs6000_expand_noop_builtin (enum insn_code icode, rtx target)
+{
+  rtx pat;
+  enum machine_mode tmode = insn_data[icode].operand[0].mode;
+
+  if (icode == CODE_FOR_nothing)
+/* Builtin not supported on this processor.  */
+return 0;
+
+  if (target == 0
+  || GET_MODE (target) != tmode
+  || ! (*insn_data[icode].operand[0].predicate) (target, tmode))
+target = gen_reg_rtx (tmode);
+
+  pat = GEN_FCN (icode) (target);
+  if (! pat)
+return 0;
+  emit_insn (pat);
+
+  return target;
+}
+
 
 static rtx
 rs6000_expand_unop_builtin (enum insn_code icode, tree exp, rtx target)
@@ -11336,6 +11360,9 @@ rs6000_expand_builtin (tree exp, rtx target, rtx 
subtarget ATTRIBUTE_UNUSED,
   ? CODE_FOR_bpermd_di
   : CODE_FOR_bpermd_si), exp, target);
 
+case RS6000_BUILTIN_GET_TB:
+  return rs6000_expand_noop_builtin (CODE_FOR_get_timebase, target);
+
 case ALTIVEC_BUILTIN_MASK_FOR_LOAD:
 case ALTIVEC_BUILTIN_MASK_FOR_STORE:
   {
@@ -11620,6 +11647,10 @@ rs6000_init_builtins (void)
 POWER7_BUILTIN_BPERMD, "__builtin_bpermd");
   def_builtin ("__builtin_bpermd", ftype, POWER7_BUILTIN_BPERMD);
 
+  ftype = build_function_type_list (unsigned_intDI_type_node,
+   NULL_TREE);
+  def_builtin ("__builtin_ppc_get_timebase", ftype, RS6000_BUILTIN_GET_TB);
+
 #if TARGET_XCOFF
   /* AIX libm provides clog as __clog.  */
   if ((tdecl = builtin_decl_explicit (BUILT_IN_CLOG)) != NULL_TREE)
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index d5ffd81..09bdd80 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -136,6 +136,7 @@
UNSPECV_PROBE_STACK_RANGE   ; probe range of stack addresses
UNSPECV_EH_RR   ; eh_reg_restore
UNSPECV_ISYNC   ; isync instruction
+   UNSPECV_GETTB   ; get timebase built-in
   ])
 
 
@@ -14101,6 +14102,41 @@
   ""
   "")
 
+(define_expand "get_timebase"
+  [(use (match_operand:DI 0 "gpc_reg_operand" ""))]
+  ""
+  "
+{
+  if (TARGET_POWERPC64)
+emit_insn (gen_get_timebase_ppc64 (operands[0]));
+  else if (TARGET_POWERPC)
+emit_insn (gen_get_timebase_ppc32 (operands[0]));
+  else
+FAIL;
+  DONE;
+}")
+
+(define_insn "get_timebase_ppc32"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
+(unspec_volatile:DI [(const_int 0)] UNSPECV_GETTB))
+   (clobber (match_scratch:SI 1 "=r"))]
+  "TARGET_POWERPC && !TARGET_POWERPC64"
+{
+return "mftbu %0\;"
+  "mftb %L0\;"
+  "mftbu %1\;"
+  "cmpw %0,%1\;"
+  "bne- $-16";
+})
+
+(define_insn "get_timebase_ppc64"

Re: [PATCH,PING]] gcc/config/freebsd-spec.h: Fix building PIE

2012-08-29 Thread Alexis Ballier
Hi Gerald,

On Sun, 26 Aug 2012 23:28:49 +0200 (CEST)
Gerald Pfeifer  wrote:

> I have tested this patch on i386-unknown-freebsd10.0 and volunteer
> to create a ChangeLog and apply if approved.

Thanks for taking care of this, I thought this patch had been
completely forgotten :)
FWIW, the git commit message was supposed to be the ChangeLog entry
(imho it is rather pointless to include it in the patch since it will
most likely conflict when/if it gets applied)

> 
> Any reviewer?
> 
> On Tue, 8 May 2012, Alexis Ballier wrote:
> > For the record, there's a similar logic in FreeBSD's gcc: 
> > http://svnweb.freebsd.org/base/head/contrib/gcc/config/freebsd-spec.h?revision=200038&view=markup
> 
> Thanks for the patch, Alexis.  One question: why do we have the same
> in freebsd-spec.h and i386/freebsd.h.  Isn't there a way to simplify
> this?  Like omitting this from i386/freebsd.h at all?

To be honest, I don't know why we have the same in these two headers
and wondered the same. I suppose it is possible to simply remove it
from i386/freebsd.h but I didn't test this since I didn't want to mix a
bugfix for PIE and cleanup of the code within the same patch.

Regards,

Alexis.

[...]


Re: faster random number engine

2012-08-29 Thread Paolo Carlini

On 8/29/12 4:19 PM, Ulrich Drepper wrote:

The  header so far contains the random number engines
documented in the header.  None of these are well suited for modern
CPUs.  There is a variant of the Mersenne twister engines which is
explicitly designed to perform well on CPUs with SIMD instructions.
The result is an engine with equal properties to the original Mersenne
twisters but several times faster.

Great.

The attached patch implements this new engine.  It's in the __gnu_cxx
namespace.  I added definitions for all the variants defined by the
authors.  The test suite checks the returned values based on results
obtained from the original code.  The SIMD optimization is so far done
for x86.  In all other cases a generic implementation is used.  The
generic implementation works correctly for little endian machines.
For big endian machines someone will come up with fixes.  Until then
the new default.cc test is expected to fail.

I hope this is an uncontroversial change.
The substance isn't of course. But normally we don't have __gnu_cxx 
things in the same std header. Can't we have a new ext/random and put it 
in there? If we can separate the new code to it, I think people would 
not even object to the target dependency, etc. In ext/ we are quite free 
to do extension / experimental work.


Paolo.


Re: [PATCH 2/3] Incorporate aggregate jump functions into inlining analysis

2012-08-29 Thread H.J. Lu
On Thu, Aug 2, 2012 at 12:28 PM, Martin Jambor  wrote:
> Hi,
>
> this patch uses the aggregate jump functions created by the previous
> patch in the series to determine benefits of inlining a particular
> call graph edge.  It has not changed much since the last time I posted
> it, except for the presence of by_ref flags and removal of checks
> required by TBAA which we now do not use.
>
> The patch works in fairly straightforward way.  It ads two flags to
> struct condition to specify it actually refers to an aggregate passed
> by value or something passed by reference, in both cases at a
> particular offset, also newly stored in the structures.  Functions
> which build the predicates specifying under which conditions CFG edges
> will be taken or individual statements are actually executed then
> simply also look whether a value comes from an aggregate passed to us
> in a parameter (either by value or reference) and if so, create
> appropriate conditions.  Later on, predicates are evaluated as before,
> we only also look at aggregate contents of the jump functions of the
> edge we are considering to inline when evaluating the predicates, and
> also remap the offsets of the jump functions when remapping over an
> ancestor jump function.
>
> This patch alone makes us inline the function bar in testcase of PR
> 48636 in comment #4.  It also passes bootstrap and testing on
> x86_64-linux.  I successfully LTO-built Firefox with it too.
>
> Thanks for all comments and suggestions,
>
> Martin
>
>
> 2012-07-31  Martin Jambor  
>
> PR fortran/48636
> * ipa-inline.h (condition): New fields offset, agg_contents and 
> by_ref.
> * ipa-inline-analysis.c (agg_position_info): New type.
> (add_condition): New parameter aggpos, also store agg_contents, by_ref
> and offset.
> (dump_condition): Also dump aggregate conditions.
> (evaluate_conditions_for_known_args): Also handle aggregate
> conditions.  New parameter known_aggs.
> (evaluate_properties_for_edge): Gather known aggregate contents.
> (inline_node_duplication_hook): Pass NULL known_aggs to
> evaluate_conditions_for_known_args.
> (unmodified_parm): Split into unmodified_parm and unmodified_parm_1.
> (unmodified_parm_or_parm_agg_item): New function.
> (set_cond_stmt_execution_predicate): Handle values passed in
> aggregates.
> (set_switch_stmt_execution_predicate): Likewise.
> (will_be_nonconstant_predicate): Likewise.
> (estimate_edge_devirt_benefit): Pass new parameter known_aggs to
> ipa_get_indirect_edge_target.
> (estimate_calls_size_and_time): New parameter known_aggs, pass it
> recrsively to itself and to estimate_edge_devirt_benefit.
> (estimate_node_size_and_time): New vector known_aggs, pass it o
> functions which need it.
> (remap_predicate): New parameter offset_map, use it to remap aggregate
> conditions.
> (remap_edge_summaries): New parameter offset_map, pass it recursively
> to itself and to remap_predicate.
> (inline_merge_summary): Also create and populate vector offset_map.
> (do_estimate_edge_time): New vector of known aggregate contents,
> passed to functions which need it.
> (inline_read_section): Stream new fields of condition.
> (inline_write_summary): Likewise.
> * ipa-cp.c (ipa_get_indirect_edge_target): Also examine the aggregate
> contents.  Let all local callers pass NULL for known_aggs.
>

This caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54409


H.J.


Re: out-of-line and arch-specific random_device

2012-08-29 Thread Ulrich Drepper
On Wed, Aug 29, 2012 at 9:48 AM, Paolo Carlini  wrote:
> Minor nit: are you sure we need to
> open a new minor version for the new symbol? Because it seemed to me that
> 4.7.x was behind by one.

I have 4.7 installed and that version already defines the symbols
defined in version 3.4.17.  This is a new symbol and requires a new
version to prevent startup of an app in case of a too old runtime
library.


Re: out-of-line and arch-specific random_device

2012-08-29 Thread Paolo Carlini

On 8/29/12 4:49 PM, Ulrich Drepper wrote:

On Wed, Aug 29, 2012 at 9:48 AM, Paolo Carlini  wrote:

Minor nit: are you sure we need to
open a new minor version for the new symbol? Because it seemed to me that
4.7.x was behind by one.

I have 4.7 installed and that version already defines the symbols
defined in version 3.4.17.  This is a new symbol and requires a new
version to prevent startup of an app in case of a too old runtime
library.
Ah in that case definitely we have to bump the minor version. I though - 
if it wasn't clear - that current mainline was *already* ahead current 
4_7-branch.


Paolo.


Re: [PATCH] rs6000: Add a builtin to read the time base register on PowerPC

2012-08-29 Thread Segher Boessenkool

Hi Tulio,

Add __builtin_ppc_get_timebase to read the time base register on  
PowerPC.
This is required for applications that measure time at high  
frequencies

with high precision that can't afford a syscall.


For things that do mftb with high frequency, maybe you should also add a
builtin that does just an mftb, i.e. returns a 32-bit result on 32-bit
implementations.

Please add documentation for the new builtin(s).


--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -1429,6 +1429,9 @@ BU_SPECIAL_X (RS6000_BUILTIN_RSQRT,  
"__builtin_rsqrt", RS6000_BTM_FRSQRTE,
 BU_SPECIAL_X (RS6000_BUILTIN_RSQRTF, "__builtin_rsqrtf",  
RS6000_BTM_FRSQRTES,

  RS6000_BTC_FP)

+BU_SPECIAL_X (RS6000_BUILTIN_GET_TB, "__builtin_ppc_get_timebase",
+RS6000_BTM_POWERPC, RS6000_BTC_MISC)


RS6000_BTM_POWERPC does not exist anymore.  RS6000_BTM_ALWAYS?

+/* Expand an expression EXP that calls a builtin without  
arguments.  */

+static rtx
+rs6000_expand_noop_builtin (enum insn_code icode, rtx target)


"noop" gives the wrong idea, "zeroop" perhaps?


+(define_expand "get_timebase"


You should probably prefix this with powerpc_ or rs6000_ as well.
The existing code is not very consistent in this.


+  [(use (match_operand:DI 0 "gpc_reg_operand" ""))]
+  ""
+  "
+{
+  if (TARGET_POWERPC64)
+emit_insn (gen_get_timebase_ppc64 (operands[0]));
+  else if (TARGET_POWERPC)
+emit_insn (gen_get_timebase_ppc32 (operands[0]));
+  else
+FAIL;
+  DONE;
+}")


TARGET_POWERPC is always true.


+(define_insn "get_timebase_ppc32"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
+(unspec_volatile:DI [(const_int 0)] UNSPECV_GETTB))
+   (clobber (match_scratch:SI 1 "=r"))]
+  "TARGET_POWERPC && !TARGET_POWERPC64"
+{
+return "mftbu %0\;"
+  "mftb %L0\;"
+  "mftbu %1\;"
+  "cmpw %0,%1\;"
+  "bne- $-16";
+})


This only works for WORDS_BIG_ENDIAN.

You should say you clobber CR0 here I think; actually, allow any CRn
instead.

Does mftb work on all supported assemblers?  The machine instruction
is phased out, but some assemblers translate it to mfspr.


+(define_insn "get_timebase_ppc64"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
+(unspec_volatile:DI [(const_int 0)] UNSPECV_GETTB))]
+  "TARGET_POWERPC64"
+{
+return "mfspr %0, 268";
+})


POWER3 needs mftb.


--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/ppc-get-timebase.c
@@ -0,0 +1,22 @@
+/* { dg-do run { target { powerpc*-*-* } } } */
+
+/* Test if __builtin_ppc_get_timebase() is compatible with the  
current
+   processor and if it's changing between reads.  A read failure  
might indicate

+   a Power ISA or binutils change.  */
+
+#include 
+
+int
+main(void)
+{
+  uint64_t t1, t2, t3;
+
+  t1 = __builtin_ppc_get_timebase ();
+  t2 = __builtin_ppc_get_timebase ();
+  t3 = __builtin_ppc_get_timebase ();
+
+  if (t1 != t2 && t1 != t3 && t2 != t3)
+return 0;
+
+  return 1;
+}


On some systems the timebase runs at a rather low frequency, say 20MHz.
This test will spuriously fail there.  Waste a million CPU cycles before
reading TB the second time?


Segher



[PATCH] PR other/54411: libiberty: objalloc_alloc integer overflows (CVE-2012-3509)

2012-08-29 Thread Florian Weimer
This patches fixes an integer overflow in libiberty, which leads to
crashes in binutils.  The long version of the objalloc_alloc macro
would have needed another conditional, so I removed that and replaced
it with a call to the actual implementation.

This has been compiled-tested only.  We do not use this function in
GCC, therefore I want to commit this just to the trunk.

2012-08-29  Florian Weimer  

PR other/54411
* objalloc.h (objalloc_alloc): Always use the simple definition of
the macro.

2012-08-29  Florian Weimer  

PR other/54411
* objalloc.c (_objalloc_alloc): Add overflow check covering
alignment and CHUNK_HEADER_SIZE addition.

Index: include/objalloc.h
===
--- include/objalloc.h  (revision 190780)
+++ include/objalloc.h  (working copy)
@@ -1,5 +1,5 @@
 /* objalloc.h -- routines to allocate memory for objects
-   Copyright 1997, 2001 Free Software Foundation, Inc.
+   Copyright 1997-2012 Free Software Foundation, Inc.
Written by Ian Lance Taylor, Cygnus Solutions.
 
 This program is free software; you can redistribute it and/or modify it
@@ -71,38 +71,8 @@
 
 extern void *_objalloc_alloc (struct objalloc *, unsigned long);
 
-/* The macro version of objalloc_alloc.  We only define this if using
-   gcc, because otherwise we would have to evaluate the arguments
-   multiple times, or use a temporary field as obstack.h does.  */
-
-#if defined (__GNUC__) && defined (__STDC__) && __STDC__
-
-/* NextStep 2.0 cc is really gcc 1.93 but it defines __GNUC__ = 2 and
-   does not implement __extension__.  But that compiler doesn't define
-   __GNUC_MINOR__.  */
-#if __GNUC__ < 2 || (__NeXT__ && !__GNUC_MINOR__)
-#define __extension__
-#endif
-
-#define objalloc_alloc(o, l)   \
-  __extension__
\
-  ({ struct objalloc *__o = (o);   \
- unsigned long __len = (l);
\
- if (__len == 0)   \
-   __len = 1;  \
- __len = (__len + OBJALLOC_ALIGN - 1) &~ (OBJALLOC_ALIGN - 1); \
- (__len <= __o->current_space  \
-  ? (__o->current_ptr += __len,\
-__o->current_space -= __len,   \
-(void *) (__o->current_ptr - __len))   \
-  : _objalloc_alloc (__o, __len)); })
-
-#else /* ! __GNUC__ */
-
 #define objalloc_alloc(o, l) _objalloc_alloc ((o), (l))
 
-#endif /* ! __GNUC__ */
-
 /* Free an entire objalloc structure.  */
 
 extern void objalloc_free (struct objalloc *);
Index: libiberty/objalloc.c
===
--- libiberty/objalloc.c(revision 190780)
+++ libiberty/objalloc.c(working copy)
@@ -1,5 +1,5 @@
 /* objalloc.c -- routines to allocate memory for objects
-   Copyright 1997 Free Software Foundation, Inc.
+   Copyright 1997-2012 Free Software Foundation, Inc.
Written by Ian Lance Taylor, Cygnus Solutions.
 
 This program is free software; you can redistribute it and/or modify it
@@ -112,8 +112,9 @@
 /* Allocate space from an objalloc structure.  */
 
 PTR
-_objalloc_alloc (struct objalloc *o, unsigned long len)
+_objalloc_alloc (struct objalloc *o, unsigned long original_len)
 {
+  unsigned long len = original_len;
   /* We avoid confusion from zero sized objects by always allocating
  at least 1 byte.  */
   if (len == 0)
@@ -121,6 +122,11 @@
 
   len = (len + OBJALLOC_ALIGN - 1) &~ (OBJALLOC_ALIGN - 1);
 
+  /* Check for overflow in the alignment operator above and the malloc
+ argument below. */
+  if (len + CHUNK_HEADER_SIZE < original_len)
+return NULL;
+
   if (len <= o->current_space)
 {
   o->current_ptr += len;


Re: [PATCH] rs6000: Add a builtin to read the time base register on PowerPC

2012-08-29 Thread Andrew Pinski
On Wed, Aug 29, 2012 at 6:56 AM, Tulio Magno Quites Machado Filho
 wrote:
> Add __builtin_ppc_get_timebase to read the time base register on PowerPC.
> This is required for applications that measure time at high frequencies
> with high precision that can't afford a syscall.
>
> [gcc]
> 2012-08-29 Tulio Magno Quites Machado Filho 
>
> * config/rs6000/rs6000-builtin.def: Add __builtin_ppc_get_timebase.
> * config/rs6000/rs6000.c (rs6000_expand_noop_builtin): New
> function to expand an expression that calls a builtin without
> arguments.
> (rs6000_expand_builtin): Add __builtin_ppc_get_timebase.
> (rs6000_init_builtins): Likewise.
> * config/rs6000/rs6000.md: Likewise.
>
> [gcc/testsuite]
> 2012-08-29 Tulio Magno Quites Machado Filho 
>
> * gcc.target/powerpc/ppc-get-timebase.c: New file.
> ---
>  gcc/config/rs6000/rs6000-builtin.def   |3 ++
>  gcc/config/rs6000/rs6000.c |   31 +
>  gcc/config/rs6000/rs6000.md|   36 
> 
>  .../gcc.target/powerpc/ppc-get-timebase.c  |   22 
>  4 files changed, 92 insertions(+), 0 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/ppc-get-timebase.c
>
> diff --git a/gcc/config/rs6000/rs6000-builtin.def 
> b/gcc/config/rs6000/rs6000-builtin.def
> index c8f8f86..75ad184 100644
> --- a/gcc/config/rs6000/rs6000-builtin.def
> +++ b/gcc/config/rs6000/rs6000-builtin.def
> @@ -1429,6 +1429,9 @@ BU_SPECIAL_X (RS6000_BUILTIN_RSQRT, "__builtin_rsqrt", 
> RS6000_BTM_FRSQRTE,
>  BU_SPECIAL_X (RS6000_BUILTIN_RSQRTF, "__builtin_rsqrtf", RS6000_BTM_FRSQRTES,
>   RS6000_BTC_FP)
>
> +BU_SPECIAL_X (RS6000_BUILTIN_GET_TB, "__builtin_ppc_get_timebase",
> +RS6000_BTM_POWERPC, RS6000_BTC_MISC)
> +
>  /* Darwin CfString builtin.  */
>  BU_SPECIAL_X (RS6000_BUILTIN_CFSTRING, "__builtin_cfstring", 
> RS6000_BTM_ALWAYS,
>   RS6000_BTC_MISC)
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 6c58307..24e274d 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -9747,6 +9747,30 @@ rs6000_overloaded_builtin_p (enum rs6000_builtins 
> fncode)
>return (rs6000_builtin_info[(int)fncode].attr & RS6000_BTC_OVERLOADED) != 
> 0;
>  }
>
> +/* Expand an expression EXP that calls a builtin without arguments.  */
> +static rtx
> +rs6000_expand_noop_builtin (enum insn_code icode, rtx target)
> +{
> +  rtx pat;
> +  enum machine_mode tmode = insn_data[icode].operand[0].mode;
> +
> +  if (icode == CODE_FOR_nothing)
> +/* Builtin not supported on this processor.  */
> +return 0;
> +
> +  if (target == 0
> +  || GET_MODE (target) != tmode
> +  || ! (*insn_data[icode].operand[0].predicate) (target, tmode))
> +target = gen_reg_rtx (tmode);
> +
> +  pat = GEN_FCN (icode) (target);
> +  if (! pat)
> +return 0;
> +  emit_insn (pat);
> +
> +  return target;
> +}
> +
>
>  static rtx
>  rs6000_expand_unop_builtin (enum insn_code icode, tree exp, rtx target)
> @@ -11336,6 +11360,9 @@ rs6000_expand_builtin (tree exp, rtx target, rtx 
> subtarget ATTRIBUTE_UNUSED,
>? CODE_FOR_bpermd_di
>: CODE_FOR_bpermd_si), exp, 
> target);
>
> +case RS6000_BUILTIN_GET_TB:
> +  return rs6000_expand_noop_builtin (CODE_FOR_get_timebase, target);
> +
>  case ALTIVEC_BUILTIN_MASK_FOR_LOAD:
>  case ALTIVEC_BUILTIN_MASK_FOR_STORE:
>{
> @@ -11620,6 +11647,10 @@ rs6000_init_builtins (void)
>  POWER7_BUILTIN_BPERMD, "__builtin_bpermd");
>def_builtin ("__builtin_bpermd", ftype, POWER7_BUILTIN_BPERMD);
>
> +  ftype = build_function_type_list (unsigned_intDI_type_node,
> +   NULL_TREE);
> +  def_builtin ("__builtin_ppc_get_timebase", ftype, RS6000_BUILTIN_GET_TB);
> +
>  #if TARGET_XCOFF
>/* AIX libm provides clog as __clog.  */
>if ((tdecl = builtin_decl_explicit (BUILT_IN_CLOG)) != NULL_TREE)
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index d5ffd81..09bdd80 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -136,6 +136,7 @@
> UNSPECV_PROBE_STACK_RANGE   ; probe range of stack addresses
> UNSPECV_EH_RR   ; eh_reg_restore
> UNSPECV_ISYNC   ; isync instruction
> +   UNSPECV_GETTB   ; get timebase built-in
>])
>
>
> @@ -14101,6 +14102,41 @@
>""
>"")
>
> +(define_expand "get_timebase"
> +  [(use (match_operand:DI 0 "gpc_reg_operand" ""))]
> +  ""
> +  "
> +{
> +  if (TARGET_POWERPC64)
> +emit_insn (gen_get_timebase_ppc64 (operands[0]));
> +  else if (TARGET_POWERPC)
> +emit_insn (gen_get_timebase_ppc32 (operands[0]));
> +  else
> +FAIL;
> +  DONE;
> +}")
> +
> +(define_insn "get_timebase_ppc32"
> +  [(set (match_operand:DI 0 "gp

Re: [PATCH] rs6000: Add a builtin to read the time base register on PowerPC

2012-08-29 Thread Hans-Peter Nilsson
On Wed, 29 Aug 2012, Segher Boessenkool wrote:
> > +++ b/gcc/testsuite/gcc.target/powerpc/ppc-get-timebase.c
> > @@ -0,0 +1,22 @@
> > +/* { dg-do run { target { powerpc*-*-* } } } */
> > +
> > +/* Test if __builtin_ppc_get_timebase() is compatible with the current
> > +   processor and if it's changing between reads.  A read failure might
> > indicate
> > +   a Power ISA or binutils change.  */
> > +
> > +#include 
> > +
> > +int
> > +main(void)
> > +{
> > +  uint64_t t1, t2, t3;
> > +
> > +  t1 = __builtin_ppc_get_timebase ();
> > +  t2 = __builtin_ppc_get_timebase ();
> > +  t3 = __builtin_ppc_get_timebase ();
> > +
> > +  if (t1 != t2 && t1 != t3 && t2 != t3)
> > +return 0;
> > +
> > +  return 1;
> > +}
>
> On some systems the timebase runs at a rather low frequency, say 20MHz.
> This test will spuriously fail there.  Waste a million CPU cycles before
> reading TB the second time?

Waste said million cycles portably by calling sched_yield()?
(Available only on POSIX systems. :)

brgds, H-P



Re: Loop iterations inline hint

2012-08-29 Thread Martin Jambor
Hi,

On Tue, Aug 21, 2012 at 08:55:02AM +0200, Jan Hubicka wrote:
> 
> Hi,
> this patch adds a hint that if inlining makes bounds on loop iterations known,
> it is probably good idea.  This is primarely targetting Fortran's array
> descriptors, but should be generally useful.
> 
> Fortran will still need a bit more work. Often we disregard inlining because 
> we
> think the call is cold (because it comes from Main) so inlining heuristic will
> need more updating and apparently we will also need to update for PHI
> conditionals as done in Martin's patch 3/3.

My patch helps only a bit, for example on the pr48636.f90 testcase it
still does not help to discover a loop bound hint because the patch,
being overly simple, looks at edge->aux predicates to construct
predicates of phi results constantness and those are computed before
we start populating nonconstant_names.  So phi nodes for any
conditions based on expressions (as opposed to direct parameter
values) are not considered.  I'll see how far I can get by
re-evaluating the condition instead (but eventually we will probably
want to do the full propagation, though perhaps not in 4.8).

> At the moment the hint is interpreted same way as the indirect_call hint from
> previous patch.
> 
> Martin: I think ipa-cp should also make use of this hint. Resolving
> number of loop iterations is important enough reason to specialize
> in many cases.  I think it already has logic for devirtualization
> but perhaps it should be made more aggressive? I was sort of
> surprised that for Mozila the inlining hint makes us to catch 20
> times more cases than before. Most of the cases sounds like good
> ipa-cp candidates.

Interesting, I can experiment with that, sure.  On the other hand, I'd
be careful about any measurements taken after August 13 because of PR
54394 which might have caused some edges to be considered much cooler
than they are.  I'll post a patch for it in a minute.

> 
> Also can you please try to finaly make param notes to be used by the virtual
> clones machinery and thus make it possible for ipa-cp to specialize for known
> aggregate parameters? This should make a lot of difference for Fortran, I 
> think.

Yeah, that's the next big item on my list to do after I finish all the
little ones (like the PHIs, for example), but hopefully, I'll have
that done soon.

Martin


Re: [PATCH] rs6000: Add a builtin to read the time base register on PowerPC

2012-08-29 Thread Michael Meissner
On Wed, Aug 29, 2012 at 01:56:05PM -0400, Hans-Peter Nilsson wrote:
> On Wed, 29 Aug 2012, Segher Boessenkool wrote:
> > On some systems the timebase runs at a rather low frequency, say 20MHz.
> > This test will spuriously fail there.  Waste a million CPU cycles before
> > reading TB the second time?
> 
> Waste said million cycles portably by calling sched_yield()?
> (Available only on POSIX systems. :)

Well only for a test environment.  You don't want to call sched_yield in the
normal case, since the apps that do this many millions of times need this to be
as a fast as possible.

-- 
Michael Meissner, IBM
5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899



[PATCH, PR 54394] Compute loops when generating inline summaries

2012-08-29 Thread Martin Jambor
Hi,

the patch below fixes PR 54394.  The problem is that since revision
190346 we depend on bb->loop_father being non-NULL to get loop_depth.
However, with loops not computed, the loop_father is NULL, loop_depth
is thus considered zero and call graph edges out of such BB can be
considered much cooler, leading to inlining regressions.

This patch fixes that by recomputing loops whenever optimizing, not
only for loop bounds hints.  We might put the computation elsewhere or
do it only under more restrictive circumstances, but I believe that
after rev. 190346 we have to do it.  In particular, I am not sure
whether we had (semi)correct loop_depths when doing early inlining or
not, this patch re-calculates it for early inliner too.

Bootstrapped and tested on x86_64-linux, fixes fatigue run-time on
an x86_64-linux and i686-linux for me.  What do you think?

Thanks,

Martin


2012-08-29  Martin Jambor  

PR middle-end/54394
* ipa-inline-analysis.c (estimate_function_body_sizes): Compute
dominance info and loops whenever optimizing.


Index: src/gcc/ipa-inline-analysis.c
===
--- src.orig/gcc/ipa-inline-analysis.c
+++ src/gcc/ipa-inline-analysis.c
@@ -2102,6 +2102,11 @@ estimate_function_body_sizes (struct cgr
   info->conds = 0;
   info->entry = 0;
 
+  if (optimize)
+{
+  calculate_dominance_info (CDI_DOMINATORS);
+  loop_optimizer_init (LOOPS_NORMAL | LOOPS_HAVE_RECORDED_EXITS);
+}
 
   if (dump_file)
 fprintf (dump_file, "\nAnalyzing function body size: %s\n",
@@ -2270,9 +2275,6 @@ estimate_function_body_sizes (struct cgr
   loop_iterator li;
   predicate loop_iterations = true_predicate ();
 
-  calculate_dominance_info (CDI_DOMINATORS);
-  loop_optimizer_init (LOOPS_NORMAL
-  | LOOPS_HAVE_RECORDED_EXITS);
   if (dump_file && (dump_flags & TDF_DETAILS))
flow_loops_dump (dump_file, NULL, 0);
   scev_initialize ();
@@ -2305,12 +2307,15 @@ estimate_function_body_sizes (struct cgr
   *inline_summary (node)->loop_iterations = loop_iterations;
}
   scev_finalize ();
-  loop_optimizer_finalize ();
-  free_dominance_info (CDI_DOMINATORS);
 }
   inline_summary (node)->self_time = time;
   inline_summary (node)->self_size = size;
   VEC_free (predicate_t, heap, nonconstant_names);
+  if (optimize)
+{
+  loop_optimizer_finalize ();
+  free_dominance_info (CDI_DOMINATORS);
+}
   if (dump_file)
 {
   fprintf (dump_file, "\n");



Re: [PATCH] rs6000: Add a builtin to read the time base register on PowerPC

2012-08-29 Thread Tulio Magno Quites Machado Filho


Hi Segher,

Segher Boessenkool  writes:
Add __builtin_ppc_get_timebase to read the time base register 
on PowerPC.
This is required for applications that measure time at high 
frequencies

with high precision that can't afford a syscall.


For things that do mftb with high frequency, maybe you should 
also add a
builtin that does just an mftb, i.e. returns a 32-bit result on 
32-bit

implementations.


Are you thinking in a function that returns only the TBL?
I don't think such a builtin would make sense on a 64-bit 
environment, right?

Do you have a suggestion for its name?


Please add documentation for the new builtin(s).


Sure!


--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -1429,6 +1429,9 @@ BU_SPECIAL_X (RS6000_BUILTIN_RSQRT, 
"__builtin_rsqrt", RS6000_BTM_FRSQRTE,
 BU_SPECIAL_X (RS6000_BUILTIN_RSQRTF, "__builtin_rsqrtf", 
 RS6000_BTM_FRSQRTES,

  RS6000_BTC_FP)

+BU_SPECIAL_X (RS6000_BUILTIN_GET_TB, 
"__builtin_ppc_get_timebase",

+RS6000_BTM_POWERPC, RS6000_BTC_MISC)


RS6000_BTM_POWERPC does not exist anymore.  RS6000_BTM_ALWAYS?


I'm replacing.

+/* Expand an expression EXP that calls a builtin without 
arguments.  */

+static rtx
+rs6000_expand_noop_builtin (enum insn_code icode, rtx target)


"noop" gives the wrong idea, "zeroop" perhaps?


zeroop is much better.




+(define_expand "get_timebase"


You should probably prefix this with powerpc_ or rs6000_ as 
well.

The existing code is not very consistent in this.


OK.


+  [(use (match_operand:DI 0 "gpc_reg_operand" ""))]
+  ""
+  "
+{
+  if (TARGET_POWERPC64)
+emit_insn (gen_get_timebase_ppc64 (operands[0]));
+  else if (TARGET_POWERPC)
+emit_insn (gen_get_timebase_ppc32 (operands[0]));
+  else
+FAIL;
+  DONE;
+}")


TARGET_POWERPC is always true.


OK.


+(define_insn "get_timebase_ppc32"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
+(unspec_volatile:DI [(const_int 0)] UNSPECV_GETTB))
+   (clobber (match_scratch:SI 1 "=r"))]
+  "TARGET_POWERPC && !TARGET_POWERPC64"
+{
+return "mftbu %0\;"
+  "mftb %L0\;"
+  "mftbu %1\;"
+  "cmpw %0,%1\;"
+  "bne- $-16";
+})


This only works for WORDS_BIG_ENDIAN.


Yes.

You should say you clobber CR0 here I think; actually, allow any 
CRn

instead.


Yes.

Does mftb work on all supported assemblers?  The machine 
instruction

is phased out, but some assemblers translate it to mfspr.


According to the Power ISA 2.06 they should translate it to mfspr.


+(define_insn "get_timebase_ppc64"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
+(unspec_volatile:DI [(const_int 0)] UNSPECV_GETTB))]
+  "TARGET_POWERPC64"
+{
+return "mfspr %0, 268";
+})


POWER3 needs mftb.


Nice catch!


+int
+main(void)
+{
+  uint64_t t1, t2, t3;
+
+  t1 = __builtin_ppc_get_timebase ();
+  t2 = __builtin_ppc_get_timebase ();
+  t3 = __builtin_ppc_get_timebase ();
+
+  if (t1 != t2 && t1 != t3 && t2 != t3)
+return 0;
+
+  return 1;
+}


On some systems the timebase runs at a rather low frequency, say 
20MHz.
This test will spuriously fail there.  Waste a million CPU 
cycles before

reading TB the second time?


Yes.

Thank you,

--
Tulio Magno



Re: faster random number engine

2012-08-29 Thread Ulrich Drepper
On Wed, Aug 29, 2012 at 11:43 AM, Paolo Carlini  wro
> The substance isn't of course. But normally we don't have __gnu_cxx things
> in the same std header. Can't we have a new ext/random and put it in there?
> If we can separate the new code to it, I think people would not even object
> to the target dependency, etc. In ext/ we are quite free to do extension /
> experimental work.

OK, I moved the definition to ext.  Will check in the result.


Re: [PATCH] MIPS16 TLS support for GCC

2012-08-29 Thread Richard Sandiford
Chung-Lin Tang  writes:
> On 2012/7/6 02:23 PM, Richard Sandiford wrote:
>> Richard Sandiford  writes:
 (3) Also related to libraries, I edited CRT_CALL_STATIC_FUNCTION to emit
 a 32-bit code sequence under both MIPS/MIPS16 mode (under O32).

 As you can see in the original Feb. patch, I had changes to emit a
 MIPS16 version of these static calls, but with the changes in (2) above,
 they will not work with the usual situation of a 32-bit MIPS built /lib
 (.init/.fini will have 32/16-bit code improperly concatenated).

 The CodeSourcery builds use an independent mips16 sysroot for this, so a
 MIPS16 CRT_CALL_STATIC_FUNCTION works there. For the usual case, I think
 making it 32-bit is the compatible choice.
>>>
>>> Yeah, I agree that sounds like the right call.  Please do the same
>>> for the n32/n64 version (i.e. explicitly make it nomips16 rather
>>> than add the #error).
>> 
>> BTW, doing this has removed my main concern about having dead code.
>> The original patch had a separate MIPS16 implementation that (as things
>> stood) could never be used by stock sources.  That would make it difficult
>> to maintain.
>> 
>> Now that the MIPS16 library support is purely adding nomips16 attributes
>> to code that is obviously nomips16, those parts are OK on their own, thanks.
>> (I.e. the mips.h change, the libgcc change, and the libgomp change.)
>> Feel free to drop the multilib thing if you don't want to implement
>> --with-multilib-list.
>
> Hi Richard, just FYI, I just committed the said approved parts.
> gcc/config/mips/t-linux64 had one additional change, adding
> ../lib/mips16 to the corresponding MULTILIB_OSDIRNAMES, or else we end
> with a weird option-named directory for the mips16 libraries.

Sorry, but the t-linux64 stuff wasn't approved.  It was just the mips.h
change, the libgcc change and the libgomp change.

Please revert the patch to t-linux64.  My original objection to adding
mips16 unconditionally still stands: it isn't correct for people who
configure for processors that don't have the MIPS16 ASE (such as Octeon).

Thanks,
Richard


Re: remove dependency on cp/parser.h from cp/lang.c

2012-08-29 Thread Marc Glisse

On Wed, 29 Aug 2012, Aaron Gray wrote:


Just got my copyright assignment through, so here's my first GCC patch,


Welcome!


This is a one liner removing the unneeded dependency of cp-lang.c on
cp/parser.h. This has been tested on Linux.


I think you need to attach a ChangeLog entry with every patch.

Also, dependencies are often repeated in makefiles. Is there anything to 
update with your patch? (maybe not, just asking)


--
Marc Glisse


Re: [patch] Fix problems with -fdebug-types-section and local types

2012-08-29 Thread Cary Coutant
> Ping.
>
> http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00398.html

Because much of this patch was superceded by this recent patch:

   http://gcc.gnu.org/ml/gcc-patches/2012-08/msg01968.html

I'll combine the two and submit a new patch.

-cary


Re: [MIPS, committed] Add missing COSTS_N_INSNS call.

2012-08-29 Thread Richard Sandiford
Richard Sandiford  writes:
> Hans-Peter Nilsson  writes:
>> On Tue, 28 Aug 2012, Richard Sandiford wrote:
>>> Hans-Peter Nilsson  writes:
>>> > On Sun, 26 Aug 2012, Richard Sandiford wrote:
>>> >> I'm preparing a patch to turn gcc.target/mips into a torture-like
>>> >> testsuite.
>>> >
>>> > While on the subject of gcc.target/mips and its extensions, it
>>> > also doesn't handle a build configured with --with-synci=yes.
>>> > (Well, not on the 4.7 branch at least.)
>>>
>>> What goes wrong?
>>
>> I don't remember details, but IIRC some synci-related tests go
>> wrong for mipsisa32r2el-linux-gnu due to -msynci being the
>> default.  Don't worry, I've fixed it in the local import. :)
>> I though the above would entice you to try it, but I guess I
>> need to report better for that to happen.  Maybe later.
>
> Trying it now.  I suspect it was the problem that Steve hit:
> the implicit -msynci is still (deliberately) kept when a lower
> architecture is selected.
>
> I'm testing a patch to make the testsuite work out the default
> -m{no,}synci, which ought to be enough.  The usual rules should
> then kick in and force -mno-synci where necessary.  Hopefully.

Here's the patch.  Tested on mipsisa64r2-elf, where mips.exp
comes out clean.  I looked at the logs to make sure that -mno-synci
was being passed for lower architectures but that no explicit
-msynci or -mno-synci option was used when testing mips64r2.
Applied.

Richard


gcc/
* config/mips/mips.h (TARGET_CPU_CPP_BUILTINS): Define __mips_synci
if TARGET_SYNCI.

gcc/testsuite/
* gcc.target/mips/mips.exp: Work out default -msynci setting.

Index: gcc/config/mips/mips.h
===
--- gcc/config/mips/mips.h  2012-08-29 19:40:47.0 +0100
+++ gcc/config/mips/mips.h  2012-08-29 19:50:50.144982449 +0100
@@ -517,6 +517,9 @@ #define TARGET_CPU_CPP_BUILTINS()   
\
   if (TARGET_OCTEON)   \
builtin_define ("__OCTEON__");  \
\
+  if (TARGET_SYNCI)
\
+   builtin_define ("__mips_synci");\
+   \
   /* Macros dependent on the C dialect.  */
\
   if (preprocessing_asm_p ())  \
{   \
Index: gcc/testsuite/gcc.target/mips/mips.exp
===
--- gcc/testsuite/gcc.target/mips/mips.exp  2012-08-27 17:27:13.0 
+0100
+++ gcc/testsuite/gcc.target/mips/mips.exp  2012-08-29 19:50:50.141982450 
+0100
@@ -767,6 +767,12 @@ proc mips-dg-init {} {
"-mno-smartmips",
#endif
 
+   #ifdef __mips_synci
+   "-msynci",
+   #else
+   "-mno-synci",
+   #endif
+
0
};
 }]


[patch] Fix problems with -fdebug-types-section

2012-08-29 Thread Cary Coutant
I've combined these two pending patches into one:

   http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00398.html
   http://gcc.gnu.org/ml/gcc-patches/2012-08/msg01968.html

The first patch fixed a problem with copying too much of a referenced
type into a type unit, by changing clone_tree_hash() to copy subprograms
as declarations.  In the second patch, I found that clone_tree_hash()
was still copying too much, and determined that it shouldn't be called
at all.

Notes from the first patch:

With --std=c++11, a template parameter can refer to a local type defined
within a function.  Because that local type doesn't qualify for its own
type unit, we copy it as an "unworthy" type into the type unit that refers
to it, but we copy too much, leading to a comdat type unit that contains a
DIE with subprogram definitions rather than declarations.  These DIEs may
have DW_AT_low_pc/high_pc or DW_AT_ranges attributes, and consequently can
refer to range list entries that don't get emitted because they're not
marked when the compile unit is scanned, eventually causing an undefined
symbol at link time.

In addition, while debugging this problem, I found that the
DW_AT_object_pointer attribute, when left in the skeletons that are left
behind in the compile unit, causes duplicate copies of the types to be
copied back into the compile unit.

This patch fixes these problems by removing the DW_AT_object_pointer
attribute from the skeleton left behind in the compile unit, and by
copying DW_TAG_subprogram DIEs as declarations when copying "unworthy"
types into a type unit.  In order to preserve information in the DIE
structure, I also added DW_AT_abstract_origin as an attribute that
should be copied when cloning a DIE as a declaration.

I also fixed the dwarf4-typedef.C test, which should be turning on
the -fdebug-types-section flag.

Notes from the second patch:

When a class template instantiation is moved into a separate type unit,
it can bring along a lot of other referenced types into the type unit,
especially if the template is derived from another (large) type that
does not have an actually have a type definition in a type unit of its
own. When there are many instantiations of the same template, we get
a lot of duplication, and in the worst case (a template with several
parameters, instantiated multiple times along each dimension), GCC
can end up taking a long time and exhausting available memory.

This combinatorial explosion is being caused by copy_decls_walk, where
it finds a type DIE that is referenced by the type unit, but is not
itself a type unit, and copies a declaration for that type into the
type unit in order to resolve the reference within the type unit.
In the process, copy_decls_walk also copies all of the children of
that DIE. In the case of a base class with member function templates,
every one of the instantiated member functions is copied into every
type unit that references the base class.

I don't believe that it's necessary to copy the children of the class
declaration at all, and this patch simply removes the code that copies
those children. If there's a reference in the type unit to one of the
children of that class, that one child will get copied in as needed.

Bootstraps and passes regression tests. Also tested with a large
internal test case that previously resulted in out-of-memory during
compilation.

OK for trunk?


2012-08-29   Cary Coutant  

gcc/
* dwarf2out.c (clone_as_declaration): Copy DW_AT_abstract_origin
attribute.
(generate_skeleton_bottom_up): Remove DW_AT_object_pointer attribute
from original DIE.
(clone_tree_hash): Remove.
(copy_decls_walk): Don't copy children of a declaration into a
type unit.

gcc/testsuite/
* testsuite/g++.dg/debug/dwarf2/dwarf4-nested.C: New test case.
* testsuite/g++.dg/debug/dwarf2/dwarf4-typedef.C: Add
-fdebug-types-section flag.


Index: gcc/testsuite/g++.dg/debug/dwarf2/dwarf4-nested.C
===
--- gcc/testsuite/g++.dg/debug/dwarf2/dwarf4-nested.C   (revision 0)
+++ gcc/testsuite/g++.dg/debug/dwarf2/dwarf4-nested.C   (revision 0)
@@ -0,0 +1,55 @@
+// { dg-do compile }
+// { dg-options "--std=c++11 -dA -gdwarf-4 -fdebug-types-section 
-fno-merge-debug-strings" }
+
+// Check that -fdebug-types-sections does not copy a full referenced type
+// into a type unit.
+
+// Checks that at least one type unit is generated.
+//
+// { dg-final { scan-assembler "DIE \\(\[^\n\]*\\) DW_TAG_type_unit" } }
+//
+// Check that func is declared exactly twice in the debug info:
+// once in the type unit for struct D, and once in the compile unit.
+//
+// { dg-final { scan-assembler-times "\\.ascii 
\"func0\"\[^\n\]*DW_AT_name" 2 } }
+//
+// Check to make sure that no type unit contains a DIE with DW_AT_low_pc
+// or DW_AT_ranges.  These patterns assume that the compile unit is always
+// emitted after all type units.
+//
+// { dg-final {

Re: [PATCH] rs6000: Add a builtin to read the time base register on PowerPC

2012-08-29 Thread Segher Boessenkool
On some systems the timebase runs at a rather low frequency, say  
20MHz.
This test will spuriously fail there.  Waste a million CPU cycles  
before

reading TB the second time?


Waste said million cycles portably by calling sched_yield()?
(Available only on POSIX systems. :)


I was thinking more along the lines of

int j;
for (j = 0; j < 100; j++)
asm("" : : "r"(j));

which is more portable (and a lot more predictable).


Segher



Re: [google/gcc-4_7, trunk] Fix problem with -fdebug-types-section and template instantiations, take 2

2012-08-29 Thread Cary Coutant
> This patch is for trunk and the google/gcc-4_7 branch.
>
> 2012-08-28   Cary Coutant  
>
> * gcc/dwarf2out.c (clone_tree_partial): Remove.
> (copy_decls_walk): Don't copy children of a declaration
> into a type unit.

For trunk, I've submitted a new patch that combines this one with a
previous pending patch.

Still looking for an approval for google/gcc-4_7 branch...

-cary


Re: [google/gcc-4_7, trunk] Fix problem with -fdebug-types-section and template instantiations, take 2

2012-08-29 Thread Sterling Augustine
On Wed, Aug 29, 2012 at 12:03 PM, Cary Coutant  wrote:
>> This patch is for trunk and the google/gcc-4_7 branch.
>>
>> 2012-08-28   Cary Coutant  
>>
>> * gcc/dwarf2out.c (clone_tree_partial): Remove.
>> (copy_decls_walk): Don't copy children of a declaration
>> into a type unit.
>
> For trunk, I've submitted a new patch that combines this one with a
> previous pending patch.
>
> Still looking for an approval for google/gcc-4_7 branch...
>
> -cary

This is OK for google 4.7


Re: [PATCH] rs6000: Add a builtin to read the time base register on PowerPC

2012-08-29 Thread Segher Boessenkool
For things that do mftb with high frequency, maybe you should also  
add a
builtin that does just an mftb, i.e. returns a 32-bit result on 32- 
bit

implementations.


Are you thinking in a function that returns only the TBL?


On 32-bit, just TBL; on 64-bit, the whole TB (there is no machine
instruction to read just TBL on 64-bit, so it doesn't make much
sense to have it return a 32-bit number).


+(define_insn "get_timebase_ppc32"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
+(unspec_volatile:DI [(const_int 0)] UNSPECV_GETTB))
+   (clobber (match_scratch:SI 1 "=r"))]
+  "TARGET_POWERPC && !TARGET_POWERPC64"
+{
+return "mftbu %0\;"
+  "mftb %L0\;"
+  "mftbu %1\;"
+  "cmpw %0,%1\;"
+  "bne- $-16";
+})


This only works for WORDS_BIG_ENDIAN.


Yes.


Do you mean you are fixing it?  :-)


Does mftb work on all supported assemblers?  The machine instruction
is phased out, but some assemblers translate it to mfspr.


According to the Power ISA 2.06 they should translate it to mfspr.


Yes, I realised that later.

But then a binary compiled with an assembler that emits mfspr for mftb
will not run on POWER3 or 601.  I don't know what to do about that;
maybe just document it.


Segher



Re: [PATCH] rs6000: Add a builtin to read the time base register on PowerPC

2012-08-29 Thread Hans-Peter Nilsson
On Wed, 29 Aug 2012, Michael Meissner wrote:
> On Wed, Aug 29, 2012 at 01:56:05PM -0400, Hans-Peter Nilsson wrote:
> > On Wed, 29 Aug 2012, Segher Boessenkool wrote:
> > > On some systems the timebase runs at a rather low frequency, say 20MHz.
> > > This test will spuriously fail there.  Waste a million CPU cycles before
> > > reading TB the second time?
> >
> > Waste said million cycles portably by calling sched_yield()?
> > (Available only on POSIX systems. :)
>
> Well only for a test environment.  You don't want to call sched_yield in the
> normal case, since the apps that do this many millions of times need this to 
> be
> as a fast as possible.

Surely, but IMHO what goes for the normal case is not a valid
reading of "waste"..."millions of cycles". ;)

Point being, for simulator environments, you may not want the
loop that was suggested later.  On the other hand, that might
not be an observable period, either.

brgds, H-P


Re: [MIPS, committed] Add missing COSTS_N_INSNS call.

2012-08-29 Thread Hans-Peter Nilsson
On Wed, 29 Aug 2012, Richard Sandiford wrote:
> Richard Sandiford  writes:
> > I'm testing a patch to make the testsuite work out the default
> > -m{no,}synci, which ought to be enough.  The usual rules should
> > then kick in and force -mno-synci where necessary.  Hopefully.
>
> Here's the patch.

> Index: gcc/testsuite/gcc.target/mips/mips.exp
> ===
> --- gcc/testsuite/gcc.target/mips/mips.exp2012-08-27 17:27:13.0 
> +0100
> +++ gcc/testsuite/gcc.target/mips/mips.exp2012-08-29 19:50:50.141982450 
> +0100
> @@ -767,6 +767,12 @@ proc mips-dg-init {} {
>   "-mno-smartmips",
>   #endif
>
> + #ifdef __mips_synci

JFTR, I came up with something very similar locally, but without
new builtin defines and with the invalid assumption of
configuring with --with-synci=yes, hence "#if (__mips == 32 ||
__mips == 64) && __mips_isa_rev == 2 && !defined(__mips16)"

brgds, H-P


[wwwdocs] SH 4.8 changes update

2012-08-29 Thread Oleg Endo
Hello,

The new SH option -menable-tas has been renamed to -mtas in rev 190782.
I have committed the attached patch to reflect this in the changes.html
for 4.8.

Cheers,
Oleg
? sh_mtas_rename.patch
Index: htdocs/gcc-4.8/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.8/changes.html,v
retrieving revision 1.23
diff -u -r1.23 changes.html
--- htdocs/gcc-4.8/changes.html	26 Aug 2012 21:48:50 -	1.23
+++ htdocs/gcc-4.8/changes.html	29 Aug 2012 19:21:06 -
@@ -232,9 +232,9 @@
   Minor improvements to code generated for software atomic sequences
   that are enabled by -msoft-atomic.
 
-  A new option -menable-tas will make the compiler
-  generate the tas.b instruction for the
-  __atomic_test_and_set built-in function.
+  A new option -mtas will make the compiler generate the
+  tas.b instruction for the __atomic_test_and_set
+  built-in function.
 
   The SH4A instructions movco.l and
   movli.l are now supported.  They are used to implement some
@@ -281,9 +281,9 @@
 
 The behavior of the -mieee option has been fixed and the
 negative form -mno-ieee has been added to control the IEEE
-conformance of floating point comparisons.  By default-mieee is
-now enabled and the option -ffinite-math-only implicitly sets
--mno-ieee.
+conformance of floating point comparisons.  By default -mieee
+is now enabled and the option -ffinite-math-only implicitly
+sets -mno-ieee.
 
   
 


Re: [PATCH] rs6000: Add a builtin to read the time base register on PowerPC

2012-08-29 Thread Tulio Magno Quites Machado Filho

Segher Boessenkool  writes:

For things that do mftb with high frequency, maybe you should 
also add a
builtin that does just an mftb, i.e. returns a 32-bit result 
on 32- 
bit

implementations.


Are you thinking in a function that returns only the TBL?


On 32-bit, just TBL; on 64-bit, the whole TB (there is no 
machine

instruction to read just TBL on 64-bit, so it doesn't make much
sense to have it return a 32-bit number).


OK.


+(define_insn "get_timebase_ppc32"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
+(unspec_volatile:DI [(const_int 0)] UNSPECV_GETTB))
+   (clobber (match_scratch:SI 1 "=r"))]
+  "TARGET_POWERPC && !TARGET_POWERPC64"
+{
+return "mftbu %0\;"
+  "mftb %L0\;"
+  "mftbu %1\;"
+  "cmpw %0,%1\;"
+  "bne- $-16";
+})


This only works for WORDS_BIG_ENDIAN.


Yes.


Do you mean you are fixing it?  :-)


Yes. At least I'll try to.  :-)

Does mftb work on all supported assemblers?  The machine 
instruction

is phased out, but some assemblers translate it to mfspr.


According to the Power ISA 2.06 they should translate it to 
mfspr.


Yes, I realised that later.

But then a binary compiled with an assembler that emits mfspr 
for mftb
will not run on POWER3 or 601.  I don't know what to do about 
that;

maybe just document it.


We can easily fix this at runtime, which isn't the case here.

Thanks again,

--
Tulio Magno



Re: [Fortran] PR37336 - FIINAL patch [1/n]: Implement the finalization wrapper subroutine

2012-08-29 Thread Tobias Burnus

Dear all,

that's the revised version of patch at 
http://gcc.gnu.org/ml/fortran/2012-08/msg00095.html, taking the review 
comments into account.


Reminder: This patch only generates the finalization wrapper, which is 
in the virtual table. It does not add the required calls; hence, it 
still doesn't allow to use finalization.



The patch consists of three parts:

a) The main patch, which implements the wrapper.
  I am asking for approval for that patch.

b) A patch which removes the gfc_error "not yet implemented"
  I suggest to only remove the error after finalization calls have been 
added


c) A patch which bumps the .mod version
   - or alternatively -
   a patch which disables the _final generation in the vtable.

I have build and regtested (on x86-64-linux) the patch with (a) and 
(a)+(b) applied.



I would like to include the patch (c) as modifying the vtable changes 
the ABI. Bumping the .mod version is a reliable way to force 
recompilation. The alternative is to wait until the final FINAL patch 
before bumping the .mod version (and disable the "_final" generation).


One possibility, if deemed useful, is to combine the .mod version bump 
with backward compatible reading of .mod files, i.e., only error out 
when BT_CLASS is encountered in an old .mod file.



Is the patch (a) OK for the trunk? With which version of (c)?

(I am slightly inclined to do the .mod bump now. As a follow up, one can 
also commit Janus' proc-pointer patch, 
http://gcc.gnu.org/ml/fortran/2012-04/msg00033.html, though I think 
someone has still to review it.)


Tobias

PS: When doing the ABI change, I am going to document it in the release 
notes / wiki.
2012-08-29  Alessandro Fanfarillo  
Tobias Burnus  

	PR fortran/37336
	* gfortran.h (symbol_attribute): Add artificial.
	* module.c (mio_symbol_attribute): Handle attr.artificial
	* class.c (gfc_build_class_symbol): Defer creation of the vtab
	if the DT has finalizers, mark generated symbols as
	attr.artificial.
	(has_finalizer_component, finalize_component,
	finalization_scalarizer, generate_finalization_wrapper):
	New static functions.
	(gfc_find_derived_vtab): Add _final component and call
	generate_finalization_wrapper.
* dump-parse-tree.c (show_f2k_derived): Use resolved
	proc_tree->n.sym rather than unresolved proc_sym.
	(show_attr): Handle attr.artificial.
	* resolve.c (gfc_resolve_finalizers): Ensure that the vtab exists.
	(resolve_fl_derived): Resolve finalizers before
	generating the vtab.
	(resolve_symbol): Also allow assumed-rank arrays with CONTIGUOUS;
	skip artificial symbols.
	(resolve_fl_derived0): Skip artificial symbols.

2012-08-29  Tobias Burnus  

	PR fortran/51632
	* gfortran.dg/coarray_class_1.f90: New.

diff --git a/gcc/fortran/class.c b/gcc/fortran/class.c
index 21a91ba..9d58aab 100644
--- a/gcc/fortran/class.c
+++ b/gcc/fortran/class.c
@@ -34,7 +34,7 @@ along with GCC; see the file COPYING3.  If not see
  declared type of the class variable and its attributes
  (pointer/allocatable/dimension/...).
 * _vptr: A pointer to the vtable entry (see below) of the dynamic type.
-
+
For each derived type we set up a "vtable" entry, i.e. a structure with the
following fields:
 * _hash: A hash value serving as a unique identifier for this type.
@@ -42,6 +42,9 @@ along with GCC; see the file COPYING3.  If not see
 * _extends:  A pointer to the vtable entry of the parent derived type.
 * _def_init: A pointer to a default initialized variable of this type.
 * _copy: A procedure pointer to a copying procedure.
+* _final:A procedure pointer to a wrapper function, which frees
+		 allocatable components and calls FINAL subroutines.
+
After these follow procedure pointer components for the specific
type-bound procedures.  */
 
@@ -572,7 +575,9 @@ gfc_build_class_symbol (gfc_typespec *ts, symbol_attribute *attr,
   if (gfc_add_component (fclass, "_vptr", &c) == FAILURE)
 	return FAILURE;
   c->ts.type = BT_DERIVED;
-  if (delayed_vtab)
+  if (delayed_vtab
+	  || (ts->u.derived->f2k_derived
+	  && ts->u.derived->f2k_derived->finalizers))
 	c->ts.u.derived = NULL;
   else
 	{
@@ -689,6 +694,702 @@ copy_vtab_proc_comps (gfc_symbol *declared, gfc_symbol *vtype)
 }
 
 
+/* Returns true if any of its nonpointer nonallocatable components or
+   their nonpointer nonallocatable subcomponents has a finalization
+   subroutine.  */
+
+static bool
+has_finalizer_component (gfc_symbol *derived)
+{
+   gfc_component *c;
+
+  for (c = derived->components; c; c = c->next)
+{
+  if (c->ts.type == BT_DERIVED && c->ts.u.derived->f2k_derived
+	  && c->ts.u.derived->f2k_derived->finalizers)
+	return true;
+
+  if (c->ts.type == BT_DERIVED
+	  && !c->attr.pointer && !c->attr.allocatable
+	  && has_finalizer_component (c->ts.u.derived))
+	return true;
+}
+  return false;
+}
+
+
+/* Call DEALLOCATE for the passed component if it is allocatable, if i

Re: remove dependency on cp/parser.h from cp/lang.c

2012-08-29 Thread Aaron Gray
On 29 August 2012 19:47, Marc Glisse  wrote:
>
> On Wed, 29 Aug 2012, Aaron Gray wrote:
>
>> Just got my copyright assignment through, so here's my first GCC patch,
>
>
> Welcome!

Thanks Marc !

>
>
>
>> This is a one liner removing the unneeded dependency of cp-lang.c on
>> cp/parser.h. This has been tested on Linux.
>
>
> I think you need to attach a ChangeLog entry with every patch.

Okay

>
>
> Also, dependencies are often repeated in makefiles. Is there anything to 
> update with your patch? (maybe not, just asking)

Yes there is a dependency in cp/Make-lang.in, I will resubmit the patch soon.

--
Aaron


[patch] Fix CFG dumping of blocks with no predecessors or successors

2012-08-29 Thread Steven Bosscher
Will commit as obvious.

* cfg.c (dump_bb_info): Print a newline if there were no edges to dump.

Index: cfg.c
===
--- cfg.c   (revision 190785)
+++ cfg.c   (working copy)
@@ -764,6 +764,8 @@ dump_bb_info (FILE *outf, basic_block bb, int inde
  dump_edge_info (outf, e, flags, 0);
  fputc ('\n', outf);
}
+  if (first)
+   fputc ('\n', outf);
 }

   if (do_footer)
@@ -784,6 +786,8 @@
  dump_edge_info (outf, e, flags, 1);
  fputc ('\n', outf);
}
+  if (first)
+   fputc ('\n', outf);
 }
 }


[PATCH] Remove dependency of cp/cp-lang.c on cp/parser.h

2012-08-29 Thread Aaron Gray
Patch removing the dependency of cp/cp-lang.c on cp/parser.c. This as
been tested on Linux.

[gcc/cp]
2012-08-29 Aaron Gray 

* cp/cp-lang.c: removed #include "parser.h"
* cp/Make-lang.in: removed dependency of cp/cp-lang.c on cp/parser.h


diff --git a/gcc/cp/cp-lang.c b/gcc/cp/cp-lang.c
index da7f1e1..5ca0b0a 100644
--- a/gcc/cp/cp-lang.c
+++ b/gcc/cp/cp-lang.c
@@ -32,7 +32,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "cp-objcp-common.h"
 #include "hashtab.h"
 #include "target.h"
-#include "parser.h"

 enum c_language_kind c_language = clk_cxx;
 static void cp_init_ts (void);
diff --git a/gcc/cp/Make-lang.in b/gcc/cp/Make-lang.in
index 6233f06..78296ae 100644
--- a/gcc/cp/Make-lang.in
+++ b/gcc/cp/Make-lang.in
@@ -270,7 +270,7 @@ cp/lex.o: cp/lex.c $(CXX_TREE_H) $(TM_H) $(FLAGS_H) \
   c-family/c-objc.h
 cp/cp-lang.o: cp/cp-lang.c $(CXX_TREE_H) $(TM_H) debug.h langhooks.h \
   $(LANGHOOKS_DEF_H) $(C_COMMON_H) gtype-cp.h gt-cp-cp-lang.h \
-  cp/cp-objcp-common.h $(EXPR_H) $(TARGET_H) $(CXX_PARSER_H)
+  cp/cp-objcp-common.h $(EXPR_H) $(TARGET_H) tree.h c-family/c-pragma.h
 cp/decl.o: cp/decl.c $(CXX_TREE_H) $(TM_H) $(FLAGS_H) cp/decl.h \
   output.h toplev.h $(HASHTAB_H) $(RTL_H) \
   cp/operators.def $(TM_P_H) $(TREE_INLINE_H) $(DIAGNOSTIC_H) $(C_PRAGMA_H) \


cp-lang.diff
Description: Binary data


Re: [PATCH] rs6000: Add a builtin to read the time base register on PowerPC

2012-08-29 Thread Segher Boessenkool

Point being, for simulator environments, you may not want the
loop that was suggested later.  On the other hand, that might
not be an observable period, either.


I don't think looping a million times would be too slow for the
testsuite: there are many tests that do a lot more work than that,
already.

The worst case for hardware that I know of can take about 100
clock cycles for one timebase tick.

But how about this then, which only iterates much if the test
fails:


int
main (void)
{
uint64_t t = __builtin_ppc_get_timebase ();
int j;

for (j = 0; j < 100; j++)
if (t != __builtin_ppc_get_timebase ())
break;

return (j == 100);
}


Segher



Re: [PATCH] Add counter histogram to fdo summary (issue6465057)

2012-08-29 Thread Teresa Johnson
On Wed, Aug 29, 2012 at 6:12 AM, Jan Hubicka  wrote:
>> Index: libgcc/libgcov.c
>> ===
>> --- libgcc/libgcov.c  (revision 190736)
>> +++ libgcc/libgcov.c  (working copy)
>> @@ -276,6 +276,78 @@ gcov_version (struct gcov_info *ptr, gcov_unsigned
>>return 1;
>>  }
>>
>> +/* Insert counter VALUE into HISTOGRAM.  */
>> +
>> +static void
>> +gcov_histogram_insert(gcov_bucket_type *histogram, gcov_type value)
>> +{
>> +  unsigned i;
>> +
>> +  i = gcov_histo_index(value);
>> +  gcc_assert (i < GCOV_HISTOGRAM_SIZE);
> Does checking_assert work in libgcov? I do not think internal consistency 
> check
> should go to --enable-checking=release libgcov. We want to maintain it as
> lightweight as possible. (I see there are two existing gcc_asserts, since they
> report file format corruption, I think they should give better diagnostic).

gcc_checking_assert isn't available, since tsystem.h not system.h is
included. I could probably just remove the assert (to be safe,
silently return if i is out of bounds?).

>
> Inliner will do good job here, but perhaps explicit inline fits.
>> +  for (f_ix = 0; f_ix != gi_ptr->n_functions; f_ix++)
>> +{
>> +  gfi_ptr = gi_ptr->functions[f_ix];
>> +
>> +  if (!gfi_ptr || gfi_ptr->key != gi_ptr)
>> +continue;
>> +
>> +  ci_ptr = &gfi_ptr->ctrs[ctr_info_ix];
>> +  for (ix = 0; ix < ci_ptr->num; ix++)
>> +gcov_histogram_insert(cs_ptr->histogram, ci_ptr->values[ix]);
> Space before (.

Ok.

>> +}
>> +}
>> +}
>> +
>>  /* Dump the coverage counts. We merge with existing counts when
>> possible, to avoid growing the .da files ad infinitum. We use this
>> program's checksum to make sure we only accumulate whole program
>> @@ -347,6 +419,7 @@ gcov_exit (void)
>>   }
>>   }
>>  }
>> +  gcov_compute_histogram (&this_prg);
>> @@ -598,11 +669,18 @@ gcov_exit (void)
>> if (gi_ptr->merge[t_ix])
>>   {
>> if (!cs_prg->runs++)
>> - cs_prg->num = cs_tprg->num;
>> +cs_prg->num = cs_tprg->num;
>> +  else if (cs_prg->num != cs_tprg->num)
>> +goto read_mismatch;
>
> Doesn't think check that all the programs that contain this unit are the same?
> I.e. will this survive profiledbootstrap where we interleave cc1 and cc1plus?

Ok, removing that check and I am switching the histogram merging code
to handle the case where there are different numbers of counters. It
will end up with the same number of counters as in the summary we are
merging into since that is the num we keep above when runs > 0 to
start with.

>> +  /* Count number of non-zero histogram entries. The histogram is only
>> + currently computed for arc counters.  */
>> +  csum = &summary->ctrs[GCOV_COUNTER_ARCS];
>> +  for (h_ix = 0; h_ix < GCOV_HISTOGRAM_SIZE; h_ix++)
>> +{
>> +  if (csum->histogram[h_ix].num_counters > 0)
>> +h_cnt++;
>> +}
>> +  gcov_write_tag_length (tag, GCOV_TAG_SUMMARY_LENGTH(h_cnt));
>>gcov_write_unsigned (summary->checksum);
>>for (csum = summary->ctrs, ix = GCOV_COUNTERS_SUMMABLE; ix--; csum++)
>>  {
>> @@ -380,6 +388,21 @@ gcov_write_summary (gcov_unsigned_t tag, const str
>>gcov_write_counter (csum->sum_all);
>>gcov_write_counter (csum->run_max);
>>gcov_write_counter (csum->sum_max);
>> +  if (ix != GCOV_COUNTER_ARCS)
>> +{
>> +  gcov_write_unsigned (0);
>> +  continue;
>> +}
>> +  gcov_write_unsigned (h_cnt);
>> +  for (h_ix = 0; h_ix < GCOV_HISTOGRAM_SIZE; h_ix++)
>> +{
>> +  if (!csum->histogram[h_ix].num_counters)
>> +continue;
>> +  gcov_write_unsigned (h_ix);
>
> It is kind of waste to write whole unsigned for each histogram index.
> What about writting bitmap of non-zero entries followed by each entry?

Sure, I will do that instead.

>> +/* Merge SRC_HISTO into TGT_HISTO.  */
>
> Perhaps comment about overall concept of the merging routine would suit here.

Ok.

>> -#else /*!IN_GCOV */
>> -#define GCOV_TYPE_SIZE (LONG_LONG_TYPE_SIZE > 32 ? 64 : 32)
>
> Why do you need t omove this out of !libgcov? I do not thing this is correct 
> for all configurations.
> i.e. gcov_type may be 16bit.

>From my understanding of the mode attribute meanings, which I thought
are defined in terms of the number of smallest addressable units, the
code in gcov-io.h that sets up the gcov_type typedef will always end
up with a gcov_type that is 32 or 64 bits? I.e. when BITS_PER_UNIT is
8 it will use either SI or DI which will end up either 32 or 64, and
when BITS_PER_UNIT is 16 it would use either HI or SI which would
again be either 32 or 64. Is that wrong and we can end up with a 16
bit gcov_type?

The GCOV_TYPE_SIZE was being defined everywhere except when IN_GOV (so
it was being defined IN_LIBGCOV), but I wanted it defined
unconditionally because 

Re: [PATCH 1/6] Thread pointer built-in functions, core parts

2012-08-29 Thread Richard Henderson
On 2012-08-28 01:13, Chung-Lin Tang wrote:
> +  icode = optab_handler (get_thread_pointer_optab, Pmode);

Until we decide there's no point in the distinction, this should
be spelled direct_optab_handler, to match OPTAB_D with which the
optab is declared.

Otherwise ok.


r~


Re: [PATCH 2/6] Thread pointer built-in functions, alpha

2012-08-29 Thread Richard Henderson
On 2012-08-28 01:13, Chung-Lin Tang wrote:
> Alpha patch updated to use MD pattern.

Ok.


r~


[PATCH] limited C++ parsing support for gengtype

2012-08-29 Thread Aaron Gray
First of two patches for class'ized cp/parser.c|h gives limited
support for gengtype to parse C++ classes and enums as first class
citizens.

Patch to SVN HEAD

2012-08-30 Aaron Gray 

* gengtype-lex.l: Support for FILE
Support for C++ single line Comments
Support for classes
Support for enums
ignore 'static'
ignore 'inline'
ignore 'public:'
ignore 'protected:'
ignore 'private:'
ignore 'friend'
support for 'operator' token
support for 'new'
support for 'delete'
added support for '+' as a token for summations in enum bodies

* gengtype.h: added 'TYPE_ENUM' to 'enum typekind'
added enum TYPE_ENUM to 'struct type' union
added OPERATOR_KEYWORD and OPERATOR keywords to Token Code enum

* gengtype-parser.c: updated 'token_names[]'
(direct_declarator): support for parsing limited operators
support for parsing constructors with no parameters
support for parsing enums

* gengtype.c: added 'type_p enums'  to maintain list of enums
(resolve_typedef): added support for stucture types and enums
added 'new_enum()'


diff --git a/gcc/gengtype-lex.l b/gcc/gengtype-lex.l
index 5788a6a..af9696a 100644
--- a/gcc/gengtype-lex.l
+++ b/gcc/gengtype-lex.l
@@ -53,11 +53,11 @@ update_lineno (const char *l, size_t len)
 ID [[:alpha:]_][[:alnum:]_]*
 WS [[:space:]]+
 HWS[ \t\r\v\f]*
-IWORD  
short|long|(un)?signed|char|int|HOST_WIDE_INT|HOST_WIDEST_INT|bool|size_t|BOOL_BITFIELD|CPPCHAR_SIGNED_T|ino_t|dev_t|HARD_REG_SET
+IWORD  
short|long|(un)?signed|char|int|HOST_WIDE_INT|HOST_WIDEST_INT|bool|size_t|BOOL_BITFIELD|CPPCHAR_SIGNED_T|ino_t|dev_t|HARD_REG_SET|FILE
 ITYPE  {IWORD}({WS}{IWORD})*
 EOID   [^[:alnum:]_]

-%x in_struct in_struct_comment in_comment
+%x in_struct in_struct_comment in_comment in_line_comment
in_line_struct_comment
 %option warn noyywrap nounput nodefault perf-report
 %option 8bit never-interactive
 %%
@@ -83,6 +83,14 @@ EOID [^[:alnum:]_]
   BEGIN(in_struct);
   return UNION;
 }
+^{HWS}class/{EOID} {
+  BEGIN(in_struct);
+  return STRUCT;
+}
+^{HWS}enum/{EOID} {
+  BEGIN(in_struct);
+  return ENUM;
+}
 ^{HWS}extern/{EOID} {
   BEGIN(in_struct);
   return EXTERN;
@@ -101,10 +109,20 @@ EOID  [^[:alnum:]_]
 \\\n   { lexer_line.line++; }

 "const"/{EOID} /* don't care */
+"static"/{EOID}/* don't care */
+"inline"/{EOID}/* don't care */
+"public:"  /* don't care */
+"private:" /* don't care */
+"protected:"   /* don't care */
+"operator"/{EOID}   { return OPERATOR_KEYWORD; }
+"new"/{EOID}{ *yylval = XDUPVAR (const char,
yytext+1, yyleng-2, yyleng-1); return OPERATOR; }
+"delete"/{EOID} { *yylval = XDUPVAR (const char,
yytext+1, yyleng-2, yyleng-1); return OPERATOR; }
+"friend"/{EOID}
 "GTY"/{EOID}   { return GTY_TOKEN; }
 "VEC"/{EOID}   { return VEC_TOKEN; }
 "union"/{EOID} { return UNION; }
 "struct"/{EOID}{ return STRUCT; }
+"class"/{EOID} { return CLASS; }
 "enum"/{EOID}  { return ENUM; }
 "ptr_alias"/{EOID} { return PTR_ALIAS; }
 "nested_ptr"/{EOID}{ return NESTED_PTR; }
@@ -148,7 +166,7 @@ EOID[^[:alnum:]_]
 }

 "..."  { return ELLIPSIS; }
-[(){},*:<>;=%|-]   { return yytext[0]; }
+[(){},*:<>;=%|\-\+]{ return yytext[0]; }

/* ignore pp-directives */
 ^{HWS}"#"{HWS}[a-z_]+[^\n]*\n   {lexer_line.line++;}
@@ -159,6 +177,7 @@ EOID[^[:alnum:]_]
 }

 "/*"   { BEGIN(in_comment); }
+"//"   { BEGIN(in_line_comment); }
 \n { lexer_line.line++; }
 {ID}   |
 "'"("\\".|[^\\])"'"|
@@ -172,8 +191,17 @@ EOID   [^[:alnum:]_]
 [^*\n] /* do nothing */
 "*"/[^/]   /* do nothing */
 }
+
+{
+[^*\n]{16} |
+[^*\n] /* do nothing */
+"*"/[^/]   /* do nothing */
+}
+
 "*/"   { BEGIN(INITIAL); }
 "*/"{ BEGIN(in_struct); }
+\n{ lexer_line.line++; BEGIN(INITIAL); }
+\n { lexer_line.line++; BEGIN(in_struct); }

 ["/]   |
 "*"  {
diff --git a/gcc/gengtype-parse.c b/gcc/gengtype-parse.c
index 03ee781..663db56 100644
--- a/gcc/gengtype-parse.c
+++ b/gcc/gengtype-parse.c
@@ -3,7 +3,7 @@

This file is part of GCC.

-   GCC is free software; you can redistribute it and/or modify it under
+   /GCC is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 3, or (at your option) any later
version.
@@ -75,6 +75,7 @@ static const char *const token_names[] = {
   "static",
   "union",
   "struct

[middle-end] Add machine_mode to address_cost target hook

2012-08-29 Thread Oleg Endo
Hello,

While experimenting a little bit with an idea for an address mode
selection RTL pass for SH, I realized that SH's sh_address_cost function
is quite broken.  When trying to fix it, I ran against a wall, since the
mode of the MEM is not passed to the target hook function, as it is e.g.
in legitimate_address.  This circumstance makes it a bit difficult to
return useful answers in the address_cost hook.  Like on SH,
displacement address modes for anything < SImode are considered slightly
more expensive due to increased pressure on R0.

Since everything in the middle-end already seems to pass the mode to the
'address_cost' function in rtlanal.c, I'd like to propose to forward the
mode arg to the target hook.  The change is quite obvious, as it only
adds one new (mostly) unused argument to the various address_cost
functions in the targets.

I went through all the targets' code and fixed the hook function.  It
seems some other targets than SH could also benefit from the mode wisdom
in their address_cost estimation.

There are a few peculiarities I ran across (respective target
maintainers CC'ed):

mn10300
  The function mn10300_address_cost calls itself recursively, so I added
  a GET_MODE (x).  However, it never looks at the mode, so there should 
  be no problem.

iq2000:
  Similar to mn10300.  Mode arg is passed to itself, but effectively 
  never used.  Should be no problem.

rs6000:
  I've added the mode to the logging message.  I hope this is OK.

epiphany:
  There's probably no need for the offset alignment workaround anymore.

arm:
  In the function 'thumb1_size_rtx_costs' the 'case MEM' looks wrong.
  I guess it is meant to look at XEXP (x, 0) when checking for 
  SYMBOL_REF?  As it stands now, it seems that GET_CODE (x) == 
  SYMBOL_REF will never be true, because GET_CODE (x) == MEM.

microblaze:
  The microblaze_address_cost takes the mode of the address rtx.
  Maybe it is meant to take the mode of the MEM?


I've checked the patch only on my SH xgcc config with 'make all-gcc',
but others should build fine since there are no functional changes.  I
hope I didn't miss anything.

Feedback appreciated!

Cheers,
Oleg


ChangeLog:

* hooks.c (hook_int_rtx_mode_bool_0): New function.
* hooks.h (hook_int_rtx_mode_bool_0): Declare it.
* output.h (default_address_cost): Add machine_mode 
argument.
* target.def (address_cost): Likewise.
* rtlanal.c (address_cost): Pass mode to target hook.
(default_address_cost): Add machine_mode argument.
* doc/tm.texi: Regenerate.
* config/alpha/alpha.c (TARGET_ADDRESS_COST): Use 
hook_int_rtx_mode_bool_0 instead of hook_int_rtx_bool_0.
* config/arm/arm.c (arm_address_cost): Add machine_mode 
argument.
* config/avr/avr.c (avr_address_cost): Likewise.
* config/bfin/bfin.c (bfin_address_cost): Likewise.
* config/cr16/cr16.c (cr16_address_cost): Likewise.
* config/cris/cris.c (cris_address_cost): Likewise.
* config/epiphany/epiphany.c (epiphany_address_cost): Likewise.
* config/i386/i386.c (ix86_address_cost): Likewise.
* config/ia64/ia64.c (TARGET_ADDRESS_COST): Use 
hook_int_rtx_mode_bool_0 instead of hook_int_rtx_bool_0.
* config/iq2000/iq2000.c (iq2000_address_cost): Add 
machine_mode argument.  Pass it on in recursive invocation.
* config/lm32/lm32.c (TARGET_ADDRESS_COST): Use 
hook_int_rtx_mode_bool_0 instead of hook_int_rtx_bool_0.
* config/m32c/m32c.c (m32c_address_cost): Add machine_mode 
argument.
* config/m32r/m32r.c (TARGET_ADDRESS_COST): Use 
hook_int_rtx_mode_bool_0 instead of hook_int_rtx_bool_0.
* config/mcore/mcore.c (TARGET_ADDRESS_COST): Likewise.
* config/mep/mep.c (mep_address_cost): Add machine_mode 
argument.
* config/microblaze/microblaze.c (microblaze_address_cost): 
Likewise.
* config/mips/mips.c (mips_address_cost): Likewise.
* config/mmix/mmix.c (TARGET_ADDRESS_COST): Use 
hook_int_rtx_mode_bool_0 instead of hook_int_rtx_bool_0.
* config/mn10300/mn10300.c (mn10300_address_cost): Add
machine_mode argument.  Use GET_MODE (x) in recursive 
invocation.
* config/pa/pa.c (hppa_address_cost): Add machine_mode argument.
* config/rs6000/rs6000.c (rs6000_debug_address_cost): Add
machine_mode argument and print it.
(TARGET_ADDRESS_COST): Use hook_int_rtx_mode_bool_0 instead of
hook_int_rtx_bool_0.
* config/rx/rx.c (rx_address_cost): Add machine_mode argument.
* config/s390/s390.c (s390_address_cost): Likewise.
* config/score/score-protos.h (score_address_cost): Likewise.
* config/score/score.c (score_address_cost): Likewise.
* config/sh/sh.c (sh_address_cost): Likewise.
* config/sparc/sparc.c (TARGET_ADDRESS_COST): Use 
hook_int_rtx_mode_bool_0 instead of

Re: [PATCH] C++'ization of cp/parser.c/h

2012-08-29 Thread Gabriel Dos Reis
On Wed, Aug 29, 2012 at 8:01 PM, Aaron Gray  wrote:
> Patch to SVN HEAD that initially C++'izes cp/parser.h and cp/parser.c
> by class'izing the cp_lexer and cp_parser group of functions.
>
> For C programmers and for context all method calls are preceded by
> 'this->' and static method calls by 'cp_parser::' or 'cp_lexer::'.
>
> I have made minimal non orthogonal changes to the code on purpose at
> this stage. This is still a work in progress and I am not sure about
> how to go about preparing a change log for this patch.
>
> There are a number of loose ends :-
>
>   - struct's are used rather than classes for now as the whole file
> gives encapsulation for now.
>   - const's need to be applied
>   - cp_parser_context_free_list is still static and not a member of
> cp_parser yet. This also needs a gengtype change to support
> GTY((deletable)) as a node and not just on gcroots.
>   - cp_token functions have not been class'ized yet.
>   - cp_debug functions are still in global space
>   - cp_unevaluated_opreand is still in global space
>   - cp_lexer::get_preprocessor_token() needs rationalizing
>   - there are still #define's associated with VEC operations that
> should be moved to inline methods
>   - constructors and new methods are still functions as PCH call
> ordering conflicts with them, this also allows keeping code changes
> parallel and recording incremental changes in the code.
>   - no_parameters has been left in but is not used

I think this is heading in the wrong direction.  A class with lot of
member functions is a manifestation of a poor C++ design.  Some
call that a "fat interface."  Ideally a good class design should have very
few observer (and mutation) functions.  Those should form the computational
basis of the class, out of which all other functions should be
implemented -- as
non-member functions.

Have a look at

 http://liz.axiomatics.org/trac/browser/trunk/src/Parser.C

The parser there is defined from a very limited set of computation basis:

  http://liz.axiomatics.org/trac/browser/trunk/src/Parser.H#L79

As a matter of fact, I prefer the non-member functions defined as
static function
(i.e. with internal linkage) so that we get an unambiguous message from the
compiler when a function definition becomes dead code.  Do we need to have a
separate parser.h file that contains code previously defined in parser.c?  Why?

-- Gaby


RE: [Patch, test] Enable to prune warnings for tests defined in one exp file

2012-08-29 Thread Terry Guo
> -Original Message-
> From: Mike Stump [mailto:mikest...@comcast.net]
> Sent: Tuesday, August 28, 2012 1:21 AM
> To: Terry Guo
> Cc: gcc-patches@gcc.gnu.org; Richard Guenther
> Subject: Re: [Patch, test] Enable to prune warnings for tests defined
> in one exp file
> 
> On Aug 27, 2012, at 1:14 AM, Terry Guo wrote:
> > This patch intends to provide a chance to prune common warning
> messages for
> > tests defined in an exp file.
> 
> > Is it OK to trunk?
> 
> Ok.
> 
> If you can find where to document this...  :-)  That'd be nice.
> 

I checked the texi files in gcc/doc folder, but can't find a suitable place.
So I resort to README.gcc in gcc/testsuite which is claimed to list notes
for those writing testcases and those writing expect scripts. Following is
the patch. Is it OK?

BR,
Terry

2012-08-30  Terry Guo  

* README.gcc: Document new variable dg_runtest_extra_prunes.

Index: gcc/testsuite/README.gcc
===
--- gcc/testsuite/README.gcc(revision 190795)
+++ gcc/testsuite/README.gcc(working copy)
@@ -79,6 +79,11 @@
 
 If a test does not fit into the torture framework, use the dg framework.
 
+If some tests in an exp file need to skip same warning messages, just
define
+variable dg_runtest_extra_prunes in this exp file and let it contain this
warning
+message pattern.  This can avoid duplicating dg-prune in these cases.
+Always remember to clear this variable when leave this exp file.
+
 

 Copyright (C) 1997, 1998, 2004 Free Software Foundation, Inc.




Re: [PATCH] limited C++ parsing support for gengtype

2012-08-29 Thread Laurynas Biveinis
Hi -

2012/8/30 Aaron Gray :
> First of two patches for class'ized cp/parser.c|h gives limited
> support for gengtype to parse C++ classes and enums as first class
> citizens.

Please sync with Diego to avoid duplicate work and/or conflicting designs.

Thanks,
-- 
Laurynas


Re: [PATCH] MIPS16 TLS support for GCC

2012-08-29 Thread Chung-Lin Tang
On 2012/8/30 02:44 AM, Richard Sandiford wrote:
> Chung-Lin Tang  writes:
>> On 2012/7/6 02:23 PM, Richard Sandiford wrote:
>>> Richard Sandiford  writes:
> (3) Also related to libraries, I edited CRT_CALL_STATIC_FUNCTION to emit
> a 32-bit code sequence under both MIPS/MIPS16 mode (under O32).
>
> As you can see in the original Feb. patch, I had changes to emit a
> MIPS16 version of these static calls, but with the changes in (2) above,
> they will not work with the usual situation of a 32-bit MIPS built /lib
> (.init/.fini will have 32/16-bit code improperly concatenated).
>
> The CodeSourcery builds use an independent mips16 sysroot for this, so a
> MIPS16 CRT_CALL_STATIC_FUNCTION works there. For the usual case, I think
> making it 32-bit is the compatible choice.

 Yeah, I agree that sounds like the right call.  Please do the same
 for the n32/n64 version (i.e. explicitly make it nomips16 rather
 than add the #error).
>>>
>>> BTW, doing this has removed my main concern about having dead code.
>>> The original patch had a separate MIPS16 implementation that (as things
>>> stood) could never be used by stock sources.  That would make it difficult
>>> to maintain.
>>>
>>> Now that the MIPS16 library support is purely adding nomips16 attributes
>>> to code that is obviously nomips16, those parts are OK on their own, thanks.
>>> (I.e. the mips.h change, the libgcc change, and the libgomp change.)
>>> Feel free to drop the multilib thing if you don't want to implement
>>> --with-multilib-list.
>>
>> Hi Richard, just FYI, I just committed the said approved parts.
>> gcc/config/mips/t-linux64 had one additional change, adding
>> ../lib/mips16 to the corresponding MULTILIB_OSDIRNAMES, or else we end
>> with a weird option-named directory for the mips16 libraries.
> 
> Sorry, but the t-linux64 stuff wasn't approved.  It was just the mips.h
> change, the libgcc change and the libgomp change.
> 
> Please revert the patch to t-linux64.  My original objection to adding
> mips16 unconditionally still stands: it isn't correct for people who
> configure for processors that don't have the MIPS16 ASE (such as Octeon).

I have reverted that part.
Maybe a list of proper march=XXX/mips16 added to MULTILIB_EXCLUSIONS
will do what you're mentioning, though I haven't tried testing that for now.

Thanks,
Chung-Lin