RE: Ping: [PATCH] Enable bbro for -Os
> -Original Message- > From: Steven Bosscher [mailto:stevenb@gmail.com] > Sent: Friday, August 24, 2012 8:17 PM > To: Zhenqiang Chen > Cc: gcc-patches@gcc.gnu.org > Subject: Re: Ping: [PATCH] Enable bbro for -Os > > On Wed, Aug 22, 2012 at 8:49 AM, Zhenqiang Chen > wrote: > >> The patch is to enable bbro for -Os. When optimizing for size, it > >> * avoid duplicating block. > >> * keep its original order if there is no chance to fall through. > >> * ignore edge frequency and probability. > >> * handle predecessor first if its index is smaller to break long trace. > > You do this by inserting the index as a key. I don't fully understand this > change. You're assuming that a block with a lower index has a lower pre- > order number in the CFG's DFS spanning tree, IIUC (i.e. the blocks are > numbered sequentially)? I'm not sure that's always true. I think you should > add an explanation for this heuristic. Thank you for the comments. cleanup_cfg is called at the end cfg_layout_initialize before reorder_basic_blocks. cleanup_cfg does lots of optimization on cfg and renumber the basic blocks. After cleanup_cfg, the blocks are roughly numbered sequentially. The heuristic bases on the result of cleanup_cfg. It just wants to keep the order of cleanup_cfg since logs show we will have code size improvement (by cleanup_cfg) even if we do not call reorder_basic_blocks. "index as a key" is a simple way keep the original order. Comments are added in the updated patch. > >> * only connect Trace n with Trace n + 1 to reduce long jump. > ... > >> * bb-reorder.c (connect_better_edge_p): New added. > >> (find_traces_1_round): When optimizing for size, ignore edge > >> frequency > >> and probability, and handle all in one round. > >> (bb_to_key): Use bb->index as key for size. > >> (better_edge_p): The smaller bb index is better for size. > >> (connect_traces): Connect block n with block n + 1; > >> connect trace m with trace m + 1 if falling through. > >> (copy_bb_p): Avoid duplicating blocks. > >> (gate_handle_reorder_blocks): Enable bbro when optimizing for -Os. > > This probably fixes PR54364. Try the case in PR54364, the patch does reduce several jmp. > > @@ -1169,6 +1272,10 @@ copy_bb_p (const_basic_block bb, int > code_may_grow) > >int max_size = uncond_jump_length; > >rtx insn; > > > > + /* Avoid duplicating blocks for size. */ if > > + (optimize_function_for_size_p (cfun)) > > +return false; > > + > >if (!bb->frequency) > > return false; > > This shouldn't be necessary, due to the CODE_MAY_GROW argument, and > this change should result in a code size increase because jumps to conditional > jumps aren't removed anymore. What did you make this change for, do you > have a test case where code size increases if you allow copy_bb_p to return > true? Thanks. It is not necessary. Here is the updated ChangeLog. The updated patch is attached. ChangeLog 2012-08-29 Zhenqiang Chen PR middle-end/54364 * bb-reorder.c (connect_better_edge_p): New added. (find_traces_1_round): When optimizing for size, ignore edge frequency and probability, and handle all in one round. (bb_to_key): Use bb->index as key for size. (better_edge_p): The smaller bb index is better for size. (connect_traces): Connect block n with block n + 1; connect trace m with trace m + 1 if falling through. (gate_handle_reorder_blocks): Enable bbro when optimizing for -Os. Enable-bbro-for-size-updated.patch Description: Binary data
Re: [PATCH] MIPS16 TLS support for GCC
On 2012/7/6 02:23 PM, Richard Sandiford wrote: > Richard Sandiford writes: >>> (3) Also related to libraries, I edited CRT_CALL_STATIC_FUNCTION to emit >>> a 32-bit code sequence under both MIPS/MIPS16 mode (under O32). >>> >>> As you can see in the original Feb. patch, I had changes to emit a >>> MIPS16 version of these static calls, but with the changes in (2) above, >>> they will not work with the usual situation of a 32-bit MIPS built /lib >>> (.init/.fini will have 32/16-bit code improperly concatenated). >>> >>> The CodeSourcery builds use an independent mips16 sysroot for this, so a >>> MIPS16 CRT_CALL_STATIC_FUNCTION works there. For the usual case, I think >>> making it 32-bit is the compatible choice. >> >> Yeah, I agree that sounds like the right call. Please do the same >> for the n32/n64 version (i.e. explicitly make it nomips16 rather >> than add the #error). > > BTW, doing this has removed my main concern about having dead code. > The original patch had a separate MIPS16 implementation that (as things > stood) could never be used by stock sources. That would make it difficult > to maintain. > > Now that the MIPS16 library support is purely adding nomips16 attributes > to code that is obviously nomips16, those parts are OK on their own, thanks. > (I.e. the mips.h change, the libgcc change, and the libgomp change.) > Feel free to drop the multilib thing if you don't want to implement > --with-multilib-list. Hi Richard, just FYI, I just committed the said approved parts. gcc/config/mips/t-linux64 had one additional change, adding ../lib/mips16 to the corresponding MULTILIB_OSDIRNAMES, or else we end with a weird option-named directory for the mips16 libraries. Thanks, Chung-Lin
[AArch64] Merge from upstream trunk r190706
Hi, I've just merged upstream trunk on the aarch64-branch up to r190706. Thanks Sofiane
remove dependency on cp/parser.h from cp/lang.c
Just got my copyright assignment through, so here's my first GCC patch, This is a one liner removing the unneeded dependency of cp-lang.c on cp/parser.h. This has been tested on Linux. Aaron diff --git a/gcc/cp/cp-lang.c b/gcc/cp/cp-lang.c index da7f1e1..5ca0b0a 100644 --- a/gcc/cp/cp-lang.c +++ b/gcc/cp/cp-lang.c @@ -32,7 +32,6 @@ along with GCC; see the file COPYING3. If not see #include "cp-objcp-common.h" #include "hashtab.h" #include "target.h" -#include "parser.h" enum c_language_kind c_language = clk_cxx; static void cp_init_ts (void); cp-lang.c.diff Description: Binary data
Re: Inline hints
Hi, On Tue, Aug 28, 2012 at 06:05:27PM +0200, Jan Hubicka wrote: > > On Sun, Aug 19, 2012 at 07:43:45AM +0200, Jan Hubicka wrote: > > > > > > * gcc.dg/ipa/iinline-1.c: Update testcase to test inline hints. > > > > > > * ipa-inline.c (want_inline_small_function_p): Bypass > > > inline limits for hinted functions. > > > (edge_badness): Dump hints; decrease badness for hinted funcitons. > > > * ipa-inline.h (enum inline_hints_vals): New enum. > > > (inline_hints): New type. > > > (edge_growth_cache_entry): Add hints. > > > (dump_inline_summary): Update. > > > (dump_inline_hints): Declare. > > > (do_estimate_edge_hints): Declare. > > > (estimate_edge_hints): New inline function. > > > (reset_edge_growth_cache): Update. > > > * predict.c (cgraph_maybe_hot_edge_p): Do not ice on indirect edges. > > > * ipa-inline-analysis.c (dump_inline_hints): New function. > > > (estimate_edge_devirt_benefit): Return true when function should be > > > hinted. > > > (estimate_calls_size_and_time): New hints argument; set it when > > > devritualization happens. > > > (estimate_node_size_and_time): New hints argument. > > > (do_estimate_edge_time): Cache hints. > > > (do_estimate_edge_growth): Update. > > > (do_estimate_edge_hints): New function > > > > ... > > > > > Index: ipa-inline.h > > > === > > > *** ipa-inline.h (revision 190508) > > > --- ipa-inline.h (working copy) > > > *** typedef struct GTY(()) condition > > > *** 42,47 > > > --- 42,54 > > > unsigned by_ref : 1; > > > } condition; > > > > > > + /* Inline hints are reasons why inline heuristics should preffer > > > inlining given function. > > > +They are represtented as bitmap of the following values. */ > > > + enum inline_hints_vals { > > > + INLINE_HINT_indirect_call = 1 > > > + }; > > > + typedef int inline_hints; > > > + > > > DEF_VEC_O (condition); > > > DEF_VEC_ALLOC_O (condition, gc); > > > > > > *** extern VEC(inline_edge_summary_t,heap) * > > > *** 158,163 > > > --- 165,171 > > > typedef struct edge_growth_cache_entry > > > { > > > int time, size; > > > + inline_hints hints; > > > } edge_growth_cache_entry; > > > DEF_VEC_O(edge_growth_cache_entry); > > > DEF_VEC_ALLOC_O(edge_growth_cache_entry,heap); > > > *** extern VEC(edge_growth_cache_entry,heap) > > > *** 168,174 > > > /* In ipa-inline-analysis.c */ > > > void debug_inline_summary (struct cgraph_node *); > > > void dump_inline_summaries (FILE *f); > > > ! void dump_inline_summary (FILE * f, struct cgraph_node *node); > > > void inline_generate_summary (void); > > > void inline_read_summary (void); > > > void inline_write_summary (void); > > > --- 176,183 > > > /* In ipa-inline-analysis.c */ > > > void debug_inline_summary (struct cgraph_node *); > > > void dump_inline_summaries (FILE *f); > > > ! void dump_inline_summary (FILE *f, struct cgraph_node *node); > > > ! void dump_inline_hints (FILE *f, inline_hints); > > > void inline_generate_summary (void); > > > void inline_read_summary (void); > > > void inline_write_summary (void); > > > *** void inline_merge_summary (struct cgraph > > > *** 185,190 > > > --- 194,200 > > > void inline_update_overall_summary (struct cgraph_node *node); > > > int do_estimate_edge_growth (struct cgraph_edge *edge); > > > int do_estimate_edge_time (struct cgraph_edge *edge); > > > + inline_hints do_estimate_edge_hints (struct cgraph_edge *edge); > > > void initialize_growth_caches (void); > > > void free_growth_caches (void); > > > void compute_inline_parameters (struct cgraph_node *, bool); > > > *** estimate_edge_time (struct cgraph_edge * > > > *** 257,262 > > > --- 267,288 > > > } > > > > > > > > > + /* Return estimated callee runtime increase after inlning > > > +EDGE. */ > > > + > > > + static inline inline_hints > > > + estimate_edge_hints (struct cgraph_edge *edge) > > > + { > > > + inline_hints ret; > > > + if ((int)VEC_length (edge_growth_cache_entry, edge_growth_cache) <= > > > edge->uid > > > + || !(ret = VEC_index (edge_growth_cache_entry, > > > + edge_growth_cache, > > > + edge->uid).hints)) > > > + return do_estimate_edge_time (edge); > > > > Surely this was supposed to be do_estimate_edge_hints instead? > Oops, surely. It is harmless, since we always query time first and thus > populate the cache, but it ought to be fixed. > Can you please apply the obvious patch? Chinese internet is bit restrictive > and it is hard to find SSH access.. > Honza I have committed the following (after it passed bootstrap and testing along with another patch on x86_64-linux). Martin 2012-08-29 Martin Jambor * ipa-inline.h (estimate_edge_hints): Call do_estimate_edge_
Re: [PATCH, libstdc++] Make empty std::string storage readonly
On 08/28/2012 08:12 PM, Jonathan Wakely wrote: > On 28 August 2012 18:27, Michael Haubenwallner wrote: >>> >>> Does it actually produce a segfault? I suppose it might on some >>> platforms, but not all, so I'm not sure it's worth changing. >> >> It does segfault here on (32bit each): >> i686-pc-linux-gnu >> ia64-hp-hpux11.31 >> i386-pc-solaris2.10 >> sparc-sun-solaris2.10 >> powerpc-ibm-aix5.3.0.0 >> powerpc-ibm-aix6.1.0.0 >> powerpc-ibm-aix7.1.0.0 >> >> It does not segfault here on: >> hppa2.0n-hp-hpux11.31 >> i586-pc-interix5.2 >> i586-pc-winnt5.2 (using MSVC) >> >> Maybe it could be made segfault on hppa2.0n-hp-hpux11.31 too using some >> linker flag, >> but that's a deprecated platform anyway. >> >> As long as the major development platform (Linux) does segfault, it feels >> worth >> changing - especially as string.clear() to write the '\0' back again won't >> help >> as quick'n dirty workaround since gcc-4.4.4 any more. > > Hmm, I tested it on x86_64-unknown-linux-gnu without getting a > segfault - but I might have messed up my test. Using this patch on my x86_64 Gentoo Linux Desktop with gcc-4.7.1 does segfault as expected - when I make sure the correct libstdc++ is used at runtime, having the '_S_empty_rep_storage' symbol in the .rodata section rather than .bss. /haubi/
Re: [PATCH] Add counter histogram to fdo summary (issue6465057)
> Index: libgcc/libgcov.c > === > --- libgcc/libgcov.c (revision 190736) > +++ libgcc/libgcov.c (working copy) > @@ -276,6 +276,78 @@ gcov_version (struct gcov_info *ptr, gcov_unsigned >return 1; > } > > +/* Insert counter VALUE into HISTOGRAM. */ > + > +static void > +gcov_histogram_insert(gcov_bucket_type *histogram, gcov_type value) > +{ > + unsigned i; > + > + i = gcov_histo_index(value); > + gcc_assert (i < GCOV_HISTOGRAM_SIZE); Does checking_assert work in libgcov? I do not think internal consistency check should go to --enable-checking=release libgcov. We want to maintain it as lightweight as possible. (I see there are two existing gcc_asserts, since they report file format corruption, I think they should give better diagnostic). Inliner will do good job here, but perhaps explicit inline fits. > + for (f_ix = 0; f_ix != gi_ptr->n_functions; f_ix++) > +{ > + gfi_ptr = gi_ptr->functions[f_ix]; > + > + if (!gfi_ptr || gfi_ptr->key != gi_ptr) > +continue; > + > + ci_ptr = &gfi_ptr->ctrs[ctr_info_ix]; > + for (ix = 0; ix < ci_ptr->num; ix++) > +gcov_histogram_insert(cs_ptr->histogram, ci_ptr->values[ix]); Space before (. > +} > +} > +} > + > /* Dump the coverage counts. We merge with existing counts when > possible, to avoid growing the .da files ad infinitum. We use this > program's checksum to make sure we only accumulate whole program > @@ -347,6 +419,7 @@ gcov_exit (void) > } > } > } > + gcov_compute_histogram (&this_prg); > @@ -598,11 +669,18 @@ gcov_exit (void) > if (gi_ptr->merge[t_ix]) > { > if (!cs_prg->runs++) > - cs_prg->num = cs_tprg->num; > +cs_prg->num = cs_tprg->num; > + else if (cs_prg->num != cs_tprg->num) > +goto read_mismatch; Doesn't think check that all the programs that contain this unit are the same? I.e. will this survive profiledbootstrap where we interleave cc1 and cc1plus? > + /* Count number of non-zero histogram entries. The histogram is only > + currently computed for arc counters. */ > + csum = &summary->ctrs[GCOV_COUNTER_ARCS]; > + for (h_ix = 0; h_ix < GCOV_HISTOGRAM_SIZE; h_ix++) > +{ > + if (csum->histogram[h_ix].num_counters > 0) > +h_cnt++; > +} > + gcov_write_tag_length (tag, GCOV_TAG_SUMMARY_LENGTH(h_cnt)); >gcov_write_unsigned (summary->checksum); >for (csum = summary->ctrs, ix = GCOV_COUNTERS_SUMMABLE; ix--; csum++) > { > @@ -380,6 +388,21 @@ gcov_write_summary (gcov_unsigned_t tag, const str >gcov_write_counter (csum->sum_all); >gcov_write_counter (csum->run_max); >gcov_write_counter (csum->sum_max); > + if (ix != GCOV_COUNTER_ARCS) > +{ > + gcov_write_unsigned (0); > + continue; > +} > + gcov_write_unsigned (h_cnt); > + for (h_ix = 0; h_ix < GCOV_HISTOGRAM_SIZE; h_ix++) > +{ > + if (!csum->histogram[h_ix].num_counters) > +continue; > + gcov_write_unsigned (h_ix); It is kind of waste to write whole unsigned for each histogram index. What about writting bitmap of non-zero entries followed by each entry? > +/* Merge SRC_HISTO into TGT_HISTO. */ Perhaps comment about overall concept of the merging routine would suit here. > -#else /*!IN_GCOV */ > -#define GCOV_TYPE_SIZE (LONG_LONG_TYPE_SIZE > 32 ? 64 : 32) Why do you need t omove this out of !libgcov? I do not thing this is correct for all configurations. i.e. gcov_type may be 16bit. Patch is OK if it passed profiledbootstrap modulo the comments above. Thanks! Honza
Re: out-of-line and arch-specific random_device
Hi, On 8/28/12 1:41 PM, Ulrich Drepper wrote: On Tue, Aug 28, 2012 at 4:44 AM, Paolo Carlini wrote: Again, without context, I think this is not the point: random_device is meant to be just a simple high level wrapper around things like dev/random, inspired by facilities like dev/random on unix-like OSes. The brutal "fall back" we have now in place wouldn't be useful anyway for the uses Marc is talking about, because there is no way to provide a seed. That said, I can't check right now C++11 about random_device, I suppose Uli has already ;) I did read it. random_device is all about non-determinism. Of course I know that RNGs in some situations have to be repeatable. That's what all the engines are about. random_device isn't. You use random_device to seed an engine etc. The spec says that if there is no way to create non-deterministic data the implementation may use a random number engine. "may" being to key. Ok. I perhaps didn't make myself clear as to what the big problem is. Depending on whether or not you define _GLIBCXX_USE_RANDOM_TR1 you get an object definition for 'random"device" which has the same name and mangling but has a different size. This means binary incompatibilities. Memory corruptions. But note that _GLIBCXX_USE_RANDOM_TR1, as *any* other such macro isn't supposed to be set by the user, definitely it's not. It's a configure-time macro. Thus, given your clarification above about "may", I think the issue here is whether normally people would like to see an abort, or the output of a fixed (no seed, that is, as we clarified already) deterministic engine as a fall back. In my opinion, having clarified the macro uses issue, the less bad solution is the deterministic engine. As a general maintainer of the library (that is not as a GNU/Linux maintainer) I would be more favorable to the abort if we had decently covered not just Unix-like systems but a few other systems, at least a bit of M$, etc. Thus, all in all, I propose to just go ahead with your patch more or less, as-is, that is retain the MT fall back. Minor nit: are you sure we need to open a new minor version for the new symbol? Because it seemed to me that 4.7.x was behind by one. Please check. Also, again minor detail, we normally just use mangled names in the linker script, see all the examples the lines before, I don't think we should just now change that?!? Paolo.
[PATCH] rs6000: Add a builtin to read the time base register on PowerPC
Add __builtin_ppc_get_timebase to read the time base register on PowerPC. This is required for applications that measure time at high frequencies with high precision that can't afford a syscall. [gcc] 2012-08-29 Tulio Magno Quites Machado Filho * config/rs6000/rs6000-builtin.def: Add __builtin_ppc_get_timebase. * config/rs6000/rs6000.c (rs6000_expand_noop_builtin): New function to expand an expression that calls a builtin without arguments. (rs6000_expand_builtin): Add __builtin_ppc_get_timebase. (rs6000_init_builtins): Likewise. * config/rs6000/rs6000.md: Likewise. [gcc/testsuite] 2012-08-29 Tulio Magno Quites Machado Filho * gcc.target/powerpc/ppc-get-timebase.c: New file. --- gcc/config/rs6000/rs6000-builtin.def |3 ++ gcc/config/rs6000/rs6000.c | 31 + gcc/config/rs6000/rs6000.md| 36 .../gcc.target/powerpc/ppc-get-timebase.c | 22 4 files changed, 92 insertions(+), 0 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/ppc-get-timebase.c diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def index c8f8f86..75ad184 100644 --- a/gcc/config/rs6000/rs6000-builtin.def +++ b/gcc/config/rs6000/rs6000-builtin.def @@ -1429,6 +1429,9 @@ BU_SPECIAL_X (RS6000_BUILTIN_RSQRT, "__builtin_rsqrt", RS6000_BTM_FRSQRTE, BU_SPECIAL_X (RS6000_BUILTIN_RSQRTF, "__builtin_rsqrtf", RS6000_BTM_FRSQRTES, RS6000_BTC_FP) +BU_SPECIAL_X (RS6000_BUILTIN_GET_TB, "__builtin_ppc_get_timebase", +RS6000_BTM_POWERPC, RS6000_BTC_MISC) + /* Darwin CfString builtin. */ BU_SPECIAL_X (RS6000_BUILTIN_CFSTRING, "__builtin_cfstring", RS6000_BTM_ALWAYS, RS6000_BTC_MISC) diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 6c58307..24e274d 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -9747,6 +9747,30 @@ rs6000_overloaded_builtin_p (enum rs6000_builtins fncode) return (rs6000_builtin_info[(int)fncode].attr & RS6000_BTC_OVERLOADED) != 0; } +/* Expand an expression EXP that calls a builtin without arguments. */ +static rtx +rs6000_expand_noop_builtin (enum insn_code icode, rtx target) +{ + rtx pat; + enum machine_mode tmode = insn_data[icode].operand[0].mode; + + if (icode == CODE_FOR_nothing) +/* Builtin not supported on this processor. */ +return 0; + + if (target == 0 + || GET_MODE (target) != tmode + || ! (*insn_data[icode].operand[0].predicate) (target, tmode)) +target = gen_reg_rtx (tmode); + + pat = GEN_FCN (icode) (target); + if (! pat) +return 0; + emit_insn (pat); + + return target; +} + static rtx rs6000_expand_unop_builtin (enum insn_code icode, tree exp, rtx target) @@ -11336,6 +11360,9 @@ rs6000_expand_builtin (tree exp, rtx target, rtx subtarget ATTRIBUTE_UNUSED, ? CODE_FOR_bpermd_di : CODE_FOR_bpermd_si), exp, target); +case RS6000_BUILTIN_GET_TB: + return rs6000_expand_noop_builtin (CODE_FOR_get_timebase, target); + case ALTIVEC_BUILTIN_MASK_FOR_LOAD: case ALTIVEC_BUILTIN_MASK_FOR_STORE: { @@ -11620,6 +11647,10 @@ rs6000_init_builtins (void) POWER7_BUILTIN_BPERMD, "__builtin_bpermd"); def_builtin ("__builtin_bpermd", ftype, POWER7_BUILTIN_BPERMD); + ftype = build_function_type_list (unsigned_intDI_type_node, + NULL_TREE); + def_builtin ("__builtin_ppc_get_timebase", ftype, RS6000_BUILTIN_GET_TB); + #if TARGET_XCOFF /* AIX libm provides clog as __clog. */ if ((tdecl = builtin_decl_explicit (BUILT_IN_CLOG)) != NULL_TREE) diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index d5ffd81..09bdd80 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -136,6 +136,7 @@ UNSPECV_PROBE_STACK_RANGE ; probe range of stack addresses UNSPECV_EH_RR ; eh_reg_restore UNSPECV_ISYNC ; isync instruction + UNSPECV_GETTB ; get timebase built-in ]) @@ -14101,6 +14102,41 @@ "" "") +(define_expand "get_timebase" + [(use (match_operand:DI 0 "gpc_reg_operand" ""))] + "" + " +{ + if (TARGET_POWERPC64) +emit_insn (gen_get_timebase_ppc64 (operands[0])); + else if (TARGET_POWERPC) +emit_insn (gen_get_timebase_ppc32 (operands[0])); + else +FAIL; + DONE; +}") + +(define_insn "get_timebase_ppc32" + [(set (match_operand:DI 0 "gpc_reg_operand" "=r") +(unspec_volatile:DI [(const_int 0)] UNSPECV_GETTB)) + (clobber (match_scratch:SI 1 "=r"))] + "TARGET_POWERPC && !TARGET_POWERPC64" +{ +return "mftbu %0\;" + "mftb %L0\;" + "mftbu %1\;" + "cmpw %0,%1\;" + "bne- $-16"; +}) + +(define_insn "get_timebase_ppc64"
Re: [PATCH,PING]] gcc/config/freebsd-spec.h: Fix building PIE
Hi Gerald, On Sun, 26 Aug 2012 23:28:49 +0200 (CEST) Gerald Pfeifer wrote: > I have tested this patch on i386-unknown-freebsd10.0 and volunteer > to create a ChangeLog and apply if approved. Thanks for taking care of this, I thought this patch had been completely forgotten :) FWIW, the git commit message was supposed to be the ChangeLog entry (imho it is rather pointless to include it in the patch since it will most likely conflict when/if it gets applied) > > Any reviewer? > > On Tue, 8 May 2012, Alexis Ballier wrote: > > For the record, there's a similar logic in FreeBSD's gcc: > > http://svnweb.freebsd.org/base/head/contrib/gcc/config/freebsd-spec.h?revision=200038&view=markup > > Thanks for the patch, Alexis. One question: why do we have the same > in freebsd-spec.h and i386/freebsd.h. Isn't there a way to simplify > this? Like omitting this from i386/freebsd.h at all? To be honest, I don't know why we have the same in these two headers and wondered the same. I suppose it is possible to simply remove it from i386/freebsd.h but I didn't test this since I didn't want to mix a bugfix for PIE and cleanup of the code within the same patch. Regards, Alexis. [...]
Re: faster random number engine
On 8/29/12 4:19 PM, Ulrich Drepper wrote: The header so far contains the random number engines documented in the header. None of these are well suited for modern CPUs. There is a variant of the Mersenne twister engines which is explicitly designed to perform well on CPUs with SIMD instructions. The result is an engine with equal properties to the original Mersenne twisters but several times faster. Great. The attached patch implements this new engine. It's in the __gnu_cxx namespace. I added definitions for all the variants defined by the authors. The test suite checks the returned values based on results obtained from the original code. The SIMD optimization is so far done for x86. In all other cases a generic implementation is used. The generic implementation works correctly for little endian machines. For big endian machines someone will come up with fixes. Until then the new default.cc test is expected to fail. I hope this is an uncontroversial change. The substance isn't of course. But normally we don't have __gnu_cxx things in the same std header. Can't we have a new ext/random and put it in there? If we can separate the new code to it, I think people would not even object to the target dependency, etc. In ext/ we are quite free to do extension / experimental work. Paolo.
Re: [PATCH 2/3] Incorporate aggregate jump functions into inlining analysis
On Thu, Aug 2, 2012 at 12:28 PM, Martin Jambor wrote: > Hi, > > this patch uses the aggregate jump functions created by the previous > patch in the series to determine benefits of inlining a particular > call graph edge. It has not changed much since the last time I posted > it, except for the presence of by_ref flags and removal of checks > required by TBAA which we now do not use. > > The patch works in fairly straightforward way. It ads two flags to > struct condition to specify it actually refers to an aggregate passed > by value or something passed by reference, in both cases at a > particular offset, also newly stored in the structures. Functions > which build the predicates specifying under which conditions CFG edges > will be taken or individual statements are actually executed then > simply also look whether a value comes from an aggregate passed to us > in a parameter (either by value or reference) and if so, create > appropriate conditions. Later on, predicates are evaluated as before, > we only also look at aggregate contents of the jump functions of the > edge we are considering to inline when evaluating the predicates, and > also remap the offsets of the jump functions when remapping over an > ancestor jump function. > > This patch alone makes us inline the function bar in testcase of PR > 48636 in comment #4. It also passes bootstrap and testing on > x86_64-linux. I successfully LTO-built Firefox with it too. > > Thanks for all comments and suggestions, > > Martin > > > 2012-07-31 Martin Jambor > > PR fortran/48636 > * ipa-inline.h (condition): New fields offset, agg_contents and > by_ref. > * ipa-inline-analysis.c (agg_position_info): New type. > (add_condition): New parameter aggpos, also store agg_contents, by_ref > and offset. > (dump_condition): Also dump aggregate conditions. > (evaluate_conditions_for_known_args): Also handle aggregate > conditions. New parameter known_aggs. > (evaluate_properties_for_edge): Gather known aggregate contents. > (inline_node_duplication_hook): Pass NULL known_aggs to > evaluate_conditions_for_known_args. > (unmodified_parm): Split into unmodified_parm and unmodified_parm_1. > (unmodified_parm_or_parm_agg_item): New function. > (set_cond_stmt_execution_predicate): Handle values passed in > aggregates. > (set_switch_stmt_execution_predicate): Likewise. > (will_be_nonconstant_predicate): Likewise. > (estimate_edge_devirt_benefit): Pass new parameter known_aggs to > ipa_get_indirect_edge_target. > (estimate_calls_size_and_time): New parameter known_aggs, pass it > recrsively to itself and to estimate_edge_devirt_benefit. > (estimate_node_size_and_time): New vector known_aggs, pass it o > functions which need it. > (remap_predicate): New parameter offset_map, use it to remap aggregate > conditions. > (remap_edge_summaries): New parameter offset_map, pass it recursively > to itself and to remap_predicate. > (inline_merge_summary): Also create and populate vector offset_map. > (do_estimate_edge_time): New vector of known aggregate contents, > passed to functions which need it. > (inline_read_section): Stream new fields of condition. > (inline_write_summary): Likewise. > * ipa-cp.c (ipa_get_indirect_edge_target): Also examine the aggregate > contents. Let all local callers pass NULL for known_aggs. > This caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54409 H.J.
Re: out-of-line and arch-specific random_device
On Wed, Aug 29, 2012 at 9:48 AM, Paolo Carlini wrote: > Minor nit: are you sure we need to > open a new minor version for the new symbol? Because it seemed to me that > 4.7.x was behind by one. I have 4.7 installed and that version already defines the symbols defined in version 3.4.17. This is a new symbol and requires a new version to prevent startup of an app in case of a too old runtime library.
Re: out-of-line and arch-specific random_device
On 8/29/12 4:49 PM, Ulrich Drepper wrote: On Wed, Aug 29, 2012 at 9:48 AM, Paolo Carlini wrote: Minor nit: are you sure we need to open a new minor version for the new symbol? Because it seemed to me that 4.7.x was behind by one. I have 4.7 installed and that version already defines the symbols defined in version 3.4.17. This is a new symbol and requires a new version to prevent startup of an app in case of a too old runtime library. Ah in that case definitely we have to bump the minor version. I though - if it wasn't clear - that current mainline was *already* ahead current 4_7-branch. Paolo.
Re: [PATCH] rs6000: Add a builtin to read the time base register on PowerPC
Hi Tulio, Add __builtin_ppc_get_timebase to read the time base register on PowerPC. This is required for applications that measure time at high frequencies with high precision that can't afford a syscall. For things that do mftb with high frequency, maybe you should also add a builtin that does just an mftb, i.e. returns a 32-bit result on 32-bit implementations. Please add documentation for the new builtin(s). --- a/gcc/config/rs6000/rs6000-builtin.def +++ b/gcc/config/rs6000/rs6000-builtin.def @@ -1429,6 +1429,9 @@ BU_SPECIAL_X (RS6000_BUILTIN_RSQRT, "__builtin_rsqrt", RS6000_BTM_FRSQRTE, BU_SPECIAL_X (RS6000_BUILTIN_RSQRTF, "__builtin_rsqrtf", RS6000_BTM_FRSQRTES, RS6000_BTC_FP) +BU_SPECIAL_X (RS6000_BUILTIN_GET_TB, "__builtin_ppc_get_timebase", +RS6000_BTM_POWERPC, RS6000_BTC_MISC) RS6000_BTM_POWERPC does not exist anymore. RS6000_BTM_ALWAYS? +/* Expand an expression EXP that calls a builtin without arguments. */ +static rtx +rs6000_expand_noop_builtin (enum insn_code icode, rtx target) "noop" gives the wrong idea, "zeroop" perhaps? +(define_expand "get_timebase" You should probably prefix this with powerpc_ or rs6000_ as well. The existing code is not very consistent in this. + [(use (match_operand:DI 0 "gpc_reg_operand" ""))] + "" + " +{ + if (TARGET_POWERPC64) +emit_insn (gen_get_timebase_ppc64 (operands[0])); + else if (TARGET_POWERPC) +emit_insn (gen_get_timebase_ppc32 (operands[0])); + else +FAIL; + DONE; +}") TARGET_POWERPC is always true. +(define_insn "get_timebase_ppc32" + [(set (match_operand:DI 0 "gpc_reg_operand" "=r") +(unspec_volatile:DI [(const_int 0)] UNSPECV_GETTB)) + (clobber (match_scratch:SI 1 "=r"))] + "TARGET_POWERPC && !TARGET_POWERPC64" +{ +return "mftbu %0\;" + "mftb %L0\;" + "mftbu %1\;" + "cmpw %0,%1\;" + "bne- $-16"; +}) This only works for WORDS_BIG_ENDIAN. You should say you clobber CR0 here I think; actually, allow any CRn instead. Does mftb work on all supported assemblers? The machine instruction is phased out, but some assemblers translate it to mfspr. +(define_insn "get_timebase_ppc64" + [(set (match_operand:DI 0 "gpc_reg_operand" "=r") +(unspec_volatile:DI [(const_int 0)] UNSPECV_GETTB))] + "TARGET_POWERPC64" +{ +return "mfspr %0, 268"; +}) POWER3 needs mftb. --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/ppc-get-timebase.c @@ -0,0 +1,22 @@ +/* { dg-do run { target { powerpc*-*-* } } } */ + +/* Test if __builtin_ppc_get_timebase() is compatible with the current + processor and if it's changing between reads. A read failure might indicate + a Power ISA or binutils change. */ + +#include + +int +main(void) +{ + uint64_t t1, t2, t3; + + t1 = __builtin_ppc_get_timebase (); + t2 = __builtin_ppc_get_timebase (); + t3 = __builtin_ppc_get_timebase (); + + if (t1 != t2 && t1 != t3 && t2 != t3) +return 0; + + return 1; +} On some systems the timebase runs at a rather low frequency, say 20MHz. This test will spuriously fail there. Waste a million CPU cycles before reading TB the second time? Segher
[PATCH] PR other/54411: libiberty: objalloc_alloc integer overflows (CVE-2012-3509)
This patches fixes an integer overflow in libiberty, which leads to crashes in binutils. The long version of the objalloc_alloc macro would have needed another conditional, so I removed that and replaced it with a call to the actual implementation. This has been compiled-tested only. We do not use this function in GCC, therefore I want to commit this just to the trunk. 2012-08-29 Florian Weimer PR other/54411 * objalloc.h (objalloc_alloc): Always use the simple definition of the macro. 2012-08-29 Florian Weimer PR other/54411 * objalloc.c (_objalloc_alloc): Add overflow check covering alignment and CHUNK_HEADER_SIZE addition. Index: include/objalloc.h === --- include/objalloc.h (revision 190780) +++ include/objalloc.h (working copy) @@ -1,5 +1,5 @@ /* objalloc.h -- routines to allocate memory for objects - Copyright 1997, 2001 Free Software Foundation, Inc. + Copyright 1997-2012 Free Software Foundation, Inc. Written by Ian Lance Taylor, Cygnus Solutions. This program is free software; you can redistribute it and/or modify it @@ -71,38 +71,8 @@ extern void *_objalloc_alloc (struct objalloc *, unsigned long); -/* The macro version of objalloc_alloc. We only define this if using - gcc, because otherwise we would have to evaluate the arguments - multiple times, or use a temporary field as obstack.h does. */ - -#if defined (__GNUC__) && defined (__STDC__) && __STDC__ - -/* NextStep 2.0 cc is really gcc 1.93 but it defines __GNUC__ = 2 and - does not implement __extension__. But that compiler doesn't define - __GNUC_MINOR__. */ -#if __GNUC__ < 2 || (__NeXT__ && !__GNUC_MINOR__) -#define __extension__ -#endif - -#define objalloc_alloc(o, l) \ - __extension__ \ - ({ struct objalloc *__o = (o); \ - unsigned long __len = (l); \ - if (__len == 0) \ - __len = 1; \ - __len = (__len + OBJALLOC_ALIGN - 1) &~ (OBJALLOC_ALIGN - 1); \ - (__len <= __o->current_space \ - ? (__o->current_ptr += __len,\ -__o->current_space -= __len, \ -(void *) (__o->current_ptr - __len)) \ - : _objalloc_alloc (__o, __len)); }) - -#else /* ! __GNUC__ */ - #define objalloc_alloc(o, l) _objalloc_alloc ((o), (l)) -#endif /* ! __GNUC__ */ - /* Free an entire objalloc structure. */ extern void objalloc_free (struct objalloc *); Index: libiberty/objalloc.c === --- libiberty/objalloc.c(revision 190780) +++ libiberty/objalloc.c(working copy) @@ -1,5 +1,5 @@ /* objalloc.c -- routines to allocate memory for objects - Copyright 1997 Free Software Foundation, Inc. + Copyright 1997-2012 Free Software Foundation, Inc. Written by Ian Lance Taylor, Cygnus Solutions. This program is free software; you can redistribute it and/or modify it @@ -112,8 +112,9 @@ /* Allocate space from an objalloc structure. */ PTR -_objalloc_alloc (struct objalloc *o, unsigned long len) +_objalloc_alloc (struct objalloc *o, unsigned long original_len) { + unsigned long len = original_len; /* We avoid confusion from zero sized objects by always allocating at least 1 byte. */ if (len == 0) @@ -121,6 +122,11 @@ len = (len + OBJALLOC_ALIGN - 1) &~ (OBJALLOC_ALIGN - 1); + /* Check for overflow in the alignment operator above and the malloc + argument below. */ + if (len + CHUNK_HEADER_SIZE < original_len) +return NULL; + if (len <= o->current_space) { o->current_ptr += len;
Re: [PATCH] rs6000: Add a builtin to read the time base register on PowerPC
On Wed, Aug 29, 2012 at 6:56 AM, Tulio Magno Quites Machado Filho wrote: > Add __builtin_ppc_get_timebase to read the time base register on PowerPC. > This is required for applications that measure time at high frequencies > with high precision that can't afford a syscall. > > [gcc] > 2012-08-29 Tulio Magno Quites Machado Filho > > * config/rs6000/rs6000-builtin.def: Add __builtin_ppc_get_timebase. > * config/rs6000/rs6000.c (rs6000_expand_noop_builtin): New > function to expand an expression that calls a builtin without > arguments. > (rs6000_expand_builtin): Add __builtin_ppc_get_timebase. > (rs6000_init_builtins): Likewise. > * config/rs6000/rs6000.md: Likewise. > > [gcc/testsuite] > 2012-08-29 Tulio Magno Quites Machado Filho > > * gcc.target/powerpc/ppc-get-timebase.c: New file. > --- > gcc/config/rs6000/rs6000-builtin.def |3 ++ > gcc/config/rs6000/rs6000.c | 31 + > gcc/config/rs6000/rs6000.md| 36 > > .../gcc.target/powerpc/ppc-get-timebase.c | 22 > 4 files changed, 92 insertions(+), 0 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/powerpc/ppc-get-timebase.c > > diff --git a/gcc/config/rs6000/rs6000-builtin.def > b/gcc/config/rs6000/rs6000-builtin.def > index c8f8f86..75ad184 100644 > --- a/gcc/config/rs6000/rs6000-builtin.def > +++ b/gcc/config/rs6000/rs6000-builtin.def > @@ -1429,6 +1429,9 @@ BU_SPECIAL_X (RS6000_BUILTIN_RSQRT, "__builtin_rsqrt", > RS6000_BTM_FRSQRTE, > BU_SPECIAL_X (RS6000_BUILTIN_RSQRTF, "__builtin_rsqrtf", RS6000_BTM_FRSQRTES, > RS6000_BTC_FP) > > +BU_SPECIAL_X (RS6000_BUILTIN_GET_TB, "__builtin_ppc_get_timebase", > +RS6000_BTM_POWERPC, RS6000_BTC_MISC) > + > /* Darwin CfString builtin. */ > BU_SPECIAL_X (RS6000_BUILTIN_CFSTRING, "__builtin_cfstring", > RS6000_BTM_ALWAYS, > RS6000_BTC_MISC) > diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c > index 6c58307..24e274d 100644 > --- a/gcc/config/rs6000/rs6000.c > +++ b/gcc/config/rs6000/rs6000.c > @@ -9747,6 +9747,30 @@ rs6000_overloaded_builtin_p (enum rs6000_builtins > fncode) >return (rs6000_builtin_info[(int)fncode].attr & RS6000_BTC_OVERLOADED) != > 0; > } > > +/* Expand an expression EXP that calls a builtin without arguments. */ > +static rtx > +rs6000_expand_noop_builtin (enum insn_code icode, rtx target) > +{ > + rtx pat; > + enum machine_mode tmode = insn_data[icode].operand[0].mode; > + > + if (icode == CODE_FOR_nothing) > +/* Builtin not supported on this processor. */ > +return 0; > + > + if (target == 0 > + || GET_MODE (target) != tmode > + || ! (*insn_data[icode].operand[0].predicate) (target, tmode)) > +target = gen_reg_rtx (tmode); > + > + pat = GEN_FCN (icode) (target); > + if (! pat) > +return 0; > + emit_insn (pat); > + > + return target; > +} > + > > static rtx > rs6000_expand_unop_builtin (enum insn_code icode, tree exp, rtx target) > @@ -11336,6 +11360,9 @@ rs6000_expand_builtin (tree exp, rtx target, rtx > subtarget ATTRIBUTE_UNUSED, >? CODE_FOR_bpermd_di >: CODE_FOR_bpermd_si), exp, > target); > > +case RS6000_BUILTIN_GET_TB: > + return rs6000_expand_noop_builtin (CODE_FOR_get_timebase, target); > + > case ALTIVEC_BUILTIN_MASK_FOR_LOAD: > case ALTIVEC_BUILTIN_MASK_FOR_STORE: >{ > @@ -11620,6 +11647,10 @@ rs6000_init_builtins (void) > POWER7_BUILTIN_BPERMD, "__builtin_bpermd"); >def_builtin ("__builtin_bpermd", ftype, POWER7_BUILTIN_BPERMD); > > + ftype = build_function_type_list (unsigned_intDI_type_node, > + NULL_TREE); > + def_builtin ("__builtin_ppc_get_timebase", ftype, RS6000_BUILTIN_GET_TB); > + > #if TARGET_XCOFF >/* AIX libm provides clog as __clog. */ >if ((tdecl = builtin_decl_explicit (BUILT_IN_CLOG)) != NULL_TREE) > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > index d5ffd81..09bdd80 100644 > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md > @@ -136,6 +136,7 @@ > UNSPECV_PROBE_STACK_RANGE ; probe range of stack addresses > UNSPECV_EH_RR ; eh_reg_restore > UNSPECV_ISYNC ; isync instruction > + UNSPECV_GETTB ; get timebase built-in >]) > > > @@ -14101,6 +14102,41 @@ >"" >"") > > +(define_expand "get_timebase" > + [(use (match_operand:DI 0 "gpc_reg_operand" ""))] > + "" > + " > +{ > + if (TARGET_POWERPC64) > +emit_insn (gen_get_timebase_ppc64 (operands[0])); > + else if (TARGET_POWERPC) > +emit_insn (gen_get_timebase_ppc32 (operands[0])); > + else > +FAIL; > + DONE; > +}") > + > +(define_insn "get_timebase_ppc32" > + [(set (match_operand:DI 0 "gp
Re: [PATCH] rs6000: Add a builtin to read the time base register on PowerPC
On Wed, 29 Aug 2012, Segher Boessenkool wrote: > > +++ b/gcc/testsuite/gcc.target/powerpc/ppc-get-timebase.c > > @@ -0,0 +1,22 @@ > > +/* { dg-do run { target { powerpc*-*-* } } } */ > > + > > +/* Test if __builtin_ppc_get_timebase() is compatible with the current > > + processor and if it's changing between reads. A read failure might > > indicate > > + a Power ISA or binutils change. */ > > + > > +#include > > + > > +int > > +main(void) > > +{ > > + uint64_t t1, t2, t3; > > + > > + t1 = __builtin_ppc_get_timebase (); > > + t2 = __builtin_ppc_get_timebase (); > > + t3 = __builtin_ppc_get_timebase (); > > + > > + if (t1 != t2 && t1 != t3 && t2 != t3) > > +return 0; > > + > > + return 1; > > +} > > On some systems the timebase runs at a rather low frequency, say 20MHz. > This test will spuriously fail there. Waste a million CPU cycles before > reading TB the second time? Waste said million cycles portably by calling sched_yield()? (Available only on POSIX systems. :) brgds, H-P
Re: Loop iterations inline hint
Hi, On Tue, Aug 21, 2012 at 08:55:02AM +0200, Jan Hubicka wrote: > > Hi, > this patch adds a hint that if inlining makes bounds on loop iterations known, > it is probably good idea. This is primarely targetting Fortran's array > descriptors, but should be generally useful. > > Fortran will still need a bit more work. Often we disregard inlining because > we > think the call is cold (because it comes from Main) so inlining heuristic will > need more updating and apparently we will also need to update for PHI > conditionals as done in Martin's patch 3/3. My patch helps only a bit, for example on the pr48636.f90 testcase it still does not help to discover a loop bound hint because the patch, being overly simple, looks at edge->aux predicates to construct predicates of phi results constantness and those are computed before we start populating nonconstant_names. So phi nodes for any conditions based on expressions (as opposed to direct parameter values) are not considered. I'll see how far I can get by re-evaluating the condition instead (but eventually we will probably want to do the full propagation, though perhaps not in 4.8). > At the moment the hint is interpreted same way as the indirect_call hint from > previous patch. > > Martin: I think ipa-cp should also make use of this hint. Resolving > number of loop iterations is important enough reason to specialize > in many cases. I think it already has logic for devirtualization > but perhaps it should be made more aggressive? I was sort of > surprised that for Mozila the inlining hint makes us to catch 20 > times more cases than before. Most of the cases sounds like good > ipa-cp candidates. Interesting, I can experiment with that, sure. On the other hand, I'd be careful about any measurements taken after August 13 because of PR 54394 which might have caused some edges to be considered much cooler than they are. I'll post a patch for it in a minute. > > Also can you please try to finaly make param notes to be used by the virtual > clones machinery and thus make it possible for ipa-cp to specialize for known > aggregate parameters? This should make a lot of difference for Fortran, I > think. Yeah, that's the next big item on my list to do after I finish all the little ones (like the PHIs, for example), but hopefully, I'll have that done soon. Martin
Re: [PATCH] rs6000: Add a builtin to read the time base register on PowerPC
On Wed, Aug 29, 2012 at 01:56:05PM -0400, Hans-Peter Nilsson wrote: > On Wed, 29 Aug 2012, Segher Boessenkool wrote: > > On some systems the timebase runs at a rather low frequency, say 20MHz. > > This test will spuriously fail there. Waste a million CPU cycles before > > reading TB the second time? > > Waste said million cycles portably by calling sched_yield()? > (Available only on POSIX systems. :) Well only for a test environment. You don't want to call sched_yield in the normal case, since the apps that do this many millions of times need this to be as a fast as possible. -- Michael Meissner, IBM 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899
[PATCH, PR 54394] Compute loops when generating inline summaries
Hi, the patch below fixes PR 54394. The problem is that since revision 190346 we depend on bb->loop_father being non-NULL to get loop_depth. However, with loops not computed, the loop_father is NULL, loop_depth is thus considered zero and call graph edges out of such BB can be considered much cooler, leading to inlining regressions. This patch fixes that by recomputing loops whenever optimizing, not only for loop bounds hints. We might put the computation elsewhere or do it only under more restrictive circumstances, but I believe that after rev. 190346 we have to do it. In particular, I am not sure whether we had (semi)correct loop_depths when doing early inlining or not, this patch re-calculates it for early inliner too. Bootstrapped and tested on x86_64-linux, fixes fatigue run-time on an x86_64-linux and i686-linux for me. What do you think? Thanks, Martin 2012-08-29 Martin Jambor PR middle-end/54394 * ipa-inline-analysis.c (estimate_function_body_sizes): Compute dominance info and loops whenever optimizing. Index: src/gcc/ipa-inline-analysis.c === --- src.orig/gcc/ipa-inline-analysis.c +++ src/gcc/ipa-inline-analysis.c @@ -2102,6 +2102,11 @@ estimate_function_body_sizes (struct cgr info->conds = 0; info->entry = 0; + if (optimize) +{ + calculate_dominance_info (CDI_DOMINATORS); + loop_optimizer_init (LOOPS_NORMAL | LOOPS_HAVE_RECORDED_EXITS); +} if (dump_file) fprintf (dump_file, "\nAnalyzing function body size: %s\n", @@ -2270,9 +2275,6 @@ estimate_function_body_sizes (struct cgr loop_iterator li; predicate loop_iterations = true_predicate (); - calculate_dominance_info (CDI_DOMINATORS); - loop_optimizer_init (LOOPS_NORMAL - | LOOPS_HAVE_RECORDED_EXITS); if (dump_file && (dump_flags & TDF_DETAILS)) flow_loops_dump (dump_file, NULL, 0); scev_initialize (); @@ -2305,12 +2307,15 @@ estimate_function_body_sizes (struct cgr *inline_summary (node)->loop_iterations = loop_iterations; } scev_finalize (); - loop_optimizer_finalize (); - free_dominance_info (CDI_DOMINATORS); } inline_summary (node)->self_time = time; inline_summary (node)->self_size = size; VEC_free (predicate_t, heap, nonconstant_names); + if (optimize) +{ + loop_optimizer_finalize (); + free_dominance_info (CDI_DOMINATORS); +} if (dump_file) { fprintf (dump_file, "\n");
Re: [PATCH] rs6000: Add a builtin to read the time base register on PowerPC
Hi Segher, Segher Boessenkool writes: Add __builtin_ppc_get_timebase to read the time base register on PowerPC. This is required for applications that measure time at high frequencies with high precision that can't afford a syscall. For things that do mftb with high frequency, maybe you should also add a builtin that does just an mftb, i.e. returns a 32-bit result on 32-bit implementations. Are you thinking in a function that returns only the TBL? I don't think such a builtin would make sense on a 64-bit environment, right? Do you have a suggestion for its name? Please add documentation for the new builtin(s). Sure! --- a/gcc/config/rs6000/rs6000-builtin.def +++ b/gcc/config/rs6000/rs6000-builtin.def @@ -1429,6 +1429,9 @@ BU_SPECIAL_X (RS6000_BUILTIN_RSQRT, "__builtin_rsqrt", RS6000_BTM_FRSQRTE, BU_SPECIAL_X (RS6000_BUILTIN_RSQRTF, "__builtin_rsqrtf", RS6000_BTM_FRSQRTES, RS6000_BTC_FP) +BU_SPECIAL_X (RS6000_BUILTIN_GET_TB, "__builtin_ppc_get_timebase", +RS6000_BTM_POWERPC, RS6000_BTC_MISC) RS6000_BTM_POWERPC does not exist anymore. RS6000_BTM_ALWAYS? I'm replacing. +/* Expand an expression EXP that calls a builtin without arguments. */ +static rtx +rs6000_expand_noop_builtin (enum insn_code icode, rtx target) "noop" gives the wrong idea, "zeroop" perhaps? zeroop is much better. +(define_expand "get_timebase" You should probably prefix this with powerpc_ or rs6000_ as well. The existing code is not very consistent in this. OK. + [(use (match_operand:DI 0 "gpc_reg_operand" ""))] + "" + " +{ + if (TARGET_POWERPC64) +emit_insn (gen_get_timebase_ppc64 (operands[0])); + else if (TARGET_POWERPC) +emit_insn (gen_get_timebase_ppc32 (operands[0])); + else +FAIL; + DONE; +}") TARGET_POWERPC is always true. OK. +(define_insn "get_timebase_ppc32" + [(set (match_operand:DI 0 "gpc_reg_operand" "=r") +(unspec_volatile:DI [(const_int 0)] UNSPECV_GETTB)) + (clobber (match_scratch:SI 1 "=r"))] + "TARGET_POWERPC && !TARGET_POWERPC64" +{ +return "mftbu %0\;" + "mftb %L0\;" + "mftbu %1\;" + "cmpw %0,%1\;" + "bne- $-16"; +}) This only works for WORDS_BIG_ENDIAN. Yes. You should say you clobber CR0 here I think; actually, allow any CRn instead. Yes. Does mftb work on all supported assemblers? The machine instruction is phased out, but some assemblers translate it to mfspr. According to the Power ISA 2.06 they should translate it to mfspr. +(define_insn "get_timebase_ppc64" + [(set (match_operand:DI 0 "gpc_reg_operand" "=r") +(unspec_volatile:DI [(const_int 0)] UNSPECV_GETTB))] + "TARGET_POWERPC64" +{ +return "mfspr %0, 268"; +}) POWER3 needs mftb. Nice catch! +int +main(void) +{ + uint64_t t1, t2, t3; + + t1 = __builtin_ppc_get_timebase (); + t2 = __builtin_ppc_get_timebase (); + t3 = __builtin_ppc_get_timebase (); + + if (t1 != t2 && t1 != t3 && t2 != t3) +return 0; + + return 1; +} On some systems the timebase runs at a rather low frequency, say 20MHz. This test will spuriously fail there. Waste a million CPU cycles before reading TB the second time? Yes. Thank you, -- Tulio Magno
Re: faster random number engine
On Wed, Aug 29, 2012 at 11:43 AM, Paolo Carlini wro > The substance isn't of course. But normally we don't have __gnu_cxx things > in the same std header. Can't we have a new ext/random and put it in there? > If we can separate the new code to it, I think people would not even object > to the target dependency, etc. In ext/ we are quite free to do extension / > experimental work. OK, I moved the definition to ext. Will check in the result.
Re: [PATCH] MIPS16 TLS support for GCC
Chung-Lin Tang writes: > On 2012/7/6 02:23 PM, Richard Sandiford wrote: >> Richard Sandiford writes: (3) Also related to libraries, I edited CRT_CALL_STATIC_FUNCTION to emit a 32-bit code sequence under both MIPS/MIPS16 mode (under O32). As you can see in the original Feb. patch, I had changes to emit a MIPS16 version of these static calls, but with the changes in (2) above, they will not work with the usual situation of a 32-bit MIPS built /lib (.init/.fini will have 32/16-bit code improperly concatenated). The CodeSourcery builds use an independent mips16 sysroot for this, so a MIPS16 CRT_CALL_STATIC_FUNCTION works there. For the usual case, I think making it 32-bit is the compatible choice. >>> >>> Yeah, I agree that sounds like the right call. Please do the same >>> for the n32/n64 version (i.e. explicitly make it nomips16 rather >>> than add the #error). >> >> BTW, doing this has removed my main concern about having dead code. >> The original patch had a separate MIPS16 implementation that (as things >> stood) could never be used by stock sources. That would make it difficult >> to maintain. >> >> Now that the MIPS16 library support is purely adding nomips16 attributes >> to code that is obviously nomips16, those parts are OK on their own, thanks. >> (I.e. the mips.h change, the libgcc change, and the libgomp change.) >> Feel free to drop the multilib thing if you don't want to implement >> --with-multilib-list. > > Hi Richard, just FYI, I just committed the said approved parts. > gcc/config/mips/t-linux64 had one additional change, adding > ../lib/mips16 to the corresponding MULTILIB_OSDIRNAMES, or else we end > with a weird option-named directory for the mips16 libraries. Sorry, but the t-linux64 stuff wasn't approved. It was just the mips.h change, the libgcc change and the libgomp change. Please revert the patch to t-linux64. My original objection to adding mips16 unconditionally still stands: it isn't correct for people who configure for processors that don't have the MIPS16 ASE (such as Octeon). Thanks, Richard
Re: remove dependency on cp/parser.h from cp/lang.c
On Wed, 29 Aug 2012, Aaron Gray wrote: Just got my copyright assignment through, so here's my first GCC patch, Welcome! This is a one liner removing the unneeded dependency of cp-lang.c on cp/parser.h. This has been tested on Linux. I think you need to attach a ChangeLog entry with every patch. Also, dependencies are often repeated in makefiles. Is there anything to update with your patch? (maybe not, just asking) -- Marc Glisse
Re: [patch] Fix problems with -fdebug-types-section and local types
> Ping. > > http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00398.html Because much of this patch was superceded by this recent patch: http://gcc.gnu.org/ml/gcc-patches/2012-08/msg01968.html I'll combine the two and submit a new patch. -cary
Re: [MIPS, committed] Add missing COSTS_N_INSNS call.
Richard Sandiford writes: > Hans-Peter Nilsson writes: >> On Tue, 28 Aug 2012, Richard Sandiford wrote: >>> Hans-Peter Nilsson writes: >>> > On Sun, 26 Aug 2012, Richard Sandiford wrote: >>> >> I'm preparing a patch to turn gcc.target/mips into a torture-like >>> >> testsuite. >>> > >>> > While on the subject of gcc.target/mips and its extensions, it >>> > also doesn't handle a build configured with --with-synci=yes. >>> > (Well, not on the 4.7 branch at least.) >>> >>> What goes wrong? >> >> I don't remember details, but IIRC some synci-related tests go >> wrong for mipsisa32r2el-linux-gnu due to -msynci being the >> default. Don't worry, I've fixed it in the local import. :) >> I though the above would entice you to try it, but I guess I >> need to report better for that to happen. Maybe later. > > Trying it now. I suspect it was the problem that Steve hit: > the implicit -msynci is still (deliberately) kept when a lower > architecture is selected. > > I'm testing a patch to make the testsuite work out the default > -m{no,}synci, which ought to be enough. The usual rules should > then kick in and force -mno-synci where necessary. Hopefully. Here's the patch. Tested on mipsisa64r2-elf, where mips.exp comes out clean. I looked at the logs to make sure that -mno-synci was being passed for lower architectures but that no explicit -msynci or -mno-synci option was used when testing mips64r2. Applied. Richard gcc/ * config/mips/mips.h (TARGET_CPU_CPP_BUILTINS): Define __mips_synci if TARGET_SYNCI. gcc/testsuite/ * gcc.target/mips/mips.exp: Work out default -msynci setting. Index: gcc/config/mips/mips.h === --- gcc/config/mips/mips.h 2012-08-29 19:40:47.0 +0100 +++ gcc/config/mips/mips.h 2012-08-29 19:50:50.144982449 +0100 @@ -517,6 +517,9 @@ #define TARGET_CPU_CPP_BUILTINS() \ if (TARGET_OCTEON) \ builtin_define ("__OCTEON__"); \ \ + if (TARGET_SYNCI) \ + builtin_define ("__mips_synci");\ + \ /* Macros dependent on the C dialect. */ \ if (preprocessing_asm_p ()) \ { \ Index: gcc/testsuite/gcc.target/mips/mips.exp === --- gcc/testsuite/gcc.target/mips/mips.exp 2012-08-27 17:27:13.0 +0100 +++ gcc/testsuite/gcc.target/mips/mips.exp 2012-08-29 19:50:50.141982450 +0100 @@ -767,6 +767,12 @@ proc mips-dg-init {} { "-mno-smartmips", #endif + #ifdef __mips_synci + "-msynci", + #else + "-mno-synci", + #endif + 0 }; }]
[patch] Fix problems with -fdebug-types-section
I've combined these two pending patches into one: http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00398.html http://gcc.gnu.org/ml/gcc-patches/2012-08/msg01968.html The first patch fixed a problem with copying too much of a referenced type into a type unit, by changing clone_tree_hash() to copy subprograms as declarations. In the second patch, I found that clone_tree_hash() was still copying too much, and determined that it shouldn't be called at all. Notes from the first patch: With --std=c++11, a template parameter can refer to a local type defined within a function. Because that local type doesn't qualify for its own type unit, we copy it as an "unworthy" type into the type unit that refers to it, but we copy too much, leading to a comdat type unit that contains a DIE with subprogram definitions rather than declarations. These DIEs may have DW_AT_low_pc/high_pc or DW_AT_ranges attributes, and consequently can refer to range list entries that don't get emitted because they're not marked when the compile unit is scanned, eventually causing an undefined symbol at link time. In addition, while debugging this problem, I found that the DW_AT_object_pointer attribute, when left in the skeletons that are left behind in the compile unit, causes duplicate copies of the types to be copied back into the compile unit. This patch fixes these problems by removing the DW_AT_object_pointer attribute from the skeleton left behind in the compile unit, and by copying DW_TAG_subprogram DIEs as declarations when copying "unworthy" types into a type unit. In order to preserve information in the DIE structure, I also added DW_AT_abstract_origin as an attribute that should be copied when cloning a DIE as a declaration. I also fixed the dwarf4-typedef.C test, which should be turning on the -fdebug-types-section flag. Notes from the second patch: When a class template instantiation is moved into a separate type unit, it can bring along a lot of other referenced types into the type unit, especially if the template is derived from another (large) type that does not have an actually have a type definition in a type unit of its own. When there are many instantiations of the same template, we get a lot of duplication, and in the worst case (a template with several parameters, instantiated multiple times along each dimension), GCC can end up taking a long time and exhausting available memory. This combinatorial explosion is being caused by copy_decls_walk, where it finds a type DIE that is referenced by the type unit, but is not itself a type unit, and copies a declaration for that type into the type unit in order to resolve the reference within the type unit. In the process, copy_decls_walk also copies all of the children of that DIE. In the case of a base class with member function templates, every one of the instantiated member functions is copied into every type unit that references the base class. I don't believe that it's necessary to copy the children of the class declaration at all, and this patch simply removes the code that copies those children. If there's a reference in the type unit to one of the children of that class, that one child will get copied in as needed. Bootstraps and passes regression tests. Also tested with a large internal test case that previously resulted in out-of-memory during compilation. OK for trunk? 2012-08-29 Cary Coutant gcc/ * dwarf2out.c (clone_as_declaration): Copy DW_AT_abstract_origin attribute. (generate_skeleton_bottom_up): Remove DW_AT_object_pointer attribute from original DIE. (clone_tree_hash): Remove. (copy_decls_walk): Don't copy children of a declaration into a type unit. gcc/testsuite/ * testsuite/g++.dg/debug/dwarf2/dwarf4-nested.C: New test case. * testsuite/g++.dg/debug/dwarf2/dwarf4-typedef.C: Add -fdebug-types-section flag. Index: gcc/testsuite/g++.dg/debug/dwarf2/dwarf4-nested.C === --- gcc/testsuite/g++.dg/debug/dwarf2/dwarf4-nested.C (revision 0) +++ gcc/testsuite/g++.dg/debug/dwarf2/dwarf4-nested.C (revision 0) @@ -0,0 +1,55 @@ +// { dg-do compile } +// { dg-options "--std=c++11 -dA -gdwarf-4 -fdebug-types-section -fno-merge-debug-strings" } + +// Check that -fdebug-types-sections does not copy a full referenced type +// into a type unit. + +// Checks that at least one type unit is generated. +// +// { dg-final { scan-assembler "DIE \\(\[^\n\]*\\) DW_TAG_type_unit" } } +// +// Check that func is declared exactly twice in the debug info: +// once in the type unit for struct D, and once in the compile unit. +// +// { dg-final { scan-assembler-times "\\.ascii \"func0\"\[^\n\]*DW_AT_name" 2 } } +// +// Check to make sure that no type unit contains a DIE with DW_AT_low_pc +// or DW_AT_ranges. These patterns assume that the compile unit is always +// emitted after all type units. +// +// { dg-final {
Re: [PATCH] rs6000: Add a builtin to read the time base register on PowerPC
On some systems the timebase runs at a rather low frequency, say 20MHz. This test will spuriously fail there. Waste a million CPU cycles before reading TB the second time? Waste said million cycles portably by calling sched_yield()? (Available only on POSIX systems. :) I was thinking more along the lines of int j; for (j = 0; j < 100; j++) asm("" : : "r"(j)); which is more portable (and a lot more predictable). Segher
Re: [google/gcc-4_7, trunk] Fix problem with -fdebug-types-section and template instantiations, take 2
> This patch is for trunk and the google/gcc-4_7 branch. > > 2012-08-28 Cary Coutant > > * gcc/dwarf2out.c (clone_tree_partial): Remove. > (copy_decls_walk): Don't copy children of a declaration > into a type unit. For trunk, I've submitted a new patch that combines this one with a previous pending patch. Still looking for an approval for google/gcc-4_7 branch... -cary
Re: [google/gcc-4_7, trunk] Fix problem with -fdebug-types-section and template instantiations, take 2
On Wed, Aug 29, 2012 at 12:03 PM, Cary Coutant wrote: >> This patch is for trunk and the google/gcc-4_7 branch. >> >> 2012-08-28 Cary Coutant >> >> * gcc/dwarf2out.c (clone_tree_partial): Remove. >> (copy_decls_walk): Don't copy children of a declaration >> into a type unit. > > For trunk, I've submitted a new patch that combines this one with a > previous pending patch. > > Still looking for an approval for google/gcc-4_7 branch... > > -cary This is OK for google 4.7
Re: [PATCH] rs6000: Add a builtin to read the time base register on PowerPC
For things that do mftb with high frequency, maybe you should also add a builtin that does just an mftb, i.e. returns a 32-bit result on 32- bit implementations. Are you thinking in a function that returns only the TBL? On 32-bit, just TBL; on 64-bit, the whole TB (there is no machine instruction to read just TBL on 64-bit, so it doesn't make much sense to have it return a 32-bit number). +(define_insn "get_timebase_ppc32" + [(set (match_operand:DI 0 "gpc_reg_operand" "=r") +(unspec_volatile:DI [(const_int 0)] UNSPECV_GETTB)) + (clobber (match_scratch:SI 1 "=r"))] + "TARGET_POWERPC && !TARGET_POWERPC64" +{ +return "mftbu %0\;" + "mftb %L0\;" + "mftbu %1\;" + "cmpw %0,%1\;" + "bne- $-16"; +}) This only works for WORDS_BIG_ENDIAN. Yes. Do you mean you are fixing it? :-) Does mftb work on all supported assemblers? The machine instruction is phased out, but some assemblers translate it to mfspr. According to the Power ISA 2.06 they should translate it to mfspr. Yes, I realised that later. But then a binary compiled with an assembler that emits mfspr for mftb will not run on POWER3 or 601. I don't know what to do about that; maybe just document it. Segher
Re: [PATCH] rs6000: Add a builtin to read the time base register on PowerPC
On Wed, 29 Aug 2012, Michael Meissner wrote: > On Wed, Aug 29, 2012 at 01:56:05PM -0400, Hans-Peter Nilsson wrote: > > On Wed, 29 Aug 2012, Segher Boessenkool wrote: > > > On some systems the timebase runs at a rather low frequency, say 20MHz. > > > This test will spuriously fail there. Waste a million CPU cycles before > > > reading TB the second time? > > > > Waste said million cycles portably by calling sched_yield()? > > (Available only on POSIX systems. :) > > Well only for a test environment. You don't want to call sched_yield in the > normal case, since the apps that do this many millions of times need this to > be > as a fast as possible. Surely, but IMHO what goes for the normal case is not a valid reading of "waste"..."millions of cycles". ;) Point being, for simulator environments, you may not want the loop that was suggested later. On the other hand, that might not be an observable period, either. brgds, H-P
Re: [MIPS, committed] Add missing COSTS_N_INSNS call.
On Wed, 29 Aug 2012, Richard Sandiford wrote: > Richard Sandiford writes: > > I'm testing a patch to make the testsuite work out the default > > -m{no,}synci, which ought to be enough. The usual rules should > > then kick in and force -mno-synci where necessary. Hopefully. > > Here's the patch. > Index: gcc/testsuite/gcc.target/mips/mips.exp > === > --- gcc/testsuite/gcc.target/mips/mips.exp2012-08-27 17:27:13.0 > +0100 > +++ gcc/testsuite/gcc.target/mips/mips.exp2012-08-29 19:50:50.141982450 > +0100 > @@ -767,6 +767,12 @@ proc mips-dg-init {} { > "-mno-smartmips", > #endif > > + #ifdef __mips_synci JFTR, I came up with something very similar locally, but without new builtin defines and with the invalid assumption of configuring with --with-synci=yes, hence "#if (__mips == 32 || __mips == 64) && __mips_isa_rev == 2 && !defined(__mips16)" brgds, H-P
[wwwdocs] SH 4.8 changes update
Hello, The new SH option -menable-tas has been renamed to -mtas in rev 190782. I have committed the attached patch to reflect this in the changes.html for 4.8. Cheers, Oleg ? sh_mtas_rename.patch Index: htdocs/gcc-4.8/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.8/changes.html,v retrieving revision 1.23 diff -u -r1.23 changes.html --- htdocs/gcc-4.8/changes.html 26 Aug 2012 21:48:50 - 1.23 +++ htdocs/gcc-4.8/changes.html 29 Aug 2012 19:21:06 - @@ -232,9 +232,9 @@ Minor improvements to code generated for software atomic sequences that are enabled by -msoft-atomic. - A new option -menable-tas will make the compiler - generate the tas.b instruction for the - __atomic_test_and_set built-in function. + A new option -mtas will make the compiler generate the + tas.b instruction for the __atomic_test_and_set + built-in function. The SH4A instructions movco.l and movli.l are now supported. They are used to implement some @@ -281,9 +281,9 @@ The behavior of the -mieee option has been fixed and the negative form -mno-ieee has been added to control the IEEE -conformance of floating point comparisons. By default-mieee is -now enabled and the option -ffinite-math-only implicitly sets --mno-ieee. +conformance of floating point comparisons. By default -mieee +is now enabled and the option -ffinite-math-only implicitly +sets -mno-ieee.
Re: [PATCH] rs6000: Add a builtin to read the time base register on PowerPC
Segher Boessenkool writes: For things that do mftb with high frequency, maybe you should also add a builtin that does just an mftb, i.e. returns a 32-bit result on 32- bit implementations. Are you thinking in a function that returns only the TBL? On 32-bit, just TBL; on 64-bit, the whole TB (there is no machine instruction to read just TBL on 64-bit, so it doesn't make much sense to have it return a 32-bit number). OK. +(define_insn "get_timebase_ppc32" + [(set (match_operand:DI 0 "gpc_reg_operand" "=r") +(unspec_volatile:DI [(const_int 0)] UNSPECV_GETTB)) + (clobber (match_scratch:SI 1 "=r"))] + "TARGET_POWERPC && !TARGET_POWERPC64" +{ +return "mftbu %0\;" + "mftb %L0\;" + "mftbu %1\;" + "cmpw %0,%1\;" + "bne- $-16"; +}) This only works for WORDS_BIG_ENDIAN. Yes. Do you mean you are fixing it? :-) Yes. At least I'll try to. :-) Does mftb work on all supported assemblers? The machine instruction is phased out, but some assemblers translate it to mfspr. According to the Power ISA 2.06 they should translate it to mfspr. Yes, I realised that later. But then a binary compiled with an assembler that emits mfspr for mftb will not run on POWER3 or 601. I don't know what to do about that; maybe just document it. We can easily fix this at runtime, which isn't the case here. Thanks again, -- Tulio Magno
Re: [Fortran] PR37336 - FIINAL patch [1/n]: Implement the finalization wrapper subroutine
Dear all, that's the revised version of patch at http://gcc.gnu.org/ml/fortran/2012-08/msg00095.html, taking the review comments into account. Reminder: This patch only generates the finalization wrapper, which is in the virtual table. It does not add the required calls; hence, it still doesn't allow to use finalization. The patch consists of three parts: a) The main patch, which implements the wrapper. I am asking for approval for that patch. b) A patch which removes the gfc_error "not yet implemented" I suggest to only remove the error after finalization calls have been added c) A patch which bumps the .mod version - or alternatively - a patch which disables the _final generation in the vtable. I have build and regtested (on x86-64-linux) the patch with (a) and (a)+(b) applied. I would like to include the patch (c) as modifying the vtable changes the ABI. Bumping the .mod version is a reliable way to force recompilation. The alternative is to wait until the final FINAL patch before bumping the .mod version (and disable the "_final" generation). One possibility, if deemed useful, is to combine the .mod version bump with backward compatible reading of .mod files, i.e., only error out when BT_CLASS is encountered in an old .mod file. Is the patch (a) OK for the trunk? With which version of (c)? (I am slightly inclined to do the .mod bump now. As a follow up, one can also commit Janus' proc-pointer patch, http://gcc.gnu.org/ml/fortran/2012-04/msg00033.html, though I think someone has still to review it.) Tobias PS: When doing the ABI change, I am going to document it in the release notes / wiki. 2012-08-29 Alessandro Fanfarillo Tobias Burnus PR fortran/37336 * gfortran.h (symbol_attribute): Add artificial. * module.c (mio_symbol_attribute): Handle attr.artificial * class.c (gfc_build_class_symbol): Defer creation of the vtab if the DT has finalizers, mark generated symbols as attr.artificial. (has_finalizer_component, finalize_component, finalization_scalarizer, generate_finalization_wrapper): New static functions. (gfc_find_derived_vtab): Add _final component and call generate_finalization_wrapper. * dump-parse-tree.c (show_f2k_derived): Use resolved proc_tree->n.sym rather than unresolved proc_sym. (show_attr): Handle attr.artificial. * resolve.c (gfc_resolve_finalizers): Ensure that the vtab exists. (resolve_fl_derived): Resolve finalizers before generating the vtab. (resolve_symbol): Also allow assumed-rank arrays with CONTIGUOUS; skip artificial symbols. (resolve_fl_derived0): Skip artificial symbols. 2012-08-29 Tobias Burnus PR fortran/51632 * gfortran.dg/coarray_class_1.f90: New. diff --git a/gcc/fortran/class.c b/gcc/fortran/class.c index 21a91ba..9d58aab 100644 --- a/gcc/fortran/class.c +++ b/gcc/fortran/class.c @@ -34,7 +34,7 @@ along with GCC; see the file COPYING3. If not see declared type of the class variable and its attributes (pointer/allocatable/dimension/...). * _vptr: A pointer to the vtable entry (see below) of the dynamic type. - + For each derived type we set up a "vtable" entry, i.e. a structure with the following fields: * _hash: A hash value serving as a unique identifier for this type. @@ -42,6 +42,9 @@ along with GCC; see the file COPYING3. If not see * _extends: A pointer to the vtable entry of the parent derived type. * _def_init: A pointer to a default initialized variable of this type. * _copy: A procedure pointer to a copying procedure. +* _final:A procedure pointer to a wrapper function, which frees + allocatable components and calls FINAL subroutines. + After these follow procedure pointer components for the specific type-bound procedures. */ @@ -572,7 +575,9 @@ gfc_build_class_symbol (gfc_typespec *ts, symbol_attribute *attr, if (gfc_add_component (fclass, "_vptr", &c) == FAILURE) return FAILURE; c->ts.type = BT_DERIVED; - if (delayed_vtab) + if (delayed_vtab + || (ts->u.derived->f2k_derived + && ts->u.derived->f2k_derived->finalizers)) c->ts.u.derived = NULL; else { @@ -689,6 +694,702 @@ copy_vtab_proc_comps (gfc_symbol *declared, gfc_symbol *vtype) } +/* Returns true if any of its nonpointer nonallocatable components or + their nonpointer nonallocatable subcomponents has a finalization + subroutine. */ + +static bool +has_finalizer_component (gfc_symbol *derived) +{ + gfc_component *c; + + for (c = derived->components; c; c = c->next) +{ + if (c->ts.type == BT_DERIVED && c->ts.u.derived->f2k_derived + && c->ts.u.derived->f2k_derived->finalizers) + return true; + + if (c->ts.type == BT_DERIVED + && !c->attr.pointer && !c->attr.allocatable + && has_finalizer_component (c->ts.u.derived)) + return true; +} + return false; +} + + +/* Call DEALLOCATE for the passed component if it is allocatable, if i
Re: remove dependency on cp/parser.h from cp/lang.c
On 29 August 2012 19:47, Marc Glisse wrote: > > On Wed, 29 Aug 2012, Aaron Gray wrote: > >> Just got my copyright assignment through, so here's my first GCC patch, > > > Welcome! Thanks Marc ! > > > >> This is a one liner removing the unneeded dependency of cp-lang.c on >> cp/parser.h. This has been tested on Linux. > > > I think you need to attach a ChangeLog entry with every patch. Okay > > > Also, dependencies are often repeated in makefiles. Is there anything to > update with your patch? (maybe not, just asking) Yes there is a dependency in cp/Make-lang.in, I will resubmit the patch soon. -- Aaron
[patch] Fix CFG dumping of blocks with no predecessors or successors
Will commit as obvious. * cfg.c (dump_bb_info): Print a newline if there were no edges to dump. Index: cfg.c === --- cfg.c (revision 190785) +++ cfg.c (working copy) @@ -764,6 +764,8 @@ dump_bb_info (FILE *outf, basic_block bb, int inde dump_edge_info (outf, e, flags, 0); fputc ('\n', outf); } + if (first) + fputc ('\n', outf); } if (do_footer) @@ -784,6 +786,8 @@ dump_edge_info (outf, e, flags, 1); fputc ('\n', outf); } + if (first) + fputc ('\n', outf); } }
[PATCH] Remove dependency of cp/cp-lang.c on cp/parser.h
Patch removing the dependency of cp/cp-lang.c on cp/parser.c. This as been tested on Linux. [gcc/cp] 2012-08-29 Aaron Gray * cp/cp-lang.c: removed #include "parser.h" * cp/Make-lang.in: removed dependency of cp/cp-lang.c on cp/parser.h diff --git a/gcc/cp/cp-lang.c b/gcc/cp/cp-lang.c index da7f1e1..5ca0b0a 100644 --- a/gcc/cp/cp-lang.c +++ b/gcc/cp/cp-lang.c @@ -32,7 +32,6 @@ along with GCC; see the file COPYING3. If not see #include "cp-objcp-common.h" #include "hashtab.h" #include "target.h" -#include "parser.h" enum c_language_kind c_language = clk_cxx; static void cp_init_ts (void); diff --git a/gcc/cp/Make-lang.in b/gcc/cp/Make-lang.in index 6233f06..78296ae 100644 --- a/gcc/cp/Make-lang.in +++ b/gcc/cp/Make-lang.in @@ -270,7 +270,7 @@ cp/lex.o: cp/lex.c $(CXX_TREE_H) $(TM_H) $(FLAGS_H) \ c-family/c-objc.h cp/cp-lang.o: cp/cp-lang.c $(CXX_TREE_H) $(TM_H) debug.h langhooks.h \ $(LANGHOOKS_DEF_H) $(C_COMMON_H) gtype-cp.h gt-cp-cp-lang.h \ - cp/cp-objcp-common.h $(EXPR_H) $(TARGET_H) $(CXX_PARSER_H) + cp/cp-objcp-common.h $(EXPR_H) $(TARGET_H) tree.h c-family/c-pragma.h cp/decl.o: cp/decl.c $(CXX_TREE_H) $(TM_H) $(FLAGS_H) cp/decl.h \ output.h toplev.h $(HASHTAB_H) $(RTL_H) \ cp/operators.def $(TM_P_H) $(TREE_INLINE_H) $(DIAGNOSTIC_H) $(C_PRAGMA_H) \ cp-lang.diff Description: Binary data
Re: [PATCH] rs6000: Add a builtin to read the time base register on PowerPC
Point being, for simulator environments, you may not want the loop that was suggested later. On the other hand, that might not be an observable period, either. I don't think looping a million times would be too slow for the testsuite: there are many tests that do a lot more work than that, already. The worst case for hardware that I know of can take about 100 clock cycles for one timebase tick. But how about this then, which only iterates much if the test fails: int main (void) { uint64_t t = __builtin_ppc_get_timebase (); int j; for (j = 0; j < 100; j++) if (t != __builtin_ppc_get_timebase ()) break; return (j == 100); } Segher
Re: [PATCH] Add counter histogram to fdo summary (issue6465057)
On Wed, Aug 29, 2012 at 6:12 AM, Jan Hubicka wrote: >> Index: libgcc/libgcov.c >> === >> --- libgcc/libgcov.c (revision 190736) >> +++ libgcc/libgcov.c (working copy) >> @@ -276,6 +276,78 @@ gcov_version (struct gcov_info *ptr, gcov_unsigned >>return 1; >> } >> >> +/* Insert counter VALUE into HISTOGRAM. */ >> + >> +static void >> +gcov_histogram_insert(gcov_bucket_type *histogram, gcov_type value) >> +{ >> + unsigned i; >> + >> + i = gcov_histo_index(value); >> + gcc_assert (i < GCOV_HISTOGRAM_SIZE); > Does checking_assert work in libgcov? I do not think internal consistency > check > should go to --enable-checking=release libgcov. We want to maintain it as > lightweight as possible. (I see there are two existing gcc_asserts, since they > report file format corruption, I think they should give better diagnostic). gcc_checking_assert isn't available, since tsystem.h not system.h is included. I could probably just remove the assert (to be safe, silently return if i is out of bounds?). > > Inliner will do good job here, but perhaps explicit inline fits. >> + for (f_ix = 0; f_ix != gi_ptr->n_functions; f_ix++) >> +{ >> + gfi_ptr = gi_ptr->functions[f_ix]; >> + >> + if (!gfi_ptr || gfi_ptr->key != gi_ptr) >> +continue; >> + >> + ci_ptr = &gfi_ptr->ctrs[ctr_info_ix]; >> + for (ix = 0; ix < ci_ptr->num; ix++) >> +gcov_histogram_insert(cs_ptr->histogram, ci_ptr->values[ix]); > Space before (. Ok. >> +} >> +} >> +} >> + >> /* Dump the coverage counts. We merge with existing counts when >> possible, to avoid growing the .da files ad infinitum. We use this >> program's checksum to make sure we only accumulate whole program >> @@ -347,6 +419,7 @@ gcov_exit (void) >> } >> } >> } >> + gcov_compute_histogram (&this_prg); >> @@ -598,11 +669,18 @@ gcov_exit (void) >> if (gi_ptr->merge[t_ix]) >> { >> if (!cs_prg->runs++) >> - cs_prg->num = cs_tprg->num; >> +cs_prg->num = cs_tprg->num; >> + else if (cs_prg->num != cs_tprg->num) >> +goto read_mismatch; > > Doesn't think check that all the programs that contain this unit are the same? > I.e. will this survive profiledbootstrap where we interleave cc1 and cc1plus? Ok, removing that check and I am switching the histogram merging code to handle the case where there are different numbers of counters. It will end up with the same number of counters as in the summary we are merging into since that is the num we keep above when runs > 0 to start with. >> + /* Count number of non-zero histogram entries. The histogram is only >> + currently computed for arc counters. */ >> + csum = &summary->ctrs[GCOV_COUNTER_ARCS]; >> + for (h_ix = 0; h_ix < GCOV_HISTOGRAM_SIZE; h_ix++) >> +{ >> + if (csum->histogram[h_ix].num_counters > 0) >> +h_cnt++; >> +} >> + gcov_write_tag_length (tag, GCOV_TAG_SUMMARY_LENGTH(h_cnt)); >>gcov_write_unsigned (summary->checksum); >>for (csum = summary->ctrs, ix = GCOV_COUNTERS_SUMMABLE; ix--; csum++) >> { >> @@ -380,6 +388,21 @@ gcov_write_summary (gcov_unsigned_t tag, const str >>gcov_write_counter (csum->sum_all); >>gcov_write_counter (csum->run_max); >>gcov_write_counter (csum->sum_max); >> + if (ix != GCOV_COUNTER_ARCS) >> +{ >> + gcov_write_unsigned (0); >> + continue; >> +} >> + gcov_write_unsigned (h_cnt); >> + for (h_ix = 0; h_ix < GCOV_HISTOGRAM_SIZE; h_ix++) >> +{ >> + if (!csum->histogram[h_ix].num_counters) >> +continue; >> + gcov_write_unsigned (h_ix); > > It is kind of waste to write whole unsigned for each histogram index. > What about writting bitmap of non-zero entries followed by each entry? Sure, I will do that instead. >> +/* Merge SRC_HISTO into TGT_HISTO. */ > > Perhaps comment about overall concept of the merging routine would suit here. Ok. >> -#else /*!IN_GCOV */ >> -#define GCOV_TYPE_SIZE (LONG_LONG_TYPE_SIZE > 32 ? 64 : 32) > > Why do you need t omove this out of !libgcov? I do not thing this is correct > for all configurations. > i.e. gcov_type may be 16bit. >From my understanding of the mode attribute meanings, which I thought are defined in terms of the number of smallest addressable units, the code in gcov-io.h that sets up the gcov_type typedef will always end up with a gcov_type that is 32 or 64 bits? I.e. when BITS_PER_UNIT is 8 it will use either SI or DI which will end up either 32 or 64, and when BITS_PER_UNIT is 16 it would use either HI or SI which would again be either 32 or 64. Is that wrong and we can end up with a 16 bit gcov_type? The GCOV_TYPE_SIZE was being defined everywhere except when IN_GOV (so it was being defined IN_LIBGCOV), but I wanted it defined unconditionally because
Re: [PATCH 1/6] Thread pointer built-in functions, core parts
On 2012-08-28 01:13, Chung-Lin Tang wrote: > + icode = optab_handler (get_thread_pointer_optab, Pmode); Until we decide there's no point in the distinction, this should be spelled direct_optab_handler, to match OPTAB_D with which the optab is declared. Otherwise ok. r~
Re: [PATCH 2/6] Thread pointer built-in functions, alpha
On 2012-08-28 01:13, Chung-Lin Tang wrote: > Alpha patch updated to use MD pattern. Ok. r~
[PATCH] limited C++ parsing support for gengtype
First of two patches for class'ized cp/parser.c|h gives limited support for gengtype to parse C++ classes and enums as first class citizens. Patch to SVN HEAD 2012-08-30 Aaron Gray * gengtype-lex.l: Support for FILE Support for C++ single line Comments Support for classes Support for enums ignore 'static' ignore 'inline' ignore 'public:' ignore 'protected:' ignore 'private:' ignore 'friend' support for 'operator' token support for 'new' support for 'delete' added support for '+' as a token for summations in enum bodies * gengtype.h: added 'TYPE_ENUM' to 'enum typekind' added enum TYPE_ENUM to 'struct type' union added OPERATOR_KEYWORD and OPERATOR keywords to Token Code enum * gengtype-parser.c: updated 'token_names[]' (direct_declarator): support for parsing limited operators support for parsing constructors with no parameters support for parsing enums * gengtype.c: added 'type_p enums' to maintain list of enums (resolve_typedef): added support for stucture types and enums added 'new_enum()' diff --git a/gcc/gengtype-lex.l b/gcc/gengtype-lex.l index 5788a6a..af9696a 100644 --- a/gcc/gengtype-lex.l +++ b/gcc/gengtype-lex.l @@ -53,11 +53,11 @@ update_lineno (const char *l, size_t len) ID [[:alpha:]_][[:alnum:]_]* WS [[:space:]]+ HWS[ \t\r\v\f]* -IWORD short|long|(un)?signed|char|int|HOST_WIDE_INT|HOST_WIDEST_INT|bool|size_t|BOOL_BITFIELD|CPPCHAR_SIGNED_T|ino_t|dev_t|HARD_REG_SET +IWORD short|long|(un)?signed|char|int|HOST_WIDE_INT|HOST_WIDEST_INT|bool|size_t|BOOL_BITFIELD|CPPCHAR_SIGNED_T|ino_t|dev_t|HARD_REG_SET|FILE ITYPE {IWORD}({WS}{IWORD})* EOID [^[:alnum:]_] -%x in_struct in_struct_comment in_comment +%x in_struct in_struct_comment in_comment in_line_comment in_line_struct_comment %option warn noyywrap nounput nodefault perf-report %option 8bit never-interactive %% @@ -83,6 +83,14 @@ EOID [^[:alnum:]_] BEGIN(in_struct); return UNION; } +^{HWS}class/{EOID} { + BEGIN(in_struct); + return STRUCT; +} +^{HWS}enum/{EOID} { + BEGIN(in_struct); + return ENUM; +} ^{HWS}extern/{EOID} { BEGIN(in_struct); return EXTERN; @@ -101,10 +109,20 @@ EOID [^[:alnum:]_] \\\n { lexer_line.line++; } "const"/{EOID} /* don't care */ +"static"/{EOID}/* don't care */ +"inline"/{EOID}/* don't care */ +"public:" /* don't care */ +"private:" /* don't care */ +"protected:" /* don't care */ +"operator"/{EOID} { return OPERATOR_KEYWORD; } +"new"/{EOID}{ *yylval = XDUPVAR (const char, yytext+1, yyleng-2, yyleng-1); return OPERATOR; } +"delete"/{EOID} { *yylval = XDUPVAR (const char, yytext+1, yyleng-2, yyleng-1); return OPERATOR; } +"friend"/{EOID} "GTY"/{EOID} { return GTY_TOKEN; } "VEC"/{EOID} { return VEC_TOKEN; } "union"/{EOID} { return UNION; } "struct"/{EOID}{ return STRUCT; } +"class"/{EOID} { return CLASS; } "enum"/{EOID} { return ENUM; } "ptr_alias"/{EOID} { return PTR_ALIAS; } "nested_ptr"/{EOID}{ return NESTED_PTR; } @@ -148,7 +166,7 @@ EOID[^[:alnum:]_] } "..." { return ELLIPSIS; } -[(){},*:<>;=%|-] { return yytext[0]; } +[(){},*:<>;=%|\-\+]{ return yytext[0]; } /* ignore pp-directives */ ^{HWS}"#"{HWS}[a-z_]+[^\n]*\n {lexer_line.line++;} @@ -159,6 +177,7 @@ EOID[^[:alnum:]_] } "/*" { BEGIN(in_comment); } +"//" { BEGIN(in_line_comment); } \n { lexer_line.line++; } {ID} | "'"("\\".|[^\\])"'"| @@ -172,8 +191,17 @@ EOID [^[:alnum:]_] [^*\n] /* do nothing */ "*"/[^/] /* do nothing */ } + +{ +[^*\n]{16} | +[^*\n] /* do nothing */ +"*"/[^/] /* do nothing */ +} + "*/" { BEGIN(INITIAL); } "*/"{ BEGIN(in_struct); } +\n{ lexer_line.line++; BEGIN(INITIAL); } +\n { lexer_line.line++; BEGIN(in_struct); } ["/] | "*" { diff --git a/gcc/gengtype-parse.c b/gcc/gengtype-parse.c index 03ee781..663db56 100644 --- a/gcc/gengtype-parse.c +++ b/gcc/gengtype-parse.c @@ -3,7 +3,7 @@ This file is part of GCC. - GCC is free software; you can redistribute it and/or modify it under + /GCC is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3, or (at your option) any later version. @@ -75,6 +75,7 @@ static const char *const token_names[] = { "static", "union", "struct
[middle-end] Add machine_mode to address_cost target hook
Hello, While experimenting a little bit with an idea for an address mode selection RTL pass for SH, I realized that SH's sh_address_cost function is quite broken. When trying to fix it, I ran against a wall, since the mode of the MEM is not passed to the target hook function, as it is e.g. in legitimate_address. This circumstance makes it a bit difficult to return useful answers in the address_cost hook. Like on SH, displacement address modes for anything < SImode are considered slightly more expensive due to increased pressure on R0. Since everything in the middle-end already seems to pass the mode to the 'address_cost' function in rtlanal.c, I'd like to propose to forward the mode arg to the target hook. The change is quite obvious, as it only adds one new (mostly) unused argument to the various address_cost functions in the targets. I went through all the targets' code and fixed the hook function. It seems some other targets than SH could also benefit from the mode wisdom in their address_cost estimation. There are a few peculiarities I ran across (respective target maintainers CC'ed): mn10300 The function mn10300_address_cost calls itself recursively, so I added a GET_MODE (x). However, it never looks at the mode, so there should be no problem. iq2000: Similar to mn10300. Mode arg is passed to itself, but effectively never used. Should be no problem. rs6000: I've added the mode to the logging message. I hope this is OK. epiphany: There's probably no need for the offset alignment workaround anymore. arm: In the function 'thumb1_size_rtx_costs' the 'case MEM' looks wrong. I guess it is meant to look at XEXP (x, 0) when checking for SYMBOL_REF? As it stands now, it seems that GET_CODE (x) == SYMBOL_REF will never be true, because GET_CODE (x) == MEM. microblaze: The microblaze_address_cost takes the mode of the address rtx. Maybe it is meant to take the mode of the MEM? I've checked the patch only on my SH xgcc config with 'make all-gcc', but others should build fine since there are no functional changes. I hope I didn't miss anything. Feedback appreciated! Cheers, Oleg ChangeLog: * hooks.c (hook_int_rtx_mode_bool_0): New function. * hooks.h (hook_int_rtx_mode_bool_0): Declare it. * output.h (default_address_cost): Add machine_mode argument. * target.def (address_cost): Likewise. * rtlanal.c (address_cost): Pass mode to target hook. (default_address_cost): Add machine_mode argument. * doc/tm.texi: Regenerate. * config/alpha/alpha.c (TARGET_ADDRESS_COST): Use hook_int_rtx_mode_bool_0 instead of hook_int_rtx_bool_0. * config/arm/arm.c (arm_address_cost): Add machine_mode argument. * config/avr/avr.c (avr_address_cost): Likewise. * config/bfin/bfin.c (bfin_address_cost): Likewise. * config/cr16/cr16.c (cr16_address_cost): Likewise. * config/cris/cris.c (cris_address_cost): Likewise. * config/epiphany/epiphany.c (epiphany_address_cost): Likewise. * config/i386/i386.c (ix86_address_cost): Likewise. * config/ia64/ia64.c (TARGET_ADDRESS_COST): Use hook_int_rtx_mode_bool_0 instead of hook_int_rtx_bool_0. * config/iq2000/iq2000.c (iq2000_address_cost): Add machine_mode argument. Pass it on in recursive invocation. * config/lm32/lm32.c (TARGET_ADDRESS_COST): Use hook_int_rtx_mode_bool_0 instead of hook_int_rtx_bool_0. * config/m32c/m32c.c (m32c_address_cost): Add machine_mode argument. * config/m32r/m32r.c (TARGET_ADDRESS_COST): Use hook_int_rtx_mode_bool_0 instead of hook_int_rtx_bool_0. * config/mcore/mcore.c (TARGET_ADDRESS_COST): Likewise. * config/mep/mep.c (mep_address_cost): Add machine_mode argument. * config/microblaze/microblaze.c (microblaze_address_cost): Likewise. * config/mips/mips.c (mips_address_cost): Likewise. * config/mmix/mmix.c (TARGET_ADDRESS_COST): Use hook_int_rtx_mode_bool_0 instead of hook_int_rtx_bool_0. * config/mn10300/mn10300.c (mn10300_address_cost): Add machine_mode argument. Use GET_MODE (x) in recursive invocation. * config/pa/pa.c (hppa_address_cost): Add machine_mode argument. * config/rs6000/rs6000.c (rs6000_debug_address_cost): Add machine_mode argument and print it. (TARGET_ADDRESS_COST): Use hook_int_rtx_mode_bool_0 instead of hook_int_rtx_bool_0. * config/rx/rx.c (rx_address_cost): Add machine_mode argument. * config/s390/s390.c (s390_address_cost): Likewise. * config/score/score-protos.h (score_address_cost): Likewise. * config/score/score.c (score_address_cost): Likewise. * config/sh/sh.c (sh_address_cost): Likewise. * config/sparc/sparc.c (TARGET_ADDRESS_COST): Use hook_int_rtx_mode_bool_0 instead of
Re: [PATCH] C++'ization of cp/parser.c/h
On Wed, Aug 29, 2012 at 8:01 PM, Aaron Gray wrote: > Patch to SVN HEAD that initially C++'izes cp/parser.h and cp/parser.c > by class'izing the cp_lexer and cp_parser group of functions. > > For C programmers and for context all method calls are preceded by > 'this->' and static method calls by 'cp_parser::' or 'cp_lexer::'. > > I have made minimal non orthogonal changes to the code on purpose at > this stage. This is still a work in progress and I am not sure about > how to go about preparing a change log for this patch. > > There are a number of loose ends :- > > - struct's are used rather than classes for now as the whole file > gives encapsulation for now. > - const's need to be applied > - cp_parser_context_free_list is still static and not a member of > cp_parser yet. This also needs a gengtype change to support > GTY((deletable)) as a node and not just on gcroots. > - cp_token functions have not been class'ized yet. > - cp_debug functions are still in global space > - cp_unevaluated_opreand is still in global space > - cp_lexer::get_preprocessor_token() needs rationalizing > - there are still #define's associated with VEC operations that > should be moved to inline methods > - constructors and new methods are still functions as PCH call > ordering conflicts with them, this also allows keeping code changes > parallel and recording incremental changes in the code. > - no_parameters has been left in but is not used I think this is heading in the wrong direction. A class with lot of member functions is a manifestation of a poor C++ design. Some call that a "fat interface." Ideally a good class design should have very few observer (and mutation) functions. Those should form the computational basis of the class, out of which all other functions should be implemented -- as non-member functions. Have a look at http://liz.axiomatics.org/trac/browser/trunk/src/Parser.C The parser there is defined from a very limited set of computation basis: http://liz.axiomatics.org/trac/browser/trunk/src/Parser.H#L79 As a matter of fact, I prefer the non-member functions defined as static function (i.e. with internal linkage) so that we get an unambiguous message from the compiler when a function definition becomes dead code. Do we need to have a separate parser.h file that contains code previously defined in parser.c? Why? -- Gaby
RE: [Patch, test] Enable to prune warnings for tests defined in one exp file
> -Original Message- > From: Mike Stump [mailto:mikest...@comcast.net] > Sent: Tuesday, August 28, 2012 1:21 AM > To: Terry Guo > Cc: gcc-patches@gcc.gnu.org; Richard Guenther > Subject: Re: [Patch, test] Enable to prune warnings for tests defined > in one exp file > > On Aug 27, 2012, at 1:14 AM, Terry Guo wrote: > > This patch intends to provide a chance to prune common warning > messages for > > tests defined in an exp file. > > > Is it OK to trunk? > > Ok. > > If you can find where to document this... :-) That'd be nice. > I checked the texi files in gcc/doc folder, but can't find a suitable place. So I resort to README.gcc in gcc/testsuite which is claimed to list notes for those writing testcases and those writing expect scripts. Following is the patch. Is it OK? BR, Terry 2012-08-30 Terry Guo * README.gcc: Document new variable dg_runtest_extra_prunes. Index: gcc/testsuite/README.gcc === --- gcc/testsuite/README.gcc(revision 190795) +++ gcc/testsuite/README.gcc(working copy) @@ -79,6 +79,11 @@ If a test does not fit into the torture framework, use the dg framework. +If some tests in an exp file need to skip same warning messages, just define +variable dg_runtest_extra_prunes in this exp file and let it contain this warning +message pattern. This can avoid duplicating dg-prune in these cases. +Always remember to clear this variable when leave this exp file. + Copyright (C) 1997, 1998, 2004 Free Software Foundation, Inc.
Re: [PATCH] limited C++ parsing support for gengtype
Hi - 2012/8/30 Aaron Gray : > First of two patches for class'ized cp/parser.c|h gives limited > support for gengtype to parse C++ classes and enums as first class > citizens. Please sync with Diego to avoid duplicate work and/or conflicting designs. Thanks, -- Laurynas
Re: [PATCH] MIPS16 TLS support for GCC
On 2012/8/30 02:44 AM, Richard Sandiford wrote: > Chung-Lin Tang writes: >> On 2012/7/6 02:23 PM, Richard Sandiford wrote: >>> Richard Sandiford writes: > (3) Also related to libraries, I edited CRT_CALL_STATIC_FUNCTION to emit > a 32-bit code sequence under both MIPS/MIPS16 mode (under O32). > > As you can see in the original Feb. patch, I had changes to emit a > MIPS16 version of these static calls, but with the changes in (2) above, > they will not work with the usual situation of a 32-bit MIPS built /lib > (.init/.fini will have 32/16-bit code improperly concatenated). > > The CodeSourcery builds use an independent mips16 sysroot for this, so a > MIPS16 CRT_CALL_STATIC_FUNCTION works there. For the usual case, I think > making it 32-bit is the compatible choice. Yeah, I agree that sounds like the right call. Please do the same for the n32/n64 version (i.e. explicitly make it nomips16 rather than add the #error). >>> >>> BTW, doing this has removed my main concern about having dead code. >>> The original patch had a separate MIPS16 implementation that (as things >>> stood) could never be used by stock sources. That would make it difficult >>> to maintain. >>> >>> Now that the MIPS16 library support is purely adding nomips16 attributes >>> to code that is obviously nomips16, those parts are OK on their own, thanks. >>> (I.e. the mips.h change, the libgcc change, and the libgomp change.) >>> Feel free to drop the multilib thing if you don't want to implement >>> --with-multilib-list. >> >> Hi Richard, just FYI, I just committed the said approved parts. >> gcc/config/mips/t-linux64 had one additional change, adding >> ../lib/mips16 to the corresponding MULTILIB_OSDIRNAMES, or else we end >> with a weird option-named directory for the mips16 libraries. > > Sorry, but the t-linux64 stuff wasn't approved. It was just the mips.h > change, the libgcc change and the libgomp change. > > Please revert the patch to t-linux64. My original objection to adding > mips16 unconditionally still stands: it isn't correct for people who > configure for processors that don't have the MIPS16 ASE (such as Octeon). I have reverted that part. Maybe a list of proper march=XXX/mips16 added to MULTILIB_EXCLUSIONS will do what you're mentioning, though I haven't tried testing that for now. Thanks, Chung-Lin