Re: is LTO aimed for large programs?

2009-11-12 Thread Jan Hubicka
> > Perhaps the question is when not to use -flto and use -fwhopr instead? > > My rule of thumb is: Try -flto first, if it does not work (running out > of memory), try -fwhopr. I think the advantage of -flto is also that it > is better tested, while -fwhopr has known issues. -fwhopr is quite brok

Re: Whole program optimization and functions-only-called-once.

2009-11-12 Thread Jan Hubicka
> On Wed, Nov 4, 2009 at 1:20 PM, Toon Moene wrote: > > You don't happen to recall the bug number ? > > It might be related to PR 41735 which I noticed when looking at the > generated assembly and trying to compare 4.5 to 4.4. I fixed this bug today, so it might help. But it is related to COMDAT

Re: Build broken in libstdc++ on x86_64-linux

2009-11-12 Thread Jan Hubicka
> Hi, > > the build is currently, ie 154122, broken in libstdc++-v3: > > ./src/system_error.cc:95:1: internal compiler error: > Segmentation fault > > Version 154120 works fine for me. I am testing patch for that still. The current version is (updated per Joseph's comment about COMDAT ma

Re: Build broken in libstdc++ on x86_64-linux

2009-11-12 Thread Jan Hubicka
> Jan Hubicka wrote: > > I am testing patch for that still. > > The current version is (updated per Joseph's comment about COMDAT making > > sence > > on !PUBLIC functions). > > > Thanks Honza, I just built successfully r154128 Note that there are

Re: Whole program optimization and functions-only-called-once.

2009-11-12 Thread Jan Hubicka
Hi, this is WIP patch to deal with the unreachable clones problem. It basically renders the clones as unanalyzed cgraph nodes (but with still body in) so IPA passes don't see them. Honza Index: cgraph.c === --- cgraph.c(revision

Re: aliases without a _DECL?

2010-02-05 Thread Jan Hubicka
Hi, I have no idea what you would like to achieve by this? I assume that you want to add aliases to given declaration without actually creating alias DECLs, just assembler symbol names. But without the DECLs there would be absolutely no way to reffer to these within current unit, so I guess cgrap

Re: aliases without a _DECL?

2010-02-05 Thread Jan Hubicka
> On 02/05/2010 05:56 PM, Jan Hubicka wrote: >> But without the DECLs there would be absolutely no way to reffer to these >> within current unit, so I guess cgraph don't need to care about them much >> (i.e. they can just be some list assigned to node or decl). > &g

Re: Peculiar XPASS of gcc.dg/guality/inline-params.c

2010-03-29 Thread Jan Hubicka
> Hi, > > I have run the testcase with the early inliner disabled and noticed > that gcc.dg/guality/inline-params.c XPASSes with early inlining and > XFAILs without it. The reason for the (expected) failure is that > IPA-CP removes a parameter which is constant (but also unused?). I > reckon thi

Re: WHOPR bootstrap, when/how?

2010-04-08 Thread Jan Hubicka
> It was. Unfortunately,work on it stopped last year and it is unlikely > that I will be assigned to this again. I still have some personal > interest on the feature, but given time restrictions, we should make > contingency plans. > > Perhaps the easiest option is to remove the feature. WHOPR

Re: WHOPR bootstrap, when/how?

2010-04-08 Thread Jan Hubicka
> Well, I think this is independent. > It makes a lot of sense to make profiling to work in a way so instrumentation > happens at linktime with LTO and we can read stuff back. This is relatively > easy to do: we need to rewrite profiling pass to work on SSA (that is easy and > desirable anyway and

Re: WHOPR bootstrap, when/how?

2010-04-08 Thread Jan Hubicka
> On 4/8/10 14:10 , Jan Hubicka wrote: > > > So I think tying WHOPR and profile feedback too close together is a mistake. > > Sorry, I didn't mean that. My intent is to make whopr/lto use profiling > information if it is available. Much like we do with other optim

Re: WHOPR bootstrap, when/how?

2010-04-08 Thread Jan Hubicka
> > On 4/8/10 14:10 , Jan Hubicka wrote: > > > > > So I think tying WHOPR and profile feedback too close together is a > > > mistake. > > > > Sorry, I didn't mean that. My intent is to make whopr/lto use profiling > > information if it i

Re: WHOPR bootstrap, when/how?

2010-04-08 Thread Jan Hubicka
> On 4/8/10 14:30 , Jan Hubicka wrote: > >>> On 4/8/10 14:10 , Jan Hubicka wrote: > >>> > >>>> So I think tying WHOPR and profile feedback too close together is a > >>>> mistake. > >>> > >>> Sorry, I didn'

Re: WHOPR bootstrap, when/how?

2010-04-09 Thread Jan Hubicka
> On Thu, 8 Apr 2010, Jan Hubicka wrote: > > > :) We need debug info and hammer out all bugs of course! I would also like > > to > > see possiblity to LTO bootstrap without gold and possibility to not generate > > assembly into LTO .o files. In the typical

Re: WHOPR bootstrap, when/how?

2010-04-09 Thread Jan Hubicka
> On Fri, 9 Apr 2010, Jan Hubicka wrote: > > > > On Thu, 8 Apr 2010, Jan Hubicka wrote: > > > > > > > :) We need debug info and hammer out all bugs of course! I would also > > > > like to > > > > see possiblity to LTO boo

Re: branch probabilities on multiway branches

2010-04-13 Thread Jan Hubicka
> Hi All, > > The following bit of code in predict.c implies branch probabilities > are strictly evenly distributed for multiway branches at present. The > comment suggests it is possible to generate better estimates for more > generic cases, apart from being involved. Could anyone point me to > t

Re: branch probabilities on multiway branches

2010-04-15 Thread Jan Hubicka
> On Thu, Apr 15, 2010 at 1:11 PM, Rahul Kharche wrote: > > The calculate branch probabilities algorithm (1) in the Wu Larus paper > > also evenly distributes branch probabilities when number of outgoing > > edges is > 2, e.g. switch cases implemented as jump tables. > > > > Are they any known heu

New branch: pretty-ipa

2008-11-12 Thread Jan Hubicka
Hi, with LTO getting closer it is obvious that IPA infrastructure needs work and also is getting more interesting ;) I don't think it makes sense to do all the work on LTO branch that contains a lot of temporary stuff, so I've created pretty-ipa branch that unlike LTO branch is targetted to merge

Re: get_ref_base_and_extent() semantics

2008-12-12 Thread Jan Hubicka
> On Fri, Dec 12, 2008 at 12:29 AM, Martin Jambor wrote: > > Hi, > > > > today I have encountered an unpleasant problem with the function > > get_ref_base_and_extent() when it claimed a known and constant offset > > for the expression insn_4(D)->u.fld[arg.82_3].rt_rtvec. (arg being a > >

Re: Inline limits

2009-02-09 Thread Jan Hubicka
> On Thu, 5 Feb 2009, Paul Brook wrote: > > > For -Os it should be enough to set PARAM_STACK_FRAME_GROWTH > > > to zero. Inlining at -Os should already only happen if it decreases > > > (overall!) code-size. Thus, inlining a function that is called once and > > > that does not need to be emitted

Re: [lto] Pass ordering and the different lto1 personalities

2009-02-19 Thread Jan Hubicka
> > The problem here is that LTRANS will run the standard pipeline > > over a callgraph that hasn't been "settled" (i.e., no inlining > > decisions have been applied yet). Perhaps the first thing LTRANS > > should do is just call execute_all_ipa_transforms() and then > > proceed with the regular p

IPA-CP as IPA_PASS in Whopr

2009-02-19 Thread Jan Hubicka
> > > The problem here is that LTRANS will run the standard pipeline > > > over a callgraph that hasn't been "settled" (i.e., no inlining > > > decisions have been applied yet). Perhaps the first thing LTRANS > > > should do is just call execute_all_ipa_transforms() and then > > > proceed with the

Re: [RFC] Better debug info by substitution tracking for inliner (and other passes eliminating whole user variables)

2009-03-05 Thread Jan Hubicka
> On Thu, 5 Mar 2009, Jan Hubicka wrote: > > > Hi, > > this patch resulted from attempt to solve regression we have in > > gdb.opt/inline-locals.exp gdb testsuite and also problems with fact that > > when > > clonning function by ipa-cp we lose any informatio

Re: [RFC] Better debug info by substitution tracking for inliner (and other passes eliminating whole user variables)

2009-03-08 Thread Jan Hubicka
Hi, thanks for support ;) I have to look into the other things mentioned in the thread and decide how to proceed with this idea. > > we are however lost when we have pointer to those struct since there > > is no means describing "there is no memory location for this pointer, > > but it would be poi

Re: -mfpmath=sse,387 is experimental ?

2009-03-11 Thread Jan Hubicka
> Timothy Madden wrote: > > Hello > > > > Is -mfpmath=both for i386 and x86-64 still experimental in gcc 4.3, as > > the in the online manual page ? > > Yes. It might (*might*) be better in GCC 4.4 thanks to the new register > allocator, but it's unlikely that the manual page will be changed bef

GCC EH unwinding bug and libjava calling std::terminate ()

2009-03-27 Thread Jan Hubicka
Hi, current mainline is buggy in EH unwinding effectivly ignoring MUST_NOT_THROW regions when reached via RESX from local handlers. See http://gcc.gnu.org/ml/gcc-patches/2009-03/msg01285.html for details. Unfortunately this patch causes bootstrap failure when building libjava, because std::termina

Re: GCC EH unwinding bug and libjava calling std::terminate ()

2009-03-27 Thread Jan Hubicka
> Jan Hubicka wrote: > > > current mainline is buggy in EH unwinding effectivly ignoring > > MUST_NOT_THROW regions when reached via RESX from local handlers. > > See http://gcc.gnu.org/ml/gcc-patches/2009-03/msg01285.html for details. > > > > Unfortunately

Re: GCC EH unwinding bug and libjava calling std::terminate ()

2009-03-27 Thread Jan Hubicka
> Jan Hubicka wrote: > > > OK, pragma_java_exceptions variable is not there > > It's in mainline now. > > > does something like this work for you? > > Yes. OK, I will do full testing cycle (x86_64-linux) and commit it. Thanks! Honza > > Andrew.

Re: [lto] Mainline merge @145453

2009-04-06 Thread Jan Hubicka
> Unsurprisingly, this merge was quite painful. Particularly > adapting to all the new EH changes. The main changes I needed to > do: > > - The master_clone field is gone from cgraph_node. Some things > had to be handled differently when reading/writing cgraph > nodes. > > - There was a bu

My plans on EH infrastructure

2009-04-08 Thread Jan Hubicka
Hi, while looking into problems of current and pretty-ipa's inlining heuristics implementation, I noticed that we have relatively important problems with EH overhead that confuse inliner code size metrics. Looking deeper into EH problems, I think main issues are: - Inliner tends to produce expo

Re: My plans on EH infrastructure

2009-04-08 Thread Jan Hubicka
> On Wed, 8 Apr 2009, Richard Guenther wrote: > > > On Wed, 8 Apr 2009, Jan Hubicka wrote: > > > - The nature of code duplication in between cleanup at end of block and > > > cleanup in EH actually brings a lot of tail merging possibilities. > > >

Re: My plans on EH infrastructure

2009-04-08 Thread Jan Hubicka
> On Wed, 8 Apr 2009, Jan Hubicka wrote: > > > Some remaining issues: > > - FILTER_EXPR/OBJ_REF_EXPR is currently handled in quite dangerous way. > > Original Rth's code made them quite 100% volatile. Now we can PRE them. > > The FILTER_EXPR/OBJ_RE

Re: My plans on EH infrastructure

2009-04-08 Thread Jan Hubicka
> Sylvain Pion a écrit : > >Naive user question : is this going to improve the efficiency > >of throwing exceptions, at least in the restricted cases of : There is little improvement already via EH cleanup: at least cleanups/catch regions that turns out to be empty are now eliminated and does not

Re: My plans on EH infrastructure

2009-04-08 Thread Jan Hubicka
> Jan Hubicka a écrit : > >>Sylvain Pion a écrit : > >>>Naive user question : is this going to improve the efficiency > >>>of throwing exceptions, at least in the restricted cases of : > > > >There is little improvement already via EH cleanup: at l

Re: My plans on EH infrastructure

2009-04-11 Thread Jan Hubicka
> 2009/4/8 Sylvain Pion : > > > Maybe, but for exceptions which are relatively local, say, inside a given > > library, the user can assume that GCC has switched to the "local ABI" with > > fast internal exceptions, since he may have compiled the library as one > > translation unit, so he may be ab

Re: [RFC] Massive recursion in tree_ssa_phiprop_1?

2009-04-19 Thread Jan Hubicka
> On Sat, Apr 18, 2009 at 7:01 PM, Dave Korn > wrote: > > > >    Hi, > > > >  I'm getting a stack overflow caused by *very* deep recursion while trying > > to > > build gnu/javax/swing/text/html/parser/HTML_401F.o in libjava, bootstrapping > > HEAD on i686-pc-cygwin. > > > >  With the default per

Re: [RFC] Massive recursion in tree_ssa_phiprop_1?

2009-04-19 Thread Jan Hubicka
> 2009/4/19 Jan Hubicka : > >> On Sat, Apr 18, 2009 at 7:01 PM, Dave Korn > >> wrote: > >> > > >> >    Hi, > >> > > >> >  I'm getting a stack overflow caused by *very* deep recursion while > >&g

Re: [RFC] Massive recursion in tree_ssa_phiprop_1?

2009-04-20 Thread Jan Hubicka
> On Mon, Apr 20, 2009 at 8:29 AM, Paolo Bonzini wrote: > > Dave Korn wrote: > >> Richard Guenther wrote: > >> > >>> Well ... in this case it's likely the problem that propagate_with_phi is > >>> inlined (single-use static function) and maybe other helpers of it too. > >> > >>   It is inlined.  I

Re: Emit jump insn in function prologue

2009-04-24 Thread Jan Hubicka
> Peter Leist writes: > > > can I use emit_cmp_and_jump_insns while creating the function > > prologue/epilogue? > > If I try, I always get an error at runtime > > > > func.c:33: internal compiler error: in make_edges, at cfgbuild.c:354 > > > > I think this is because the jump doesn't get an JUM

Re: [RFC] Thoughts on reordering the source tree

2009-05-01 Thread Jan Hubicka
> On Fri, May 1, 2009 at 2:05 PM, Steven Bosscher wrote: > > Hello, > > > > The GCC source tree is getting really big.  We currently have in gcc/: > > > > - 337 .c files > > - 171 .h files > > > > Personally, I think the source tree is quite a mess, the way it is > > now.  A long time ago (I can't

Empty loops removal (Was Re: Some extra decorations)

2009-05-03 Thread Jan Hubicka
> Jan Hubicka: > > >There are also few annotations that I would like to add to functions > >declared in headers where GCC don't know if they are finite so it is not > >sure it can remove call to them even if it can prove there are no side > >effects otherwise.

Re: Empty loops removal (Was Re: Some extra decorations)

2009-05-03 Thread Jan Hubicka
> 2009/5/4 Joseph S. Myers: > > On Mon, 4 May 2009, Jan Hubicka wrote: > > > >> On mainline I enabled infinite loop removal at > >> -funsafe-loop-optimizations.  I would suggest adding > >> -fempty-loops-terminate and make it default for C++? It does not

Re: New GCC releases comparison and comparison of GCC4.4 and LLVM2.5 on SPEC2000

2009-05-13 Thread Jan Hubicka
> Paolo Bonzini wrote: > > > >Rather, we should seriously understand what caused the compilation time > >jump in 4.2, and whether those are still a problem. We made a good job > >in 4.0 and 4.3 offsetting the slowdowns from infrastructure changes with > >speedups from other changes; and 4.4 while

Re: can_throw_internal affected by inlining?

2009-07-11 Thread Jan Hubicka
> Re: http://gcc.gnu.org/ml/gcc-patches/2009-03/msg01404.html > > Do you have test cases for this? > > Changing can_throw_internal/external to depend on whether or not future > inlining is possible looks *very* wrong to me. Surely the only thing > that matters for new code that might appear "b

Re: can_throw_internal affected by inlining?

2009-07-11 Thread Jan Hubicka
> On 07/11/2009 05:59 AM, Jan Hubicka wrote: > >Well, we can either teach inlinable_call_p to handle your new indirect > >calls as "for sure uninlinable", make it conservative and consider all > >calls inlinable or we can stop doing the early removal of MUST_NOT_T

Re: can_throw_internal affected by inlining?

2009-07-11 Thread Jan Hubicka
> On 07/11/2009 10:59 AM, Jan Hubicka wrote: > >I would like to bring more of EH lowering to tree level (i.e. instead of > >relying on RTL to lower RESX instructions into gotos/calls/jumptables do > >this at gimple and keep to RTL world only job of constructing landing > &g

Re: LTO question

2010-04-28 Thread Jan Hubicka
> On 4/28/10 10:26 , Manuel López-Ibá?ez wrote: > Not yet, I mistakenly thought -fwhole-program is the same as -fwhopr > and it is just for solving scaling issue of large program.(These two > options do look similar :-). I shall try next. > >>> > >>> Yep, -fwhopr is not ideal name, b

Re: LTO question

2010-04-28 Thread Jan Hubicka
> > On 4/28/10 10:26 , Manuel López-Ibá?ez wrote: > > Not yet, I mistakenly thought -fwhole-program is the same as -fwhopr > > and it is just for solving scaling issue of large program.(These two > > options do look similar :-). I shall try next. > > >>> > > >>> Yep, -fwhopr is not i

Re: LTO question

2010-04-29 Thread Jan Hubicka
> 2010/4/29 Jan Hubicka : > >> > On 4/28/10 10:26 , Manuel López-Ibá?ez wrote: > >> > >>>> Not yet, I mistakenly thought -fwhole-program is the same as -fwhopr > >> > >>>> and it is just for solving scaling issue of large program.(Thes

Re: LTO vs static library archives [was Re: lto1: internal compiler error: in lto_symtab_merge_decls_1, at lto-symtab.c:549]

2010-04-29 Thread Jan Hubicka
> Well, we'd then need to re-architect the symbol merging and > LTO unit read-in to properly honor linking semantics (drop > a LTO unit from an archive if it doesn't resolve any unresolved > symbols). I don't know how easy that will be, but it shouldn't > be impossible at least. We also should ke

Re: LTO vs static library archives [was Re: lto1: internal compiler error: in lto_symtab_merge_decls_1, at lto-symtab.c:549]

2010-04-29 Thread Jan Hubicka
> 2010/4/29 Jan Hubicka : > >> Well, we'd then need to re-architect the symbol merging and > >> LTO unit read-in to properly honor linking semantics (drop > >> a LTO unit from an archive if it doesn't resolve any unresolved > >> symbols).  I d

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-04-29 Thread Jan Hubicka
> GCC-4.5.0 and LLVM-2.7 were released recently. To understand > where we stand after releasing GCC-4.5.0 I benchmarked it on SPEC2000 > for x86/x86-64 and posted the comparison of it with the > previous GCC releases and LLVM-2.7. > > Even benchmarking SPEC2000 takes a lot of time on the fastest

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-04-29 Thread Jan Hubicka
> Thanks for the comments. FDO will probably improve SPEC2000 score. > Although it is not obvious for some tests because the train data sets > for them are different from the reference data sets and it might > actually mislead the compiler. There are several studies on the topic and it is

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-04-29 Thread Jan Hubicka
BTW we are also tracking SPEC2k6 with and without LTO (not FDO runs) http://gcc.opensuse.org/SPEC/CINT/sb-barbella.suse.de-ai-64/recent.html http://gcc.opensuse.org/SPEC/CINT/sb-barbella.suse.de-head-64-2006/recent.html not all 2k6 tests pass with LTO so it will need a bit care to compare results

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-04-29 Thread Jan Hubicka
> I noticed eon's peak options do not include FDO, is that intended? I think it is just bug in page header, but I will double check. Base and peak should match otherwise. Honza

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-04-29 Thread Jan Hubicka
seriously find time for that yet (well, hoping that submitting the thesis will make this easier). What are the LIPO's features that are missing in -flto -fprofile-use? Honza > > David > > On Thu, Apr 29, 2010 at 2:38 PM, Steven Bosscher > wrote: > > On Thu, Apr 29, 2010

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-04-29 Thread Jan Hubicka
> 2010/4/30 Jan Hubicka : > >> Thanks for the suggestion. Raksit currently is busy with merging trunk > >> changes back to lw-ipo branch which can be a daunting task. After that > >> this can be done.  (Our internal release is based on 4.4). > > > >

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-04-30 Thread Jan Hubicka
> In theory, LIPO should not generate better results than LTO+FDO. What > makes LIPO attractive is that it allows distributed build from the > beginning. Its integration with large distributed build system is also > easy. Another point is that LIPO can be decoupled from FDO as well. The integrati

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-04-30 Thread Jan Hubicka
> > > > Interesting.  My plan for profiling with LTO is to ultimately make it > > linktime > > transform.  This will be more difficult with WHOPR (i.e. instrumenting need > > function bodies that are not available at WPA time), but I believe it is > > solvable: just assign uids to the edges and do

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-05-01 Thread Jan Hubicka
> > Vortex needs -fno-strict-aliasing. It casts between two record types > with one record being a 'prefix' of another. So today runs are complette. Thanks to Richi who fixed ICE in symtab merging that affected perl and GCC. With vortex problem was that in addition to -fno-strict-aliasing it i

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-05-02 Thread Jan Hubicka
> On Sat, May 1, 2010 at 2:36 AM, Jan Hubicka wrote: > >> > >> Vortex needs -fno-strict-aliasing.  It casts between two record types > >> with one record being a 'prefix' of another. > > > > So today runs are complette.  Thanks to Richi who fix

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-05-04 Thread Jan Hubicka
> On Sun, May 2, 2010 at 6:45 AM, Jan Hubicka wrote: > That depends. The following cases exist in vortex: > > 1) the value is runtime constant -- it is read from input file but > never changed -- e.g.: QueBug. Nothing can be done by the compiler in > this case; > > 2)

Re: GIMPLE types merging in LTO compiler

2010-05-14 Thread Jan Hubicka
> On Fri, May 14, 2010 at 9:33 PM, Eric Botcazou wrote: > >> Ugh.  This presents a chicken-and-egg problem to symbol resolution > >> and type-merging. > >> > >> To be clear, the issue is sth like > >> > >> unit1 > >> - > >> int size; > >> int a[size]; > >> > >> unit2 > >> -- > >> extern in

Re: Does `-fwhole-program' make sense when compiling shared libraries?

2010-05-17 Thread Jan Hubicka
> On Mon, May 17, 2010 at 10:57:31AM -0700, Toon Moene wrote: > > On 05/17/2010 08:08 PM, Dave Korn wrote: > > > > > > Hi! > > > > > >PR42904 is a bug where, when compiling a windows DLL using > > > -fwhole-program, > > > the compiler optimises away the entire library body, because there'

Re: Does `-fwhole-program' make sense when compiling shared libraries?

2010-05-18 Thread Jan Hubicka
> [ hmf. This one got lost to an smtp error when I sent it yesterday. It > appears there's more or less agreement that at the moment you're supposed to > manually annotate all external entry points if you want to use -fwhole-program > on a library. On windows, where we often do that anyway, it l

Re: Where does the time go?

2010-05-21 Thread Jan Hubicka
> On Thu, May 20, 2010 at 11:21 PM, Xinliang David Li > wrote: > > On Thu, May 20, 2010 at 2:18 PM, Steven Bosscher > > wrote: > >> On Thu, May 20, 2010 at 11:14 PM, Xinliang David Li > >> wrote: > >>> stack variable overlay and stack slot assignments is here too. > >> > >> Yes, and for these

Re: Where does the time go?

2010-05-21 Thread Jan Hubicka
> 2010/5/21 Jan Hubicka : > >> On Thu, May 20, 2010 at 11:21 PM, Xinliang David Li > >> wrote: > >> > On Thu, May 20, 2010 at 2:18 PM, Steven Bosscher > >> > wrote: > >> >> On Thu, May 20, 2010 at 11:14 PM, Xinliang David Li &g

Re: externally_visible and resoultion file

2010-05-27 Thread Jan Hubicka
> On Wed, May 26, 2010 at 5:53 PM, Bingfeng Mei wrote: > > Hi, Richard, > > With resolution file generated by GOLD (or I am going to hack gnu LD),  is > > externally_visible attribute still needed to annotate those symbols accessed > > from non-LTO objects when compiling with -fwhole-program. > >

Re: gcc compilation broken with --enable-checking=release

2010-05-27 Thread Jan Hubicka
Hi, I've committed the following fix. * cgraph.h (struct cgraph_node): Mark former_clone_of by GTY ((skip)). * cgraphunit.c (clone_of_p): Compile only when checking is enabled. Index: cgraph.h === *** cgraph.h(revi

Unused variables and functions and missing const decls in cc1 binary

2010-05-29 Thread Jan Hubicka
Hi, I do not have time to poke too much about this, but with whole-program build it is easy to see what functions ends up being unused in final cc1 binary. Not all of those are unnecesary (and some are for future use, for debugging or used by other binaries), but it might serve as guideline to rem

Re: Issue with LTO/-fwhole-program

2010-06-11 Thread Jan Hubicka
> Ah, so the problem is the missing -flto in the second compilation > step? I think this is a bug in the compiler for not reporting this > somehow. Is there are PR open for this? Compiler can not report it because it does not see the other object files. It is really up to user to understand -fwhol

Re: Issue with LTO/-fwhole-program

2010-06-11 Thread Jan Hubicka
> On 11/06/2010 14:26, Jan Hubicka wrote: > > > Perhaps we can somehow poison the object names that are brought local with > > -fwhole-program > > so linking explode, but I am not sure there is way to do so. > > Could emit warning symbols, but, like the ot

Re: [RFC] Cleaning up the pass manager

2010-06-15 Thread Jan Hubicka
> I have been thinking about doing some cleanups to the pass manager. > The goal would be to have the pass manager be the central driver of > every action done by the compiler. In particular, the front ends > should make use of it and the callgraph manager, instead of the > twisted interactions we

Re: Massive performance regression from switching to gcc 4.5

2010-06-25 Thread Jan Hubicka
> On Fri, Jun 25, 2010 at 8:15 AM, Jonathan Adamczewski > wrote: > > On 25/06/10 06:39, Richard Guenther wrote: > >> There are btw. some bugs wrt accounting of functions called once > >> being inlined in 4.5 which were fixed on trunk which allow extra > >> inlining. > >> > > > > Are these changes

Re: Massive performance regression from switching to gcc 4.5

2010-06-25 Thread Jan Hubicka
> Hi, > > On Fri, 25 Jun 2010, Jan Hubicka wrote: > > > I would be also very interested to know how profile feedback works in this > > case > > (and why it does not work in previous releases). > > Profiling multi-threading programs needs -fprofile-correctio

Re: Massive performance regression from switching to gcc 4.5

2010-06-27 Thread Jan Hubicka
> On Fri, Jun 25, 2010 at 06:10:56AM -0700, Jan Hubicka wrote: > > When you compile with -Os, the inlining happens only when code size reduces. > > Thus we pretty much care about the code size metrics only. I suspect the > > problem here might be that normal C++ code needs

Re: Massive performance regression from switching to gcc 4.5

2010-06-27 Thread Jan Hubicka
> On Fri, 25 Jun 2010, it was written: > > On Thu, Jun 24, 2010 at 11:50:52AM -0700, Taras Glek wrote: > > > We switched gcc4.3 for gcc4.5 and our automated benchmarking > > > infrastructure reported 4-19% slowdown on most of our performance > > > metrics on 32 and 64bit Linux. > > > > Could you p

Re: Massive performance regression from switching to gcc 4.5

2010-06-27 Thread Jan Hubicka
> Jan Hubicka wrote: >>> On Fri, 25 Jun 2010, it was written: >>> There sure is something in 4.5. I've seen a 1-10% slowdown at the GiNaC >>> (a computer algebra library) benchmark suite after switching from 4.4 to >>> 4.5 on x86_64 when compiling

Re: Massive performance regression from switching to gcc 4.5

2010-06-27 Thread Jan Hubicka
> > Jan Hubicka wrote: > >>> On Fri, 25 Jun 2010, it was written: > >>> There sure is something in 4.5. I've seen a 1-10% slowdown at the GiNaC > >>> (a computer algebra library) benchmark suite after switching from 4.4 to > >>> 4.5

Re: Massive performance regression from switching to gcc 4.5

2010-06-27 Thread Jan Hubicka
> > > Jan Hubicka wrote: > > >>> On Fri, 25 Jun 2010, it was written: > > >>> There sure is something in 4.5. I've seen a 1-10% slowdown at the GiNaC > > >>> (a computer algebra library) benchmark suite after switching from 4.4 to >

Re: Massive performance regression from switching to gcc 4.5

2010-06-27 Thread Jan Hubicka
> > (it is regression at 4.5 branch, forgot to mention) PR44694 GiNaC indeed shows interesting behaviour. Just the first test on 4.3 is: timing commutative expansion and substitution size: 100 200 400 800 time/s: 0.064 0.301.4 6.2 for 4.5 timing commu

Re: role of executable_checksum & LTO?

2010-06-28 Thread Jan Hubicka
> Hello all, > > What is the role of executable_checksum from c-common.h & generated by > genchecksum. > > In particular, in the MELT runtime plugin (actually in the MELT branch) > I was supposing it was always defined. However, when referencing it from > melt-runtime.c I got an undefined symbol

Re: Massive performance regression from switching to gcc 4.5

2010-06-30 Thread Jan Hubicka
> On 06/30/2010 02:26 PM, Basile Starynkevitch wrote: >> On Wed, 2010-06-30 at 14:23 -0700, Taras Glek wrote: >> >>> I tried 4.5 -O2 and it's actually faster than 4.3 -Os. >>> >>> I am happy that -O2 performance is actually pretty good, but -Os >>> regression is going to hurt on mobile. >>>

Re: Plug-ins on Windows

2010-06-30 Thread Jan Hubicka
> On 06/30/2010 03:46 PM, Dave Korn wrote: > > Although we could build plugins as Windows DLLs and have GCC load them at > > runtime, if those DLLs needed to refer to anything in the main GCC > > executable, > > it would have to be specifically linked to import it - and imports on > > Windows >

Re: Massive performance regression from switching to gcc 4.5

2010-07-01 Thread Jan Hubicka
>> When you compile with -Os, the inlining happens only when code size reduces. >> Thus we pretty much care about the code size metrics only. I suspect the >> problem here might be that normal C++ code needs some inlining to make >> abstraction penalty go away. GCC -Os implementation is generally

Re: Crucial C++ inlining broken under -Os

2010-07-02 Thread Jan Hubicka
> Quoting Richard Guenther : > >> That is, we no longer optimistically assume that comdat functions >> can be eliminated if there are no callers in the local TU in 4.5 >> (but we did in previous releases). > > But if the function is very simple, the only reason to keep it would be > if its address

Re: Crucial C++ inlining broken under -Os

2010-07-02 Thread Jan Hubicka
> Quoting Jan Hubicka : > >> The behaviour change is about COMDAT functions that are larger than call >> overhead but either called just once or small enough so code growth caused >> by inlining is smaller than the function body size itself. In these cases >> we ma

Re: Massive performance regression from switching to gcc 4.5

2010-07-06 Thread Jan Hubicka
> > On 06/30/2010 02:26 PM, Basile Starynkevitch wrote: > >> On Wed, 2010-06-30 at 14:23 -0700, Taras Glek wrote: > >> > >>> I tried 4.5 -O2 and it's actually faster than 4.3 -Os. > >>> > >>> I am happy that -O2 performance is actually pretty good, but -Os > >>> regression is going to hurt on m

Re: Massive performance regression from switching to gcc 4.5

2010-07-06 Thread Jan Hubicka
... and time report Execution times (seconds) garbage collection: 12.48 ( 2%) usr 0.00 ( 0%) sys 12.50 ( 2%) wall 0 kB ( 0%) ggc callgraph optimization: 0.21 ( 0%) usr 0.00 ( 0%) sys 0.21 ( 0%) wall 2743 kB ( 0%) ggc varpool construction : 0.97 ( 0%) usr 0.02 ( 0%)

Re: GNU/Linux ABI documentation ? GCC supports SSSE3 in general purpose code generation ?

2010-07-12 Thread Jan Hubicka
> > Is there a document or standard (or group of standards) that define the > collective ABIs of GNU/Linux systems using ELF binary formats of various > CPU architectures, including at least: > IA32 (i386/i686/AMD64/EMT64/etc...) > ARM (v5, v5t, v7, etc...) x86_64 ABI is at www.x86-64.org/do

Re: Updating frequencies and dominators

2010-09-14 Thread Jan Hubicka
> So I get in stderr: > , > | g (nD.1176) > | { > | : > | Invalid sum of outgoing probabilities 0.0% > | goto ; > | > | Invalid sum of incoming frequencies 0, should be 4600 > | :; > | f (&"1"[0]); > | goto ; > | > | Invalid sum of incoming frequencies 0, should be 5400 > | :; > | f (

Re: LTO symtabs inconsistency

2010-10-17 Thread Jan Hubicka
> On 16/10/2010 21:20, Dave Korn wrote: > > >> U _libintl_bindtextdomain > >> U _libintl_gettext > >> U _libintl_textdomain > > >> 0070 9600 5f6c 6962696e 746c5f74 .._libintl_t > >> 0080 65787464 6f6d6169 6e02 extdomain... > >> 0090 000

Re: Is it possbile to hack I386 backend to make all function calls to be indirect function calling?

2010-10-25 Thread Jan Hubicka
> redriver jiang writes: > > > I meet a requirement to make all function calls to be indirect > > function calling ( for I386 GCC compiler). I am not familiar with > > frontend, so my first idea is > > > > to hack it from backend, change the asm output for "call" and > > "call_value" insn pattern

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-11-15 Thread Jan Hubicka
> For peak, FDO is the most effective option. It can boost performance > by 7-10% depending on the program. The options you suggested probably > won't make too big a dent. -funroll-loops can hurt performance > without profiling. More aggressive inlining, ipa-cp, unswitching etc -funroll-loops ov

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-11-15 Thread Jan Hubicka
> I did some measurement (64bit). > > Experiment 1: > > -O2 -funroll-loops vs -O2 > > It improves performance (geomean) by 0.56%, not too much: > O2 O2 unroll-loops > 164.gzip13241331 0.56%

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-11-15 Thread Jan Hubicka
> This means O3 level inlining should be turned on also for lto build by > default -- as -O2 lto performance is too unimpressive. I am just re-tunning the inliner and hope to get more speedups for smaller costs than we get right now. I however don't think we can resonably enable it as it is at LT

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-11-15 Thread Jan Hubicka
> > This means O3 level inlining should be turned on also for lto build by > > default -- as -O2 lto performance is too unimpressive. > > I am just re-tunning the inliner and hope to get more speedups for smaller > costs than we get right now. I however don't think we can resonably enable it > as

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-11-15 Thread Jan Hubicka
> On Mon, Nov 15, 2010 at 4:25 PM, Jan Hubicka wrote: > >> This means O3 level inlining should be turned on also for lto build by > >> default -- as -O2 lto performance is too unimpressive. > > > > I am just re-tunning the inliner and hope to get more speedups

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-11-15 Thread Jan Hubicka
> > Fortunately linker plugin solves the problem here and this is why I want to > > have it by default.  GCC then can do effectively -fwhole-program for > > binaries > > (since linker knows what will be bound elsewhere) and take advantage of > > visibility((hidden)) hints for shared libraries same

Re: GCC-4.5.0 comparison with previous releases and LLVM-2.7 on SPEC2000 for x86/x86_64

2010-11-15 Thread Jan Hubicka
> On Mon, Nov 15, 2010 at 5:39 PM, Jan Hubicka wrote: > >> > Fortunately linker plugin solves the problem here and this is why I want > >> > to > >> > have it by default.  GCC then can do effectively -fwhole-program for > >> > binaries >

<    1   2   3   4   5   6   7   >