Re: gcc.gnu.org Bugzilla: Perl error Can't locate mro.pm in @INC
Le 26. 01. 11 17:04, Frank Ch. Eigler a écrit : Can't locate mro.pm in @INC > > This may be fixed now, with a hand-made dummy mro.pm file. I think I know what's wrong. I will paste what I wrote at https://bugzilla.mozilla.org/show_bug.cgi?id=675633#c2: email_in.pl requires Email::Reply which requires Email::Abstract which requires mro since 3.003. So if you have Email::Abstract 3.002 or older, you shouldn't get this error. If you have Email::Abstract 3.003 or newer, then this means MRO::Compat (which has "mro") is not correctly installed. Frédéric
Re: libgcc: strange optimization
Richard Guenther wrote: > On Tue, Aug 2, 2011 at 3:23 PM, Ian Lance Taylor wrote: > > Richard Guenther writes: > >> I suggest to amend the documentation for local call-clobbered register > >> variables to say that the only valid sequence using them is from a > >> non-inlinable function that contains only direct initializations of the > >> register variables from constants or parameters. > > > > Let's just implement those requirements in the compiler itself. > > Doesn't work for existing code, no? And if thinking new code then > I'd rather have explicit dependences (and a way to represent them). > Thus, for example > > asm ("scall" : : "asm("r0")" (10), ...) > > thus, why force new constraints when we already can figure out > local register vars by register name? Why not extend the constraint > syntax somehow to allow specifying the same effect? Maybe it would be possible to implement this while keeping the syntax of existing code by (re-)defining the semantics of register asm to basically say that: If a variable X is declared as register asm for register Y, and X is later on used as operand to an inline asm, the register allocator will choose register Y to hold that asm operand. (And this is the full specification of register asm semantics, nothing beyond this is guaranteed.) It seems this semantics could be implemented very early on, probably in the frontend itself. The frontend would mark the *asm* statement as using the specified register (there would be no special handling of the *variable* as such, after the frontend is done). The optimizers would then simply be required to pass the asm-statement register annotations though, much like today they pass constraints through. At the point where register allocation decisions are made, those register annotations would then be acted on. Bye, Ulrich -- Dr. Ulrich Weigand GNU Toolchain for Linux on System z and Cell BE ulrich.weig...@de.ibm.com
Re: libgcc: strange optimization
Ulrich Weigand wrote: > Richard Guenther wrote: >> On Tue, Aug 2, 2011 at 3:23 PM, Ian Lance Taylor wrote: >>> Richard Guenther writes: I suggest to amend the documentation for local call-clobbered register variables to say that the only valid sequence using them is from a non-inlinable function that contains only direct initializations of the register variables from constants or parameters. >>> Let's just implement those requirements in the compiler itself. >> Doesn't work for existing code, no? And if thinking new code then >> I'd rather have explicit dependences (and a way to represent them). >> Thus, for example >> >> asm ("scall" : : "asm("r0")" (10), ...) >> >> thus, why force new constraints when we already can figure out >> local register vars by register name? Why not extend the constraint >> syntax somehow to allow specifying the same effect? Yes this would be exact equivalence of register int var asm ("r0") = 10; ... asm ("scall" : : "r" (var), ...) > Maybe it would be possible to implement this while keeping the syntax > of existing code by (re-)defining the semantics of register asm to > basically say that: > > If a variable X is declared as register asm for register Y, and X > is later on used as operand to an inline asm, the register allocator > will choose register Y to hold that asm operand. (And this is the > full specification of register asm semantics, nothing beyond this > is guaranteed.) Yes, that's reasonable. As I understand the docs, in code like void foo () { register int var asm ("r1") = 10; asm (";; use r1"); } there is nothing that connects var to the asm and assuming that r1 holds 10 in the asm is a user error. The only place where the asm attached to a variable needs to have effect are the inline asm sequences that explicitly refer to respective variables. If there is no inline asm referencing a local register variable, there is on difference to a non-register auto variable; there could even be a warning that in such a case that register int var asm ("r1") = 10; is equivalent to int var = 10; > It seems this semantics could be implemented very early on, probably > in the frontend itself. The frontend would mark the *asm* statement > as using the specified register (there would be no special handling > of the *variable* as such, after the frontend is done). The optimizers > would then simply be required to pass the asm-statement register > annotations though, much like today they pass constraints through. > At the point where register allocation decisions are made, those > register annotations would then be acted on. > > Bye, > Ulrich I wonder why it does not work like that in the current implementation. Local register variable is just like using a similar constraint (with the only difference that in general there is no such constraint, otherwise the developer would use it). A pass like .asmcons could take care of it just the same way it does for constraints and no optimizer passed would have to bother if a variable is a local register or not. This would render local register variables even more functional because no one needed to care if there were implicit library calls or things like that. Johann
Re: libgcc: strange optimization
On Wed, Aug 3, 2011 at 11:50 AM, Georg-Johann Lay wrote: > Ulrich Weigand wrote: >> Richard Guenther wrote: >>> On Tue, Aug 2, 2011 at 3:23 PM, Ian Lance Taylor wrote: Richard Guenther writes: > I suggest to amend the documentation for local call-clobbered register > variables to say that the only valid sequence using them is from a > non-inlinable function that contains only direct initializations of the > register variables from constants or parameters. Let's just implement those requirements in the compiler itself. >>> Doesn't work for existing code, no? And if thinking new code then >>> I'd rather have explicit dependences (and a way to represent them). >>> Thus, for example >>> >>> asm ("scall" : : "asm("r0")" (10), ...) >>> >>> thus, why force new constraints when we already can figure out >>> local register vars by register name? Why not extend the constraint >>> syntax somehow to allow specifying the same effect? > > Yes this would be exact equivalence of > > register int var asm ("r0") = 10; > ... > asm ("scall" : : "r" (var), ...) > > >> Maybe it would be possible to implement this while keeping the syntax >> of existing code by (re-)defining the semantics of register asm to >> basically say that: >> >> If a variable X is declared as register asm for register Y, and X >> is later on used as operand to an inline asm, the register allocator >> will choose register Y to hold that asm operand. (And this is the >> full specification of register asm semantics, nothing beyond this >> is guaranteed.) > > Yes, that's reasonable. As I understand the docs, in code like > > void foo () > { > register int var asm ("r1") = 10; > asm (";; use r1"); > } > > there is nothing that connects var to the asm and assuming that > r1 holds 10 in the asm is a user error. > > The only place where the asm attached to a variable needs to have > effect are the inline asm sequences that explicitly refer to > respective variables. If there is no inline asm referencing a > local register variable, there is on difference to a non-register > auto variable; there could even be a warning that in such a case > that > > register int var asm ("r1") = 10; > > is equivalent to > > int var = 10; > >> It seems this semantics could be implemented very early on, probably >> in the frontend itself. The frontend would mark the *asm* statement >> as using the specified register (there would be no special handling >> of the *variable* as such, after the frontend is done). The optimizers >> would then simply be required to pass the asm-statement register >> annotations though, much like today they pass constraints through. >> At the point where register allocation decisions are made, those >> register annotations would then be acted on. >> >> Bye, >> Ulrich > > I wonder why it does not work like that in the current implementation. > Local register variable is just like using a similar constraint > (with the only difference that in general there is no such constraint, > otherwise the developer would use it). A pass like .asmcons could > take care of it just the same way it does for constraints and no > optimizer passed would have to bother if a variable is a local register > or not. > > This would render local register variables even more functional > because no one needed to care if there were implicit library calls > or things like that. Yes, I like that idea. Richard.
Re: libgcc: strange optimization
Hi, On Wed, 3 Aug 2011, Richard Guenther wrote: > > Yes, that's reasonable. As I understand the docs, in code like > > > > void foo () > > { > > register int var asm ("r1") = 10; > > asm (";; use r1"); > > } > > > > there is nothing that connects var to the asm and assuming that > > r1 holds 10 in the asm is a user error. > > > > The only place where the asm attached to a variable needs to have > > effect are the inline asm sequences that explicitly refer to > > respective variables. If there is no inline asm referencing a > > local register variable, there is on difference to a non-register > > auto variable; there could even be a warning that in such a case > > that > > > > register int var asm ("r1") = 10; > > > > is equivalent to > > > > int var = 10; > > > > This would render local register variables even more functional > > because no one needed to care if there were implicit library calls or > > things like that. > > Yes, I like that idea. I do too. Except it doesn't work :) There's a common idiom of accessing registers read-only by declaring local register vars. E.g. to (*grasp*) the stack pointer. There won't be a DEF for that register var, and hence at use-points we couldn't reload any sensible values into those registers (and we really shouldn't clobber the stack pointer in this way). We could introduce that special semantic only for non-reserved registers, and require no writes to register vars for reserved registers. Or we could simply do: if (any_local_reg_vars) optimize = 0; But I already see people wanting to _do_ optimization also with local reg vars, "just not the wrong optimizations" ;-/ Ciao, Michael.
Re: libgcc: strange optimization
On Wed, Aug 3, 2011 at 3:27 PM, Michael Matz wrote: > Hi, > > On Wed, 3 Aug 2011, Richard Guenther wrote: > >> > Yes, that's reasonable. As I understand the docs, in code like >> > >> > void foo () >> > { >> > register int var asm ("r1") = 10; >> > asm (";; use r1"); >> > } >> > >> > there is nothing that connects var to the asm and assuming that >> > r1 holds 10 in the asm is a user error. >> > >> > The only place where the asm attached to a variable needs to have >> > effect are the inline asm sequences that explicitly refer to >> > respective variables. If there is no inline asm referencing a >> > local register variable, there is on difference to a non-register >> > auto variable; there could even be a warning that in such a case >> > that >> > >> > register int var asm ("r1") = 10; >> > >> > is equivalent to >> > >> > int var = 10; >> > >> > This would render local register variables even more functional >> > because no one needed to care if there were implicit library calls or >> > things like that. >> >> Yes, I like that idea. > > I do too. Except it doesn't work :) > > There's a common idiom of accessing registers read-only by declaring local > register vars. E.g. to (*grasp*) the stack pointer. There won't be a DEF > for that register var, and hence at use-points we couldn't reload any > sensible values into those registers (and we really shouldn't clobber the > stack pointer in this way). > > We could introduce that special semantic only for non-reserved registers, > and require no writes to register vars for reserved registers. > > Or we could simply do: > > if (any_local_reg_vars) > optimize = 0; > > But I already see people wanting to _do_ optimization also with local reg > vars, "just not the wrong optimizations" ;-/ I'd say we should start rejecting all these bogus constructs by default (maybe accepting them with -fpermissive and then, well, maybe generate some dwim code). That is, local register var decls are only valid with an initializer, they are implicitly constant (you can't re-assign to them). Reserved registers are a no-go (like %esp), either global or local. Richard. > > Ciao, > Michael.
Re: libgcc: strange optimization
Richard Guenther wrote: > On Wed, Aug 3, 2011 at 3:27 PM, Michael Matz wrote: >> Hi, >> >> On Wed, 3 Aug 2011, Richard Guenther wrote: >> Yes, that's reasonable. As I understand the docs, in code like void foo () { register int var asm ("r1") = 10; asm (";; use r1"); } there is nothing that connects var to the asm and assuming that r1 holds 10 in the asm is a user error. The only place where the asm attached to a variable needs to have effect are the inline asm sequences that explicitly refer to respective variables. If there is no inline asm referencing a local register variable, there is on difference to a non-register auto variable; there could even be a warning that in such a case that register int var asm ("r1") = 10; is equivalent to int var = 10; This would render local register variables even more functional because no one needed to care if there were implicit library calls or things like that. >>> Yes, I like that idea. >> I do too. Except it doesn't work :) >> >> There's a common idiom of accessing registers read-only by declaring local >> register vars. E.g. to (*grasp*) the stack pointer. There won't be a DEF >> for that register var, and hence at use-points we couldn't reload any >> sensible values into those registers (and we really shouldn't clobber the >> stack pointer in this way). >> >> We could introduce that special semantic only for non-reserved registers, >> and require no writes to register vars for reserved registers. >> >> Or we could simply do: >> >> if (any_local_reg_vars) >>optimize = 0; >> >> But I already see people wanting to _do_ optimization also with local reg >> vars, "just not the wrong optimizations" ;-/ Definitely yes. As I wrote above, if you see asm it's not unlikely that it is a piece of performance critical code. > I'd say we should start rejecting all these bogus constructs by default > (maybe accepting them with -fpermissive and then, well, maybe generate > some dwim code). That is, local register var decls are only valid > with an initializer, they are implicitly constant (you can't re-assign to > them). > Reserved registers are a no-go (like %esp), either global or local. Would that help? Like in code static inline void foo (int arg) { register const int reg asm ("r1") = arg; asm ("..."::"r"(reg)); } And with output constraints like "=r,0" or "+r". Or in local blocks: static inline void foo (int arg) { register const int reg asm ("r1") = arg; ... { register const int reg2 asm ("r1") = reg; asm ("..."::"r"(reg2)); } } Do the current optimizers shred inline asm with ordinary constraints but without local registers? If yes, there is a considerable problem in the optimizers and/or in GCC. If not, why can't local register variables work similarly, i.e. propagate the register information into respective asms and forget about it for the variables? Johann > Richard. > >> Ciao, >> Michael.
Re: libgcc: strange optimization
On 08/03/2011 07:02 AM, Richard Guenther wrote: > Reserved registers are a no-go (like %esp), either global or local. Local register variables referring to anything in fixed_regs are trivial to handle -- continue to treat them exactly as we currently do. They won't be clobbered by random code movement because they're fixed. r~
2011 GCC Summit.
I wanted to let everyone know that the planning for the 2011 GCC and GNU Toolchain Developers' Summit is well underway and I hope to have the dates and locations confirmed any time now. The aim is the same timing as 2010 in the 3rd week of October. Start thinking about the topics you're most interested in and about those paper proposals as we'll be opening submissions very soon as well. I've also setup a twitter feed as gcc_summit which I encourage you to follow and will be sending my regular updates there with summaries going out on the announcement mailing list from time to time. I'm very much looking forward to seeing everyone again and if you've got some great ideas for this year please email me!
Re: Performance degradation on g++ 4.6
Scanning through the profile data you provided -- test functions such as test_constant ...> completely disappeared in 4.1's profile which means they are inlined by gcc4.1. They exist in 4.6's profile. For the unsigned short case where neither version inlines the call, 4.6 version is much faster. David On Mon, Aug 1, 2011 at 11:43 AM, Oleg Smolsky wrote: > On 2011/7/29 14:07, Xinliang David Li wrote: >> >> Profiling tools are your best friend here. If you don't have access to >> any, the least you can do is to build the program with -pg option and >> use gprof tool to find out differences. > > The test suite has a bunch of very basic C++ tests that are executed an > enormous number of times. I've built one with the obvious performance > degradation and attached the source, output and reports. > > Here are some highlights: > v4.1: Total absolute time for int8_t constant folding: 30.42 sec > v4.6: Total absolute time for int8_t constant folding: 43.32 sec > > Every one of the tests in this section had degraded... the first half more > than the second. I am not sure how much further I can take this - the > benchmarked code is very short and plain. I can post disassembly for one > (some?) of them if anyone is willing to take a look... > > Thanks, > Oleg. >
Re: [RFC] Remove -freorder-blocks-and-partition
Hi, > The worst part is that test coverage for this feature is > extremely poor. It's very difficult to tell if any cleanup > in this area is likely to introduce more bugs than it fixes. > > After 3 days fighting with this code, I had a bit of a > cathartic whine on IRC. I got two votes to just rip the > whole thing out. I am also not fan of the code, given that I had several encounters with it and was bit by it quite badly, too. With ipa-split I implemented part of what is needed for outlining of cold regions of function sinto a separate functions. This however is different from partitioning - i.e. the code sequence of getting into the offlined part is longer since you need to actually pass stuff in function arguments and it is hard to jump back and forth in between hot and cold regions. Expecting it the partitioning to be fully replaced by gimple level offlining is thus not realistic. So function partitioning still makes sense to me as an optimization and in fact I was hoping to get it into shape that it can be enabled with -fprofile-use by default and thus also tested by profiledbootstrap. It did not happen as I am busy with IPA/LTO tasks at the moment. So I am unsure what really we want to do. Removing the feature seems pity, but at the same time the code really needs an revamp. Since you apparently spent most time to on this issue, I won't object to your decision to rip out the code. Honza > > Andrew Pinski points out that the feature could probably be > equivalently implemented via outlining and function calls > (I assume well back at the gimple level). At which point we > no longer have cross-segment jump_insns at the rtl level, > which seems like a Really Big Win to me at this point. > Not that I'm volunteering to actually do the work to implement > any such scheme. > > Thoughts? > > > r~
Re: [RFC] Remove -freorder-blocks-and-partition
> On 07/25/2011 06:42 AM, Xinliang David Li wrote: >> FYI the performance impact of this option with SPEC06 (built with >> google_46 compiler and measured on a core2 box). The base line number >> is FDO, and ref number is FDO + reorder_with_partitioning. >> >> xalancbmk improves> 3.5% >> perlbench improves> 1.5% >> dealII and bzip2 degrades about 1.4%. >> >> Note the partitioning scheme is not tuned at all -- there is not even >> a tunable parameter to play with. > I looked at the bzip2 slowdown years ago and back then it was code layout issue: i.e. adding a nops at place code was offlined actually returned the performance. It was couple years back and thus deifnitely on different CPY than what David use. Bzip2 has tight internal loops sorting the strings, so the layout issues are however quite likely explanation. Honza
Re: [RFC] Remove -freorder-blocks-and-partition
> In xalancbmk, with the partition option, most of object files have > nonzero size cold sections generated. The text size of the binary is > increased to 3572728 bytes from 3466790 bytes. Profiling the program > using the training input shows the following differences. With > partitioning, number of executed branch instructions slightly > increases, but itlb misses and icache load misses are significantly > lower compared with the binary without partitioning. > > > David > > With partition: > - >53654937239 branches > 306751458 L1-icache-load-misses > 8146112 iTLB-load-misses Note that I was also planning for some time to introduce notion of provably cold stuff into our branch prediction heurstics. I.e. code leading to aborts, eh etc that can be then offlined even w/o profile feedback and could perhaps help to large apps. (also the whole pass should be more effective with larger testcases, SPEC2k6 is slowly becoming a small one) Honza
Re: [RFC] Remove -freorder-blocks-and-partition
On Wed, Aug 3, 2011 at 2:06 PM, Jan Hubicka wrote: >> In xalancbmk, with the partition option, most of object files have >> nonzero size cold sections generated. The text size of the binary is >> increased to 3572728 bytes from 3466790 bytes. Profiling the program >> using the training input shows the following differences. With >> partitioning, number of executed branch instructions slightly >> increases, but itlb misses and icache load misses are significantly >> lower compared with the binary without partitioning. >> >> >> David >> >> With partition: >> - >> 53654937239 branches >> 306751458 L1-icache-load-misses >> 8146112 iTLB-load-misses > > Note that I was also planning for some time to introduce notion of provably > cold > stuff into our branch prediction heurstics. I.e. code leading to aborts, eh > etc no-return attribute is looked at by static profile estimation pass. Is the attribute (definitely not returning) properly propagated to the callers (wrappers of exit, etc)? David > that can be then offlined even w/o profile feedback and could perhaps help > to large apps. > (also the whole pass should be more effective with larger testcases, SPEC2k6 > is slowly > becoming a small one) > > Honza >
Re: libgcc: strange optimization
On Wed, 3 Aug 2011, Ulrich Weigand wrote: > Richard Guenther wrote: > > asm ("scall" : : "asm("r0")" (10), ...) > Maybe it would be possible to implement this while keeping the syntax > of existing code by (re-)defining the semantics of register asm to > basically say that: > > If a variable X is declared as register asm for register Y, and X > is later on used as operand to an inline asm, the register allocator > will choose register Y to hold that asm operand. "me too": Nice idea! > (And this is the > full specification of register asm semantics, nothing beyond this > is guaranteed.) You'd have to handle global registers differently, and local fixed registers not feeding into asms. For everything else, error or warning. That should be ok, because local asm registers are wonderfully already documented to have that restriction: "Local register variables in specific registers do not reserve the registers, except at the point where they are used as input or output operands in an @code{asm} statement and the @code{asm} statement itself is not deleted." So, it's just a small matter of programming to make that happen for real. :-) To make sure, it'd be nice if someone could perhaps grep an entire GNU/Linux-or-other distribution including the kernel for uses of asm-declared *local* registers that don't directly feed into asms and not being the stack-pointer? Or can we get away with just saying that local asm registers haven't had any other documented meaning for the last seven years? > It seems this semantics could be implemented very early on, probably > in the frontend itself. The frontend would mark the *asm* statement > as using the specified register (there would be no special handling > of the *variable* as such, after the frontend is done). The optimizers > would then simply be required to pass the asm-statement register > annotations though, much like today they pass constraints through. > At the point where register allocation decisions are made, those > register annotations would then be acted on. People ask why it's not already like that, probably because they assume the ideal sequence of events. At least the quote above is a late addition (close to seven years now). IIUC, asms and register asms weren't originally tied together and the current implementation with early register tying just happened to work well together, well, that is until the SSA revolution. ;) brgds, H-P
g++ 2.5.2 does not catch reference to local variable error.
Hi, "g++ -Wall -Wextra ..." should flag a warning on the following code but does not. std::pair get_XYZ_data() { XYZ result; return std::pair(1, result); } This is a violation of Scott Meyer's "Effective C++" Item 21 "Don't try to return a reference when you must return an object." GCC version 4.5.2 on Kubuntu 11.04 does not issue a warning. I apologize for not subscribing to the mailing list or submitting via GCC Buzilla. Regards, Fung Chai. -- FWIW: $\lnot \exists x \, {\rm Right} (x) \leftarrow \forall x \, {\rm Wrong} (x)$ \hfill -- Stephen Stills Freedom's just another word for nothin' left to lose -- Kris Kristofferson