Re: gcc.gnu.org Bugzilla: Perl error Can't locate mro.pm in @INC

2011-08-03 Thread Frédéric Buclin
Le 26. 01. 11 17:04, Frank Ch. Eigler a écrit :
 Can't locate mro.pm in @INC
> 
> This may be fixed now, with a hand-made dummy mro.pm file.

I think I know what's wrong. I will paste what I wrote at
https://bugzilla.mozilla.org/show_bug.cgi?id=675633#c2:

email_in.pl requires Email::Reply which requires Email::Abstract which
requires mro since 3.003. So if you have Email::Abstract 3.002 or older,
you shouldn't get this error. If you have Email::Abstract 3.003 or
newer, then this means MRO::Compat (which has "mro") is not correctly
installed.

Frédéric


Re: libgcc: strange optimization

2011-08-03 Thread Ulrich Weigand
Richard Guenther wrote:
> On Tue, Aug 2, 2011 at 3:23 PM, Ian Lance Taylor  wrote:
> > Richard Guenther  writes:
> >> I suggest to amend the documentation for local call-clobbered register
> >> variables to say that the only valid sequence using them is from a
> >> non-inlinable function that contains only direct initializations of the
> >> register variables from constants or parameters.
> >
> > Let's just implement those requirements in the compiler itself.
> 
> Doesn't work for existing code, no?  And if thinking new code then
> I'd rather have explicit dependences (and a way to represent them).
> Thus, for example
> 
> asm ("scall" : : "asm("r0")" (10), ...)
> 
> thus, why force new constraints when we already can figure out
> local register vars by register name?  Why not extend the constraint
> syntax somehow to allow specifying the same effect?

Maybe it would be possible to implement this while keeping the syntax
of existing code by (re-)defining the semantics of register asm to
basically say that:

 If a variable X is declared as register asm for register Y, and X
 is later on used as operand to an inline asm, the register allocator
 will choose register Y to hold that asm operand.  (And this is the
 full specification of register asm semantics, nothing beyond this
 is guaranteed.)

It seems this semantics could be implemented very early on, probably
in the frontend itself.  The frontend would mark the *asm* statement
as using the specified register (there would be no special handling
of the *variable* as such, after the frontend is done).  The optimizers
would then simply be required to pass the asm-statement register
annotations though, much like today they pass constraints through.
At the point where register allocation decisions are made, those
register annotations would then be acted on.

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com


Re: libgcc: strange optimization

2011-08-03 Thread Georg-Johann Lay
Ulrich Weigand wrote:
> Richard Guenther wrote:
>> On Tue, Aug 2, 2011 at 3:23 PM, Ian Lance Taylor  wrote:
>>> Richard Guenther  writes:
 I suggest to amend the documentation for local call-clobbered register
 variables to say that the only valid sequence using them is from a
 non-inlinable function that contains only direct initializations of the
 register variables from constants or parameters.
>>> Let's just implement those requirements in the compiler itself.
>> Doesn't work for existing code, no?  And if thinking new code then
>> I'd rather have explicit dependences (and a way to represent them).
>> Thus, for example
>>
>> asm ("scall" : : "asm("r0")" (10), ...)
>>
>> thus, why force new constraints when we already can figure out
>> local register vars by register name?  Why not extend the constraint
>> syntax somehow to allow specifying the same effect?

Yes this would be exact equivalence of

  register int var asm ("r0") = 10;
  ...
  asm ("scall" : : "r" (var), ...)


> Maybe it would be possible to implement this while keeping the syntax
> of existing code by (re-)defining the semantics of register asm to
> basically say that:
> 
>  If a variable X is declared as register asm for register Y, and X
>  is later on used as operand to an inline asm, the register allocator
>  will choose register Y to hold that asm operand.  (And this is the
>  full specification of register asm semantics, nothing beyond this
>  is guaranteed.)

Yes, that's reasonable.  As I understand the docs, in code like

void foo ()
{
   register int var asm ("r1") = 10;
   asm (";; use r1");
}

there is nothing that connects var to the asm and assuming that
r1 holds 10 in the asm is a user error.

The only place where the asm attached to a variable needs to have
effect are the inline asm sequences that explicitly refer to
respective variables.  If there is no inline asm referencing a
local register variable, there is on difference to a non-register
auto variable; there could even be a warning that in such a case
that

   register int var asm ("r1") = 10;

is equivalent to

   int var = 10;

> It seems this semantics could be implemented very early on, probably
> in the frontend itself.  The frontend would mark the *asm* statement
> as using the specified register (there would be no special handling
> of the *variable* as such, after the frontend is done).  The optimizers
> would then simply be required to pass the asm-statement register
> annotations though, much like today they pass constraints through.
> At the point where register allocation decisions are made, those
> register annotations would then be acted on.
> 
> Bye,
> Ulrich

I wonder why it does not work like that in the current implementation.
Local register variable is just like using a similar constraint
(with the only difference that in general there is no such constraint,
otherwise the developer would use it). A pass like .asmcons could
take care of it just the same way it does for constraints and no
optimizer passed would have to bother if a variable is a local register
or not.

This would render local register variables even more functional
because no one needed to care if there were implicit library calls
or things like that.

Johann


Re: libgcc: strange optimization

2011-08-03 Thread Richard Guenther
On Wed, Aug 3, 2011 at 11:50 AM, Georg-Johann Lay  wrote:
> Ulrich Weigand wrote:
>> Richard Guenther wrote:
>>> On Tue, Aug 2, 2011 at 3:23 PM, Ian Lance Taylor  wrote:
 Richard Guenther  writes:
> I suggest to amend the documentation for local call-clobbered register
> variables to say that the only valid sequence using them is from a
> non-inlinable function that contains only direct initializations of the
> register variables from constants or parameters.
 Let's just implement those requirements in the compiler itself.
>>> Doesn't work for existing code, no?  And if thinking new code then
>>> I'd rather have explicit dependences (and a way to represent them).
>>> Thus, for example
>>>
>>> asm ("scall" : : "asm("r0")" (10), ...)
>>>
>>> thus, why force new constraints when we already can figure out
>>> local register vars by register name?  Why not extend the constraint
>>> syntax somehow to allow specifying the same effect?
>
> Yes this would be exact equivalence of
>
>  register int var asm ("r0") = 10;
>  ...
>  asm ("scall" : : "r" (var), ...)
>
>
>> Maybe it would be possible to implement this while keeping the syntax
>> of existing code by (re-)defining the semantics of register asm to
>> basically say that:
>>
>>  If a variable X is declared as register asm for register Y, and X
>>  is later on used as operand to an inline asm, the register allocator
>>  will choose register Y to hold that asm operand.  (And this is the
>>  full specification of register asm semantics, nothing beyond this
>>  is guaranteed.)
>
> Yes, that's reasonable.  As I understand the docs, in code like
>
> void foo ()
> {
>   register int var asm ("r1") = 10;
>   asm (";; use r1");
> }
>
> there is nothing that connects var to the asm and assuming that
> r1 holds 10 in the asm is a user error.
>
> The only place where the asm attached to a variable needs to have
> effect are the inline asm sequences that explicitly refer to
> respective variables.  If there is no inline asm referencing a
> local register variable, there is on difference to a non-register
> auto variable; there could even be a warning that in such a case
> that
>
>   register int var asm ("r1") = 10;
>
> is equivalent to
>
>   int var = 10;
>
>> It seems this semantics could be implemented very early on, probably
>> in the frontend itself.  The frontend would mark the *asm* statement
>> as using the specified register (there would be no special handling
>> of the *variable* as such, after the frontend is done).  The optimizers
>> would then simply be required to pass the asm-statement register
>> annotations though, much like today they pass constraints through.
>> At the point where register allocation decisions are made, those
>> register annotations would then be acted on.
>>
>> Bye,
>> Ulrich
>
> I wonder why it does not work like that in the current implementation.
> Local register variable is just like using a similar constraint
> (with the only difference that in general there is no such constraint,
> otherwise the developer would use it). A pass like .asmcons could
> take care of it just the same way it does for constraints and no
> optimizer passed would have to bother if a variable is a local register
> or not.
>
> This would render local register variables even more functional
> because no one needed to care if there were implicit library calls
> or things like that.

Yes, I like that idea.

Richard.


Re: libgcc: strange optimization

2011-08-03 Thread Michael Matz
Hi,

On Wed, 3 Aug 2011, Richard Guenther wrote:

> > Yes, that's reasonable.  As I understand the docs, in code like
> >
> > void foo ()
> > {
> >   register int var asm ("r1") = 10;
> >   asm (";; use r1");
> > }
> >
> > there is nothing that connects var to the asm and assuming that
> > r1 holds 10 in the asm is a user error.
> >
> > The only place where the asm attached to a variable needs to have
> > effect are the inline asm sequences that explicitly refer to
> > respective variables.  If there is no inline asm referencing a
> > local register variable, there is on difference to a non-register
> > auto variable; there could even be a warning that in such a case
> > that
> >
> >   register int var asm ("r1") = 10;
> >
> > is equivalent to
> >
> >   int var = 10;
> >
> > This would render local register variables even more functional 
> > because no one needed to care if there were implicit library calls or 
> > things like that.
> 
> Yes, I like that idea.

I do too.  Except it doesn't work :)

There's a common idiom of accessing registers read-only by declaring local 
register vars.  E.g. to (*grasp*) the stack pointer.  There won't be a DEF 
for that register var, and hence at use-points we couldn't reload any 
sensible values into those registers (and we really shouldn't clobber the 
stack pointer in this way).

We could introduce that special semantic only for non-reserved registers, 
and require no writes to register vars for reserved registers.

Or we could simply do:

  if (any_local_reg_vars)
optimize = 0;

But I already see people wanting to _do_ optimization also with local reg 
vars, "just not the wrong optimizations" ;-/


Ciao,
Michael.

Re: libgcc: strange optimization

2011-08-03 Thread Richard Guenther
On Wed, Aug 3, 2011 at 3:27 PM, Michael Matz  wrote:
> Hi,
>
> On Wed, 3 Aug 2011, Richard Guenther wrote:
>
>> > Yes, that's reasonable.  As I understand the docs, in code like
>> >
>> > void foo ()
>> > {
>> >   register int var asm ("r1") = 10;
>> >   asm (";; use r1");
>> > }
>> >
>> > there is nothing that connects var to the asm and assuming that
>> > r1 holds 10 in the asm is a user error.
>> >
>> > The only place where the asm attached to a variable needs to have
>> > effect are the inline asm sequences that explicitly refer to
>> > respective variables.  If there is no inline asm referencing a
>> > local register variable, there is on difference to a non-register
>> > auto variable; there could even be a warning that in such a case
>> > that
>> >
>> >   register int var asm ("r1") = 10;
>> >
>> > is equivalent to
>> >
>> >   int var = 10;
>> >
>> > This would render local register variables even more functional
>> > because no one needed to care if there were implicit library calls or
>> > things like that.
>>
>> Yes, I like that idea.
>
> I do too.  Except it doesn't work :)
>
> There's a common idiom of accessing registers read-only by declaring local
> register vars.  E.g. to (*grasp*) the stack pointer.  There won't be a DEF
> for that register var, and hence at use-points we couldn't reload any
> sensible values into those registers (and we really shouldn't clobber the
> stack pointer in this way).
>
> We could introduce that special semantic only for non-reserved registers,
> and require no writes to register vars for reserved registers.
>
> Or we could simply do:
>
>  if (any_local_reg_vars)
>    optimize = 0;
>
> But I already see people wanting to _do_ optimization also with local reg
> vars, "just not the wrong optimizations" ;-/

I'd say we should start rejecting all these bogus constructs by default
(maybe accepting them with -fpermissive and then, well, maybe generate
some dwim code).  That is, local register var decls are only valid
with an initializer, they are implicitly constant (you can't re-assign to them).
Reserved registers are a no-go (like %esp), either global or local.

Richard.

>
> Ciao,
> Michael.


Re: libgcc: strange optimization

2011-08-03 Thread Georg-Johann Lay
Richard Guenther wrote:
> On Wed, Aug 3, 2011 at 3:27 PM, Michael Matz  wrote:
>> Hi,
>>
>> On Wed, 3 Aug 2011, Richard Guenther wrote:
>>
 Yes, that's reasonable.  As I understand the docs, in code like

 void foo ()
 {
   register int var asm ("r1") = 10;
   asm (";; use r1");
 }

 there is nothing that connects var to the asm and assuming that
 r1 holds 10 in the asm is a user error.

 The only place where the asm attached to a variable needs to have
 effect are the inline asm sequences that explicitly refer to
 respective variables.  If there is no inline asm referencing a
 local register variable, there is on difference to a non-register
 auto variable; there could even be a warning that in such a case
 that

   register int var asm ("r1") = 10;

 is equivalent to

   int var = 10;

 This would render local register variables even more functional
 because no one needed to care if there were implicit library calls or
 things like that.
>>> Yes, I like that idea.
>> I do too.  Except it doesn't work :)
>>
>> There's a common idiom of accessing registers read-only by declaring local
>> register vars.  E.g. to (*grasp*) the stack pointer.  There won't be a DEF
>> for that register var, and hence at use-points we couldn't reload any
>> sensible values into those registers (and we really shouldn't clobber the
>> stack pointer in this way).
>>
>> We could introduce that special semantic only for non-reserved registers,
>> and require no writes to register vars for reserved registers.
>>
>> Or we could simply do:
>>
>>  if (any_local_reg_vars)
>>optimize = 0;
>>
>> But I already see people wanting to _do_ optimization also with local reg
>> vars, "just not the wrong optimizations" ;-/

Definitely yes.  As I wrote above, if you see asm it's not unlikely that it
is  a piece of performance critical code.

> I'd say we should start rejecting all these bogus constructs by default
> (maybe accepting them with -fpermissive and then, well, maybe generate
> some dwim code).  That is, local register var decls are only valid
> with an initializer, they are implicitly constant (you can't re-assign to 
> them).
> Reserved registers are a no-go (like %esp), either global or local.

Would that help? Like in code

static inline void foo (int arg)
{
   register const int reg asm ("r1") = arg;
   asm ("..."::"r"(reg));
}

And with output constraints like "=r,0" or "+r".  Or in local blocks:

static inline void foo (int arg)
{
   register const int reg asm ("r1") = arg;

   ...
   {
   register const int reg2 asm ("r1") = reg;
   asm ("..."::"r"(reg2));
   }
}



Do the current optimizers shred inline asm with ordinary constraints
but without local registers?

If yes, there is a considerable problem in the optimizers and/or in GCC.

If not, why can't local register variables work similarly, i.e. propagate
the register information into respective asms and forget about it for
the variables?

Johann

> Richard.
> 
>> Ciao,
>> Michael.



Re: libgcc: strange optimization

2011-08-03 Thread Richard Henderson
On 08/03/2011 07:02 AM, Richard Guenther wrote:
> Reserved registers are a no-go (like %esp), either global or local.

Local register variables referring to anything in fixed_regs
are trivial to handle -- continue to treat them exactly as we
currently do.  They won't be clobbered by random code movement
because they're fixed.


r~


2011 GCC Summit.

2011-08-03 Thread Andrew J. Hutton
I wanted to let everyone know that the planning for the 2011 GCC and GNU 
Toolchain Developers' Summit is well underway and I hope to have the 
dates and locations confirmed any time now.  The aim is the same timing 
as 2010 in the 3rd week of October.


Start thinking about the topics you're most interested in and about 
those paper proposals as we'll be opening submissions very soon as well.


I've also setup a twitter feed as gcc_summit which I encourage you to 
follow and will be sending my regular updates there with summaries going 
out on the announcement mailing list from time to time.


I'm very much looking forward to seeing everyone again and if you've got 
some great ideas for this year please email me!


Re: Performance degradation on g++ 4.6

2011-08-03 Thread Xinliang David Li
Scanning through the profile data you provided -- test functions such
as test_constant ...>
completely disappeared in 4.1's profile which means they are inlined
by gcc4.1. They exist in 4.6's profile. For the unsigned short case
where neither version inlines the call, 4.6 version is much faster.

David

On Mon, Aug 1, 2011 at 11:43 AM, Oleg Smolsky  wrote:
> On 2011/7/29 14:07, Xinliang David Li wrote:
>>
>> Profiling tools are your best friend here. If you don't have access to
>> any, the least you can do is to build the program with -pg option and
>> use gprof tool to find out differences.
>
> The test suite has a bunch of very basic C++ tests that are executed an
> enormous number of times. I've built one with the obvious performance
> degradation and attached the source, output and reports.
>
> Here are some highlights:
>    v4.1:    Total absolute time for int8_t constant folding: 30.42 sec
>    v4.6:    Total absolute time for int8_t constant folding: 43.32 sec
>
> Every one of the tests in this section had degraded... the first half more
> than the second. I am not sure how much further I can take this - the
> benchmarked code is very short and plain. I can post disassembly for one
> (some?) of them if anyone is willing to take a look...
>
> Thanks,
> Oleg.
>


Re: [RFC] Remove -freorder-blocks-and-partition

2011-08-03 Thread Jan Hubicka
Hi,
> The worst part is that test coverage for this feature is
> extremely poor.  It's very difficult to tell if any cleanup
> in this area is likely to introduce more bugs than it fixes.
> 
> After 3 days fighting with this code, I had a bit of a 
> cathartic whine on IRC.  I got two votes to just rip the 
> whole thing out.

I am also not fan of the code, given that I had several encounters with it and
was bit by it quite badly, too.

With ipa-split I implemented part of what is needed for outlining of cold
regions of function sinto a separate functions.  This however is different from
partitioning - i.e. the code sequence of getting into the offlined part is
longer since you need to actually pass stuff in function arguments and it is
hard to jump back and forth in between hot and cold regions.
Expecting it the partitioning to be fully replaced by gimple level offlining is 
thus not realistic.

So function partitioning still makes sense to me as an optimization and in fact
I was hoping to get it into shape that it can be enabled with -fprofile-use by
default and thus also tested by profiledbootstrap.  It did not happen as I am
busy with IPA/LTO tasks at the moment.

So I am unsure what really we want to do.  Removing the feature seems pity,
but at the same time the code really needs an revamp. Since you apparently spent
most time to on this issue, I won't object to your decision to rip out the code.

Honza
> 
> Andrew Pinski points out that the feature could probably be
> equivalently implemented via outlining and function calls
> (I assume well back at the gimple level).  At which point we
> no longer have cross-segment jump_insns at the rtl level,
> which seems like a Really Big Win to me at this point.
> Not that I'm volunteering to actually do the work to implement
> any such scheme.
> 
> Thoughts?
> 
> 
> r~


Re: [RFC] Remove -freorder-blocks-and-partition

2011-08-03 Thread Jan Hubicka
> On 07/25/2011 06:42 AM, Xinliang David Li wrote:
>> FYI  the performance impact of this option with SPEC06 (built with
>> google_46 compiler and measured on a core2 box).  The base line number
>> is FDO, and ref number is FDO + reorder_with_partitioning.
>>
>> xalancbmk improves>  3.5%
>> perlbench improves>  1.5%
>> dealII and bzip2 degrades about 1.4%.
>>
>> Note the partitioning scheme is not tuned at all -- there is not even
>> a tunable parameter to play with.
>

I looked at the bzip2 slowdown years ago and back then it was code layout issue:
i.e. adding a nops at place code was offlined actually returned the performance.
It was couple years back and thus deifnitely on different CPY than what David 
use.
Bzip2 has tight internal loops sorting the strings, so the layout issues are 
however
quite likely explanation.

Honza


Re: [RFC] Remove -freorder-blocks-and-partition

2011-08-03 Thread Jan Hubicka
> In xalancbmk, with the partition option, most of object files have
> nonzero size cold sections generated. The text size of the binary is
> increased to 3572728 bytes from 3466790 bytes.  Profiling the program
> using the training input shows the following differences. With
> partitioning, number of executed branch instructions slightly
> increases, but itlb misses and icache load misses are significantly
> lower compared with the binary without partitioning.
> 
> 
> David
> 
> With partition:
> -
>53654937239  branches
>   306751458  L1-icache-load-misses
> 8146112  iTLB-load-misses

Note that I was also planning for some time to introduce notion of provably cold
stuff into our branch prediction heurstics. I.e. code leading to aborts, eh etc
that can be then offlined even w/o profile feedback and could perhaps help
to large apps.
(also the whole pass should be more effective with larger testcases, SPEC2k6 is 
slowly
becoming a small one)

Honza


Re: [RFC] Remove -freorder-blocks-and-partition

2011-08-03 Thread Xinliang David Li
On Wed, Aug 3, 2011 at 2:06 PM, Jan Hubicka  wrote:
>> In xalancbmk, with the partition option, most of object files have
>> nonzero size cold sections generated. The text size of the binary is
>> increased to 3572728 bytes from 3466790 bytes.  Profiling the program
>> using the training input shows the following differences. With
>> partitioning, number of executed branch instructions slightly
>> increases, but itlb misses and icache load misses are significantly
>> lower compared with the binary without partitioning.
>>
>>
>> David
>>
>> With partition:
>> -
>>    53654937239  branches
>>       306751458  L1-icache-load-misses
>>         8146112  iTLB-load-misses
>
> Note that I was also planning for some time to introduce notion of provably 
> cold
> stuff into our branch prediction heurstics. I.e. code leading to aborts, eh 
> etc

no-return attribute is looked at by static profile estimation pass. Is
the attribute (definitely not returning) properly propagated to the
callers (wrappers of exit, etc)?

David

> that can be then offlined even w/o profile feedback and could perhaps help
> to large apps.
> (also the whole pass should be more effective with larger testcases, SPEC2k6 
> is slowly
> becoming a small one)
>
> Honza
>


Re: libgcc: strange optimization

2011-08-03 Thread Hans-Peter Nilsson
On Wed, 3 Aug 2011, Ulrich Weigand wrote:
> Richard Guenther wrote:
> > asm ("scall" : : "asm("r0")" (10), ...)
> Maybe it would be possible to implement this while keeping the syntax
> of existing code by (re-)defining the semantics of register asm to
> basically say that:
>
>  If a variable X is declared as register asm for register Y, and X
>  is later on used as operand to an inline asm, the register allocator
>  will choose register Y to hold that asm operand.

"me too": Nice idea!

>  (And this is the
>  full specification of register asm semantics, nothing beyond this
>  is guaranteed.)

You'd have to handle global registers differently, and local
fixed registers not feeding into asms.  For everything else,
error or warning.  That should be ok, because local asm
registers are wonderfully already documented to have that
restriction: "Local register variables in specific registers do
not reserve the registers, except at the point where they are
used as input or output operands in an @code{asm} statement and
the @code{asm} statement itself is not deleted."

So, it's just a small matter of programming to make that happen
for real. :-)

To make sure, it'd be nice if someone could perhaps grep an
entire GNU/Linux-or-other distribution including the kernel for
uses of asm-declared *local* registers that don't directly feed
into asms and not being the stack-pointer?  Or can we get away
with just saying that local asm registers haven't had any other
documented meaning for the last seven years?

> It seems this semantics could be implemented very early on, probably
> in the frontend itself.  The frontend would mark the *asm* statement
> as using the specified register (there would be no special handling
> of the *variable* as such, after the frontend is done).  The optimizers
> would then simply be required to pass the asm-statement register
> annotations though, much like today they pass constraints through.
> At the point where register allocation decisions are made, those
> register annotations would then be acted on.

People ask why it's not already like that, probably because they
assume the ideal sequence of events.  At least the quote above
is a late addition (close to seven years now).  IIUC, asms and
register asms weren't originally tied together and the current
implementation with early register tying just happened to work
well together, well, that is until the SSA revolution. ;)

brgds, H-P


g++ 2.5.2 does not catch reference to local variable error.

2011-08-03 Thread LIM Fung-Chai
Hi,

"g++ -Wall -Wextra ..." should flag a warning on the following code
but does not.

std::pair
get_XYZ_data()
{
XYZ result;
return std::pair(1, result);
}

This is a violation of Scott Meyer's "Effective C++" Item 21 "Don't
try to return a reference when you must return an object."  GCC
version 4.5.2 on Kubuntu 11.04 does not issue a warning.

I apologize for not subscribing to the mailing list or submitting via
GCC Buzilla.

Regards,
Fung Chai.

--
FWIW: $\lnot \exists x \, {\rm Right} (x) \leftarrow \forall x \, {\rm
Wrong} (x)$ \hfill -- Stephen Stills

Freedom's just another word for nothin' left to lose -- Kris Kristofferson