date:20101207

Re: PATCH: 2 stage BFD linker for LTO plugin

2010-12-07 Thread Tristan Gingold

On Dec 6, 2010, at 6:23 PM, Dave Korn wrote:
>  Tristan, sorry, you must be sick of hearing from me by now,

No, not really :-)

> but I notice the
> branch was still labile a couple of hours ago... it would be really good if we
> could get HJ's patch approved and backported before you spin the release.

The issue is that this patch isn't yet approved for the trunk and looks 
slightly controversial.
And I'd like to make the release soon (ie this week).

Unless this patch is quickly approved (and back-ported), I plan to do the 
release this week.  If it is accepted after,
this is not a real issue as it can be part of binutils 2.21.1 (which should be 
released before March - or within gcc 4.6)

Tristan.

Re: PATCH: 2 stage BFD linker for LTO plugin

2010-12-07 Thread Dave Korn

On 07/12/2010 08:33, Tristan Gingold wrote:
> On Dec 6, 2010, at 6:23 PM, Dave Korn wrote:
>> Tristan, sorry, you must be sick of hearing from me by now,
> 
> No, not really :-)
> 
>> but I notice the branch was still labile a couple of hours ago... it
>> would be really good if we could get HJ's patch approved and backported
>> before you spin the release.
> 
> The issue is that this patch isn't yet approved for the trunk and looks
> slightly controversial. And I'd like to make the release soon (ie this
> week).
> 
> Unless this patch is quickly approved (and back-ported), I plan to do the
> release this week.  If it is accepted after, this is not a real issue as it
> can be part of binutils 2.21.1 (which should be released before March - or
> within gcc 4.6)

  Yeah, we clearly aren't going to arrive at any sudden consensus, so don't
delay it.  It can wait for .1, it's not needed until GCC releases.

cheers,
  DaveK

The secondary reload

2010-12-07 Thread Paulo J. Matos

Hi,

On GCC4.3 I am facing a problem due to a reload error: unable to find
register to spill in class 'CHIP_REGS'.

This happens on a really nasty set of rules that involve expands, splits
and a TARGET_SECONDARY_RELOAD. Since this code has been brough through
some older GCC versions, I am trying to get around to refactor it and
hopefully in the end, get rid of the register spill failure.

So, I would like if someone could give a couple of hints on how the
splits fit together with the rest.

So, at expansion time we generate the RTL and during register allocation
and reload we have loads of matching. However, what happens when we have
a define_insn where the output template is "#". Will this trigger the
split at expansion time too?

During reload we have TARGET_SECONDARY_RELOAD being massively called
with all kinds of expressions and reload modes. One interesting thing I
noticed is that in gcc4.3 i386 has no TARGET_SECONDARY_RELOAD defined
and it made me wonder if the secondary reload is really something of a
hack that is best avoided by writing clearer/better backend rules or if
there are specific cases which it _has_ to be defined for reload to
work.

Cheers,

-- 
PMatos

Re: combine two load insns

2010-12-07 Thread Jeff Law


On 12/06/10 15:07, Ian Lance Taylor wrote:

roy rosen  writes:


If I have two load SI insns. Is there any way to combine them into one
load DI insn?
Not using peephole which can catch only this limited case of being
sequential insns.
I have seen something done in ARM (*arith_adjacentmem) but it is very
awkward and would not be realistic if the DI is being used by many
different intrinsics.

As far as I know there is no general pass which does this at present.
So it would currently have to a combine pattern like arith_adjacentmem,
or a peephole, or a machine specific pass.

On many processors the alignment requirements of DImode and SImode loads
are different, so it would be hard to do this as a fully general pass.
Given the two loads don't have a def-use data dependency combine won't 
ever get the opportunity to do anything with them.  In general there is 
no pass which combines insns without a true data dependency and targets 
which have such insns have had to handle those combinations in machine 
dependent reorg.  In fact, it was the combination of independent insns 
which led to the introduction of the machine dependent reorg pass eons ago.


I've speculated that this kind of optimization could be done in the 
scheduler.  The basic idea is to first realize that the memory 
references can be combine if they can be issued at the same time 
otherwise there's some kind of dependency that gets in the way.  So the 
thing to do is see if an insn moving to the ready queue can be combined 
with other insns already in the ready queue.


Of course you'd need the machine checks to verify the combination, but 
that wouldn't be terribly hard to handle.


jeff

Re: The secondary reload

2010-12-07 Thread Ian Lance Taylor

pocma...@gmail.com (Paulo J. Matos) writes:

> This happens on a really nasty set of rules that involve expands, splits
> and a TARGET_SECONDARY_RELOAD. Since this code has been brough through
> some older GCC versions, I am trying to get around to refactor it and
> hopefully in the end, get rid of the register spill failure.
>
> So, I would like if someone could give a couple of hints on how the
> splits fit together with the rest.
>
> So, at expansion time we generate the RTL and during register allocation
> and reload we have loads of matching. However, what happens when we have
> a define_insn where the output template is "#". Will this trigger the
> split at expansion time too?

Splits don't happen at expansion time.  They happen later during the
processing.  Splits happen before register allocation, and then again
after register allocation.  After the second split, no output template
should still be "#".

> During reload we have TARGET_SECONDARY_RELOAD being massively called
> with all kinds of expressions and reload modes. One interesting thing I
> noticed is that in gcc4.3 i386 has no TARGET_SECONDARY_RELOAD defined
> and it made me wonder if the secondary reload is really something of a
> hack that is best avoided by writing clearer/better backend rules or if
> there are specific cases which it _has_ to be defined for reload to
> work.

There are cases where a processor needs a secondary reload.  You need it
when you can not move between registers of two classes in a single
instruction, and you need it when you can't load and store registers of
any class directly to/from memory.

Ian

Re: The secondary reload

2010-12-07 Thread Paulo J. Matos

Ian Lance Taylor  writes:

> [snip]
> after register allocation.  After the second split, no output template
> should still be "#".
>

What do you mean by your last sentence? It somehow makes me think that
the splits work at some preprocessing level replacing/rewriting the output
template of instructions. Is that what happens?

> There are cases where a processor needs a secondary reload.  You need it
> when you can not move between registers of two classes in a single
> instruction, 
>

I assume you by 'instruction' here mean a define_insn and not a single
RTL or assembler instruction. 

So, assume I have two classes M_REGS and Y_REGS and I cannot move 
between them except if I go through an intermediary in C_REGS. 
Do I need a secondary reload? 

I wouldn't expect so cause I could write a rule that has a scratch from
C_REGS. Then I move the value from SOURCE to C_REGS and from C_REGS to
DEST. Have I misunderstood what you said? 

> and you need it when you can't load and store registers of
> any class directly to/from memory.

That's interesting but if you can't store/load registers of any class to
and from memory how do you do it with a secondary reload anyway?

Cheers,
-- 
PMatos

-flto, remove unsued code from output

2010-12-07 Thread Klaus Rudolph

Hi all,

I play a bit with lto optimisation. As I see, some functions will be inlined 
during link stage which is the expected result. But the function code which is 
always inlined is not removed from the output file which will result in larger 
output files.

Any additional option to use with gcc during compile or link?

I am using gcc-4.5.1 for avr target on linux-x86 host.

Regards
 Klaus

-- 
Neu: GMX De-Mail - Einfach wie E-Mail, sicher wie ein Brief!  
Jetzt De-Mail-Adresse reservieren: http://portal.gmx.net/de/go/demail

Re: The secondary reload

2010-12-07 Thread Paul Koning

On Dec 7, 2010, at 9:51 AM, Paulo J. Matos wrote:

> Ian Lance Taylor  writes:
> 
>> [snip]
>> after register allocation.  After the second split, no output template
>> should still be "#".
>> 
> 
> What do you mean by your last sentence? It somehow makes me think that
> the splits work at some preprocessing level replacing/rewriting the output
> template of instructions. Is that what happens?

I assume that means that the insn stream you have at this point will only match 
templates that don't have # in them, in other words the # insns all have 
patterns that no longer match (because of the splits etc. that have taken place)
> 
> 
>> There are cases where a processor needs a secondary reload.  You need it
>> when you can not move between registers of two classes in a single
>> instruction, 
>> 
> 
> I assume you by 'instruction' here mean a define_insn and not a single
> RTL or assembler instruction. 
> 
> So, assume I have two classes M_REGS and Y_REGS and I cannot move 
> between them except if I go through an intermediary in C_REGS. 
> Do I need a secondary reload? 

Yes
> 
> I wouldn't expect so cause I could write a rule that has a scratch from
> C_REGS. Then I move the value from SOURCE to C_REGS and from C_REGS to
> DEST. Have I misunderstood what you said? 
> 
>> and you need it when you can't load and store registers of
>> any class directly to/from memory.
> 
> That's interesting but if you can't store/load registers of any class to
> and from memory how do you do it with a secondary reload anyway?

"Directly" is the key word.  If you have a register that can't be stored at all 
then it's not useable.  But if it can be loaded/stored by way of another 
register, then  you describe this with a secondary reload. 

Read the documentation in gccint about TARGET_SECONDARY_RELOAD, it's pretty 
clear, better than older documentation on the subject.  That's the target hook 
that defines how to move things between register classes, or between register 
and memory, if it takes an intermediate step.  The other one you might need is 
SECONDARY_MEMORY_NEEDED, which applies if you can't get from class X to class Y 
except by way of memory. 

You can find examples of both in the pdp11 target.  TARGET_SECONDARY_RELOAD is 
needed because some of the floating point registers can't be loaded/stored 
directly (the "NO_LOAD_FPU_REGS" class).  And TARGET_MEMORY_NEEDED is defined 
because you can't transfer directly from FPU to general registers.  I think 
that second one is not actually needed because HARD_REGNO_MODE_OK confines 
values to either one or the other class of register, but it could still serve 
as an example for what you're looking for.

paul

Re: -flto, remove unsued code from output

2010-12-07 Thread Richard Guenther

On Tue, Dec 7, 2010 at 4:02 PM, Klaus Rudolph  wrote:
> Hi all,
>
> I play a bit with lto optimisation. As I see, some functions will be inlined 
> during link stage which is the expected result. But the function code which 
> is always inlined is not removed from the output file which will result in 
> larger output files.
>
> Any additional option to use with gcc during compile or link?

-fwhole-program.  That will make all functions (but not main) have
static linkage, so unused functions can be optimized out.

Richard.

> I am using gcc-4.5.1 for avr target on linux-x86 host.
>
> Regards
>  Klaus
>
> --
> Neu: GMX De-Mail - Einfach wie E-Mail, sicher wie ein Brief!
> Jetzt De-Mail-Adresse reservieren: http://portal.gmx.net/de/go/demail
>

Re: The secondary reload

2010-12-07 Thread Paulo J. Matos

Paul Koning  writes:

>> I assume you by 'instruction' here mean a define_insn and not a single
>> RTL or assembler instruction. 
>> 
>> So, assume I have two classes M_REGS and Y_REGS and I cannot move 
>> between them except if I go through an intermediary in C_REGS. 
>> Do I need a secondary reload? 
>
> Yes
>> 
>> I wouldn't expect so cause I could write a rule that has a scratch from
>> C_REGS. Then I move the value from SOURCE to C_REGS and from C_REGS to
>> DEST. Have I misunderstood what you said? 
>>

What about the above case?
Couldn't you have something like I described above:
(define_insn "transfer"
   [
 (set (match_operand 0 "register_operand" "m")
  (match_operand 1 "register_operand" "y"))
 (clobber (match_scratch 2 "c"))
   ]
 ""
{
   move from 1 to 2;
   move from 2 to 0;
})

Why would something like this not work and force you to have a secondary
reload hook?
 
-- 
PMatos

Re: The secondary reload

2010-12-07 Thread Paul Koning


On Dec 7, 2010, at 10:30 AM, Paulo J. Matos wrote:

> Paul Koning  writes:
> 
>>> I assume you by 'instruction' here mean a define_insn and not a single
>>> RTL or assembler instruction. 
>>> 
>>> So, assume I have two classes M_REGS and Y_REGS and I cannot move 
>>> between them except if I go through an intermediary in C_REGS. 
>>> Do I need a secondary reload? 
>> 
>> Yes
>>> 
>>> I wouldn't expect so cause I could write a rule that has a scratch from
>>> C_REGS. Then I move the value from SOURCE to C_REGS and from C_REGS to
>>> DEST. Have I misunderstood what you said? 
>>> 
> 
> What about the above case?
> Couldn't you have something like I described above:
> (define_insn "transfer"
>   [
> (set (match_operand 0 "register_operand" "m")
>  (match_operand 1 "register_operand" "y"))
> (clobber (match_scratch 2 "c"))
>   ]
> ""
> {
>   move from 1 to 2;
>   move from 2 to 0;
> })
> 
> Why would something like this not work and force you to have a secondary
> reload hook?

I don't know enough to answer that.  But I do know that the secondary reload 
stuff works great, and is reasonably well documented, and it takes only a few 
lines to put into effect.  Why not give it a try?

paul

Re: The secondary reload

2010-12-07 Thread Paulo J. Matos

Paul Koning  writes:

>
> I don't know enough to answer that.  But I do know that the secondary
> reload stuff works great, and is reasonably well documented, and it
> takes only a few lines to put into effect.  Why not give it a try?

Thanks for your input Paul.
Actually the problem is exactly that. I have inherited code with a
secondary reload which 'forwards' the code to a insn that requires split
and I have a gut feeling that something here is not quite right and
could be simplified. My initial thought was... I will remove the
secondary reload and try to go without it cause I haven't understood why
we need it. There's always one register class that is able to load/store
from memory and even though not all register classes can have move
between themselves I think the solution I showed using a scratch
register would make sense for those cases.

In the meantime, you might be know the following. Sometimes, the
secondary reload brings something already in sri->icode, meaning it is
not always CODE_FOR_nothing. Which value is this? The manual doesn't
mention any default for sri->icode.

-- 
PMatos

Re: The secondary reload

2010-12-07 Thread Jeff Law


On 12/07/10 08:30, Paulo J. Matos wrote:

Paul Koning  writes:


I assume you by 'instruction' here mean a define_insn and not a single
RTL or assembler instruction.

So, assume I have two classes M_REGS and Y_REGS and I cannot move
between them except if I go through an intermediary in C_REGS.
Do I need a secondary reload?

Yes

I wouldn't expect so cause I could write a rule that has a scratch from
C_REGS. Then I move the value from SOURCE to C_REGS and from C_REGS to
DEST. Have I misunderstood what you said?


What about the above case?
Couldn't you have something like I described above:
(define_insn "transfer"
[
  (set (match_operand 0 "register_operand" "m")
   (match_operand 1 "register_operand" "y"))
  (clobber (match_scratch 2 "c"))
]
  ""
{
move from 1 to 2;
move from 2 to 0;
})

Why would something like this not work and force you to have a secondary
reload hook?
That's not going to work for a variety of reasons.  The most obvious of 
which is when is that pattern ever going to match any generated insn?  
And even if it did, the constraints aren't applied until reload, so 
every reg->reg copy would need to have a scratch register, which 
effectively boils down to reserving a register.


You can often avoid secondary reloads by removing a register from the 
set of registers available to use during allocation and reserving it for 
these special cases; that practice is generally frowned upon as it 
penalizes the majority of code which doesn't need secondary reloads for 
the oddball cases that do need secondary reloads.  It also fails in 
cases where you need multiple secondary reload registers.And yes, 
there are cases where multiple secondary reload registers are needed, 
along with secondary memory.


You're better off taking the time to understand how secondary reloads 
work.  In addition to your port working better, the knowledge you gain 
will help you with other maintenance burdens with your port.




jeff

Re: The secondary reload

2010-12-07 Thread Ian Lance Taylor

pocma...@gmail.com (Paulo J. Matos) writes:

> Paul Koning  writes:
>
>>> I assume you by 'instruction' here mean a define_insn and not a single
>>> RTL or assembler instruction. 
>>> 
>>> So, assume I have two classes M_REGS and Y_REGS and I cannot move 
>>> between them except if I go through an intermediary in C_REGS. 
>>> Do I need a secondary reload? 
>>
>> Yes
>>> 
>>> I wouldn't expect so cause I could write a rule that has a scratch from
>>> C_REGS. Then I move the value from SOURCE to C_REGS and from C_REGS to
>>> DEST. Have I misunderstood what you said? 
>>>
>
> What about the above case?
> Couldn't you have something like I described above:
> (define_insn "transfer"
>[
>  (set (match_operand 0 "register_operand" "m")
>   (match_operand 1 "register_operand" "y"))
>  (clobber (match_scratch 2 "c"))
>]
>  ""
> {
>move from 1 to 2;
>move from 2 to 0;
> })
>
> Why would something like this not work and force you to have a secondary
> reload hook?

First I'll note that in gcc the movMODE insn patterns are special.  They
must be able to handle any simple move.  So you can't use a separate
insn as above to move the instructions.

Second, at reload time you can't generate a new insn which uses a
match_scratch, because reload runs after register allocation and it's
too late to create a new scratch register.  You have to use a secondary
reload to get a scratch register at reload time.  Now, reload does only
require the ability to load and store from memory, so for a lot of code
you can get away without using a secondary reload to transfer a value
from one register class to another.  However, in my experience it is
always possible to write code that uses an unusual typecast to cause gcc
to want to move the value directly from a register in one class to a
register in another class.  And then you need the secondary reload.

Ian

Re: The secondary reload

2010-12-07 Thread Paulo J. Matos

Jeff Law  writes:

> You're better off taking the time to understand how secondary reloads
> work.  In addition to your port working better, the knowledge you gain
> will help you with other maintenance burdens with your port.

Yes, I think you're right. It seems to be a powerful tool that I have
been underestimating. (actually I have been assuming that it is a
workaround for when define_insn are not enough). :)

-- 
PMatos

Re: The secondary reload

2010-12-07 Thread Paulo J. Matos


Actually after your explanation below, a lot of things make total
sense. Thanks, that cleared out several things to me.

However, there's a specific secondary reload question still bothering me: 
- I have seen cases where secondary reload is called and sri->icode !=
CODE_FOR_nothing. In which cases does this happen and should I touch it if
the reload required no setting for sri->icode?

Now, I will look at some code examples of using secondary reload with
moves that _sometimes_ require a scratch register. Any tips on where to
look?

Cheers and once again thanks,

Paulo Matos

Ian Lance Taylor  writes:

>
> First I'll note that in gcc the movMODE insn patterns are special.  They
> must be able to handle any simple move.  So you can't use a separate
> insn as above to move the instructions.
>
> Second, at reload time you can't generate a new insn which uses a
> match_scratch, because reload runs after register allocation and it's
> too late to create a new scratch register.  You have to use a secondary
> reload to get a scratch register at reload time.  Now, reload does only
> require the ability to load and store from memory, so for a lot of code
> you can get away without using a secondary reload to transfer a value
> from one register class to another.  However, in my experience it is
> always possible to write code that uses an unusual typecast to cause gcc
> to want to move the value directly from a register in one class to a
> register in another class.  And then you need the secondary reload.
>
> Ian
>

-- 
PMatos

Re: The secondary reload

2010-12-07 Thread Jeff Law


On 12/07/10 09:30, Paulo J. Matos wrote:

Jeff Law  writes:


You're better off taking the time to understand how secondary reloads
work.  In addition to your port working better, the knowledge you gain
will help you with other maintenance burdens with your port.

Yes, I think you're right. It seems to be a powerful tool that I have
been underestimating. (actually I have been assuming that it is a
workaround for when define_insn are not enough). :)
The best way to think about it is it's a way to get another register in 
cases where it wasn't apparent until reload that an additional register 
was necessary.


It's fairly complex and a source of numerous questions from people 
maintaining their own ports.  Even those of us who have done significant 
port work forget cases that need to be handled by the secondary reload 
mechanisms.  I believe most ports have secondary reloads of one form or 
another that you can refer to.  You might want to review


mn10300/mn10300.c::mn10300_secondary_reload_class
pa/pa.c::emit_move_sequence pa/pa.c::secondary_reload
i386/i386.c::ix86_secondary_reload
m68k/m68k.c::m68k_secondary_reload_class



Note that defining secondary reloads when none was necessary can lead to 
poor code generation; so do your best to define the precise set of 
circumstances when they're needed.


Jeff

Re: The secondary reload

2010-12-07 Thread Ian Lance Taylor

pocma...@gmail.com (Paulo J. Matos) writes:

> However, there's a specific secondary reload question still bothering me: 
> - I have seen cases where secondary reload is called and sri->icode !=
> CODE_FOR_nothing. In which cases does this happen and should I touch it if
> the reload required no setting for sri->icode?
>
> Now, I will look at some code examples of using secondary reload with
> moves that _sometimes_ require a scratch register. Any tips on where to
> look?

The TARGET_SECONDARY_RELOAD hook should set sri->icode to some value if
there is a specific insn pattern which will do the actual secondary
reload.

A good example for complex secondary reloads is sh_secondary_reload in
config/sh/sh.c.

Ian

about the gcc complier

2010-12-07 Thread Gene Michaelson

Hi GNU,

I'm trying to write my own game engine and, I've got it to work for my
Wii, and for my Xbox360, but to get it to work on PlayStation is
difficult.

A buddy of mine told me about the gcc compiler and that people have used
to build games for the PS3 systems. He even said that PlayStation has a PS
Move plug-in for the gcc compiler.

Is that true and is gcc open source?

Let me know, and thanx ;D

~Gene Michaelson

Re: about the gcc complier

2010-12-07 Thread Ian Lance Taylor

"Gene Michaelson"  writes:

> I'm trying to write my own game engine and, I've got it to work for my
> Wii, and for my Xbox360, but to get it to work on PlayStation is
> difficult.
>
> A buddy of mine told me about the gcc compiler and that people have used
> to build games for the PS3 systems. He even said that PlayStation has a PS
> Move plug-in for the gcc compiler.
>
> Is that true and is gcc open source?

This question is not appropriate for the mailing list gcc@gcc.gnu.org,
which is about the development of gcc.  It would be appropriate for the
mailing list gcc-h...@gcc.gnu.org.  Please take any followups to
gcc-help.  Thanks.

Yes, the gcc compiler can be used to build games for the PS3.  I don't
personally know the details.  A quick web search turned up a few
different web pages discussing it.

The gcc compiler is free software.

Ian

Re: The secondary reload

2010-12-07 Thread Paulo J. Matos

Ian Lance Taylor  writes:

>
> The TARGET_SECONDARY_RELOAD hook should set sri->icode to some value if
> there is a specific insn pattern which will do the actual secondary
> reload.
>
> A good example for complex secondary reloads is sh_secondary_reload in
> config/sh/sh.c.
>

Thanks Ian!

-- 
PMatos

Re: combine two load insns

2010-12-07 Thread Frédéric RISS

Le mardi 07 décembre 2010 à 06:18 -0700, Jeff Law a écrit :
> On 12/06/10 15:07, Ian Lance Taylor wrote:
> Given the two loads don't have a def-use data dependency combine won't 
> ever get the opportunity to do anything with them.  In general there is 
> no pass which combines insns without a true data dependency and targets 
> which have such insns have had to handle those combinations in machine 
> dependent reorg.  In fact, it was the combination of independent insns 
> which led to the introduction of the machine dependent reorg pass eons ago.

The issue with this approach is that reorg runs very late. I suppose
that if one wants to combine 2 SI loads into a DI load, it needs to be
done before IRA to satisfy the generated register constraints.

Fred

question about alias-analysis in gcc 4.5

2010-12-07 Thread Eugen Wagner

Hi,
Are any kinds of flow-dependent points-to analysis computed on gimple
in ssa form?
in which pass?


regards,
Eugen

Re: PATCH: 2 stage BFD linker for LTO plugin

2010-12-07 Thread H.J. Lu

On Mon, Dec 6, 2010 at 9:20 AM, H.J. Lu  wrote:
> On Mon, Dec 6, 2010 at 9:23 AM, Dave Korn  wrote:
>> On 06/12/2010 02:20, H.J. Lu wrote:
>>
>>> BTW, the new linker passed bootstrap-lto with all default languages.
>>> I am planning to include this patch in the next Linux binutils.
>>>
>> I missed the IR object in an archive:
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42690#c34
>>
>> This updated patch fixed it.  OK for trunk?
>>
> We shouldn't clear SEC_EXCLUDE if BFD_PLUGIN is set:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42690#c38
>
> This updated patch fixed it.  OK for trunk?
>
 It turns out that my patch also fixes:

 http://sourceware.org/bugzilla/show_bug.cgi?id=12277

>>>
>>> Updated patch, adjusted for plugin ELF symbol visibility bug fix.
>>> OK for trunk?
>>
>>  Well, I reckon this patch is great (but don't have the approval rights).
>> It's passed an lto-bootstrap of gcc on i686-pc-cygwin and the tests are well
>> underway without anything abnormal showing up.
>>
>>> +       /* Free the old already linked table and create a new one.  */
>>> +       bfd_section_already_linked_table_free ();
>>> +       if (!bfd_section_already_linked_table_init ())
>>> +         einfo (_("%P%F: Failed to create hash table\n"));
>>> +
>>> +       /* Free the old hash table and create a new one.  */
>>> +       bfd_link_hash_table_free (link_info.output_bfd,
>>> +                                 link_info.hash);
>>> +       link_info.hash
>>> +         = bfd_link_hash_table_create (link_info.output_bfd);
>>> +       if (link_info.hash == NULL)
>>> +         einfo (_("%P%F: can not create hash table: %E\n"));
>>
>>  If I had known that there was really this little stored state to be unwound
>> and regenerated, I would have wanted to do it this way in the first place.
>>
>>> +typedef struct cmdlin_header_struct
>>
>>  Typo there.
>
> Fixed.
>

cmdline_set_next_claimed_output took the address of an stack
variable:

http://www.sourceware.org/bugzilla/show_bug.cgi?id=12293

Fixed in this updated patch.

-- 
H.J.
---
bfd/

2010-12-07  H.J. Lu  

PR ld/12248
PR ld/12277
* bfd.c (BFD_PLUGIN): New.
(BFD_FLAGS_SAVED): Add BFD_PLUGIN.
(BFD_FLAGS_FOR_BFD_USE_MASK): Likewise.

* bfd-in2.h: Regenerated.

ld/

2010-12-07  H.J. Lu  

PR ld/12248
PR ld/12277
* ldfile.c (ldfile_try_open_bfd): Set BFD_PLUGIN for plugin. Set
stage1.

* ldlang.c (cmdline_list): New.
(cmdline_next_claimed_output): Likewise.
(cmdline_list_init): Likewise.
(cmdline_get_stage2_input_files): Likewise.
(debug_cmdline_list): Likewise.
(cmdline_list_append): Likewise.
(cmdline_set_next_claimed_output): Likewise.
(cmdline_list_insert_claimed_output): Likewise.
(new_afile): Set stage1 to FALSE;
(lang_init): Call cmdline_list_init.
(lang_gc_sections): Don't clear SEC_EXCLUDE if BFD_PLUGIN is
set.
(lang_process): Call plugin_active_plugins_p to check plugin
support.  Check cmdline_next_claimed_output before opening
stage 2 input.  Call debug_cmdline_list if trace_file_tries
is set.  Call cmdline_get_stage2_input_files to get stage 2
input files.

* ldlang.h (lang_input_statement_struct): Add stage1.
(cmdline_enum_type): New.
(cmdline_header_type): Likewise.
(cmdline_input_statement_type): Likewise.
(cmdline_claimed_output_type): Likewise.
(cmdline_union_type): Likewise.
(cmdline_list_type): Likewise.
(cmdline_list_append): Likewise.
(cmdline_list_insert_claimed_output): Likewise.
(cmdline_set_next_claimed_output): Likewise.

* ldmain.c (add_archive_element): Call
cmdline_set_next_claimed_output with archive BFD.  Set
BFD_PLUGIN for plugin.

* lexsup.c (parse_args): Call cmdline_list_append if needed.

* plugin.c (plugin_opt_plugin_arg): Ignore -pass-through=.
(add_input_file): Replace lang_add_input_file with
cmdline_list_insert_claimed_output.
(add_input_library): Likewise.

ld/testsuite/

2010-12-07  H.J. Lu  

PR ld/12248
PR ld/12277
* ld-plugin/func1p.c: New.
* ld-plugin/func2.c: Likewise.
* ld-plugin/func2i.c: Likewise.
* ld-plugin/func3h.c: Likewise.

* ld-plugin/plugin.exp: Add object files for symbols claimed
or created by testplugin.
* ld-plugin/plugin-7.d: Updated.
* ld-plugin/plugin-8.d: Likewise.
* ld-plugin/plugin-9.d: Likewise.
bfd/

2010-12-07  H.J. Lu  

PR ld/12248
PR ld/12277
* bfd.c (BFD_PLUGIN): New.
(BFD_FLAGS_SAVED): Add BFD_PLUGIN.
(BFD_FLAGS_FOR_BFD_USE_MASK): Likewise.

* bfd-in2.h: Regenerated.

ld/

2010-12-07  H.J. Lu  

PR ld/12248
PR ld/12277

Re: "ld -r" on mixed IR/non-IR objects (

2010-12-07 Thread H.J. Lu

On Mon, Dec 6, 2010 at 4:05 PM, H.J. Lu  wrote:
>>
>> Without slim lto you never know if a duplicate symbol is a mistake
>> of the programmer or just the "fat lto" copy. Also ELF semantics
>> like weak are hard if you have multiple copies.
>>
>
> It isn't easy, but doable.
>

Here is my proposal.  Any comments?

Thanks.


-- 
H.J.
---
• 2 kinds of object files
○ non-IR object file has
§ non-IR sections
○ IR object file has
§ IR sections
§ non-IR sections
• The output of "ld -r" with mixed IR/non-IR objects should work with:
○ Compilers/linkers with IR support.
○ Compilers/linkers without IR support.
• Add the mixed object file has
○ IR sections
○ non-IR sections:
§ Object codes from IR sections.
§ Object codes from non-IR object files.
○ Object-only section:
§ Section name won't be generated by any tools, something like
".objectonly\004".
§ Contains non-IR object file.
§ Input is discarded after link.
• Linker action:
○ Classify each input object file:
□ If there is a ".objectonly\004" section, it is a mixed object 
file.
□ If there is a IR section, it is an IR object file.
□ Otherwise, it is a non-IR object file.
○ Relocatable link:
§ Prepare for an object-only output.
§ Prepare for a regular output.
§ For each mixed object file,
□ Add IR and non-IR sections to the regular output.
□ For object-only section:
® Extract object only file.
® Add it to the object-only output.
® Discard object-only section.
§ For each IR object file,
□ Add IR and non-IR sections to the regular output.
§ For each non-IR object file,
□ Add non-IR sections to the regular output.
□ Add non-IR sections to the object-only output.
§ Final output:
□ If there are IR objects, non-IR objects and the 
object-only
output isn't empty:
® Put the object-only output into the 
object-only section.
® Add the object-only section to the regular 
output.
□ Remove the object-only output.
○ Normal link
§ Prepare for output.
§ For each mixed object file,
□ Compile and add IR sections to the output.
□ For object-only section:
® Extract object only file.
® Add it to the output.
® Discard object-only section.
§ For each IR object file,
□ Compile and add IR sections to the output.
§ For each non-IR object file,
□ Add non-IR sections to the output.

gcc-4.4-20101207 is now available

2010-12-07 Thread gccadmin

Snapshot gcc-4.4-20101207 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.4-20101207/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.4 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_4-branch 
revision 167573

You'll find:

 gcc-4.4-20101207.tar.bz2 Complete GCC (includes all of below)

  MD5=3f6748a42f2189bb760e901ef7611df8
  SHA1=8adec189fb2445a1146dc94dea50f6bba95249c7

 gcc-core-4.4-20101207.tar.bz2C front end and core compiler

  MD5=fedd0f1cfe8c48e9bf0a6406ef0f3ace
  SHA1=29b773ca1c72e6f21d2bdf47785843d59364d54e

 gcc-ada-4.4-20101207.tar.bz2 Ada front end and runtime

  MD5=377ed77e52bd228cada609dce25d21b4
  SHA1=60801a7be14c70c35351f495531e06b1072e5e40

 gcc-fortran-4.4-20101207.tar.bz2 Fortran front end and runtime

  MD5=788de1f1fcdeefedebd74993f06acc9d
  SHA1=d43d7ac3ac7998405c778916f67b66189cd46ff9

 gcc-g++-4.4-20101207.tar.bz2 C++ front end and runtime

  MD5=973b2143cd587f485f265292daea76bc
  SHA1=8280ab625e0855ae597b4c6b6e295af74baa8feb

 gcc-java-4.4-20101207.tar.bz2Java front end and runtime

  MD5=d4b061c78392daacfbc6882ccd4ef21a
  SHA1=e28dd2d2a19abe8762ca80e92187f4fa5178b0a6

 gcc-objc-4.4-20101207.tar.bz2Objective-C front end and runtime

  MD5=7841dac2b02b5c48955a5822b9a5a80b
  SHA1=249bf8ee7dc580858a03dba73fd7d918cd17de5c

 gcc-testsuite-4.4-20101207.tar.bz2   The GCC testsuite

  MD5=f2bbec032a2cb788e2c8d1f3039b74ba
  SHA1=db2a58fb12a7ac45df805ba4fa75a39fe6e17df4

Diffs from 4.4-20101130 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.4
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.

Re: combine two load insns

2010-12-07 Thread Jeff Law


On 12/07/10 12:29, Frédéric RISS wrote:

Le mardi 07 décembre 2010 à 06:18 -0700, Jeff Law a écrit :

On 12/06/10 15:07, Ian Lance Taylor wrote:
Given the two loads don't have a def-use data dependency combine won't
ever get the opportunity to do anything with them.  In general there is
no pass which combines insns without a true data dependency and targets
which have such insns have had to handle those combinations in machine
dependent reorg.  In fact, it was the combination of independent insns
which led to the introduction of the machine dependent reorg pass eons ago.

The issue with this approach is that reorg runs very late. I suppose
that if one wants to combine 2 SI loads into a DI load, it needs to be
done before IRA to satisfy the generated register constraints.
Constraints aren't checked until after register allocation is complete 
-- they're going to be of no help in performing this optimization.  
Right now the machine dependent reorg pass or a peephole are the only 
places this optimization can be performed.However, I believe it 
would be possible to make the scheduler perform this optimization with 
some work.


You could also make reorg.c do the job, but that's too gross to 
contemplate (particularly since I did it over 15 years ago and the 
result was so ugly that we came up with the machine dependent reorg hook).



jeff

Re: "ld -r" on mixed IR/non-IR objects (

2010-12-07 Thread Cary Coutant

> Here is my proposal.  Any comments?

We talked about ld -r a while back during the WHOPR project, and the
two ways that the linker could work: (1) combine all the .o files and
use the plugin to run LTRANS on the IR files, producing a pure,
optimized, object file; and (2) combine the non-IR object files as ld
-r normally would, and combine that result somehow with the IR from
the other files, for later optimization. If I remember correctly,
there was support for both modes of operation. The first mode is
easily handled with the current design (untested as far as I know --
there are probably bugs, and I'm not sure if we get the symbol
visibility correct in those cases).

The second mode corresponds with your proposal here. It's complicated
by the fact that it's difficult to tell, once the objects are
combined, which compiled code came without corresponding IR. For this,
I've got a suggestion that seems a bit simpler than your
".objectonly\004" section, based on an idea for something completely
unrelated[1] that I've been pondering over for a while. Instead of
embedding the non-IR objects into the mixed object file, let's instead
produce an archive file with several members: one that contains the
result of running ld -r on the non-IR objects in the link, and one
member for each of the IR files (alternatively, exactly one member
that contains the result of running ld -r on all of the IR objects).
In order to make the archive such that a subsequent link loads all of
the members unconditionally, I propose to add a special symbol
".FORCE" into the archive symbol table for each member; when the
linker sees that symbol in the archive symbol table, it will load the
corresponding member unconditionally.

>   ○ Object-only section:
>   § Section name won't be generated by any tools, something like
>".objectonly\004".
>   § Contains non-IR object file.
>   § Input is discarded after link.

Please -- use a special section type, not a magic name.

-cary


[1] My unrelated idea is about "__attribute__ (( used ))" -- when a
symbol is marked as used, it should not only suppress unused warnings
in the compiler, but it should also force the resulting object module
to be linked from an archive library. I've been thinking about a
proposal to mark any object file that contains a used symbol, have ar
recognize that mark and add the ".FORCE" symbol to the archive symbol
table for that object, then have the linker recognize the ".FORCE"
symbol and load the member unconditionally.

Re: "ld -r" on mixed IR/non-IR objects (

2010-12-07 Thread Dave Korn

On 07/12/2010 23:15, Cary Coutant wrote:

>>   ○ Object-only section:
>>   § Section name won't be generated by any tools, something like
>> ".objectonly\004".
>>   § Contains non-IR object file.
>>   § Input is discarded after link.
> 
> Please -- use a special section type, not a magic name.

  We're still gonna have to use a magic name on non-ELF platforms.

cheers,
  DaveK

Re: "ld -r" on mixed IR/non-IR objects (

2010-12-07 Thread H.J. Lu

On Tue, Dec 7, 2010 at 3:15 PM, Cary Coutant  wrote:
>> Here is my proposal.  Any comments?
>
> We talked about ld -r a while back during the WHOPR project, and the
> two ways that the linker could work: (1) combine all the .o files and
> use the plugin to run LTRANS on the IR files, producing a pure,
> optimized, object file; and (2) combine the non-IR object files as ld
> -r normally would, and combine that result somehow with the IR from
> the other files, for later optimization. If I remember correctly,
> there was support for both modes of operation. The first mode is
> easily handled with the current design (untested as far as I know --
> there are probably bugs, and I'm not sure if we get the symbol
> visibility correct in those cases).

We considered it.  The problem is LTO performs the best when
generating the final executable/DSO.  That is we want the full IR in the
output of "ld -r".

> The second mode corresponds with your proposal here. It's complicated
> by the fact that it's difficult to tell, once the objects are
> combined, which compiled code came without corresponding IR. For this,
> I've got a suggestion that seems a bit simpler than your
> ".objectonly\004" section, based on an idea for something completely
> unrelated[1] that I've been pondering over for a while. Instead of
> embedding the non-IR objects into the mixed object file, let's instead
> produce an archive file with several members: one that contains the
> result of running ld -r on the non-IR objects in the link, and one
> member for each of the IR files (alternatively, exactly one member
> that contains the result of running ld -r on all of the IR objects).
> In order to make the archive such that a subsequent link loads all of
> the members unconditionally, I propose to add a special symbol
> ".FORCE" into the archive symbol table for each member; when the
> linker sees that symbol in the archive symbol table, it will load the
> corresponding member unconditionally.
>
>>       ○ Object-only section:
>>               § Section name won't be generated by any tools, something like
>>".objectonly\004".
>>               § Contains non-IR object file.
>>               § Input is discarded after link.
>
> Please -- use a special section type, not a magic name.
>

As Dave pointed out, we need the magic section name for non-ELF
platform.  One main feature of my proposal is transparent:

# ld -r -o foo.o foo1.o foo2.o foo3.o ...
# ld -r o -bar.o bar1.o bar2.o bar3.o ...
...
# ld  -r o new.o foo.o bar.o ...

where foo.o, bar.o ... are mixed object files.  That is more user-friendly.
Projects like Linux kernel can take advantage of LTO with simple changes
to their Makefiles.


-- 
H.J.

Re: "ld -r" on mixed IR/non-IR objects (

2010-12-07 Thread Andrew Pinski

On Tue, Dec 7, 2010 at 3:53 PM, H.J. Lu  wrote:
> On Tue, Dec 7, 2010 at 3:15 PM, Cary Coutant  wrote:
>>> Here is my proposal.  Any comments?
>>
>> We talked about ld -r a while back during the WHOPR project, and the
>> two ways that the linker could work: (1) combine all the .o files and
>> use the plugin to run LTRANS on the IR files, producing a pure,
>> optimized, object file; and (2) combine the non-IR object files as ld
>> -r normally would, and combine that result somehow with the IR from
>> the other files, for later optimization. If I remember correctly,
>> there was support for both modes of operation. The first mode is
>> easily handled with the current design (untested as far as I know --
>> there are probably bugs, and I'm not sure if we get the symbol
>> visibility correct in those cases).
>
> We considered it.  The problem is LTO performs the best when
> generating the final executable/DSO.  That is we want the full IR in the
> output of "ld -r".

What happens when ld -r is the final link?  Think loadable linux
kernel modules and some other stuff that abuse elf relocatable
objects?

Thanks,
Andrew Pinski

Re: "ld -r" on mixed IR/non-IR objects (

2010-12-07 Thread H.J. Lu

On Tue, Dec 7, 2010 at 3:57 PM, Andrew Pinski  wrote:
> On Tue, Dec 7, 2010 at 3:53 PM, H.J. Lu  wrote:
>> On Tue, Dec 7, 2010 at 3:15 PM, Cary Coutant  wrote:
 Here is my proposal.  Any comments?
>>>
>>> We talked about ld -r a while back during the WHOPR project, and the
>>> two ways that the linker could work: (1) combine all the .o files and
>>> use the plugin to run LTRANS on the IR files, producing a pure,
>>> optimized, object file; and (2) combine the non-IR object files as ld
>>> -r normally would, and combine that result somehow with the IR from
>>> the other files, for later optimization. If I remember correctly,
>>> there was support for both modes of operation. The first mode is
>>> easily handled with the current design (untested as far as I know --
>>> there are probably bugs, and I'm not sure if we get the symbol
>>> visibility correct in those cases).
>>
>> We considered it.  The problem is LTO performs the best when
>> generating the final executable/DSO.  That is we want the full IR in the
>> output of "ld -r".
>
> What happens when ld -r is the final link?  Think loadable linux
> kernel modules and some other stuff that abuse elf relocatable
> objects?
>

"ld -plugin ... -r" will be treated as final link.


-- 
H.J.

Re: "ld -r" on mixed IR/non-IR objects (

2010-12-07 Thread H.J. Lu

On Tue, Dec 7, 2010 at 3:58 PM, H.J. Lu  wrote:
> On Tue, Dec 7, 2010 at 3:57 PM, Andrew Pinski  wrote:
>> On Tue, Dec 7, 2010 at 3:53 PM, H.J. Lu  wrote:
>>> On Tue, Dec 7, 2010 at 3:15 PM, Cary Coutant  wrote:
> Here is my proposal.  Any comments?

 We talked about ld -r a while back during the WHOPR project, and the
 two ways that the linker could work: (1) combine all the .o files and
 use the plugin to run LTRANS on the IR files, producing a pure,
 optimized, object file; and (2) combine the non-IR object files as ld
 -r normally would, and combine that result somehow with the IR from
 the other files, for later optimization. If I remember correctly,
 there was support for both modes of operation. The first mode is
 easily handled with the current design (untested as far as I know --
 there are probably bugs, and I'm not sure if we get the symbol
 visibility correct in those cases).
>>>
>>> We considered it.  The problem is LTO performs the best when
>>> generating the final executable/DSO.  That is we want the full IR in the
>>> output of "ld -r".
>>
>> What happens when ld -r is the final link?  Think loadable linux
>> kernel modules and some other stuff that abuse elf relocatable
>> objects?
>>
>
> "ld -plugin ... -r" will be treated as final link.
>

Here is the updated proposal.


-- 
H.J.
---
• 2 kinds of object files
○ non-IR object file has
§ non-IR sections
○ IR object file has
§ IR sections
§ non-IR sections
• The output of "ld -r" with mixed IR/non-IR objects should work with:
○ Compilers/linkers with IR support.
○ Compilers/linkers without IR support.
• Add the mixed object file which has
○ IR sections
○ non-IR sections:
§ Object codes from IR sections.
§ Object codes from non-IR object files.
○ Object-only section:
§ Section name won't be generated by any tools, something like
".objectonly\004".
§ Contains non-IR object file.
§ Input is discarded after link.
• Linker action:
○ Classify each input object file:
□ If there is a ".objectonly\004" section, it is a mixed object 
file.
□ If there is a IR section, it is an IR object file.
□ Otherwise, it is a non-IR object file.
○ Relocatable non-IR link:
§ Prepare for an object-only output.
§ Prepare for a regular output.
§ For each mixed object file:
□ Add IR and non-IR sections to the regular output.
□ For object-only section:
® Extract object only file.
® Add it to the object-only output.
® Discard object-only section.
§ For each IR object file:
□ Add IR and non-IR sections to the regular output.
§ For each non-IR object file:
□ Add non-IR sections to the regular output.
□ Add non-IR sections to the object-only output.
§ Final output:
□ If there are IR objects, non-IR objects and the 
object-only
output isn't empty:
® Put the object-only output into the 
object-only section.
® Add the object-only section to the regular 
output.
□ Remove the object-only output.
○ Normal link and relocatable IR link:
§ Prepare for output.
§ IR link:
□ For each mixed object file:
® Compile and add IR sections to the output.
® Discard non-IR sections.
® Object-only section:
◊ Extract object only file.
◊ Add it to the output.
◊ Discard object-only section.
□ For each IR object file:
® Compile and add IR sections to the output.
® Discard non-IR sections.
□ For each non-IR object file,
® Add non-IR sections to the output.
§ Non-IR link:
□ For each mixed object file:
® Add non-IR sections to the output.
® Discard IR sections and object-only section.
□ For each IR object file:
® Add non-IR sections to the output.
® Discard IR sections .
□ For each non-IR object file:
® Add non-IR sections to the output.

Re: "ld -r" on mixed IR/non-IR objects (

2010-12-07 Thread Andi Kleen

>> Here is my proposal. Â Any comments?
>
> We talked about ld -r a while back during the WHOPR project, and the
> two ways that the linker could work: (1) combine all the .o files and
> use the plugin to run LTRANS on the IR files, producing a pure,
> optimized, object file; and (2) combine the non-IR object files as ld
> -r normally would, and combine that result somehow with the IR from
> the other files, for later optimization. If I remember correctly,
> there was support for both modes of operation. The first mode is
> easily handled with the current design (untested as far as I know --
> there are probably bugs, and I'm not sure if we get the symbol
> visibility correct in those cases).

the first mode is imho useless because you'll never get whole program
optimizations this way. I tested it some time ago and it worked in a
limited way
(there were some problems that gcc crashed if you didn't specify
-fwhole-program which would be obviously a lie, but those might be fixed
now)
But it won't give you the LTO advantages in any case.

I implemented (2) by giving the sections appropiate names
so they don't get messed up. this works today with gcc mainline, as
long as all objects in the combined object are LTO.

The only problem left is mixing of lto and non lto objects. this right
now is not handled. IMHO still the best way to handle it is to use
slim lto and then simply separate link the "left overs" after deleting
the LTO objects. This can be actually done with objcopy (with some
limitations), doesn't even need linker support.

-Andi

Re: "ld -r" on mixed IR/non-IR objects (

2010-12-07 Thread H. Peter Anvin

On 12/07/2010 04:20 PM, Andi Kleen wrote:
> 
> The only problem left is mixing of lto and non lto objects. this right
> now is not handled. IMHO still the best way to handle it is to use
> slim lto and then simply separate link the "left overs" after deleting
> the LTO objects. This can be actually done with objcopy (with some
> limitations), doesn't even need linker support.
> 

Quite possibly a better way to deal with that is to provide a mechanism
for encapsulating arbitrary binary code objects inside the LTO IR.

-hpa

Re: "ld -r" on mixed IR/non-IR objects (

2010-12-07 Thread H.J. Lu

On Tue, Dec 7, 2010 at 4:20 PM, Andi Kleen  wrote:
>>> Here is my proposal.  Any comments?
>>
>> We talked about ld -r a while back during the WHOPR project, and the
>> two ways that the linker could work: (1) combine all the .o files and
>> use the plugin to run LTRANS on the IR files, producing a pure,
>> optimized, object file; and (2) combine the non-IR object files as ld
>> -r normally would, and combine that result somehow with the IR from
>> the other files, for later optimization. If I remember correctly,
>> there was support for both modes of operation. The first mode is
>> easily handled with the current design (untested as far as I know --
>> there are probably bugs, and I'm not sure if we get the symbol
>> visibility correct in those cases).
>
> the first mode is imho useless because you'll never get whole program
> optimizations this way. I tested it some time ago and it worked in a
> limited way
> (there were some problems that gcc crashed if you didn't specify
> -fwhole-program which would be obviously a lie, but those might be fixed
> now)
> But it won't give you the LTO advantages in any case.
>
> I implemented (2) by giving the sections appropiate names
> so they don't get messed up. this works today with gcc mainline, as
> long as all objects in the combined object are LTO.
>
> The only problem left is mixing of lto and non lto objects. this right
> now is not handled. IMHO still the best way to handle it is to use
> slim lto and then simply separate link the "left overs" after deleting
> the LTO objects. This can be actually done with objcopy (with some
> limitations), doesn't even need linker support.
>

My proposal should address mixing of lto and non lto objects.


-- 
H.J.

Re: "ld -r" on mixed IR/non-IR objects (

2010-12-07 Thread H.J. Lu

On Tue, Dec 7, 2010 at 4:24 PM, H. Peter Anvin  wrote:
> On 12/07/2010 04:20 PM, Andi Kleen wrote:
>>
>> The only problem left is mixing of lto and non lto objects. this right
>> now is not handled. IMHO still the best way to handle it is to use
>> slim lto and then simply separate link the "left overs" after deleting
>> the LTO objects. This can be actually done with objcopy (with some
>> limitations), doesn't even need linker support.
>>
>
> Quite possibly a better way to deal with that is to provide a mechanism
> for encapsulating arbitrary binary code objects inside the LTO IR.
>

If IR supports it, we can use it instead of magic section name.

-- 
H.J.

Re: "ld -r" on mixed IR/non-IR objects (

2010-12-07 Thread H. Peter Anvin

On 12/07/2010 03:58 PM, Dave Korn wrote:
> On 07/12/2010 23:15, Cary Coutant wrote:
> 
>>>   ○ Object-only section:
>>>   § Section name won't be generated by any tools, something like
>>> ".objectonly\004".
>>>   § Contains non-IR object file.
>>>   § Input is discarded after link.
>>
>> Please -- use a special section type, not a magic name.
> 
>   We're still gonna have to use a magic name on non-ELF platforms.
> 

Yes, but it probably should still be a special section type on ELF.

-hpa

Re: "ld -r" on mixed IR/non-IR objects (

2010-12-07 Thread Ian Lance Taylor

"H. Peter Anvin"  writes:

> On 12/07/2010 04:20 PM, Andi Kleen wrote:
>> 
>> The only problem left is mixing of lto and non lto objects. this right
>> now is not handled. IMHO still the best way to handle it is to use
>> slim lto and then simply separate link the "left overs" after deleting
>> the LTO objects. This can be actually done with objcopy (with some
>> limitations), doesn't even need linker support.
>> 
>
> Quite possibly a better way to deal with that is to provide a mechanism
> for encapsulating arbitrary binary code objects inside the LTO IR.

And when we use a special section name, using an unprintable name is
needlessly painful and will make it hard to play convenient games with
objcopy.  Just use a printable name starting with .gnu.

(This assumes that we do need a special section, rather than, say, a
note.)

Ian

Re: "ld -r" on mixed IR/non-IR objects (

2010-12-07 Thread H.J. Lu

On Tue, Dec 7, 2010 at 5:39 PM, Ian Lance Taylor  wrote:
> "H. Peter Anvin"  writes:
>
>> On 12/07/2010 04:20 PM, Andi Kleen wrote:
>>>
>>> The only problem left is mixing of lto and non lto objects. this right
>>> now is not handled. IMHO still the best way to handle it is to use
>>> slim lto and then simply separate link the "left overs" after deleting
>>> the LTO objects. This can be actually done with objcopy (with some
>>> limitations), doesn't even need linker support.
>>>
>>
>> Quite possibly a better way to deal with that is to provide a mechanism
>> for encapsulating arbitrary binary code objects inside the LTO IR.
>
> And when we use a special section name, using an unprintable name is
> needlessly painful and will make it hard to play convenient games with
> objcopy.  Just use a printable name starting with .gnu.
>
> (This assumes that we do need a special section, rather than, say, a
> note.)

Section works for non-ELF system. How about .gnu_object_only section
and with SHT_GNU_OBJECT_ONLY type on ELF?


-- 
H.J.

Re: PATCH: 2 stage BFD linker for LTO plugin

2010-12-07 Thread H.J. Lu

On Tue, Dec 7, 2010 at 12:12 PM, H.J. Lu  wrote:
> On Mon, Dec 6, 2010 at 9:20 AM, H.J. Lu  wrote:
>> On Mon, Dec 6, 2010 at 9:23 AM, Dave Korn  wrote:
>>> On 06/12/2010 02:20, H.J. Lu wrote:
>>>
 BTW, the new linker passed bootstrap-lto with all default languages.
 I am planning to include this patch in the next Linux binutils.

>>> I missed the IR object in an archive:
>>>
>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42690#c34
>>>
>>> This updated patch fixed it.  OK for trunk?
>>>
>> We shouldn't clear SEC_EXCLUDE if BFD_PLUGIN is set:
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42690#c38
>>
>> This updated patch fixed it.  OK for trunk?
>>
> It turns out that my patch also fixes:
>
> http://sourceware.org/bugzilla/show_bug.cgi?id=12277
>

 Updated patch, adjusted for plugin ELF symbol visibility bug fix.
 OK for trunk?
>>>
>>>  Well, I reckon this patch is great (but don't have the approval rights).
>>> It's passed an lto-bootstrap of gcc on i686-pc-cygwin and the tests are well
>>> underway without anything abnormal showing up.
>>>
 +       /* Free the old already linked table and create a new one.  */
 +       bfd_section_already_linked_table_free ();
 +       if (!bfd_section_already_linked_table_init ())
 +         einfo (_("%P%F: Failed to create hash table\n"));
 +
 +       /* Free the old hash table and create a new one.  */
 +       bfd_link_hash_table_free (link_info.output_bfd,
 +                                 link_info.hash);
 +       link_info.hash
 +         = bfd_link_hash_table_create (link_info.output_bfd);
 +       if (link_info.hash == NULL)
 +         einfo (_("%P%F: can not create hash table: %E\n"));
>>>
>>>  If I had known that there was really this little stored state to be unwound
>>> and regenerated, I would have wanted to do it this way in the first place.
>>>
 +typedef struct cmdlin_header_struct
>>>
>>>  Typo there.
>>
>> Fixed.
>>
>
> cmdline_set_next_claimed_output took the address of an stack
> variable:
>
> http://www.sourceware.org/bugzilla/show_bug.cgi?id=12293
>
> Fixed in this updated patch.
>

I fixed:

http://sourceware.org/bugzilla/show_bug.cgi?id=12295

on hjl/lto branch with commit

c7374491324a96819584e0692aaf9c8aab6f2241

http://git.kernel.org/?p=devel/binutils/hjl/x86.git;a=summary


H.J.
-- 
H.J.

41 matches

Mail list logo