Daniel Berlin wrote:
On Sun, Jul 27, 2008 at 3:10 PM, Mark Mitchell <[EMAIL PROTECTED]> wrote:
Daniel Berlin wrote:
I agree that, at least in principle, it should be possible to emit the
debug
info (whether the format is DWARF, Stabs, etc.) once.
No, you can't.
You would at least have to emit the variables separate from the types
(IE emit debug info twice).
Yes, of course; that's what everyone is talking about, I think. "Emit" here
may also mean "cache in memory some place", rather than "write to a file".
It could mean, for example, fill in the data structures we already use for
types in dwarf2out.c early, and then throw away the front-end type
information
Okay, then let us go through the options, and you tell me which you
are suggesting:
If you assume LTO does not have access to the front ends, your options
look something like this:
When you first compile each file:
Emit type debug info
Emit LTO
When you LTO them all together
Do LTO
Emit variable debug info
Under this option, "Emit variable info" requires being able to
reference the types. If you've lowered the types, this is quite
problematic. So either you get to store label names for the already
output type debug info with the variables (so you can still reference
the type you output properly when you read it back in). This is
fairly fragile, to be honest.
Another downside of this is that you can't eliminate duplicate types
between units because you don't know which types are really the same
in the debug info. You have to let the
Another option is:
When you first compile each file:
Emit type debug info
Emit partial variable debug info (IE add pointers to outputted types
but not to locations)
Emit LTO
When you LTO them all together:
Do LTO
Parse and update variable debug info to have locations
Emit variable debug info
This requires parsing the debug info (in some format, be it DWARF or
some generic format we've made up) so that you can update the variable
info's location.
As a plus, you can easily update the types where you need to.
Unlike the first option, because you understand the debug info, you
can now remove all the duplicate types between units without having to
have the linker do it for you.
Unless you link in every single frontend to LTO1 (Or move a lot to
the middle end), there is no way to do the following:
When you first compile each file:
Emit LTO
When you LTO them all together:
Emit type debug info
Do LTO
Emit variable debug info
If you don't want to link the frontends, you could also get away with
moving a lot of junk to the middle end (everything from being able to
distinguish between class and struct to namespaces, the context of
lexical blocks) because debug info outputting uses language specific
nodes all over the place right now.
Unless i've missed something, our least fragile and IMHO, best option
requires parsing back in debug info.
It is certainly *possible* to get debug info without parsing the debug
info back in.
Then again, I also don't see what the big deal about adding a debug
info parser is.
It's not like they are all that large.
[EMAIL PROTECTED]:/home/dannyb/util/debuginfo]> wc -l bytereader.*
bytereader-inl.h dwarf2enums.h dwarf2reader*
40 bytereader.cc
110 bytereader.h
118 bytereader-inl.h
465 dwarf2enums.h
797 dwarf2reader.cc
373 dwarf2reader.h
1903 total
(This includes both a callback style reader that simply hands you
thinks you tell it to, as well as something that can read back into a
format much like we use during debug info output)
you may of course be right and this is what we will end up doing, but
the implications for whopr are not good. The parser is going to have
to work in lockstep with the type merger and all of the debug sections
for all of the .o files are going to have to be parsed in lto1. My
predictions is that this is going to be a bottleneck.
kenny