> On Feb 11, 2016, at 6:56 PM, Zachary Turner <[email protected]> wrote:
>
>
>
> On Thu, Feb 11, 2016 at 5:35 PM Greg Clayton <[email protected]> wrote:
>
> > On Feb 11, 2016, at 3:41 PM, Zachary Turner via lldb-dev
> > <[email protected]> wrote:
> >
> > Hi,
> >
> > I want to make a new symbol provider to teach LLDB to understand microsoft
> > PDB files. I've been looking over the various symbol APIs, and I have a
> > few questions.
> >
> > 1. Under what circumstances do I need a custom SymbolVendor? The way pdb
> > works is that generally there is 1 file that contains all the debug info
> > needed for a single binary (so or executable). Given a list of paths, we
> > can then determine if there is a matching PDB in one of those paths. Is it
> > better to do this in the CalculateAbilities() function of the symbol file
> > plugin (by just returning 0 if we don't find a match) or do we need to do
> > something more complicated?
>
> I would suggest make a SymbolVendorPDB that only enables itself if you are
> able to find the PDB files for your COFF file. So look at your COFF file, and
> I presume somewhere in there there is a pointer to one or more PDB files
> inside that file? CalculateAbililties is the correct place to see if a COFF
> file has pointers to PDB files and making sure those files exist before you
> say that you can provide any abilities.
> Currently we use the operating system to query the PDBs. This could change
> in the future, but for now that's how we're doing it. The operating system
> does all the work of finding, matching, and loading the PDB for us, and it
> does it all in one call. So if we put this in the symbol vendor, there's no
> way to say "is there a PDB" without also saying "actually load all the data
> from the PDB" at the same time. So I'm not sure if there's a solution to
> this in there, because obviously I dont' want to load it twice.
Interesting. If you are on windows and you have a COFF file, you might just
want to make a SymbolVendorCOFF. Does PDB info always and only get created for
COFF files?
>
> One question I had about SymbolVendor, is that I looked at
> SymbolVendorELF.cpp and it seems to boil down to this notion of "symbol file
> representations". All the logic in SymbolVendorELF exists just to add some
> object file representations. What is this supposed to represent? I've got
> an exe or something, what other "representation" is there other than the exe
> itself?
In SymbolVendoerMacOSX, we have the executable and then the DWARF debug info in
a stand alone dSYM bundle. So MacOSX you have a.out as the main ObjectFile
(a.out) for a Module, but the symbols are in a different ObjectFile
(a.out.dSYM). For ELF I believe there is information in the ELF file that
_might_ point to a separate debug info file, but it also might just contain the
DWARF in the executable. So for ELF you have 1 file (exec ELF that contains
DWARF) or two files (exe ELF with no DWARF + debug info ELF with DWARF).
A symbol vendor's only job is to take an executable and and then use it plus
any other files (its job is to locate these extra debug files) to make a single
coherent view of the symbols for a lldb_private::Module. So the
SymbolVendor::FindTypes(...) might look into the executable file and one or
more other files to get the information. The information must be retrieved from
one or more SymbolFile instances. A SymbolFile uses one ObjectFile to do its
job. So there is a one to one mapping between SymbolFile and ObjectFile
instances. The SymbolFile can use the same ObjectFile as the main executable if
the data is in there. The SymbolVendor is the one that figures this out.
So some mappings might help show. The addresses before the object names are the
address of the class in the LLDB address space. For a simple a.out ELF file
that contains DWARF we would have:
0x1000: Module ("/tmp/a.out")
m_obj_file = 0x2000
0x2000: ObjectFile ("/tmp/a.out")
0x3000: SymbolVendorELF
m_sym_file = 0x4000
0x4000: SymbolFile
m_obj_file = 0x2000
For a a.out ELF file that contains an external debug file "/var/debug/a.out"
0x1000: Module ("/tmp/a.out")
m_obj_file = 0x2000
0x2000: ObjectFile ("/tmp/a.out")
0x2200: ObjectFile ("/var/debug/a.out")
0x3000: SymbolVendorELF
m_sym_file = 0x4000
0x4000: SymbolFile
m_obj_file = 0x2200
Same goes for MacOSX where we have "a.out" and "a.out.dSYM" except the
SymbolVendorMacOSX is used since it knows how to locate the dSYM files.
If there are multiple ObjectFile objects that represent the debug info, they
must share the same section list. So ObjectFiles and SymbolFiles work to make a
single section list within lldb_private::Module that is used for all objects
used to represent the symbol and debug info. That way the ObjectFile at 0x2000
and 0x2200 above both use the same section for ".text", ".data", etc. If one
ObjectFile has sections (like .debug_info for DWARF) where the other ObjectFile
doesn't, then each ObjectFile adds sections as needed. Also if executable
object file has no symbols, or a reduced amount of symbols, since it might have
been stripped, the two ObjectFiles can combine their symbol tables to make a
better symbol table. On MacOSX if we strip a.out and it has no symbols, we can
get the symbols from the dSYM file (if we find one) since dSYM files always
have fully unstripped symbol tables.
So think of SymbolVendor as the class that knows how to locate the symbol file
for a given executable (possibly even fetch the symbols from your build
system!!!) and put together one or more files to provide a coherent view of the
debug info (grab debug info from the executable itself or a stand alone file)
and object file (combine symbol tables from one or more object files, combine
all sections from all ObjectFiles used for a Module/SymbolVendor) so the use
doesn't ever need to worry about the underlying details, clients just ask the
module for stuff and we provide it to them.
Greg
_______________________________________________
lldb-dev mailing list
[email protected]
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev