On 29/11/2018 23:34, Greg Clayton wrote:
On Nov 29, 2018, at 10:55 AM, Pavel Labath via Phabricator
<revi...@reviews.llvm.org> wrote:
labath added a comment.
I've recently started looking at adding a new symbol file format (breakpad
symbols). While researching the best way to achieve that, I started comparing
the operation of PDB and DWARF symbol files. I noticed a very important
difference there, and I think that is the cause of our problems here. In the
DWARF implementation, a symbol file is an overlay on top of an object file - it
takes the data contained by the object file and presents it in a more
structured way.
However, that is not the case with PDB (both implementations). These take the
debug information from a completely different file, which is not backed by an
ObjectFile instance, and then present that. Since the SymbolFile interface
requires them to be backed by an object file, they both pretend they are backed
by the original EXE file, but in reality the data comes from elsewhere.
If we had an ObjectFilePDB (which not also not ideal, though in a way it is a
better fit to the current lldb organization), then this could expose the PDB
symtab via the existing ObjectFile interface and we could reuse the existing
mechanism for merging symtabs from two object files.
I am asking this because now I am facing a choice in how to implement breakpad
symbols. I could go the PDB way, and read the symbols without an intervening
object file, or I could create an ObjectFileBreakpad and then (possibly) a
SymbolFileBreakpad sitting on top of that.
The drawbacks of the PDB approach I see are:
- I lose the ability to do matching of the (real) object file via symbol
vendors. The PDB symbol files now basically implement their own little symbol
vendors inside them, which is mostly fine if you just need to find the PDB next
to the exe file. However, things could get a bit messy if you wanted to
implement some more complex searching on multiple paths, or downloading them
from the internet.
- I'll hit issues when attempting to unwind (which is the real meat of the
breakpad symbols), because unwind info is currently provided via the ObjectFile
interface (ObjectFile::GetUnwindTable).
The drawbacks of the ObjectFile approach are:
- more code - it needs a new ObjectFile and a new SymbolFile class (possibly
also a SymbolVendor)
- it will probably look a bit weird because Breakpad files (and PDBs) aren't
really object files
I'd like to hear your thoughts on this, if you have any.
Since the Breakpad files contain symbol and debug info, I would make a
ObjectFileBreakpad and SymbolFileBreakpad route. We might want to just load a
break pad file and be able to symbolicate with it. Each symbol vendor will need
to be able to use a breakpad file (since Breakpad produces files for all
architectures), so we can build some functionality into the SymbolVendor base
class for file formats that any file format (ELF, Mach-o, COFF) might want to
use.
Just to close this, I've decided to try going the ObjectFile route (in
D55214). It shouldn't take me long to reach the point where I need to
vend symtab and unwind information to the rest of lldb, at which point
we can decide whether that was the right move.
cheers,
pl
_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits