clayborg added a comment.
I like this idea of this, but I would like to see this be a bit more complete.
One idea is to remove the ObjectFileJSON::Symbol definition and just use
lldb_private::Symbol objects and allow these objects to construct encode and
decode from JSON. Then we have the ability to re-create any symbols we need in
full fidelity. But that not be the goal in your case, but I don't think it
would hurt to use the lldb_private::Symbol as the object we serialize and
deserialize from JSON as ObjectFileJSON::Symbol just can't reproduce the full
depth of the symbols we want.
From reading this it looks like your main use case is to supply additional
symbols to an existing stripped binary. Is that correct? Are you aware that
each dSYM file has a full copy of the unstripped symbol table already? It
removes the debug map entries, but each dSYM copies all non debug map symbols
inside of it already. So the same thing from this patch can already be
accomplished by stripping a dSYM file of all debug info sections and leaving
the symbol table alone, and then using this this minimal dSYM file just to get
the symbols.
Any idea on where the JSON file will live if it is just a companion file for
another mach-o or ELF executable? Will it always be next to the mach-o
executable? Will we enable a Spotlight importer to find it like we do for dSYM
files?
================
Comment at: lldb/source/Plugins/ObjectFile/JSON/ObjectFileJSON.cpp:133
+void ObjectFileJSON::ParseSymtab(Symtab &symtab) {
+ // Nothing to do for JSON files, all information is parsed as debug info.
+}
----------------
Don't we want to create a Symtab from the ObjectFileJSON::Symbol vector here?
Symbols are not considered debug info. Debug info for functions is supposed to
be more rich than just address and name.
================
Comment at: lldb/source/Plugins/ObjectFile/JSON/ObjectFileJSON.h:92-95
+ struct Header {
+ std::string triple;
+ std::string uuid;
+ };
----------------
It would be great to be able to also define sections somewhere. I can think of
this being really handy for core file serialization where we generate a core
file and then store extra metadata that describes the object file in this JSON
format. It would be great to have sections for this, and we really need
sections to make ObjectFileJSON be able to represent something that actually
looks like a real object file. See Section.h for what we expect section to
have, but we can just store uint64_t values instead of lldb_private::Address
values for the start address of the range for the section.
================
Comment at: lldb/source/Plugins/ObjectFile/JSON/ObjectFileJSON.h:97-100
+ struct Symbol {
+ uint64_t addr;
+ std::string name;
+ };
----------------
A few things to think about:
- some symbols have a byte size (like in ELF or mach-o symbols in the debug map
that have pairs)
- some symbol have absolute values that are not actually addresses, there is no
way to represent this currently here
- some symbols have symbol types. We could allow symbols to specify a
lldb::SymbolType by name?
- If we have symbol types other than just symbols with addresses, some might
refer to a sibling symbol by ID or index. It might be good to allow symbols to
have integer ids
One suggestion would be to define Symbol as:
```
struct Symbol {
uint64_t value;
std::optional<uint64_t> size;
std::optional<uint64_t> id;
std::optional<SymbolType> type;
std::string name;
bool value_is_address; // Set to true if "value" is an address, false if
"value" is just an integer with a different meaning
};
```
================
Comment at: lldb/source/Plugins/SymbolFile/JSON/SymbolFileJSON.cpp:87
+
+void SymbolFileJSON::AddSymbols(Symtab &symtab) {
+ auto json_object_file = dyn_cast_or_null<ObjectFileJSON>(m_objfile_sp.get());
----------------
If we have no debug info i SymbolFileJSON, then there is really no need to add
the symbols here, we can just avoid this whole SymbolFileJSON class and have
ObjectFileJSON parse the JSON symbols and add them to the Symtab in:
```
void ObjectFileJSON::ParseSymtab(Symtab &symtab);
```
So if we are not going to add JSON debug info to the ObjectFileJSON schema,
then I would vote to remove this class and just do everything from
ObjectFileJSON.
================
Comment at: lldb/source/Plugins/SymbolFile/JSON/SymbolFileJSON.cpp:112
+ symtab.AddSymbol(Symbol(
+ /*symID*/ 0, Mangled(symbol.name), eSymbolTypeCode,
+ /*is_global*/ true, /*is_debug*/ false,
----------------
JDevlieghere wrote:
> mib wrote:
> > Should we increment the `symID` here ?
> The IDs are arbitrary, and if we start at zero, we'll have conflicts with the
> ones already in the symbol table (e.g. the lldb_unnamed_symbols for stripped
> binaries). We could check the size of the symtab and continue counting from
> there. Or we can use 0 like we did here to indicate that these values are
> "special". I went with the latter approach mostly because that's what
> SymbolFileBreakpad does too.
regardless of where the code goes that converts ObjectFileJSON::Symbol to
lldb_private::Symbol, I would vote that we include enough information in
ObjectFileJSON::Symbol so that we don't have to make up symbols IDs, or set all
symbol IDs to zero. LLDB relies on having unique symbol IDs in each
lldb_private::Symbol instance.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D145180/new/
https://reviews.llvm.org/D145180
_______________________________________________
lldb-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits