[Lldb-commits] [lldb] ObjectFileJSON and Section changes to support section.address field i… (PR #101062)
https://github.com/YungRaj created https://github.com/llvm/llvm-project/pull/101062 This is a dummy pull request to demonstrate the changes I made in to get symbolication working using JSON Object/Symbol files https://discourse.llvm.org/t/lldb-support-renaming-symbols-by-address-using-a-symbol-file-provided-by-json-or-other-formatted-data-to-assist-reverse-engineering-e-g-protobuf-plist-xml-etc/80355/8 >From 47af2b7229eaa6712fe5812e3d4dbea44bbb212b Mon Sep 17 00:00:00 2001 From: Ilhan Raja Date: Mon, 29 Jul 2024 11:43:47 -0700 Subject: [PATCH] ObjectFileJSON and Section changes to support section.address field in JSON object symmbol files --- lldb/source/Core/Section.cpp | 2 +- lldb/source/Plugins/ObjectFile/JSON/ObjectFileJSON.cpp | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/lldb/source/Core/Section.cpp b/lldb/source/Core/Section.cpp index 0763e88d4608f..f138d62fb356d 100644 --- a/lldb/source/Core/Section.cpp +++ b/lldb/source/Core/Section.cpp @@ -685,7 +685,7 @@ bool fromJSON(const llvm::json::Value &value, lldb_private::JSONSection §ion, llvm::json::Path path) { llvm::json::ObjectMapper o(value, path); return o && o.map("name", section.name) && o.map("type", section.type) && - o.map("size", section.address) && o.map("size", section.size); + o.map("address", section.address) && o.map("size", section.size); } bool fromJSON(const llvm::json::Value &value, lldb::SectionType &type, diff --git a/lldb/source/Plugins/ObjectFile/JSON/ObjectFileJSON.cpp b/lldb/source/Plugins/ObjectFile/JSON/ObjectFileJSON.cpp index ffbd87714242c..0ee827355f060 100644 --- a/lldb/source/Plugins/ObjectFile/JSON/ObjectFileJSON.cpp +++ b/lldb/source/Plugins/ObjectFile/JSON/ObjectFileJSON.cpp @@ -183,7 +183,7 @@ void ObjectFileJSON::CreateSections(SectionList &unified_section_list) { for (const auto §ion : m_sections) { auto section_sp = std::make_shared( GetModule(), this, id++, ConstString(section.name), -section.type.value_or(eSectionTypeCode), 0, section.size.value_or(0), 0, +section.type.value_or(eSectionTypeCode), section.address.value_or(0), section.size.value_or(0), 0, section.size.value_or(0), /*log2align*/ 0, /*flags*/ 0); m_sections_up->AddSection(section_sp); unified_section_list.AddSection(section_sp); ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] ObjectFileJSON and Section changes to support section.address field i… (PR #101062)
https://github.com/YungRaj edited https://github.com/llvm/llvm-project/pull/101062 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] ObjectFileJSON and Section changes to support section.address field i… (PR #101062)
YungRaj wrote: > The change itself looks good modulo formatting. Can you please update one of > the existing tests to cover this use case? This doesn't get symbolication fully working. We still need to prevent the object file from being read when inspecting the symbolicated addresses through disassembly of instructions or by reading the contents of that memory. I can fix all the unit tests once we get the ball fully rolled there. What are your recommendations for getting that working? Is that an issue with `Target::ReadMemory()` and the such? https://github.com/llvm/llvm-project/pull/101062 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] ObjectFileJSON and Section changes to support section.address field i… (PR #101062)
YungRaj wrote: > This patch fixes the deserialization of the "address" field which surely can > be tested in isolation. The existing test (`TestObjectFileJSON.py`) has > `address` set to zero, which is why this happens to work today. It should be > possible to either update the test or add a new one that has a non-zero > section address, which would fail prior to this patch, but pass with it. I was hoping to fix everything in one Pull Request so that it at least becomes usable once this merges. I can handle the unit test changes, but I'm unsure how to fix the other memory reading issue while debugging the target. I describe it in brief on that other comment. Basically reading the memory of the symbolicated address reads from the JSON Object File instead of the live memory. ``` ilhanraja@ilhanrajas-Virtual-Machine build % bin/lldb (lldb) target create /Users/ilhanraja/Downloads/symbols.json Current executable set to '/Users/ilhanraja/Downloads/symbols.json' (arm64e). (lldb) gdb-remote 10.11.1.2:4000 Kernel UUID: 894FC3F0-4DF9-3A70-A4EE-75646151C8A7 Load Address: 0xfff00700c000 Process 1 stopped * thread #1, stop reason = signal SIGINT frame #0: 0xfff007d56080 error: 0x can't be resolved (lldb) bt * thread #1, stop reason = signal SIGINT * frame #0: 0xfff007d56080 frame #1: 0xfff007ecafe8 frame #2: 0xfff007dc2d50 frame #3: 0xfff007dc2fb8 (lldb) image lookup -r -s ipc 74 symbols match the regular expression 'ipc' in /Users/ilhanraja/Downloads/symbols.json: Address: symbols.json[0xfff007d62124] (symbols.json.com.apple.kernel:__text + 106788) Summary: symbols.json`ipc_hash_delete Address: symbols.json[0xfff007d6264c] (symbols.json.com.apple.kernel:__text + 108108) Summary: symbols.json`ipc_importance_task_check_transition Address: symbols.json[0xfff007d62774] (symbols.json.com.apple.kernel:__text + 108404) Summary: symbols.json`ipc_importance_task_propagate_assertion_locked Address: symbols.json[0xfff007d66b80] (symbols.json.com.apple.kernel:__text + 125824) Summary: symbols.json`ipc_kmsg_alloc Address: symbols.json[0xfff007d66e30] (symbols.json.com.apple.kernel:__text + 126512) Summary: symbols.json`ipc_kmsg_alloc_uext_reply Address: symbols.json[0xfff007d671c4] (symbols.json.com.apple.kernel:__text + 127428) Summary: symbols.json`ipc_kmsg_enqueue_qos Address: symbols.json[0xfff007d675f8] (symbols.json.com.apple.kernel:__text + 128504) Summary: symbols.json`ipc_kmsg_clean_body Address: symbols.json[0xfff007d6af48] (symbols.json.com.apple.kernel:__text + 143176) Summary: symbols.json`_ipc_kmsg_option_check Address: symbols.json[0xfff007d6b0fc] (symbols.json.com.apple.kernel:__text + 143612) Summary: symbols.json`ipc_kmsg_validate_reply_port_locked Address: symbols.json[0xfff007d6bf78] (symbols.json.com.apple.kernel:__text + 147320) Summary: symbols.json`ipc_kmsg_link_reply_context_locked Address: symbols.json[0xfff007d6cf5c] (symbols.json.com.apple.kernel:__text + 151388) Summary: symbols.json`ipc_kmsg_get_thread_group Address: symbols.json[0xfff007d6e65c] (symbols.json.com.apple.kernel:__text + 157276) Summary: symbols.json`ipc_mqueue_destroy_locked Address: symbols.json[0xfff007d6e6fc] (symbols.json.com.apple.kernel:__text + 157436) Summary: symbols.json`ipc_mqueue_set_qlimit_locked Address: symbols.json[0xfff007d6e87c] (symbols.json.com.apple.kernel:__text + 157820) Summary: symbols.json`ipc_notify_port_deleted Address: symbols.json[0xfff007d6e8e0] (symbols.json.com.apple.kernel:__text + 157920) Summary: symbols.json`ipc_notify_no_senders_prepare Address: symbols.json[0xfff007d6ea50] (symbols.json.com.apple.kernel:__text + 158288) Summary: symbols.json`ipc_notify_send_once_and_unlock Address: symbols.json[0xfff007d6eb6c] (symbols.json.com.apple.kernel:__text + 158572) Summary: symbols.json`ipc_object_deallocate_queue_invoke Address: symbols.json[0xfff007d6ec7c] (symbols.json.com.apple.kernel:__text + 158844) Summary: symbols.json`ipc_object_release Address: symbols.json[0xfff007d6f300] (symbols.json.com.apple.kernel:__text + 160512) Summary: symbols.json`ipc_object_alloc Address: symbols.json[0xfff007d6f400] (symbols.json.com.apple.kernel:__text + 160768) Summary: symbols.json`ipc_object_alloc_name Address: symbols.json[0xfff007d70844] (symbols.json.com.apple.kernel:__text + 165956) Summary: symbols.json`ipc_object_lock_allow_invalid Address: symbols.json[0xfff007d70970] (symbols.json.com.apple.kernel:__text + 166256) Summary: symbols.json`ipc_port_reference Address: symbols.
[Lldb-commits] [lldb] ObjectFileJSON and Section changes to support section.address field i… (PR #101062)
YungRaj wrote: I got it working. But symbolicating backtraces is still broken. Setting breakpoints by address, reading memory, disassembling memory, etc is now working. https://github.com/llvm/llvm-project/blob/40b4fd7a3e81d32b29364a1b15337bcf817659c0/lldb/source/Core/Section.cpp#L229 In `Section::GetBaseLoadAddress()`, the default return value is` LLDB_INVALID_ADDRESS`, but should only be the default return value if the current segment's FileAddress() is invalid, in which case with the `symbols.json` value it is not. https://github.com/llvm/llvm-project/pull/101062 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] ObjectFileJSON and Section changes to support section.address field i… (PR #101062)
YungRaj wrote: > > I was hoping to fix everything in one Pull Request so that it at least > > becomes usable once this merges. > > The LLVM project generally > [prefers](https://llvm.org/docs/CodeReview.html#code-reviews-speed-and-reciprocity) > smaller patches as they're easier to review. We'll definitely want to fix > the end-to-end issue and have a test, but the deserialization issue can stand > on its own and deserves its own PR. Sounds good. Will divide the pull requests into multiple of them. So I tried to get symbolicating backtraces working, however, this is a bit more challenging, because LLVM doesn't have a good intuition of building a proper bound from start to finish of a function. It's not that I couldn't get addresses to symbolicate, but many functions that get symbolicated are actually from functions before it that had the subsequent function unsymbolicated. @JDevlieghere do you believe there is a way to fix LLDB's intuition of the proper bounds of functions based on function prologues (e.g. `BTI`, `PACIBSP`, `STP X29, X30, [SP, #-offset]!` on arm64) and epilogues (`RET`, `RETAB`, etc on arm64) https://github.com/llvm/llvm-project/pull/101062 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] ObjectFileJSON and Section changes to support section.address field i… (PR #101062)
https://github.com/YungRaj edited https://github.com/llvm/llvm-project/pull/101062 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] ObjectFileJSON and Section changes to support section.address field i… (PR #101062)
YungRaj wrote: > > > I was hoping to fix everything in one Pull Request so that it at least > > > becomes usable once this merges. > > > > > > The LLVM project generally > > [prefers](https://llvm.org/docs/CodeReview.html#code-reviews-speed-and-reciprocity) > > smaller patches as they're easier to review. We'll definitely want to fix > > the end-to-end issue and have a test, but the deserialization issue can > > stand on its own and deserves its own PR. > > Sounds good. Will divide the pull requests into many of them. > > So I tried to get symbolicating backtraces working, however, this is a bit > more challenging, because LLDB doesn't have a good intuition of building a > `AddressRange` e.g. the start to finish of a function. It's not that I > couldn't get addresses to symbolicate, but many functions that get > symbolicated are actually from functions before it that had the subsequent > function unsymbolicated. Thus the symbols clash and the backtraces are low > accuracy. > > @JDevlieghere do you believe there is a way to fix LLDB's intuition of the > proper bounds of functions based on function prologues (e.g. `BTI`, > `PACIBSP`, `STP X29, X30, [SP, #-offset]!` on arm64) and epilogues (`RET`, > `RETAB`, etc on arm64) ? Technically, decompilers are able to effectively > build this intuition, but the metadata for that doesn't get exported into the > linker map files. Is this non-trivial in general? Could lifting via IR be > effective here (even though I doubt the LLDB source supports that)? > > If this is not achievable, what would it take to get type information > supported by `ObjectFileJSON`? This isn't a strict requirement of mine, but > it'd be a nice to reduce the quantity of things that I would need to do by > hand. It looks like the best way to do this is force the decompiler to produce the end address in a symbol file. This is not supported by linker map files, but can be procured using a IDAPython script. https://github.com/llvm/llvm-project/pull/101062 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits