[Lldb-commits] [lldb] ObjectFileJSON and Section changes to support section.address field i… (PR #101062)

2024-07-29 Thread Ilhan Raja via lldb-commits

https://github.com/YungRaj created 
https://github.com/llvm/llvm-project/pull/101062

This is a dummy pull request to demonstrate the changes I made in to get 
symbolication working using JSON Object/Symbol files

https://discourse.llvm.org/t/lldb-support-renaming-symbols-by-address-using-a-symbol-file-provided-by-json-or-other-formatted-data-to-assist-reverse-engineering-e-g-protobuf-plist-xml-etc/80355/8


>From 47af2b7229eaa6712fe5812e3d4dbea44bbb212b Mon Sep 17 00:00:00 2001
From: Ilhan Raja 
Date: Mon, 29 Jul 2024 11:43:47 -0700
Subject: [PATCH] ObjectFileJSON and Section changes to support section.address
 field in JSON object symmbol files

---
 lldb/source/Core/Section.cpp   | 2 +-
 lldb/source/Plugins/ObjectFile/JSON/ObjectFileJSON.cpp | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/lldb/source/Core/Section.cpp b/lldb/source/Core/Section.cpp
index 0763e88d4608f..f138d62fb356d 100644
--- a/lldb/source/Core/Section.cpp
+++ b/lldb/source/Core/Section.cpp
@@ -685,7 +685,7 @@ bool fromJSON(const llvm::json::Value &value,
   lldb_private::JSONSection §ion, llvm::json::Path path) {
   llvm::json::ObjectMapper o(value, path);
   return o && o.map("name", section.name) && o.map("type", section.type) &&
- o.map("size", section.address) && o.map("size", section.size);
+ o.map("address", section.address) && o.map("size", section.size);
 }
 
 bool fromJSON(const llvm::json::Value &value, lldb::SectionType &type,
diff --git a/lldb/source/Plugins/ObjectFile/JSON/ObjectFileJSON.cpp 
b/lldb/source/Plugins/ObjectFile/JSON/ObjectFileJSON.cpp
index ffbd87714242c..0ee827355f060 100644
--- a/lldb/source/Plugins/ObjectFile/JSON/ObjectFileJSON.cpp
+++ b/lldb/source/Plugins/ObjectFile/JSON/ObjectFileJSON.cpp
@@ -183,7 +183,7 @@ void ObjectFileJSON::CreateSections(SectionList 
&unified_section_list) {
   for (const auto §ion : m_sections) {
 auto section_sp = std::make_shared(
 GetModule(), this, id++, ConstString(section.name),
-section.type.value_or(eSectionTypeCode), 0, section.size.value_or(0), 
0,
+section.type.value_or(eSectionTypeCode), section.address.value_or(0), 
section.size.value_or(0), 0,
 section.size.value_or(0), /*log2align*/ 0, /*flags*/ 0);
 m_sections_up->AddSection(section_sp);
 unified_section_list.AddSection(section_sp);

___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] ObjectFileJSON and Section changes to support section.address field i… (PR #101062)

2024-07-29 Thread Ilhan Raja via lldb-commits

https://github.com/YungRaj edited 
https://github.com/llvm/llvm-project/pull/101062
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] ObjectFileJSON and Section changes to support section.address field i… (PR #101062)

2024-07-29 Thread Ilhan Raja via lldb-commits

YungRaj wrote:

> The change itself looks good modulo formatting. Can you please update one of 
> the existing tests to cover this use case?

This doesn't get symbolication fully working. We still need to prevent the 
object file from being read when inspecting the symbolicated addresses through 
disassembly of instructions or by reading the contents of that memory.

I can fix all the unit tests once we get the ball fully rolled there. What are 
your recommendations for getting that working? Is that an issue with 
`Target::ReadMemory()` and the such?

https://github.com/llvm/llvm-project/pull/101062
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] ObjectFileJSON and Section changes to support section.address field i… (PR #101062)

2024-07-29 Thread Ilhan Raja via lldb-commits

YungRaj wrote:

> This patch fixes the deserialization of the "address" field which surely can 
> be tested in isolation. The existing test (`TestObjectFileJSON.py`) has 
> `address` set to zero, which is why this happens to work today. It should be 
> possible to either update the test or add a new one that has a non-zero 
> section address, which would fail prior to this patch, but pass with it.

I was hoping to fix everything in one Pull Request so that it at least becomes 
usable once this merges. I can handle the unit test changes, but I'm unsure how 
to fix the other memory reading issue while debugging the target. I describe it 
in brief on that other comment.

Basically reading the memory of the symbolicated address reads from the JSON 
Object File instead of the live memory.

```
ilhanraja@ilhanrajas-Virtual-Machine build % bin/lldb
(lldb) target create /Users/ilhanraja/Downloads/symbols.json 
Current executable set to '/Users/ilhanraja/Downloads/symbols.json' (arm64e).
(lldb) gdb-remote 10.11.1.2:4000
Kernel UUID: 894FC3F0-4DF9-3A70-A4EE-75646151C8A7
Load Address: 0xfff00700c000
Process 1 stopped
* thread #1, stop reason = signal SIGINT
frame #0: 0xfff007d56080
error: 0x can't be resolved
(lldb) bt
* thread #1, stop reason = signal SIGINT
  * frame #0: 0xfff007d56080
frame #1: 0xfff007ecafe8
frame #2: 0xfff007dc2d50
frame #3: 0xfff007dc2fb8
(lldb) image lookup -r -s ipc
74 symbols match the regular expression 'ipc' in 
/Users/ilhanraja/Downloads/symbols.json:
Address: symbols.json[0xfff007d62124] 
(symbols.json.com.apple.kernel:__text + 106788)
Summary: symbols.json`ipc_hash_delete
Address: symbols.json[0xfff007d6264c] 
(symbols.json.com.apple.kernel:__text + 108108)
Summary: symbols.json`ipc_importance_task_check_transition
Address: symbols.json[0xfff007d62774] 
(symbols.json.com.apple.kernel:__text + 108404)
Summary: symbols.json`ipc_importance_task_propagate_assertion_locked
Address: symbols.json[0xfff007d66b80] 
(symbols.json.com.apple.kernel:__text + 125824)
Summary: symbols.json`ipc_kmsg_alloc
Address: symbols.json[0xfff007d66e30] 
(symbols.json.com.apple.kernel:__text + 126512)
Summary: symbols.json`ipc_kmsg_alloc_uext_reply
Address: symbols.json[0xfff007d671c4] 
(symbols.json.com.apple.kernel:__text + 127428)
Summary: symbols.json`ipc_kmsg_enqueue_qos
Address: symbols.json[0xfff007d675f8] 
(symbols.json.com.apple.kernel:__text + 128504)
Summary: symbols.json`ipc_kmsg_clean_body
Address: symbols.json[0xfff007d6af48] 
(symbols.json.com.apple.kernel:__text + 143176)
Summary: symbols.json`_ipc_kmsg_option_check
Address: symbols.json[0xfff007d6b0fc] 
(symbols.json.com.apple.kernel:__text + 143612)
Summary: symbols.json`ipc_kmsg_validate_reply_port_locked
Address: symbols.json[0xfff007d6bf78] 
(symbols.json.com.apple.kernel:__text + 147320)
Summary: symbols.json`ipc_kmsg_link_reply_context_locked
Address: symbols.json[0xfff007d6cf5c] 
(symbols.json.com.apple.kernel:__text + 151388)
Summary: symbols.json`ipc_kmsg_get_thread_group
Address: symbols.json[0xfff007d6e65c] 
(symbols.json.com.apple.kernel:__text + 157276)
Summary: symbols.json`ipc_mqueue_destroy_locked
Address: symbols.json[0xfff007d6e6fc] 
(symbols.json.com.apple.kernel:__text + 157436)
Summary: symbols.json`ipc_mqueue_set_qlimit_locked
Address: symbols.json[0xfff007d6e87c] 
(symbols.json.com.apple.kernel:__text + 157820)
Summary: symbols.json`ipc_notify_port_deleted
Address: symbols.json[0xfff007d6e8e0] 
(symbols.json.com.apple.kernel:__text + 157920)
Summary: symbols.json`ipc_notify_no_senders_prepare
Address: symbols.json[0xfff007d6ea50] 
(symbols.json.com.apple.kernel:__text + 158288)
Summary: symbols.json`ipc_notify_send_once_and_unlock
Address: symbols.json[0xfff007d6eb6c] 
(symbols.json.com.apple.kernel:__text + 158572)
Summary: symbols.json`ipc_object_deallocate_queue_invoke
Address: symbols.json[0xfff007d6ec7c] 
(symbols.json.com.apple.kernel:__text + 158844)
Summary: symbols.json`ipc_object_release
Address: symbols.json[0xfff007d6f300] 
(symbols.json.com.apple.kernel:__text + 160512)
Summary: symbols.json`ipc_object_alloc
Address: symbols.json[0xfff007d6f400] 
(symbols.json.com.apple.kernel:__text + 160768)
Summary: symbols.json`ipc_object_alloc_name
Address: symbols.json[0xfff007d70844] 
(symbols.json.com.apple.kernel:__text + 165956)
Summary: symbols.json`ipc_object_lock_allow_invalid
Address: symbols.json[0xfff007d70970] 
(symbols.json.com.apple.kernel:__text + 166256)
Summary: symbols.json`ipc_port_reference
Address: symbols.

[Lldb-commits] [lldb] ObjectFileJSON and Section changes to support section.address field i… (PR #101062)

2024-07-29 Thread Ilhan Raja via lldb-commits

YungRaj wrote:

I got it working.

But symbolicating backtraces is still broken. Setting breakpoints by address, 
reading memory, disassembling memory, etc is now working.

https://github.com/llvm/llvm-project/blob/40b4fd7a3e81d32b29364a1b15337bcf817659c0/lldb/source/Core/Section.cpp#L229


In `Section::GetBaseLoadAddress()`, the default return value is` 
LLDB_INVALID_ADDRESS`, but should only be the default return value if the 
current segment's FileAddress() is invalid, in which case with the 
`symbols.json` value it is not.

https://github.com/llvm/llvm-project/pull/101062
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] ObjectFileJSON and Section changes to support section.address field i… (PR #101062)

2024-07-31 Thread Ilhan Raja via lldb-commits

YungRaj wrote:

> > I was hoping to fix everything in one Pull Request so that it at least 
> > becomes usable once this merges.
> 
> The LLVM project generally 
> [prefers](https://llvm.org/docs/CodeReview.html#code-reviews-speed-and-reciprocity)
>  smaller patches as they're easier to review. We'll definitely want to fix 
> the end-to-end issue and have a test, but the deserialization issue can stand 
> on its own and deserves its own PR.

Sounds good. Will divide the pull requests into multiple of them.

So I tried to get symbolicating backtraces working, however, this is a bit more 
challenging, because LLVM doesn't have a good intuition of building a proper 
bound from start to finish of a function. It's not that I couldn't get 
addresses to symbolicate, but many functions that get symbolicated are actually 
from functions before it that had the subsequent function unsymbolicated.


@JDevlieghere do you believe there is a way to fix LLDB's intuition of the 
proper bounds of functions based on function prologues (e.g. `BTI`, `PACIBSP`, 
`STP X29, X30, [SP, #-offset]!` on arm64) and epilogues (`RET`, `RETAB`, etc on 
arm64) 



https://github.com/llvm/llvm-project/pull/101062
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] ObjectFileJSON and Section changes to support section.address field i… (PR #101062)

2024-08-01 Thread Ilhan Raja via lldb-commits

https://github.com/YungRaj edited 
https://github.com/llvm/llvm-project/pull/101062
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] ObjectFileJSON and Section changes to support section.address field i… (PR #101062)

2024-08-02 Thread Ilhan Raja via lldb-commits

YungRaj wrote:

> > > I was hoping to fix everything in one Pull Request so that it at least 
> > > becomes usable once this merges.
> > 
> > 
> > The LLVM project generally 
> > [prefers](https://llvm.org/docs/CodeReview.html#code-reviews-speed-and-reciprocity)
> >  smaller patches as they're easier to review. We'll definitely want to fix 
> > the end-to-end issue and have a test, but the deserialization issue can 
> > stand on its own and deserves its own PR.
> 
> Sounds good. Will divide the pull requests into many of them.
> 
> So I tried to get symbolicating backtraces working, however, this is a bit 
> more challenging, because LLDB doesn't have a good intuition of building a 
> `AddressRange` e.g. the start to finish of a function. It's not that I 
> couldn't get addresses to symbolicate, but many functions that get 
> symbolicated are actually from functions before it that had the subsequent 
> function unsymbolicated. Thus the symbols clash and the backtraces are low 
> accuracy.
> 
> @JDevlieghere do you believe there is a way to fix LLDB's intuition of the 
> proper bounds of functions based on function prologues (e.g. `BTI`, 
> `PACIBSP`, `STP X29, X30, [SP, #-offset]!` on arm64) and epilogues (`RET`, 
> `RETAB`, etc on arm64) ? Technically, decompilers are able to effectively 
> build this intuition, but the metadata for that doesn't get exported into the 
> linker map files. Is this non-trivial in general? Could lifting via IR be 
> effective here (even though I doubt the LLDB source supports that)?
> 
> If this is not achievable, what would it take to get type information 
> supported by `ObjectFileJSON`? This isn't a strict requirement of mine, but 
> it'd be a nice to reduce the quantity of things that I would need to do by 
> hand.

It looks like the best way to do this is force the decompiler to produce the 
end address in a symbol file. This is not supported by linker map files, but 
can be procured using a IDAPython script.

https://github.com/llvm/llvm-project/pull/101062
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits