[Lldb-commits] [lldb] Support disassembling RISC-V proprietary instructions (PR #145793)

2025-06-26 Thread Sam Elliott via lldb-commits

lenary wrote:

To also respond to something earlier in the thread, where there is a little 
complexity:

> The missing part is knowing how to split up that encoding value isn't it. For 
> AArch64 you'd just print it because we only have 32-bit, Intel you would roll 
> dice to randomly decide what to do and RISC-V we have these 2/3 formats.

One "weird" bit of the approach is that we actually still rely on LLVM's 
MC-layer to understand the length of the instruction. RISC-V currently has only 
2 ratified lengths (16 and 32-bit), but describes an encoding scheme for longer 
instructions which both GNU objdump and LLVM's MC-layer understand when 
disassembling. RISC-V does not, at the moment, have a maximum length of 
instruction, but our callback only implements the scheme up to 176-bit long 
instructions. On the assembler side, we can only assemble up to 64-bit 
instructions, so we ensure our teams keep to this lower limit.

There are two relevant callbacks on MC's `MCDisassembler` interface:
- `MCDisassembler::getInstruction` which is the main interface, and interprets 
the `uint64_t &Size` whether it decodes an instruction or not. This is the only 
callback RISC-V implements.
- `MCDisassembler::suggestBytesToSkip`, which the Arm/AArch64 backends use for 
realigning the disassembly flow. We maybe should implement this given we know 
the instruction alignment in RISC-V is either 2 or 4, but we don't at the 
moment.



https://github.com/llvm/llvm-project/pull/145793
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [lldb] Support disassembling RISC-V proprietary instructions (PR #145793)

2025-07-01 Thread Sam Elliott via lldb-commits

lenary wrote:

> Doesn't seem the ideal format given that we have a known size, today most 
> often 16/32/64, and I guess 48 for funsies.

Standard instructions are right now only 16/32, but custom instructions can be 
any multiple of 16. This was the change to llvm-objdump to group bytes like gnu 
objdump does: 
https://github.com/llvm/llvm-project/commit/b27f86b40b20942c0e809128214b43d6edde365a
 which was only a bit over a year ago.

https://github.com/llvm/llvm-project/pull/145793
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [lldb] Support disassembling RISC-V proprietary instructions (PR #145793)

2025-07-01 Thread Sam Elliott via lldb-commits

lenary wrote:

> I didn't realize that the riscv instructions had a scheme for indicating 
> their lengths, very convenient.

It "doesn't". LLVM objdump implements the scheme described in the spec, but for 
>32-bit instructions, that scheme is not ratified so it could change in the 
future (the note about this is there but easy to miss). That said, Qualcomm's 
custom instructions have adopted the unratified scheme for 48-bit instructions.


> ... is this something that could be formatted from the SBInstructions in 
> `fdis`? ...
> I haven't looked at the contents of the SBInstruction to see if this is 
> straightforward or if there are things like the comment field that are 
> missing, but it's my first thought for accomplishing this.

IIRC, there's nothing to that provides the raw encoding bytes to pass off to 
another disassembler. Maybe I missed it, or it's not documented? That said, it 
would be great not to have to re-implement all of the disassemble command's 
argument handling, which looks fairly complex to me.



https://github.com/llvm/llvm-project/pull/145793
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [llvm] [lldb] Support disassembling RISC-V proprietary instructions (PR #145793)

2025-07-11 Thread Sam Elliott via lldb-commits

lenary wrote:

> We don't have any big endian riscv ArchSpec entry today or I'd add a big 
> endian riscv instruction decoding test too.

We're starting to get Big Endian support for RISC-V in LLVM, but I've asked 
that those patches wait until after the branch as there's a lot of work to make 
big-endian work, and it would be better if it all came in one release not 
slowly over several (especially as the big-endian ABI is not yet completed).

Once we get big-endian support, testing should be fairly easy, as risc-v 
instructions are always little-endian.

https://github.com/llvm/llvm-project/pull/145793
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits