jrtc27 added a comment.

In D64146#2591830 <https://reviews.llvm.org/D64146#2591830>, @jrtc27 wrote:
> In D64146#2591732 <https://reviews.llvm.org/D64146#2591732>, @jrtc27 wrote:
>
>> In D64146#2567710 <https://reviews.llvm.org/D64146#2567710>, @nand wrote:
>>
>>> CodePtr points into the bytecode emitted by the byte code compiler. In some 
>>> instances, pointers to auxiliary data structures are embedded into the byte 
>>> code, such as functions or AST nodes which contain information relevant to 
>>> the execution of the instruction.
>>>
>>> Would it help if instead of encoding pointers, the byte code encoded some 
>>> integers mapped to the original objects?
>>
>> I've read through the code and have slightly more understanding now. It 
>> seems there are several options:
>>
>> 1. Keep the pointers somewhere on the side and put an integer in the byte 
>> code, like you suggest
>>
>> 2. Pad values in the byte code to their natural alignment in general (and 
>> ensure the underlying std::vector<char> gets its storage allocated at an 
>> aligned boundary / use a different container), though this can get a little 
>> weird as the amount of padding between consecutive arguments varies 
>> depending on where you are (unless you force realignment to the max 
>> alignment at the start of a new opcode)
>>
>> 3. Make the byte code be an array of uintptr_t instead of packing it like is 
>> done currently, with care needed on ILP32; that can either just use uint64_t 
>> instead and we declare CHERI unsupported for 32-bit architectures (which is 
>> unlikely to be a problem as you probably want a 64-bit virtual address space 
>> if the doubling pointer size, with 64-bit CHERI capabilities on 32-bit VA 
>> systems being only for embedded use) or you can split 64-bit integers into 
>> two 32-bit integers and treat them as two arguments
>>
>> 1 works but feels ugly. 2 or 3 would be my preference, and mirror how 
>> "normal" interpreters work, though those might split the code and data so 
>> they can keep the opcodes as, say, 32-bit integers, but the stack full of 
>> native word/pointer slots; my inclination is that 3 is the best option as it 
>> looks like the simplest. How do you feel about each of those? Is memory 
>> overhead from not packing values a concern?
>
> Hm, though I see the "store an ID" pattern is common for dynamic things and 
> this should be quite rare, so maybe that is indeed the right approach, 
> mirroring something like getOrCreateGlobal?

https://reviews.llvm.org/D97606 implements this.


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64146/new/

https://reviews.llvm.org/D64146

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to