On Thu, 12 Sep 2024, Evgeny Karpov wrote:
Thursday, September 12, 2024
Martin Storsjö <mar...@martin.st> wrote:
This looks very reasonable - I presume this will make sure that you only
use the other code form if the offset actually is larger than 1 MB.
For the case when the offset actually is larger than 1 MB, I guess this
also ends up generating some other instruction sequence than just a "add
x0, x0, #imm", as the #imm is limited to <= 4096. From reading the code,
it looks like it generates something like "mov x16, #imm; add x0, x0,
x16"? That's probably quite reasonable.
The generated code will stay unchanged for the offset less than 1MB:
adrp x0, symbol + offset
add x0, x0, :lo12:symbol + offset
When the offset is >= 1MB:
adrp x0, symbol + offset % (1 << 20) // it prevents relocation overflow in
IMAGE_REL_ARM64_PAGEBASE_REL21
add x0, x0, (offset & ~0xfffff) >> 12, lsl #12 // a workaround to support 4GB
offset
add x0, x0, :lo12:symbol + offset % (1 << 20)
Ah, I see. Yeah, that works.
That won't get you up to a full 4 GB offset from your symbol though, I
think that'll get you up to 16 MB offsets. In the "add x0, x0, #imm, lsl
#12" case, the immediate is a 12 bit immediate, shifted left by 12, so you
effectively have 24 bit range there. But clearly this works a bit further
than 1 MB at least.
// Martin