On Thu, 12 Sep 2024, Evgeny Karpov wrote:

Thursday, September 12, 2024
Martin Storsjö <mar...@martin.st> wrote:

This looks very reasonable - I presume this will make sure that you only
use the other code form if the offset actually is larger than 1 MB.

For the case when the offset actually is larger than 1 MB, I guess this
also ends up generating some other instruction sequence than just a "add
x0, x0, #imm", as the #imm is limited to <= 4096. From reading the code,
it looks like it generates something like "mov x16, #imm; add x0, x0,
x16"? That's probably quite reasonable.

The generated code will stay unchanged for the offset less than 1MB:

adrp x0, symbol + offset
add x0, x0, :lo12:symbol + offset

When the offset is >= 1MB:

adrp x0, symbol + offset % (1 << 20) // it prevents relocation overflow in 
IMAGE_REL_ARM64_PAGEBASE_REL21
add x0, x0, (offset & ~0xfffff) >> 12, lsl #12 // a workaround to support 4GB 
offset
add x0, x0, :lo12:symbol + offset % (1 << 20)

Ah, I see. Yeah, that works.

That won't get you up to a full 4 GB offset from your symbol though, I think that'll get you up to 16 MB offsets. In the "add x0, x0, #imm, lsl #12" case, the immediate is a 12 bit immediate, shifted left by 12, so you effectively have 24 bit range there. But clearly this works a bit further than 1 MB at least.

// Martin

Reply via email to