https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119386
--- Comment #39 from Ard Biesheuvel <ardb at kernel dot org> --- (In reply to Alexander Monakov from comment #38) > (In reply to Ard Biesheuvel from comment #37) > > Yes, we can drop -mcmodel=kernel, and use -mcmodel=small instead. This is > > why I'm not keen on relying on that - it is ill-defined and there is really > > no need to have this special case. In the kernel, we are trying to move away > > from all the special sauce in the toolchain - x86 especially is affected by > > this, whereas arm64 and other architectures just use -mcmodel=small. The > > primary sticking point is the relative cost of RIP-relative LEA vs 32-bit > > absolute MOV but that gap appears to have been closing in recent designs. > > There's a couple places where GCC restricts offsets differently for > -mcmodel=kernel vs. -mcmodel=small in the x86 backend. It's been determined > it doesn't matter? How so? > https://gcc.gnu.org/cgit/gcc/tree/gcc/config/i386/predicates.md#n239 > PIC code can run anywhere, so it can also run in the top 2 GB of the 64-bit address space, which is what the kernel code model is limited to. > > I'm not sure what that would solve. When linking the kernel, all > > R_X86_64_PLT32 can be resolved directly, and so there is never the need for > > a PLT in practice. The compiler does not have to care about this > > distinction. Relaxing a CALL via a PLT into a direct one is much easier than > > relaxing a GOT based data reference into a direct one. > > It's not just about the calls. On x86-64 it's less pronounced, but on arm64 > telling the compiler up front that everything ends up in the final > executable can improve codegen when referencing extern variables, for > instance: > > //__attribute__((visibility("hidden"))) > extern int a[]; > > int f(void) > { > return a[1]; > } > > gcc -O2 -fpie gets you > > f: > adrp x0, 0 <_GLOBAL_OFFSET_TABLE_> > R_AARCH64_ADR_PREL_PG_HI21 _GLOBAL_OFFSET_TABLE_ > ldr x0, [x0] > R_AARCH64_LD64_GOTPAGE_LO15 a > ldr w0, [x0, #4] > ret > > and with the attribute uncommented, emulating what -fstatic-pie would do: > > f: > adrp x0, 0 <a> > R_AARCH64_ADR_PREL_PG_HI21 a+0x4 > ldr w0, [x0] > R_AARCH64_LDST32_ABS_LO12_NC a+0x4 > ret In Linux, we don't even bother with PIC codegen, even though we link with -pie. The non-PIC AArch64 small code model uses PC-relative references for code and data. I do agree that it would be better for this behavior to be explicit, so I'd switch Linux to it if it ever appeared. But only to keep the existing behavior. We do use hidden visibility in Linux (using #pragma) in some places, when building PIC code that must not ever use absolute references (which makes the use of a GOT impossible)