On Thu, Aug 21, 2025 at 11:46:17AM -0700, Kees Cook wrote:
> On Thu, Aug 21, 2025 at 11:29:35AM +0200, Peter Zijlstra wrote:

> > The current kernel FineIBT code hard assumes r11 for now.
> 
> Oh, it looked like it wasn't always r11. Does clang force the call
> register to be r11?

Yes. I'm not sure why, but that's what it does, unconditionally r11.

(funny extra detail, when using retpolines, clang sometimes generates
conditional tail-calls, it merged Jcc and JMP __x86_indirect_thunk_r11
into Jcc __x86_indirect_thunk_r11. I've never seen GCC do this).

> I only do that here if the call expression isn't a
> register (similar to -mindirect-branch-register). Looking at the retpoline
> implementation, I see __x86_indirect_thunk_* being generated for all the
> general registers. 

Yeah, generating the whole set was easiest. The BPF JIT and custom asm
also have retpolines in. Eg.

arch/x86/kernel/ftrace_64.S:    CALL_NOSPEC r8
arch/x86/platform/efi/efi_stub_64.S:    CALL_NOSPEC rdi

> Hm, but in looking now I see all the hard-coded r11 use
> in the fineibt alternatives. I wonder if my boot testing is somehow not
> triggering the FineIBT alternatives patching? I will investigate more...

Right, GCC typically prefers rax (with a wide margin) but pretty much
every register gets used on a big enough code-base.

Random defconfig of the day gets me:

  15227 __x86_indirect_thunk_rax
    417 __x86_indirect_thunk_rdx
    205 __x86_indirect_thunk_rcx
    110 __x86_indirect_thunk_r13
    108 __x86_indirect_thunk_r12
     98 __x86_indirect_thunk_r8
     97 __x86_indirect_thunk_rbp
     96 __x86_indirect_thunk_r10
     85 __x86_indirect_thunk_r14
     38 __x86_indirect_thunk_r15
     36 __x86_indirect_thunk_r9
     34 __x86_indirect_thunk_rbx
     28 __x86_indirect_thunk_r11
     16 __x86_indirect_thunk_rsi
      1 __x86_indirect_thunk_rdi

IIRC you clobber r10 for the hash usage, so you'll not generate indirect
calls through that, but other than that the code seems to preserve
random register if the address is already loaded in it, otherwise loads
into r11.

Anyway, I might be able to deal with the indirect call not being r11,
but it'll take a bit of prodding. Also it will shatter my plans to move
the hash to eax to save a few bytes in instruction encoding. Let me go
poke around with that UDB patch see what's possible.

Reply via email to