https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110899
--- Comment #15 from Maxim Egorushkin <maxim.yegorushkin at gmail dot com> --- (In reply to Maxim Egorushkin from comment #14) > (In reply to Marco Elver from comment #0) > > On X86-64 the callee preserves all general purpose registers, except for > > R11. R11 can be used as a scratch register. > > R11 is reserved for use in those lazy-resolved PLT stubs, invented for > OpenOffice to load faster than users giving up waiting on it and buying MS > Office instead. > > In other words, R11 is assumed to be wiped by any/every call instruction on > its own, since any call instruction may need to resolve a PLT stub on its > way. > > You may like to compile your code with `-fno-plt` to bypass calling those > pesky PLT stubs invented for OpenOffice. But that doesn't affect the machine > code of any 3rd-party libraries you invoke, they still do their calls > through PLT and, hence, may clobber R11. > > R10 is also clobbered by some nebulous features no user asked for, and, > hence, is generally unavailable, except in tail functions making no calls. I am quite irked by X86-64 calls preserving only a few of GP registers and none of XMM, causing massive stack spills whenever a call instruction is emitted. Registers R10 and R11 being generally unavailable, unless in a bottom-most self-reliant function doing everything on its own, is the most disappointing outcome. X86-64 doubled the number of the scarce x86 GP registers, from 8 to 16, to reduce the stack spills which hindered X86 performance the most. Yet, some deep mind decided that 2 of these extra GP registers won't be generally available in the majority of circumstances because of a few unhealthy and obscure use-cases. Users having to give up the 2×GP registers in each and every application because of a few bad apples is the worst idea, in my opinion. Can be totally wrong and defer to `perf top`.