https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110899

--- Comment #15 from Maxim Egorushkin <maxim.yegorushkin at gmail dot com> ---
(In reply to Maxim Egorushkin from comment #14)
> (In reply to Marco Elver from comment #0)
> > On X86-64 the callee preserves all general purpose registers, except for
> > R11. R11 can be used as a scratch register. 
> 
> R11 is reserved for use in those lazy-resolved PLT stubs, invented for
> OpenOffice to load faster than users giving up waiting on it and buying MS
> Office instead. 
> 
> In other words, R11 is assumed to be wiped by any/every call instruction on
> its own, since any call instruction may need to resolve a PLT stub on its
> way.
> 
> You may like to compile your code with `-fno-plt` to bypass calling those
> pesky PLT stubs invented for OpenOffice. But that doesn't affect the machine
> code of any 3rd-party libraries you invoke, they still do their calls
> through PLT and, hence, may clobber R11.
> 
> R10 is also clobbered by some nebulous features no user asked for, and,
> hence, is generally unavailable, except in tail functions making no calls.

I am quite irked by X86-64 calls preserving only a few of GP registers and none
of XMM, causing massive stack spills whenever a call instruction is emitted.
Registers R10 and R11 being generally unavailable, unless in a bottom-most
self-reliant function doing everything on its own, is the most disappointing
outcome. 

X86-64 doubled the number of the scarce x86 GP registers, from 8 to 16, to
reduce the stack spills which hindered X86 performance the most. Yet, some deep
mind decided that 2 of these extra GP registers won't be generally available in
the majority of circumstances because of a few unhealthy and obscure use-cases.
Users having to give up the 2×GP registers in each and every application
because of a few bad apples is the worst idea, in my opinion. Can be totally
wrong and defer to `perf top`.

Reply via email to