Hi, elves (people who work on ELF)!
Motivated by a recent rant from Linus Torvalds on shared objects' performance
issues and a previous post about CPython 1.3x speedup with
-fno-semantic-interposition 1.3x, I have thought about an ELF world with
STB_GLOBAL variant of -Bsymbolic-functions by default recently and filed some
GCC/binutils feature requests.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98112
-fno-direct-access-access-external-data for -fno-pic
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100593 -fno-pic: Use GOT to take
address of an external default visibility function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100618
-fsemantic-interposition=variable
https://sourceware.org/bugzilla/show_bug.cgi?id=27871 ld: Add
-Bsymbolic-global-functions which only applies to STB_GLOBAL STT_FUNC
Pending patch: remove HAVE_LD_PIE_COPYRELOC:
https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html
I have written down my thoughts and a plan. I'll hereby humbly refer you to
https://maskray.me/blog/2021-05-16-elf-interposition-and-bsymbolic#the-last-alliance-of-elf-and-men
For your convenience, I have attached the paragraph.
I think distributions should default to a function-only variant of
`-fno-semantic-interposition` and (in the long term) a `STB_GLOBAL` variant of
`-Wl,-Bsymbolic-functions`, bringing back the lost performance for decades.
I wish that distributions default to a function-only variant of
-fno-semantic-interposition and (in the long term) a STB_GLOBAL variant of
-Wl,-Bsymbolic-functions, bringing back the lost performance for decades. macOS
(Mach-O), Windows (PE-COFF), and Solaris (ELF) direct bindings have set up
precedent so there is a good chance that most pieces of portable software are
already in a good state.
However, there is still some amount of work needed to annotate software which
cannot be built with -fsemantic-interposition=variable or
-Wl,-Bsymbolic-global-functions. Distributions have to put into resources. In
return, I estimate that many pieces of software may be 5% to 20% faster
(CPython is 1.3x faster) and a few percentage smaller in size.
There is a trade-off and the downside is that LD_PRELOAD replacing a function
definition in a shared object will be a non-default choice. The users can build
the software by themselves. (Note: malloc replacement/fakeroot are still
supported).
We need an option (say, -fsemantic-interposition=variable) to disable
interposition for functions but enable interposition for variables, because we
want to be compatible with copy relocations, which will require years to fix.
GCC feature request: PR100618.
We may need a configure-time option for default
-fsemantic-interposition=variable, like GCC's --enable-default-pie.
We need a -Bsymbolic-functions variant which only applies to STB_GLOBAL symbols
(i.e. STB_WEAK symbols are excluded). {-Bsymbolic-global-functions}
We need a linker option to cancel default -Bsymbolic-global-functions. I have
added -Bno-symbolic to GNU ld and gold (binutils 2.37; PR27834) and ld.lld 13.
(From Peter Smith) The linker can introduce a debugging option for executables
to catch accidental interposition, say, --warn-interposition: "Warning symbol S
of type STT_FUNC is defined in executable A and shared objects B and C, using
definition in A."
GCC -fno-pic should be fixed to use GOT to take the address of an external
default visibility function. PR100593. {-fno-pic_got}