https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113874
--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> --- E.g. https://sourceware.org/legacy-ml/binutils/2005-09/msg00184.html says The functions defined above use custom calling conventions that require them to preserve any registers they modify. This penalizes the case that requires dynamic TLS, since it must preserve all call-clobbered registers before calling __tls_get_addr(), but it is optimized for the most common case of static TLS, and also for the case in which the code generated by the compiler can be relaxed by the linker to a more efficient access model: being able to assume no registers are clobbered by the call tends to improve register allocation. Also, the function that handles the dynamic TLS case will most often be able to avoid calling __tls_get_addr(), thus potentially avoiding the need for preserving registers.