On Wed, Apr 22, 2015 at 3:15 PM, Ramana Radhakrishnan <ramana....@googlemail.com> wrote: > On Wed, Apr 22, 2015 at 5:34 PM, H.J. Lu <hongjiu...@intel.com> wrote: >> Normally, with PIE, GCC accesses globals that are extern to the module >> using GOT. This is two instructions, one to get the address of the global >> from GOT and the other to get the value. Examples: >> >> --- >> extern int a_glob; >> int >> main () >> { >> return a_glob; >> } >> --- >> >> With PIE, the generated code accesses global via GOT using two memory >> loads: >> >> movq a_glob@GOTPCREL(%rip), %rax >> movl (%rax), %eax >> >> for 64-bit or >> >> movl a_glob@GOT(%ecx), %eax >> movl (%eax), %eax >> >> for 32-bit. >> >> Some experiments on google and SPEC CPU benchmarks show that the extra >> instruction affects performance by 1% to 5%. >> >> Solution - Copy Relocations: >> >> When the linker supports copy relocations, GCC can always assume that >> the global will be defined in the executable. For globals that are >> truly extern (come from shared objects), the linker will create copy >> relocations and have them defined in the executable. Result is that >> no global access needs to go through GOT and hence improves performance. >> We can generate >> >> movl a_glob(%rip), %eax >> >> for 64-bit and >> >> movl a_glob@GOTOFF(%eax), %eax >> >> for 32-bit. This optimization only applies to undefined non-weak >> non-TLS global data. Undefined weak global or TLS data access still >> must go through GOT. >> >> This patch reverts legitimate_pic_address_disp_p change made in revision >> 218397, which only applies to x86-64. Instead, this patch updates >> targetm.binds_local_p to indicate if undefined non-weak non-TLS global >> data is defined locally in PIE. It also introduces a new target hook, >> binds_tls_local_p to distinguish TLS variable from non-TLS variable. By >> default, binds_tls_local_p is the same as binds_local_p. >> >> This patch checks if 32-bit and 64-bit linkers support PIE with copy >> reloc at configure time. 64-bit linker is enabled in binutils 2.25 >> and 32-bit linker is enabled in binutils 2.26. This optimization >> is enabled only if the linker support is available. >> >> Tested on Linux/x86-64 with -m32 and -m64, using linkers with and without >> support for copy relocation in PIE. OK for trunk? >> >> Thanks. > > > Looking at this my first reaction was that surely most (if not all ? ) > targets that use ELF and had copy relocs would benefit from this ? > Couldn't we find a simpler way for targets to have this support ? I > don't have a more constructive suggestion to make at the minute but > getting this to work just from the targetm.binds_local_p (decl) > interface would probably be better ?
default_binds_local_p_3 is a global function which is used to implement targetm.binds_local_p in x86 backend. Any backend can use it to optimize for copy relocation. -- H.J.