On Mon, May 11, 2015 at 8:52 AM, H.J. Lu <hjl.to...@gmail.com> wrote: > > I will clarify in the spec language. Yes, that is the intention for both > R_X86_64_RELAX_PC32 and R_X86_64_RELAX_PLT32. That is what > is implemented on users/hjl/relax branch. >
Here is the updated proposal. I changed nop prefix from 0x48 to 0x67 and clarified how foo@GOTPCREL(%rip) should be resolved. -- H.J. ---- To remove one direct branch to PLT for external function calls: https://gcc.gnu.org/ml/gcc-patches/2015-05/msg00001.html I am proposing to add 2 new relocations, R_X86_64_RELAX_PC32 and R_X86_64_RELAX_PLT32: 1. They can only be used on 32-bit direct call/jmp instructions. 2. call/jmp instructions must have a 0x67 prefix, which is the address size prefix and is ignored by 32-bit direct call/jmp instructions. 3. Linker can treat them as R_X86_64_PC32 and R_X86_64_PLT32, respectively. 4. Optionally, linker can convert 0x67 call/jmp foo[@PLT] to call/jmp *foo@GOTPCREL(%rip) when function foo is defined in a shared library. If there is no GOT slot allocated for symbol foo, linker should resolve foo@GOTPCREL(%rip) to its PLT slot address + 6, which is the push instruction, to support lazy binding. Otherwise, linker should resolve it to its PLT slot address. If foo is defined locally, linker will generate 0x67 call/jmp foo R_X86_64_RELAX_PC32 is defined as 39, which was the deprecated R_X86_64_PC32_BND for bnd call/jmp foo R_X86_64_RELAX_PLT32 is defined as 40, which was the deprecated R_X86_64_PLT32_BND for bnd call/jmp foo@PLT Since the current linkers treat R_X86_64_PC32_BND and R_X86_64_PLT32_BND as R_X86_64_PC32 and R_X86_64_PLT32, respectively, they can handle the new ones correctly.