On Thu, Mar 15, 2018 at 1:41 PM, Jan Hubicka <hubi...@ucw.cz> wrote:
>> On Sun, Mar 11, 2018 at 7:40 AM, H.J. Lu <hjl.to...@gmail.com> wrote:
>> > On Mon, Mar 5, 2018 at 4:20 AM, H.J. Lu <hjl.to...@gmail.com> wrote:
>> >> On Tue, Feb 27, 2018 at 11:39 AM, H.J. Lu <hongjiu...@intel.com> wrote:
>> >>> For x86 targets, when -fno-plt is used, external functions are called
>> >>> via GOT slot, in 64-bit mode:
>> >>>
>> >>>         [bnd] call/jmp *foo@GOTPCREL(%rip)
>> >>>
>> >>> and in 32-bit mode:
>> >>>
>> >>>         [bnd] call/jmp *foo@GOT[(%reg)]
>> >>>
>> >>> With -mindirect-branch=, they are converted to, in 64-bit mode:
>> >>>
>> >>>         pushq          foo@GOTPCREL(%rip)
>> >>>         [bnd] jmp      __x86_indirect_thunk[_bnd]
>> >>>
>> >>> and in 32-bit mode:
>> >>>
>> >>>         pushl          foo@GOT[(%reg)]
>> >>>         [bnd] jmp      __x86_indirect_thunk[_bnd]
>> >>>
>> >>> which were incompatible with CFI.  In 64-bit mode, since R11 is a scratch
>> >>> register, we generate:
>> >>>
>> >>>         movq           foo@GOTPCREL(%rip), %r11
>> >>>         [bnd] call/jmp __x86_indirect_thunk_[bnd_]r11
>> >>>
>> >>> instead.  We do it in ix86_output_indirect_branch so that we can use
>> >>> the newly proposed R_X86_64_THUNK_GOTPCRELX relocation:
>> >>>
>> >>> https://groups.google.com/forum/#!topic/x86-64-abi/eED5lzn3_Mg
>> >>>
>> >>>         movq           foo@OTPCREL_THUNK(%rip), %r11
>> >>>         [bnd] call/jmp __x86_indirect_thunk_[bnd_]r11
>> >>>
>> >>> to load GOT slot into R11.  If foo is defined locally, linker can can
>> >>> convert
>> >>>
>> >>>         movq           foo@GOTPCREL_THUNK(%rip), %reg
>> >>>         call/jmp       __x86_indirect_thunk_reg
>> >>>
>> >>> to
>> >>>
>> >>>         call/jmp       foo
>> >>>         nop            0L(%rax)
>> >>>
>> >>> In 32-bit mode, since all caller-saved registers, EAX, EDX and ECX, may
>> >>> used to function parameters, there is no scratch register available.  For
>> >>> -fno-plt -fno-pic -mindirect-branch=, we expand external function call
>> >>> to:
>> >>>
>> >>>         movl           foo@GOT, %reg
>> >>>         [bnd] call/jmp *%reg
>> >>>
>> >>> so that it can be converted to
>> >>>
>> >>>         movl           foo@GOT, %reg
>> >>>         [bnd] call/jmp __x86_indirect_thunk_[bnd_]reg
>> >>>
>> >>> in ix86_output_indirect_branch.  Since this is performed during RTL
>> >>> expansion, other instructions may be inserted between movl and call/jmp.
>> >>> Linker optimization isn't always possible.
>
> I suppose we can just combine those into patterns if we want to prevent gcc 
> from

I will look into it.

> interleaving this with other instructions.  However since this affects ABI and
> not only return thunk, did you discuss the changes with LLVM folks as well?

This doesn't change calling convention.   The new R_X86_64_THUNK_GOTPCRELX
relocation is an optimization.   It can be safely treated as
R_X86_64_GOTPCRELX.

> I would be nice to not have diverging solutions.
>

That is why I posted the new relocation to x86-64 psABI group.

-- 
H.J.

Reply via email to