On Thu, Jan 11, 2018 at 2:46 PM, Jeff Law <[email protected]> wrote:
> On 01/07/2018 03:59 PM, H.J. Lu wrote:
>> Add -mindirect-branch= option to convert indirect call and jump to call
>> and return thunks. The default is 'keep', which keeps indirect call and
>> jump unmodified. 'thunk' converts indirect call and jump to call and
>> return thunk. 'thunk-inline' converts indirect call and jump to inlined
>> call and return thunk. 'thunk-extern' converts indirect call and jump to
>> external call and return thunk provided in a separate object file. You
>> can control this behavior for a specific function by using the function
>> attribute indirect_branch.
>>
>> 2 kinds of thunks are geneated. Memory thunk where the function address
>> is at the top of the stack:
>>
>> __x86_indirect_thunk:
>> call L2
>> L1:
>> lfence
>> jmp L1
>> L2:
>> lea 8(%rsp), %rsp|lea 4(%esp), %esp
>> ret
>>
>> Indirect jmp via memory, "jmp mem", is converted to
>>
>> push memory
>> jmp __x86_indirect_thunk
>>
>> Indirect call via memory, "call mem", is converted to
>>
>> jmp L2
>> L1:
>> push [mem]
>> jmp __x86_indirect_thunk
>> L2:
>> call L1
>>
>> Register thunk where the function address is in a register, reg:
>>
>> __x86_indirect_thunk_reg:
>> call L2
>> L1:
>> lfence
>> jmp L1
>> L2:
>> movq %reg, (%rsp)|movl %reg, (%esp)
>> ret
>>
>> where reg is one of (r|e)ax, (r|e)dx, (r|e)cx, (r|e)bx, (r|e)si, (r|e)di,
>> (r|e)bp, r8, r9, r10, r11, r12, r13, r14 and r15.
>>
>> Indirect jmp via register, "jmp reg", is converted to
>>
>> jmp __x86_indirect_thunk_reg
>>
>> Indirect call via register, "call reg", is converted to
>>
>> call __x86_indirect_thunk_reg
>>
>> gcc/
>>
>> * config/i386/i386-opts.h (indirect_branch): New.
>> * config/i386/i386-protos.h (ix86_output_indirect_jmp): Likewise.
>> * config/i386/i386.c (ix86_using_red_zone): Disallow red-zone
>> with local indirect jump when converting indirect call and jump.
>> (ix86_set_indirect_branch_type): New.
>> (ix86_set_current_function): Call ix86_set_indirect_branch_type.
>> (indirectlabelno): New.
>> (indirect_thunk_needed): Likewise.
>> (indirect_thunk_bnd_needed): Likewise.
>> (indirect_thunks_used): Likewise.
>> (indirect_thunks_bnd_used): Likewise.
>> (INDIRECT_LABEL): Likewise.
>> (indirect_thunk_name): Likewise.
>> (output_indirect_thunk): Likewise.
>> (output_indirect_thunk_function): Likewise.
>> (ix86_output_indirect_branch): Likewise.
>> (ix86_output_indirect_jmp): Likewise.
>> (ix86_code_end): Call output_indirect_thunk_function if needed.
>> (ix86_output_call_insn): Call ix86_output_indirect_branch if
>> needed.
>> (ix86_handle_fndecl_attribute): Handle indirect_branch.
>> (ix86_attribute_table): Add indirect_branch.
>> * config/i386/i386.h (machine_function): Add indirect_branch_type
>> and has_local_indirect_jump.
>> * config/i386/i386.md (indirect_jump): Set has_local_indirect_jump
>> to true.
>> (tablejump): Likewise.
>> (*indirect_jump): Use ix86_output_indirect_jmp.
>> (*tablejump_1): Likewise.
>> (simple_return_indirect_internal): Likewise.
>> * config/i386/i386.opt (mindirect-branch=): New option.
>> (indirect_branch): New.
>> (keep): Likewise.
>> (thunk): Likewise.
>> (thunk-inline): Likewise.
>> (thunk-extern): Likewise.
>> * doc/extend.texi: Document indirect_branch function attribute.
>> * doc/invoke.texi: Document -mindirect-branch= option.
> Note I'm expecting Uros to chime in. So please do not consider this
> ack'd until you hear from Uros.
>
> At a high level is there really that much value in having thunks in the
> object file? Why not put the full set of thunks into libgcc and just
> allow selection between inline sequences and external thunks
> (thunk-inline and thunk-external)? It's not a huge simplification, but
> if there isn't a compelling reason, let's drop the in-object-file thunks.
I prefer to leave it in the object file just in case that
-mindirect-branch-loop=
is needed in the future.
>> +
>> +/* Fills in the label name that should be used for the indirect thunk. */
>> +
>> +static void
>> +indirect_thunk_name (char name[32], int regno, bool need_bnd_p)
> Please document each argument in the function's comment.
Will do.
>
>
>> +
>> +static void
>> +output_indirect_thunk (bool need_bnd_p, int regno)
> Needs a function comment.
Will do.
>
>
>> +
>> +static void
>> +output_indirect_thunk_function (bool need_bnd_p, int regno)
> Needs a function comment.
>
Will do.
>
>> @@ -28119,12 +28357,182 @@ ix86_nopic_noplt_attribute_p (rtx call_op)
>> return false;
>> }
>>
>> +static void
>> +ix86_output_indirect_branch (rtx call_op, const char *xasm,
>> + bool sibcall_p)
> Needs a function comment.
>
Will do.
> I'd probably break this into a few smaller functions. It's a lot of
> inlined code.
>
That function has 142 lines. Unless there is a compelling need,
I prefer to leave it ASIS.
>
>
>
>
>
>> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
>> index 3f587806407..a7573c468ae 100644
>> --- a/gcc/config/i386/i386.md
>> +++ b/gcc/config/i386/i386.md
>> @@ -12313,12 +12313,13 @@
>> {
>> if (TARGET_X32)
>> operands[0] = convert_memory_address (word_mode, operands[0]);
>> + cfun->machine->has_local_indirect_jump = true;
> Note this is not ideal in that it's set at expansion time and thus would
> not be accurate if the RTL optimizers were able to simply things enough
> such that the indirect jump has a known target.
>
> But I wouldn't expect that to happen much in the RTL optimizers much as
> the gimple optimizers are likely much better at doing that kind of
> thing. So I won't object to doing things this way as long as they
> gracefully handle this case.
>
>
>> diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
>> index 09aaa97c2fc..22c806206e4 100644
>> --- a/gcc/config/i386/i386.opt
>> +++ b/gcc/config/i386/i386.opt
>> @@ -1021,3 +1021,23 @@ indirect jump.
>> mforce-indirect-call
>> Target Report Var(flag_force_indirect_call) Init(0)
>> Make all function calls indirect.
>> +
>> +mindirect-branch=
>> +Target Report RejectNegative Joined Enum(indirect_branch)
>> Var(ix86_indirect_branch) Init(indirect_branch_keep)
>> +Convert indirect call and jump.
> Convert to what? I realize there's an enum of the choices below, but
> this doesn't read well.
I will update.
> Do you want to mention that CET and retpolines are inherently
I will document it.
> incompatible? Should an attempt to use them together generate a
> compile-time error?
>
Compile-time error sounds a good idea.
Thanks.
--
H.J.