Re: [PATCH] [AArch64] support -mfentry feature for arm64

Szabolcs Nagy Wed, 20 Apr 2016 09:46:44 -0700

On 20/04/16 02:25, AKASHI Takahiro wrote:
> On Tue, Apr 19, 2016 at 09:39:39AM +0300, Alexander Monakov wrote:
>> On Tue, 19 Apr 2016, AKASHI Takahiro wrote:
>>>>> But if Szabolcs' two-instruction 
>>>>> sequence in the adjacent subthread is sufficient, this is moot.
>>>>
>>>> .  It can also be solved by having just one NOP after the function label, 
>>>> and a number of them before, then no thread can be in the nop pad.  That 
>>>> seems to indicate that GCC should not try to be too clever and simply 
>>>> leave the specified number of nops before and after the function label, 
>>>> leaving safety measures to the patching infrastructure.
>>>
>>> I don't get this idea very well.
>>> How can the instructions *before* a function label be executed
>>> after branching into this function?
>>
>> The single nop after the function label is changed to a short backwards 
>> branch
>> to the instructions just before the function label.
>>
>> As a result, the last instruction in the pad would have to become a short
>> forward branch jumping over the backwards branch described above, to the 
>> first
>> real instruction of the function.
> 
> So you mean something like:
> 1:
>   str x30, [sp, #-8]!
>   bl _tracefunc
>   ldr x30, [sp], #8
>   b 2f
> .global <function label>
>   b 1b
> 2:
>   <function prologue/body>
>   ...
> (We will not have to use x9 or else to preserve x30 here.)
> 
> Interesting.
> Livepatch code in the kernel has an assumption that the address of
> "bl _tracefunc" be equal to <function label>, but a recent patch for
> power pc to support livepatch tries to ease this restriction [1],
> and so hopefully it won't be an issue.
> (I will have to dig into the kernel code to be sure that there is
> no other issues though.)
>


i think ldr x30,[sp],#8 after the _tracefunc is not ok for
livepatching, since _tracefunc will change the return
address to the new function to hijack the call, which will
not restore the stack (this can be solved if the new
function can be instrumented, but fiddly).
and sp has to be 16 byte aligned, so the options are

  str x30,[sp,#-16]!
  bl _tracefunc

or

  mov x9,x30
  bl _tracefunc

where _tracefunc is responsible for restoring x30 and
sp, and this sequence can come before or after the
function symbol.

if it's before then

1:
  <save x30>
  bl _tracefunc
  b 2f
func:
  b 1b
2:
  <prologue>

the trace disabled case is better (only one nop), but i
think it would mean more kernel work (the current
code assumes bl _tracefunc is nopped, so whenever
tracing is enabled a different tracefunc target may be
used in the atomic update, i don't know if this is
necessary though).

it is probably only worth inventing something new for
aarch64 in gcc if the kernel can use that consistently
across targets or if that can cover other significant
use cases, but it's not clear if the various flexible nop
padding solutions can be more useful than the simple
two instruction sequence which kernel tools can already
deal with.

so it seems to me that

func:
  mov x9, x30
  bl __fentry__
  <prologue>

is still the best option with a new -mfentry option for
aarch64 (then we can keep the default -pg behaviour
for backward compatibility and work similarly to x86
with -mfentry) it does not solve the more general
instrumentation problem, but that would require more
analysis.

(on x86, gcc also provides -mrecord-mcount and
-mnop-mcount to record the noped out mcount call
sites,  but the kernel seems to use its own tool
to do that by looking for the mcount/fentry call
relocs so they are probably not needed).

> Thanks,
> -Takahiro AKASHI
> 
> [1] http://lkml.iu.edu//hypermail/linux/kernel/1604.1/04111.html and
>     http://lkml.iu.edu//hypermail/linux/kernel/1604.1/04112.html
> 
>> Alexander
>

Re: [PATCH] [AArch64] support -mfentry feature for arm64

Reply via email to