On 14/04/16 14:15, Andrew Pinski wrote:
> On Thu, Apr 14, 2016 at 9:08 PM, Maxim Kuvyrkov
> <maxim.kuvyr...@linaro.org> wrote:
>> On Mar 14, 2016, at 11:14 AM, Li Bin <huawei.li...@huawei.com> wrote:
>>>
>>> As ARM64 is entering enterprise world, machines can not be stopped for
>>> some critical enterprise production environment, that is, live patch as
>>> one of the RAS features is increasing more important for ARM64 arch now.
>>>
>>> Now, the mainstream live patch implementation which has been merged in
>>> Linux kernel (x86/s390) is based on the 'ftrace with regs' feature, and
>>> this feature needs the help of gcc.
>>>
>>> This patch proposes a generic solution for arm64 gcc which called mfentry,
>>> following the example of x86, mips, s390, etc. and on these archs, this
>>> feature has been used to implement the ftrace feature 'ftrace with regs'
>>> to support live patch.
>>>
>>> By now, there is an another solution from linaro [1], which proposes to
>>> implement a new option -fprolog-pad=N that generate a pad of N nops at the
>>> beginning of each function. This solution is a arch-independent way for gcc,
>>> but there may be some limitations which have not been recognized for Linux
>>> kernel to adapt to this solution besides the discussion on [2]
>>
>> It appears that implementing -fprolog-pad=N option in GCC will not enable 
>> kernel live-patching support for AArch64.  The proposal for the option was 
>> to make GCC output a given number of NOPs at the beginning of each function, 
>> and then the kernel could use that NOP pad to insert whatever instructions 
>> it needs.  The modification of kernel instruction stream needs to be done 
>> atomically, and, unfortunately, it seems the kernel can use only 
>> architecture-provided atomicity primitives -- i.e., changing at most 8 bytes 
>> at a time.
>>
> 
> Can't we add a 16byte atomic primitive for ARM64 to the kernel?
> Though you need to align all functions to a 16 byte boundary if the
> -fprolog-pag=N needs to happen.  Do you know what the size that needs
> to be modified?  It does seem to be either 12 or 16 bytes.
> 

looking at [2] i don't see why

func:
  mov x9, x30
  bl _tracefunc
  <function body>

is not good for the kernel.

mov x9, x30 is a nop at function entry, so in
theory 4 byte atomic write should be enough
to enable/disable tracing.

>> From the kernel discussion thread it appears that the pad needs to be more 
>> than 8 bytes, and that the kernel can't update that atomically.  However if 
>> -mfentry approach is used, then we need to update only 4 (or 8) bytes of the 
>> pad, and we avoid the atomicity problem.
> 
> I think you are incorrect, you could add a 16 byte atomic primitive if needed.
> 
>>
>> Therefore, [unless there is a clever multi-stage update process to 
>> atomically change NOPs to whatever we need,] I think we have to go with Li's 
>> -mfentry approach.
> 
> Please consider the above of having a 16 byte (128bit) atomic
> instructions be available would that be enough?
> 
> Thanks,
> Andrew
> 
>>
>> Comments?
>>
>> --
>> Maxim Kuvyrkov
>> www.linaro.org
>>
>>
>>> , typically
>>> for powerpc archs. Furthermore I think there are no good reasons to promote
>>> the other archs (such as x86) which have implemented the feature 'ftrace 
>>> with regs'
>>> to replace the current method with the new option, which may bring heavily
>>> target-dependent code adaption, as a result it becomes a arm64 dedicated
>>> solution, leaving kernel with two different forms of implementation.
>>>
>>> [1] https://gcc.gnu.org/ml/gcc/2015-10/msg00090.html
>>> [2] 
>>> http://lists.infradead.org/pipermail/linux-arm-kernel/2016-January/401854.html
>>
> 

Reply via email to