On Mon, Apr 18, 2016 at 06:54:33PM +0300, Alexander Monakov wrote: > On Mon, 18 Apr 2016, Szabolcs Nagy wrote: > > On 18/04/16 14:26, Alexander Monakov wrote: > > > On Thu, 14 Apr 2016, Szabolcs Nagy wrote: > > >> looking at [2] i don't see why > > >> > > >> func: > > >> mov x9, x30 > > >> bl _tracefunc > > >> <function body> > > >> > > >> is not good for the kernel. > > >> > > >> mov x9, x30 is a nop at function entry, so in > > >> theory 4 byte atomic write should be enough > > >> to enable/disable tracing. > > > > > > Overwriting x9 can be problematic because GCC has gained the ability to > > > track > > > register usage interprocedurally: if foo() calls bar(), and GCC has > > > already > > > emitted code for bar() and knows that it cannot change x9, it can use that > > > knowledge to avoid saving/restoring x9 in foo() around calls to bar(). See > > > option '-fipa-ra'. > > > > > > If there's no register that can be safely used in place of x9 here, then > > > the backend should emit the entry/pad appropriately (e.g. with an unspec > > > that > > > clobbers the possibly-overwritten register). > > > > > > > (1) nop padded function can be assumed to clobber all temp regs > > This may be undesirable if the nop pad is expected to be left untouched > most of the time, because it would penalize the common case. If only > sufficiently complex functions (e.g. making other calls anyway) are expected > to be padded, it's moot.
Almost of all the "C" functions in the kernel will be compiled with -mfentry, and later on, we can dynamically turn on and off tracing per-function. > > (2) or _tracefunc must save/restore all temp regs, not just arg regs. > > This doesn't work: when _tracefunc starts executing, old value of x9 is > already unrecoverable. Yeah. We may, instead, be able to preserve LR value on a stack, but obviously with performance penalty. I wondered whether we could stop "instruction scheduling" partially, and always generate a fixed sequence of instructions like save x29, x30, [sp, #-XX]! mov x29, x30 bl _mcount <function body> but Maxim said no :) Thanks, -Takahiro AKASHI > > on x86_64, glibc and linux _mcount and __fentry__ don't > > save %r11 (temp reg), only the arg regs, so i think nop > > padding should behave the same way (1). > > That makes sense (modulo what I said above about penalizing tiny functions). > > Heh, I started wondering if on x86 this is handled correctly when the calls > are nopped out, and it turns out -pg disables -fipa-ra (in toplev.c)! :) > > Alexander -- Thanks, -Takahiro AKASHI