On Fri, Jan 08, 2016 at 02:36:32PM +0900, AKASHI Takahiro wrote: > On 01/07/2016 11:56 PM, Richard Earnshaw (lists) wrote: > >On 07/01/16 14:22, Will Deacon wrote: > >>On Thu, Dec 24, 2015 at 04:57:54PM +0900, AKASHI Takahiro wrote: > >>>So I'd like to introduce a function prologue analyzer to determine > >>>a size allocated by a function's prologue and deduce it from "Depth". > >>>My implementation of this analyzer has been submitted to > >>>linux-arm-kernel mailing list[1]. > >>>I borrowed some ideas from gdb's analyzer[2], especially a loop of > >>>instruction decoding as well as stop of decoding at exiting a basic block, > >>>but implemented my own simplified one because gdb version seems to do > >>>a bit more than what we expect here. > >>>Anyhow, since it is somewhat heuristic (and may not be maintainable for > >>>a long term), could you review it from a broader viewpoint of toolchain, > >>>please? > >>> > >>My main issue with this is that we cannot rely on the frame layout > >>generated by the compiler and there's little point in asking for > >>commitment here. Therefore, the heuristics will need updating as and > >>when we identify new frames that we can't handle. That's pretty fragile > >>and puts us on the back foot when faced with newer compilers. This might > >>be sustainable if we don't expect to encounter much variation, but even > >>that would require some sort of "buy-in" from the various toolchain > >>communities. > >> > >>GCC already has an option (-fstack-usage) to determine the stack usage > >>on a per-function basis and produce a report at build time. Why can't > >>we use that to provide the information we need, rather than attempt to > >>compute it at runtime based on your analyser? > >> > >>If -fstack-usage is not sufficient, understanding why might allow us to > >>propose a better option. > > > >Can you not use the dwarf frame unwind data? That's always sufficient > >to recover the CFA (canonical frame address - the value in SP when > >executing the first instruction in a function). It seems to me it's > >unlikely you're going to need something that's an exceedingly high > >performance operation. > > Thank you for your comment. > Yeah, but we need some utility routines to handle unwind data(.debug_frame). > In fact, some guy has already attempted to merge (part of) libunwind into > the kernel[1], but it was rejected by the kernel community (including Linus > if I correctly remember). It seems that they thought the code was still buggy.
The ARC guys seem to have sneaked something in for their architecture: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/arch/arc/kernel/unwind.c so it might not be impossible if we don't require all the bells and whistles of libunwind. > That is one of reasons that I wanted to implement my own analyzer. I still don't understand why you can't use fstack-usage. Can you please tell me why that doesn't work? Am I missing something? Will