https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92172
--- Comment #5 from Seth LaForge <sethml at ofb dot net> --- Richard: > No it doesn't. The AAPCS for AArch32 makes no reference to a frame pointer, > so there is no portable way defined for walking a frame other than by using > dwarf records or C++ unwinding descriptions. The latter are preferred, but > only support unwinding from 'synchronous' unwind points (after the prologue > and before the epilogue). ...in other words, neither is suitable for generating stack traces in an embedded context, which is a genuinely useful feature. You're right, AAPCS does not mention frame pointers. ATPCS does - I'm not sure if it's still normative. However, it says the thumb frame pointer is any of r4-r7, and dictates a frame pointer even *higher* on the stack - just above the saved LR. That's not what any compiler I know of does. At this point it's entirely an argument from consistency: - GCC ARM and Clang ARM use R11 for frame pointer, pointing to the stacked R11. Useful. - Clang Thumb uses R7 for frame pointer, pointing to the stacked R7. Useful. - GCC Thumb uses R7 for the frame pointer, pointing to an arbitrary location. Useless for stack traces. Stack traces are a genuinely useful thing. Many language runtimes do them automatically all the time (e.g. Python). Many C/C++ development environments do them automatically on a crash, either via a debugger or something like libunwind. Many embedded devices would like to do them on a crash - they often have very little storage to store debugging information and relay it to some server, and something like libunwind is just too much for them. > Compilers are, of course, free to use frame pointers internally, within a > frame, > but there is no frame chain that can be walked. With clang, there is. With GCC and ARM mode, there is. I'm promoting making thumb mode work the same as ARM mode, thus making stack tracing possible. Wilco: > On GCC10 the codesize overhead of -fno-omit-frame-pointer is 4.1% for Arm and > 4.8% > for Thumb-2 (measured on SPEC2006). That's already a large overhead, > especially > since this feature doesn't do anything useful besides adding overhead... Well, that's basically my point: as implemented, gcc frame pointers are useless on Thumb. There's no reason to enable them. With a small adjustment to behave the same as clang they are quite useful: software can create stack traces easily. Adding a small amount of overhead to a useless feature in order to make it useful seems like a very worthwhile tradeoff to me. > The key is that GCC uses the frame pointer for every stack access, and thus > the > placement of the frame pointer within a frame matters. It does? Why?!? The SP register is a better register to offset from in every case I can think of that doesn't involve alloca() or variable-size-arrays, which should be rare. Clang, when using frame pointers, uses SP to access local variables in most cases - compare the implementation of AccessLocal(): https://godbolt.org/z/3o4TlD int AccessLocal(int a) { volatile int b = a; SimpleLeaf(); return b; } GCC 8: push {r7, lr} sub sp, sp, #8 add r7, sp, #0 str r0, [r7, #4] ... Clang 9: push {r7, lr} mov r7, sp sub sp, #8 str r0, [sp, #4] ... Same numer of instructions, same code size, same performance, but the clang version has an unwindable/traceable frame pointer. > Thanks for posting actual numbers, but GCC 4.7?!? It might be time to try > GCC9... There are, sadly, compelling historical reasons. We're putting our effort into moving to clang instead.