https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92172

--- Comment #5 from Seth LaForge <sethml at ofb dot net> ---
Richard:
> No it doesn't.  The AAPCS for AArch32 makes no reference to a frame pointer,
> so there is no portable way defined for walking a frame other than by using
> dwarf records or C++ unwinding descriptions.  The latter are preferred, but
> only support unwinding from 'synchronous' unwind points (after the prologue
> and before the epilogue).

...in other words, neither is suitable for generating stack traces in an
embedded context, which is a genuinely useful feature.

You're right, AAPCS does not mention frame pointers. ATPCS does - I'm not sure
if it's still normative. However, it says the thumb frame pointer is any of
r4-r7, and dictates a frame pointer even *higher* on the stack - just above the
saved LR. That's not what any compiler I know of does.

At this point it's entirely an argument from consistency:
- GCC ARM and Clang ARM use R11 for frame pointer, pointing to the stacked R11.
Useful.
- Clang Thumb uses R7 for frame pointer, pointing to the stacked R7. Useful.
- GCC Thumb uses R7 for the frame pointer, pointing to an arbitrary location.
Useless for stack traces.

Stack traces are a genuinely useful thing. Many language runtimes do them
automatically all the time (e.g. Python). Many C/C++ development environments
do them automatically on a crash, either via a debugger or something like
libunwind. Many embedded devices would like to do them on a crash - they often
have very little storage to store debugging information and relay it to some
server, and something like libunwind is just too much for them.

> Compilers are, of course, free to use frame pointers internally, within a 
> frame,
> but there is no frame chain that can be walked.

With clang, there is. With GCC and ARM mode, there is. I'm promoting making
thumb mode work the same as ARM mode, thus making stack tracing possible.

Wilco:
> On GCC10 the codesize overhead of -fno-omit-frame-pointer is 4.1% for Arm and 
> 4.8%
> for Thumb-2 (measured on SPEC2006). That's already a large overhead, 
> especially
> since this feature doesn't do anything useful besides adding overhead...

Well, that's basically my point: as implemented, gcc frame pointers are useless
on Thumb. There's no reason to enable them. With a small adjustment to behave
the same as clang they are quite useful: software can create stack traces
easily. Adding a small amount of overhead to a useless feature in order to make
it useful seems like a very worthwhile tradeoff to me.

> The key is that GCC uses the frame pointer for every stack access, and thus 
> the
> placement of the frame pointer within a frame matters.

It does? Why?!? The SP register is a better register to offset from in every
case I can think of that doesn't involve alloca() or variable-size-arrays,
which should be rare. Clang, when using frame pointers, uses SP to access local
variables in most cases - compare the implementation of AccessLocal():

https://godbolt.org/z/3o4TlD

int AccessLocal(int a) {
    volatile int b = a;
    SimpleLeaf();
    return b;
}

GCC 8:
        push    {r7, lr}
        sub     sp, sp, #8
        add     r7, sp, #0
        str     r0, [r7, #4]
        ...

Clang 9:
        push    {r7, lr}
        mov     r7, sp
        sub     sp, #8
        str     r0, [sp, #4]
        ...

Same numer of instructions, same code size, same performance, but the clang
version has an unwindable/traceable frame pointer.

> Thanks for posting actual numbers, but GCC 4.7?!? It might be time to try 
> GCC9...

There are, sadly, compelling historical reasons. We're putting our effort into
moving to clang instead.

Reply via email to