When generating relatively trivial leaf-functions that contains local branch instructions (not calling sub functions), the compiler generates unnecessary PUSH/POP instructions to store the LR on the stack.
Richard Earnshaw [richard.earns...@arm.com] has confirmed that this is a bug and requested that I raise it in bugzilla. In the attached example (test.c), it can be seen in the generated assembly file (test-prologue-thumb.s) that all of the wcstrlenN() functions have unnecessary 'push{lr}/pop{pc}' which could be replaced by just doing 'bx lr' in the epilog. Command line was: arm-none-eabi-gcc -mthumb -mno-thumb-interwork test.c -Os -S -o test-prologue-thumb.s -- Summary: 'GCC/THUMB generates sub-optimal prolog/epilog Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: daniel dot sherwood at sepura dot com GCC build triplet: i686-pc-linux-gnu GCC host triplet: i686-mingw32 GCC target triplet: arm-none-eabi http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38570