https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63718
--- Comment #1 from Joey Ye <joey.ye at arm dot com> ---
Challenging to reduce a small case, as inlining impacts optimization behavior.
Trying to describe the problem as clear as possible.
Problemetic generated code:
mov r0, r10
mov r1, r3
mov r2, r8
str r3, [sp, #8]
bl _ZL17bmp_iter_set_init
mov r3, r9
ldr r4, [r0, #12] <--- r0 is used after call, which is clobbered
implicitly
_ZL17bmp_iter_set_init:
...
pop {r4}
pop {r0} <--- clobbering r0, which is implicit from RTL view
bx r0
Prototype of bmp_iter_set_init:
static inline void <--- return void
bmp_iter_set_init (bitmap_iterator *bi, const_bitmap map,
unsigned start_bit, unsigned *bit_no)
1. thumb1 (arch=armv4t) sometimes clobbers r0-r3 on return, with logic in
arm.c:thumb_exit
2. Behavior in 1 is implicit to RTL. A typical thumb1 return RTL will be
(jump_insn (unspec_volatile [
(return)
] VUNSPEC_EPILOGUE))
3. Other RTLs in caller do not modifies all r0-r3
4. After 216365 copyprop_hardreg_forward_1 believes r0-r3 are not clobbered
during call. Bang!
Attached reduced RTL dump of cprop_hardreg and the previous pass.