https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84039
Bug ID: 84039 Summary: x86 retpolines and CFI Product: gcc Version: 8.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: fw at gcc dot gnu.org Target Milestone: --- Target: x86-64 I tried this: struct C { virtual ~C(); virtual void f(); }; void f (C *p) { p->f(); p->f(); } with r256939 and -mindirect-branch=thunk -O2 on x86-64 GNU/Linux, and got this: _Z1fP1C: .LFB0: .cfi_startproc pushq %rbx .cfi_def_cfa_offset 16 .cfi_offset 3, -16 movq (%rdi), %rax movq %rdi, %rbx jmp .LIND1 .LIND0: pushq 16(%rax) jmp __x86_indirect_thunk .LIND1: call .LIND0 movq (%rbx), %rax movq %rbx, %rdi popq %rbx .cfi_def_cfa_offset 8 movq 16(%rax), %rax jmp __x86_indirect_thunk_rax .cfi_endproc This doesn't look quite right. x86-64 is supposed to have asynchronous unwind tables by default, but there is nothing that reflects the change in the (relative) frame address after .LIND0. I think that region really has to be moved outside of the .cfi_startproc/.cfi_endproc bracket. There is a different issue with the think itself. __x86_indirect_thunk_rax: .LFB2: .cfi_startproc call .LIND5 .LIND4: pause lfence jmp .LIND4 .LIND5: mov %rax, (%rsp) ret .cfi_endproc If a signal is delivered after the mov has executed, the unwinder will eventually unwind through the signal frame and hit __x86_indirect_thunk_rax. It does not treat it as a signal frame, so the return address of the stack is decremented by one, in an attempt to obtain a program counter value which is within the call instruction. However, in this scenario, the return address is actually the start of the function, and subtracting one moves the program counter out of the unwind region for that function. It should be possible to fix the thunk function by changing the CFA offset (using “.cfi_def_cfa_offset 16”) before the target address at the top of the stack is overwritten, so that the unwinder never tries to unwind through a function which has not yet started running (which is what causes the off-by-one issue describe above). Both issues are visible in GDB if you set breakpoints in the proper places because the frame information used for debugging is incorrect as well. Mailing list thread: https://gcc.gnu.org/ml/gcc/2018-01/msg00160.html This may impact the kernel after all. We have a report of Systemtap not working, which could be related: https://sourceware.org/ml/systemtap/2018-q1/msg00008.html