https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84039
Bug ID: 84039
Summary: x86 retpolines and CFI
Product: gcc
Version: 8.0
Status: UNCONFIRMED
Keywords: wrong-code
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: fw at gcc dot gnu.org
Target Milestone: ---
Target: x86-64
I tried this:
struct C {
virtual ~C();
virtual void f();
};
void
f (C *p)
{
p->f();
p->f();
}
with r256939 and -mindirect-branch=thunk -O2 on x86-64 GNU/Linux, and got this:
_Z1fP1C:
.LFB0:
.cfi_startproc
pushq %rbx
.cfi_def_cfa_offset 16
.cfi_offset 3, -16
movq (%rdi), %rax
movq %rdi, %rbx
jmp .LIND1
.LIND0:
pushq 16(%rax)
jmp __x86_indirect_thunk
.LIND1:
call .LIND0
movq (%rbx), %rax
movq %rbx, %rdi
popq %rbx
.cfi_def_cfa_offset 8
movq 16(%rax), %rax
jmp __x86_indirect_thunk_rax
.cfi_endproc
This doesn't look quite right. x86-64 is supposed to have asynchronous unwind
tables by default, but there is nothing that reflects the change in the
(relative) frame address after .LIND0. I think that region really has to be
moved outside of the .cfi_startproc/.cfi_endproc bracket.
There is a different issue with the think itself.
__x86_indirect_thunk_rax:
.LFB2:
.cfi_startproc
call .LIND5
.LIND4:
pause
lfence
jmp .LIND4
.LIND5:
mov %rax, (%rsp)
ret
.cfi_endproc
If a signal is delivered after the mov has executed, the unwinder will
eventually unwind through the signal frame and hit __x86_indirect_thunk_rax.
It does not treat it as a signal frame, so the return address of the stack is
decremented by one, in an attempt to obtain a program counter value which is
within the call instruction. However, in this scenario, the return address is
actually the start of the function, and subtracting one moves the program
counter out of the unwind region for that function.
It should be possible to fix the thunk function by changing the CFA offset
(using “.cfi_def_cfa_offset 16”) before the target address at the top of the
stack is overwritten, so that the unwinder never tries to unwind through a
function which has not yet started running (which is what causes the off-by-one
issue describe above).
Both issues are visible in GDB if you set breakpoints in the proper places
because the frame information used for debugging is incorrect as well.
Mailing list thread:
https://gcc.gnu.org/ml/gcc/2018-01/msg00160.html
This may impact the kernel after all. We have a report of Systemtap not
working, which could be related:
https://sourceware.org/ml/systemtap/2018-q1/msg00008.html