On 4/11/25 4:22 AM, Richard Biener wrote:
On Fri, Apr 11, 2025 at 12:18 PM Richard Sandiford
<richard.sandif...@arm.com> wrote:
Richard Biener <richard.guent...@gmail.com> writes:
On Fri, Apr 11, 2025 at 11:10 AM Richard Sandiford
<richard.sandif...@arm.com> wrote:
Richard Biener <richard.guent...@gmail.com> writes:
On Thu, Apr 10, 2025 at 10:10 PM Richard Sandiford
<richard.sandif...@arm.com> wrote:
PR119610 is about incorrect CFI output for a stack probe when that
probe is not the initial allocation. The main aarch64 stack probe
function, aarch64_allocate_and_probe_stack_space, implicitly assumed
that the incoming stack pointer pointed to the top of the frame,
and thus held the CFA.
aarch64_save_callee_saves and aarch64_restore_callee_saves use a
parameter called bytes_below_sp to track how far the stack pointer
is above the base of the static frame. This patch does the same
thing for aarch64_allocate_and_probe_stack_space.
Also, I noticed that the SVE path was attaching the first CFA note
to the wrong instruction: it was attaching the note to the calculation
of the stack size, rather than to the r11<-sp copy.
Bootstrapped & regression-tested on aarch64-linux-gnu. I'll push on
Monday if there are no comments before then. I'd appreciate a second
pair of eyes though, since this is a sensitive area.
Do you happen to know if the backports to older branches you provided for
the change that triggered this issue (in particular to GCC 7) are also affected?
GCC 7 and GCC 8 should be ok. The bug relies on stack protection being
enabled (-fstack-protector-strong for the testcase in the PR, but just
-fstack-protector for others) and that was added in GCC 9.
Hmm, I see -fstack-protector[-strong] working with GCC 7 on aarch64.
I think it's just being silently accepted as a nop. At least:
Specifically
the testcase produces for foo():
.text
.align 2
.global _Z3foov
.type _Z3foov, %function
_Z3foov:
.LFB0:
.cfi_startproc
stp x29, x30, [sp, -16]!
.cfi_def_cfa_offset 16
.cfi_offset 29, -16
.cfi_offset 30, -8
add x29, sp, 0
.cfi_def_cfa_register 29
sub sp, sp, #16
sub sp, sp, #524288
...the function isn't probing the stack here. It ought to be doing
something like the code that Alex quoted in the PR. Compare the
GCC 8 and GCC 9 output in https://godbolt.org/z/cWbsG3eGb .
Yes, I thought that's the -fstack-clash-protection support being sub-par
(not enforced for VLA allocations). So I wrongly assumed this is
about -fstack-protector (the __stack_chk_guard thing?)
?!? stack clash protection does support VLAs since those go through the
standard routines to allocate stack in explow.cc IIRC.
jeff