https://gcc.gnu.org/g:c65046ff2ef0a9a46e59bc0b3369b2d226f6a239
commit r14-9606-gc65046ff2ef0a9a46e59bc0b3369b2d226f6a239 Author: Jeff Law <j...@ventanamicro.com> Date: Thu Mar 21 20:41:59 2024 -0600 [committed] Fix RISC-V missing stack tie As some of you know, Raphael has been working on stack-clash support for the RISC-V port. A little while ago Florian reached out to us with an issue where glibc was failing its smoke test due to referencing an unallocated stack slot. Without diving into the code in detail I (incorrectly) concluded it was a problem with the fallback of using Ada's stack-check paths due to not having stack-clash support. Once enough stack-clash bits were ready I had Raphael review the code generated for Florian's test and we concluded the the original case from Florian was just wrong irrespective of stack clash/stack check. While Raphael's stack-clash work will indirectly fix Florian's case, it really should also work without stack-clash. In particular this code was called out by valgrind: > 000000000003cb5e <realpath@@GLIBC_2.27>: > __GI___realpath(): > 3cb5e: 81010113 addi sp,sp,-2032 > 3cb62: 7d313423 sd s3,1992(sp) > 3cb66: 79fd lui s3,0xfffff > 3cb68: 7e813023 sd s0,2016(sp) > 3cb6c: 7c913c23 sd s1,2008(sp) > 3cb70: 7f010413 addi s0,sp,2032 > 3cb74: 35098793 addi a5,s3,848 # fffffffffffff350 <__libc_initial+0xffffffffffe8946a> > 3cb78: 74fd lui s1,0xfffff > 3cb7a: 008789b3 add s3,a5,s0 > 3cb7e: f9048793 addi a5,s1,-112 # ffffffffffffef90 <__libc_initial+0xffffffffffe890aa> > 3cb82: 008784b3 add s1,a5,s0 > 3cb86: 77fd lui a5,0xfffff > 3cb88: 7d413023 sd s4,1984(sp) > 3cb8c: 7b513c23 sd s5,1976(sp) > 3cb90: 7e113423 sd ra,2024(sp) > 3cb94: 7d213823 sd s2,2000(sp) > 3cb98: 7b613823 sd s6,1968(sp) > 3cb9c: 7b713423 sd s7,1960(sp) > 3cba0: 7b813023 sd s8,1952(sp) > 3cba4: 79913c23 sd s9,1944(sp) > 3cba8: 79a13823 sd s10,1936(sp) > 3cbac: 79b13423 sd s11,1928(sp) > 3cbb0: 34878793 addi a5,a5,840 # fffffffffffff348 <__libc_initial+0xffffffffffe89462> > 3cbb4: 40000713 li a4,1024 > 3cbb8: 00132a17 auipc s4,0x132 > 3cbbc: ae0a3a03 ld s4,-1312(s4) # 16e698 <__stack_chk_guard> > 3cbc0: 01098893 addi a7,s3,16 > 3cbc4: 42098693 addi a3,s3,1056 > 3cbc8: b8040a93 addi s5,s0,-1152 > 3cbcc: 97a2 add a5,a5,s0 > 3cbce: 000a3603 ld a2,0(s4) > 3cbd2: f8c43423 sd a2,-120(s0) > 3cbd6: 4601 li a2,0 > 3cbd8: 3d14b023 sd a7,960(s1) > 3cbdc: 3ce4b423 sd a4,968(s1) > 3cbe0: 7cd4b823 sd a3,2000(s1) > 3cbe4: 7ce4bc23 sd a4,2008(s1) > 3cbe8: b7543823 sd s5,-1168(s0) > 3cbec: b6e43c23 sd a4,-1160(s0) > 3cbf0: e38c sd a1,0(a5) > 3cbf2: b0010113 addi sp,sp,-1280 In particular note the store at 0x3cbd8. That's hitting (s1 + 960). If you chase the values around, you'll find it's a bit more than 1k into unallocated stack space. It's also worth noting the final stack adjustment at 0x3cbf2. While I haven't reproduced Florian's code exactly, I was able to get reasonably close and verify my suspicion that everything was fine before sched2 and incorrect after sched2. It was also obvious at that point what had gone wrong -- we were missing a stack tie after the final stack pointer adjustment. This patch adds the missing stack tie. While not technically a regression, I shudder at the thought of chasing one of these issues down again in the wild. Been there, done that. Regression tested on rv64gc. Verified the scheduler no longer mucked up realpath by hand. Pushing to the trunk. gcc/ * config/riscv/riscv.cc (riscv_expand_prologue): Add missing stack tie for scalable and final stack adjustment if needed. Co-authored-by: Raphael Zinsly <rzin...@ventanamicro.com> Diff: --- gcc/config/riscv/riscv.cc | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 97350b8305e..1358c243898 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -7365,7 +7365,15 @@ riscv_expand_prologue (void) /* Second step for constant frame. */ HOST_WIDE_INT constant_frame = remaining_size.to_constant (); if (constant_frame == 0) - return; + { + /* We must have allocated stack space for the scalable frame. + Emit a stack tie if we have a frame pointer so that the + allocation is ordered WRT fp setup and subsequent writes + into the frame. */ + if (frame_pointer_needed) + riscv_emit_stack_tie (); + return; + } if (SMALL_OPERAND (-constant_frame)) { @@ -7385,6 +7393,13 @@ riscv_expand_prologue (void) insn = gen_rtx_SET (stack_pointer_rtx, insn); riscv_set_frame_expr (insn); } + + /* We must have allocated the remainder of the stack frame. + Emit a stack tie if we have a frame pointer so that the + allocation is ordered WRT fp setup and subsequent writes + into the frame. */ + if (frame_pointer_needed) + riscv_emit_stack_tie (); } }