https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112510
Bug ID: 112510 Summary: Regression: ASAN code injection breaks alignment of stack variables Product: gcc Version: 13.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: sadko4u at gmail dot com Target Milestone: --- Created attachment 56568 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56568&action=edit Code snippet that can cause the problem Hello! I've met a problem on Github CI while introducing AVX-512 support to my DSP code when using ASAN and was able to find out what's happening only in few launches. The problem run starts here: https://github.com/lsp-plugins/lsp-plugins-beat-breather/actions/runs/6843768144/job/18606721058#step:12:49 We observe a disassembly code that crashed: ``` => 0x7f3bdb82df53 <_ZN3lsp6avx51213gate_x1_curveEPfPKfPKNS_3dsp11gate_knee_tEm+247>: vmovaps %zmm0,0x120(%rbx) 0x7f3bdb82df5d <_ZN3lsp6avx51213gate_x1_curveEPfPKfPKNS_3dsp11gate_knee_tEm+257>: vmovaps %zmm1,0x160(%rbx) 0x7f3bdb82df67 <_ZN3lsp6avx51213gate_x1_curveEPfPKfPKNS_3dsp11gate_knee_tEm+267>: vmovaps %zmm2,0x1a0(%rbx) 0x7f3bdb82df71 <_ZN3lsp6avx51213gate_x1_curveEPfPKfPKNS_3dsp11gate_knee_tEm+277>: vmovaps %zmm3,0x1e0(%rbx) 0x7f3bdb82df7b <_ZN3lsp6avx51213gate_x1_curveEPfPKfPKNS_3dsp11gate_knee_tEm+287>: vmovaps %zmm4,0x220(%rbx) 0x7f3bdb82df85 <_ZN3lsp6avx51213gate_x1_curveEPfPKfPKNS_3dsp11gate_knee_tEm+297>: vmovaps %zmm5,0x260(%rbx) 0x7f3bdb82df8f <_ZN3lsp6avx51213gate_x1_curveEPfPKfPKNS_3dsp11gate_knee_tEm+307>: vmovaps %zmm6,0x2a0(%rbx) 0x7f3bdb82df99 <_ZN3lsp6avx51213gate_x1_curveEPfPKfPKNS_3dsp11gate_knee_tEm+317>: vmovaps %zmm7,0x2e0(%rbx) ``` As we see, the offset to the %rbx register is not 64-byte aligned while the %rbx register is: ``` rbx 0x7f3bd9d20400 139895034217472 ``` If we disassemble the heading of the function, then we see: ``` (gdb) Dump of assembler code for function _ZN3lsp6avx51213gate_x1_curveEPfPKfPKNS_3dsp11gate_knee_tEm: 0x00007f3bdb82de5c <+0>: push %rbp 0x00007f3bdb82de5d <+1>: mov %rsp,%rbp 0x00007f3bdb82de60 <+4>: push %r15 0x00007f3bdb82de62 <+6>: push %r14 0x00007f3bdb82de64 <+8>: push %r13 0x00007f3bdb82de66 <+10>: push %r12 0x00007f3bdb82de68 <+12>: push %rbx 0x00007f3bdb82de69 <+13>: and $0xffffffffffffffc0,%rsp 0x00007f3bdb82de6d <+17>: sub $0x3c0,%rsp 0x00007f3bdb82de74 <+24>: mov %rdi,%r12 0x00007f3bdb82de77 <+27>: mov %rsi,%r13 0x00007f3bdb82de7a <+30>: mov %rdx,%r14 0x00007f3bdb82de7d <+33>: mov %rcx,0x18(%rsp) 0x00007f3bdb82de82 <+38>: lea 0x20(%rsp),%rbx 0x00007f3bdb82de87 <+43>: mov %rbx,%r15 0x00007f3bdb82de8a <+46>: mov 0x15f13f(%rip),%rax # 0x7f3bdb98cfd0 0x00007f3bdb82de91 <+53>: cmpl $0x0,(%rax) 0x00007f3bdb82de94 <+56>: jne 0x7f3bdb82eba0 <_ZN3lsp6avx51213gate_x1_curveEPfPKfPKNS_3dsp11gate_knee_tEm+3396> 0x00007f3bdb82de9a <+62>: movq $0x41b58ab3,(%rbx) 0x00007f3bdb82dea1 <+69>: lea 0x95bb0(%rip),%rax # 0x7f3bdb8c3a58 0x00007f3bdb82dea8 <+76>: mov %rax,0x8(%rbx) 0x00007f3bdb82deac <+80>: lea -0x57(%rip),%rax # 0x7f3bdb82de5c <_ZN3lsp6avx51213gate_x1_curveEPfPKfPKNS_3dsp11gate_knee_tEm> 0x00007f3bdb82deb3 <+87>: mov %rax,0x10(%rbx) 0x00007f3bdb82deb7 <+91>: mov %rbx,%rax 0x00007f3bdb82deba <+94>: shr $0x3,%rax 0x00007f3bdb82debe <+98>: movl $0xf1f1f1f1,0x7fff8000(%rax) 0x00007f3bdb82dec8 <+108>: movl $0xf2f2f2f2,0x7fff8008(%rax) 0x00007f3bdb82ded2 <+118>: movl $0xf2f2f2f2,0x7fff801c(%rax) 0x00007f3bdb82dedc <+128>: movl $0xf2f2f2f2,0x7fff8020(%rax) 0x00007f3bdb82dee6 <+138>: movl $0xf3f3f3f3,0x7fff8064(%rax) 0x00007f3bdb82def0 <+148>: movl $0xf3f3f3f3,0x7fff8068(%rax) 0x00007f3bdb82defa <+158>: movl $0xf3f3f3f3,0x7fff806c(%rax) 0x00007f3bdb82df04 <+168>: mov %fs:0x28,%rdx 0x00007f3bdb82df0d <+177>: mov %rdx,0x3b8(%rsp) 0x00007f3bdb82df15 <+185>: xor %edx,%edx 0x00007f3bdb82df17 <+187>: mov 0x18(%rsp),%rdx 0x00007f3bdb82df1c <+192>: vbroadcastss (%r14),%zmm0 0x00007f3bdb82df22 <+198>: vbroadcastss 0x4(%r14),%zmm1 0x00007f3bdb82df29 <+205>: vbroadcastss 0x8(%r14),%zmm2 0x00007f3bdb82df30 <+212>: vbroadcastss 0xc(%r14),%zmm3 0x00007f3bdb82df37 <+219>: vbroadcastss 0x10(%r14),%zmm4 0x00007f3bdb82df3e <+226>: vbroadcastss 0x14(%r14),%zmm5 0x00007f3bdb82df45 <+233>: vbroadcastss 0x18(%r14),%zmm6 0x00007f3bdb82df4c <+240>: vbroadcastss 0x1c(%r14),%zmm7 => 0x00007f3bdb82df53 <+247>: vmovaps %zmm0,0x120(%rbx) 0x00007f3bdb82df5d <+257>: vmovaps %zmm1,0x160(%rbx) 0x00007f3bdb82df67 <+267>: vmovaps %zmm2,0x1a0(%rbx) ``` The function is aligning stack pointer properly at the beginning: ``` 0x00007f3bdb82de69 <+13>: and $0xffffffffffffffc0,%rsp ``` But after some ASAN magic it now uses %rbx as a base pointer instead of %rsp (for non-ASAN build) and generates invalid offsets which are aligned to 32-byte boundary instead of 64-byte boundary. This bug was hard to reproduce for me since it happens only on particular machines and I am able to do it only on a specific runner with GitHub CI. It is not reproducible on other than ArchLinux systems with older compiler because the compiler generates proper The version of compiler: ``` gcc (GCC) 13.2.1 20230801 Copyright (C) 2023 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. ``` I think the reason is in this code that loads 64-byte aligned address into %rbx: ``` 0x00007f4ef6295f3e <+197>: mov 0x18(%rsp),%rdx ``` while the generated offsets for the `vmovaps` instruction expect that the address is multiple of 0x20 but not of 0x40. The code snippet that MAY cause the problem is in attachment. But I really was able to reproduce the problem on specific CPUs on the GitHub CI. For example, here is my GDB session of the test snippet on an AVX512 computer: ``` │B+> 0x5555555552dc <_Z13gate_x1_curvePfPKfPKN3dsp11gate_knee_tEm+243> vmovaps %zmm0,0x120(%rbx) │ │ 0x5555555552e6 <_Z13gate_x1_curvePfPKfPKN3dsp11gate_knee_tEm+253> vmovaps %zmm1,0x160(%rbx) │ │ 0x5555555552f0 <_Z13gate_x1_curvePfPKfPKN3dsp11gate_knee_tEm+263> vmovaps %zmm2,0x1a0(%rbx) │ │ 0x5555555552fa <_Z13gate_x1_curvePfPKfPKN3dsp11gate_knee_tEm+273> vmovaps %zmm3,0x1e0(%rbx) │ │ 0x555555555304 <_Z13gate_x1_curvePfPKfPKN3dsp11gate_knee_tEm+283> vmovaps %zmm4,0x220(%rbx) │ │ 0x55555555530e <_Z13gate_x1_curvePfPKfPKN3dsp11gate_knee_tEm+293> vmovaps %zmm5,0x260(%rbx) │ │ 0x555555555318 <_Z13gate_x1_curvePfPKfPKN3dsp11gate_knee_tEm+303> vmovaps %zmm6,0x2a0(%rbx) │ │ 0x555555555322 <_Z13gate_x1_curvePfPKfPKN3dsp11gate_knee_tEm+313> vmovaps %zmm7,0x2e0(%rbx) ``` While %rbx is: ``` rbx 0x7fffffffde20 140737488346656 ```