https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92190
--- Comment #8 from Uroš Bizjak <ubizjak at gmail dot com> --- (In reply to Liu Hao from comment #7) > MSDN says 'the upper portions of YMM0-15 and ZMM0-15 are considered volatile > and must be considered destroyed on function calls' explicitly [1]. > > I am not clear about the cause of OP's ICE, but I think it should conform to > MSABI to emit VZEROUPPER in the epilog, followed by restoring XMM6 - XMM15, > destroying their upper halves. Similar with the prolog. The insertion of vzeroupper is not "invisible" to stack frame management code any more, since vzeroupper is now defined as: (insn 738 619 434 2 (parallel [ (unspec_volatile [ (const_int 0 [0]) ] UNSPECV_VZEROUPPER) (clobber (reg:V2DI 20 xmm0)) (clobber (reg:V2DI 21 xmm1)) (clobber (reg:V2DI 22 xmm2)) (clobber (reg:V2DI 23 xmm3)) (clobber (reg:V2DI 24 xmm4)) (clobber (reg:V2DI 25 xmm5)) (set (reg:V2DI 26 xmm6) (reg:V2DI 26 xmm6)) (clobber (reg:V2DI 27 xmm7)) (clobber (reg:V2DI 44 xmm8)) (clobber (reg:V2DI 45 xmm9)) (clobber (reg:V2DI 46 xmm10)) (clobber (reg:V2DI 47 xmm11)) (clobber (reg:V2DI 48 xmm12)) (clobber (reg:V2DI 49 xmm13)) (clobber (reg:V2DI 50 xmm14)) (clobber (reg:V2DI 51 xmm15)) ]) "pr92190.c":8:3 -1 (nil)) . The insertion point of vzeroupper pass is just after reload pass, and now all xmm registers (xmm0 - xmm15) become live. This is not a problem in SYSV ABI, where all registers are call_used, but in MS ABI, the prologue now tries to save xmm6 - xmm15 to the stack. So, vzeroupper should be described in a way that won't trigger saves of xmm6 - xmm15 to the stack, while still mark that high part of the register is clobbered. An alternative would be to consider the mode of call_used register and save only wide (> 128bits) registers in the caller. I'm not sure if the current implementation already clobbers the high part of the 256bit register.