https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86625
--- Comment #5 from Chris Elrod <elrodc at gmail dot com> --- Created attachment 44424 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44424&action=edit Smaller avx512 kernel that still spills into the stack This generated 18 total `vmovapd` (I think there'd ideally be 0) when compiled with: gfortran -march=skylake-avx512 -mprefer-vector-width=512 -O2 -ftree-vectorize -shared -fPIC -S kernels16x32x13.f90 -o kernels16x32x13.s 4 of which moved onto the stack, and one moved from the stack back into a register. (The others were transfered from the stack within vfmadd instructions: `vfmadd213pd 72(%rsp), %zmm11, %zmm15` ) Similar to the larger kernel, using `-O3` instead of `-O2 -ftree-vectorize` eliminated two of the `vmovapd`instructions between registers, but none of the spills.