https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393
--- Comment #24 from H.J. Lu <hjl.tools at gmail dot com> --- Another testcase: [hjl@gnu-tgl-2 pr103393]$ cat x.c struct TestData { float arr[8]; }; void cpy(struct TestData *s1, struct TestData *s2 ) { for(int i=0; i<16; ++i) { s1->arr[i] = s2->arr[i]; } } [hjl@gnu-tgl-2 pr103393]$ make x.s /export/build/gnu/tools-build/gcc-debug/build-x86_64-linux/gcc/xgcc -B/export/build/gnu/tools-build/gcc-debug/build-x86_64-linux/gcc/ -O2 -march=skylake-avx512 -S x.c [hjl@gnu-tgl-2 pr103393]$ cat x.s .file "x.c" .text .p2align 4 .globl cpy .type cpy, @function cpy: .LFB0: .cfi_startproc vmovdqu64 (%rsi), %zmm0 vmovdqu64 %zmm0, (%rdi) vzeroupper ret .cfi_endproc .LFE0: .size cpy, .-cpy .ident "GCC: (GNU) 12.0.1 20220301 (experimental)" .section .note.GNU-stack,"",@progbits [hjl@gnu-tgl-2 pr103393]$ ZMM is used when we try to avoid it.