https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121970
Bug ID: 121970
Summary: struct copy still use zmm even specify -mmove-max=256
Product: gcc
Version: 16.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: liuhongt at gcc dot gnu.org
Target Milestone: ---
typedef struct
{
double ds[8];
}ds;
extern void bar (ds* );
void
foo (double* a, double* b, double* c, double* d, ds* __restrict e, int n)
{
ds tmp[2];
for (int j = 0 ; j != 1024; j++)
for (int i = 0; i != 8 ; i++)
tmp[0].ds[i] += a[8*j + i] * b[8*j + i];
for (int j = 0; j != 1024; j++)
for (int i = 0; i != 8; i++)
tmp[1].ds[i] += c[8*j + i] * c[8*j + i];
bar (tmp);
e[0] = tmp[0];
e[3] = tmp[1];
}
with -Ofast -march=sapphirerapids -mmove-max=256
There's zmm used for struct copy
call bar(ds*)
vmovdqu64 (%rsp), %zmm0
vmovdqu64 %zmm0, (%rbx)
vmovdqu64 64(%rsp), %zmm0
vmovdqu64 %zmm0, 192(%rbx)
vzeroupper
movq -8(%rbp), %rbx
with -Ofast -march=sapphirerapids -mstore-max=256
Still see zmm.
need to add both options
-Ofast -march=sapphirerapids -mstore-max=256 -mmove-max=256 then ymm is used
for struct copy.
https://godbolt.org/z/M739c9ejn
Not sure if it's on purpose or it's a bug.