https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97984
--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> --- (In reply to Martin Liška from comment #4) > Hm, can't replicate for GCC 10, do you use any -mtune or so? I can reproduce worse code for GCC 10 at -O3 -mtune=generic: ldp x2, x3, [x0] ldr x4, [x1] ldr q1, [x1] mul x2, x2, x4 ldr x4, [x1, 8] fmov d0, x2 ins v0.d[1], x3 mul x1, x3, x4 ins v0.d[1], x1 add v0.2d, v0.2d, v1.2d str q0, [x0] But with -O3 -mtune=cortext-a57 the decent code happens.