https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97984
--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Martin Liška from comment #4)
> Hm, can't replicate for GCC 10, do you use any -mtune or so?
I can reproduce worse code for GCC 10 at -O3 -mtune=generic:
ldp x2, x3, [x0]
ldr x4, [x1]
ldr q1, [x1]
mul x2, x2, x4
ldr x4, [x1, 8]
fmov d0, x2
ins v0.d[1], x3
mul x1, x3, x4
ins v0.d[1], x1
add v0.2d, v0.2d, v1.2d
str q0, [x0]
But with -O3 -mtune=cortext-a57 the decent code happens.