https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120398
--- Comment #2 from Alexander Monakov <amonakov at gcc dot gnu.org> --- Right, aarch64 vectorizes at -O3 to the desired form, but not -O2: .L3: ldr d1, [x0, x2, lsl 3] add x2, x2, 1 fmla v31.2s, v1.2s, v1.2s cmp x1, x2 bne .L3