https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112384
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> --- Oh f2 just goes to memory. Produces: ``` and x0, x0, 3 str q0, [sp] ldr s0, [sp, x0, lsl 2] dup v0.4s, v0.s[0] ``` Now clang(LLVM) produces: ``` mov x8, sp and w9, w0, #0x3 str q0, [sp] orr x8, x8, x9, lsl #2 ld1r { v0.4s }, [x8] ``` I don't know which is better but it might be the case where GCC's is better for some micro-arch.