https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113827
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> --- (In reply to Robin Dapp from comment #0) > A hot block in the MrBayes benchmark (as used in the Phoronix testsuite) has > a redundant scalar load when vectorized. > > Minimal example, compiled with -march=rv64gcv -O3 > > int foo (float **a, float f, int n) > { > for (int i = 0; i < n; i++) > { > a[i][0] /= f; > a[i][1] /= f; > a[i][2] /= f; > a[i][3] /= f; > a[i] += 4; > } > } LLVM for aarch64 with the above testcase: `` .L3: ldr x2, [x0] mov x1, x2 ldr q31, [x2] fdiv v31.4s, v31.4s, v0.4s str q31, [x1], 16 str x1, [x0], 8 ;;;; HERE cmp x3, x0 bne .L3 ``` There is a store of x1 there. I really think you messed up reducing the testcase.