https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72517
--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> --- As for the cost model the vectorizer uses vec_to_scalar for element extraction which it was not originally added for (it was added for the cost of extracting element zero only). Ok, so I can reproduce the regression (-Ofast -march=native): 436.cactusADM 11950 304 39.3 * 11950 420 28.5 * The function that regresses is bench_staggeredleapfrog2_ (that was probably obvious). The set of vectorized loops / BBs does _not_ change with the patch. We do a lot less work in dependence analysis after the patch as no read-read dependences are computed which unfortunately had the side-effect of mostly disabling STMT_VINFO_SAME_ALIGN_REFS computation which causes us to peel for alignment for a different DR. I'll revert that part of the change.