https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78200
--- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> --- Btw, I see with -mavx2 addq (%r9), %rax jns .L90 .L90: je .L92 cmpl $2, 24(%rdx) je .L91 thus there is no extra cmpq $0, %rdi in the predecessor. Note when I profile avx (base) vs. avx2 (peak) I see 18.17% 451662 mcf_base.amd64- mcf_base.amd64-m64-gcc42-nn [.] refresh_potential 18.12% 424592 mcf_base.amd64- mcf_base.amd64-m64-gcc42-nn [.] primal_bea_mpp 17.96% 465325 mcf_peak.amd64- mcf_peak.amd64-m64-gcc42-nn [.] primal_bea_mpp 14.93% 373309 mcf_peak.amd64- mcf_peak.amd64-m64-gcc42-nn [.] refresh_potential plus a 3-run of avx (base) vs. avx2 (peak) gives me 429.mcf 9120 252 36.1 * 9120 264 34.6 S 429.mcf 9120 257 35.5 S 9120 253 36.0 S 429.mcf 9120 232 39.3 S 9120 258 35.4 * which isn't really conclusive. If you are trying to narrow down a regression GCC 6 vs. GCC 7 I wouldn't look at flags but at profiling and what changed.