https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78200

--- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> ---
Btw, I see with -mavx2

        addq    (%r9), %rax
        jns     .L90

.L90:
        je      .L92
        cmpl    $2, 24(%rdx)
        je      .L91

thus there is no extra cmpq $0, %rdi in the predecessor.

Note when I profile avx (base) vs. avx2 (peak) I see

 18.17%        451662  mcf_base.amd64-  mcf_base.amd64-m64-gcc42-nn  [.]
refresh_potential
 18.12%        424592  mcf_base.amd64-  mcf_base.amd64-m64-gcc42-nn  [.]
primal_bea_mpp
 17.96%        465325  mcf_peak.amd64-  mcf_peak.amd64-m64-gcc42-nn  [.]
primal_bea_mpp
 14.93%        373309  mcf_peak.amd64-  mcf_peak.amd64-m64-gcc42-nn  [.]
refresh_potential

plus a 3-run of avx (base) vs. avx2 (peak) gives me

429.mcf          9120        252       36.1 *    9120        264       34.6 S
429.mcf          9120        257       35.5 S    9120        253       36.0 S
429.mcf          9120        232       39.3 S    9120        258       35.4 *

which isn't really conclusive.

If you are trying to narrow down a regression GCC 6 vs. GCC 7 I wouldn't
look at flags but at profiling and what changed.

Reply via email to