https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90911
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |ASSIGNED Last reconfirmed| |2019-06-25 Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> --- So looking at perf the reason seems obvious: Samples: 717K of event 'cycles:pu', Event count (approx.): 586330968682 Overhead Command Shared Object Symbol ◆ 60.74% hmmer_base.amd6 hmmer_base.amd64-m64-gcc42-nn [.] P7Viterbi ▒ 31.35% hmmer_base.amd6 hmmer_base.amd64-m64-gcc42-nn [.] P7Viterbi.cold ▒ 2.32% hmmer_base.amd6 hmmer_base.amd64-m64-gcc42-nn [.] FChoose ▒ 2.02% hmmer_base.amd6 hmmer_base.amd64-m64-gcc42-nn [.] sre_random and caused by /* ??? if-conversion uses profile_probability::always () but prob below is profile_probability::likely (). */ thus we keep the profile from if-conversion which uses always () compared to previously versioning the loop with the vectorized path being only likely (). The inconsistent profile from if-conversion persists and seems to confuse us later: /* At this point we invalidate porfile confistency until IFN_LOOP_VECTORIZED is re-merged in the vectorizer. */ new_loop = loop_version (loop, cond, &cond_bb, profile_probability::always (), profile_probability::always (), profile_probability::always (), profile_probability::always (), true); Just updating the edge probability of the guard seems to avoid creating the bogus hot/cold partitioning and should not affect further copying from the scalar loop from prologue/epilogue peeling.