On Tue, 2 Aug 2022, Aldy Hernandez wrote: > On Tue, Aug 2, 2022 at 1:59 PM Richard Biener <rguent...@suse.de> wrote: > > > > On Tue, 2 Aug 2022, Aldy Hernandez wrote: > > > > > On Tue, Aug 2, 2022 at 1:45 PM Richard Biener <rguent...@suse.de> wrote: > > > > > > > > On Tue, 2 Aug 2022, Aldy Hernandez wrote: > > > > > > > > > Unfortunately, this was before my time, so I don't know. > > > > > > > > > > That being said, thanks for tackling these issues that my work > > > > > triggered last release. Much appreciated. > > > > > > > > Ah. But it was your r12-324-g69e5544210e3c0 that did > > > > > > > > - else if (n_insns > 1) > > > > + else if (!m_speed_p && n_insns > 1) > > > > > > > > causing the breakage on the 12 branch. That leads to a simpler > > > > fix I guess. Will re-test and also backport to GCC 12 if successful. > > > > > > Huh. It's been a while, but that looks like a typo. That patch was > > > supposed to be non-behavior changing. > > > > Exactly my thinking so reverting it shouldn't be a reason for > > detailed questions. Now, the contains_hot_bb computation is, > > Sorry for the pain.
So - actually the change was probably done on purpose (even if reverting - which I've now already one - caused no testsuite regressions). That's because the whole function is invoked N + 1 times for a path of length N and we definitely want to avoid using the size optimization heuristics when the path is not complete yet. I think the proper way is to do diff --git a/gcc/tree-ssa-threadbackward.cc b/gcc/tree-ssa-threadbackward.cc index ba114e98a41..6979398ef76 100644 --- a/gcc/tree-ssa-threadbackward.cc +++ b/gcc/tree-ssa-threadbackward.cc @@ -767,7 +767,11 @@ back_threader_profitability::profitable_path_p (const vec<basic_block> &m_path, as in PR 78407 this leads to noticeable improvements. */ if (m_speed_p && ((taken_edge && optimize_edge_for_speed_p (taken_edge)) - || contains_hot_bb)) + || contains_hot_bb + /* Avoid using the size heuristics when not doing the final + thread evaluation, we get called for each added BB + to the path. */ + || !taken_edge)) { if (n_insns >= param_max_fsm_thread_path_insns) { thus assume there'll be a hot BB in the future. That said, the very best fix would be to not call this function N + 1 times (I have a patch to call it only N times - yay), but instead factor out parts to be called per BB plus keeping enough state so we can incrementally collect info. There's more "odd" things in the backward threader, of course :/ I'm looking for things applicable to the GCC 12 branch right now so will try the above. Richard.