Feel free to blame me for everything except the profitability code and the generic block copier. That stuff was all there before and I mostly avoided it.
:-) Thanks for the work in this space. Aldy On Tue, Aug 2, 2022, 15:29 Richard Biener <rguent...@suse.de> wrote: > On Tue, 2 Aug 2022, Aldy Hernandez wrote: > > > On Tue, Aug 2, 2022 at 1:59 PM Richard Biener <rguent...@suse.de> wrote: > > > > > > On Tue, 2 Aug 2022, Aldy Hernandez wrote: > > > > > > > On Tue, Aug 2, 2022 at 1:45 PM Richard Biener <rguent...@suse.de> > wrote: > > > > > > > > > > On Tue, 2 Aug 2022, Aldy Hernandez wrote: > > > > > > > > > > > Unfortunately, this was before my time, so I don't know. > > > > > > > > > > > > That being said, thanks for tackling these issues that my work > > > > > > triggered last release. Much appreciated. > > > > > > > > > > Ah. But it was your r12-324-g69e5544210e3c0 that did > > > > > > > > > > - else if (n_insns > 1) > > > > > + else if (!m_speed_p && n_insns > 1) > > > > > > > > > > causing the breakage on the 12 branch. That leads to a simpler > > > > > fix I guess. Will re-test and also backport to GCC 12 if > successful. > > > > > > > > Huh. It's been a while, but that looks like a typo. That patch was > > > > supposed to be non-behavior changing. > > > > > > Exactly my thinking so reverting it shouldn't be a reason for > > > detailed questions. Now, the contains_hot_bb computation is, > > > > Sorry for the pain. > > So - actually the change was probably done on purpose (even if > reverting - which I've now already one - caused no testsuite regressions). > That's because the whole function is invoked N + 1 times for a path > of length N and we definitely want to avoid using the size optimization > heuristics when the path is not complete yet. I think the proper > way is to do > > diff --git a/gcc/tree-ssa-threadbackward.cc > b/gcc/tree-ssa-threadbackward.cc > index ba114e98a41..6979398ef76 100644 > --- a/gcc/tree-ssa-threadbackward.cc > +++ b/gcc/tree-ssa-threadbackward.cc > @@ -767,7 +767,11 @@ back_threader_profitability::profitable_path_p (const > vec<basic_block> &m_path, > as in PR 78407 this leads to noticeable improvements. */ > if (m_speed_p > && ((taken_edge && optimize_edge_for_speed_p (taken_edge)) > - || contains_hot_bb)) > + || contains_hot_bb > + /* Avoid using the size heuristics when not doing the final > + thread evaluation, we get called for each added BB > + to the path. */ > + || !taken_edge)) > { > if (n_insns >= param_max_fsm_thread_path_insns) > { > > thus assume there'll be a hot BB in the future. > > That said, the very best fix would be to not call this function > N + 1 times (I have a patch to call it only N times - yay), but > instead factor out parts to be called per BB plus keeping enough > state so we can incrementally collect info. > > There's more "odd" things in the backward threader, of course :/ > > I'm looking for things applicable to the GCC 12 branch right now > so will try the above. > > Richard. > >