On Tue, 2 Aug 2022, Aldy Hernandez wrote:

> On Tue, Aug 2, 2022 at 1:59 PM Richard Biener <rguent...@suse.de> wrote:
> >
> > On Tue, 2 Aug 2022, Aldy Hernandez wrote:
> >
> > > On Tue, Aug 2, 2022 at 1:45 PM Richard Biener <rguent...@suse.de> wrote:
> > > >
> > > > On Tue, 2 Aug 2022, Aldy Hernandez wrote:
> > > >
> > > > > Unfortunately, this was before my time, so I don't know.
> > > > >
> > > > > That being said, thanks for tackling these issues that my work
> > > > > triggered last release.  Much appreciated.
> > > >
> > > > Ah.  But it was your r12-324-g69e5544210e3c0 that did
> > > >
> > > > -  else if (n_insns > 1)
> > > > +  else if (!m_speed_p && n_insns > 1)
> > > >
> > > > causing the breakage on the 12 branch.  That leads to a simpler
> > > > fix I guess.  Will re-test and also backport to GCC 12 if successful.
> > >
> > > Huh.  It's been a while, but that looks like a typo.  That patch was
> > > supposed to be non-behavior changing.
> >
> > Exactly my thinking so reverting it shouldn't be a reason for
> > detailed questions.  Now, the contains_hot_bb computation is,
> 
> Sorry for the pain.

So - actually the change was probably done on purpose (even if
reverting - which I've now already one - caused no testsuite regressions).
That's because the whole function is invoked N + 1 times for a path
of length N and we definitely want to avoid using the size optimization
heuristics when the path is not complete yet.  I think the proper
way is to do

diff --git a/gcc/tree-ssa-threadbackward.cc 
b/gcc/tree-ssa-threadbackward.cc
index ba114e98a41..6979398ef76 100644
--- a/gcc/tree-ssa-threadbackward.cc
+++ b/gcc/tree-ssa-threadbackward.cc
@@ -767,7 +767,11 @@ back_threader_profitability::profitable_path_p (const 
vec<basic_block> &m_path,
      as in PR 78407 this leads to noticeable improvements.  */
   if (m_speed_p
       && ((taken_edge && optimize_edge_for_speed_p (taken_edge))
-         || contains_hot_bb))
+         || contains_hot_bb
+         /* Avoid using the size heuristics when not doing the final
+            thread evaluation, we get called for each added BB
+            to the path.  */
+         || !taken_edge))
     {
       if (n_insns >= param_max_fsm_thread_path_insns)
        {

thus assume there'll be a hot BB in the future.

That said, the very best fix would be to not call this function
N + 1 times (I have a patch to call it only N times - yay), but
instead factor out parts to be called per BB plus keeping enough
state so we can incrementally collect info.

There's more "odd" things in the backward threader, of course :/

I'm looking for things applicable to the GCC 12 branch right now
so will try the above.

Richard.

Reply via email to