[Bug middle-end/102997] [12 Regression] 45% 454.calculix regression with LTO+PGO -march=native -Ofast on Zen since r12-4526-gd8edfadfc7a9795b65177a50ce44fd348858e844

rguenth at gcc dot gnu.org via Gcc-bugs Mon, 08 Nov 2021 01:50:36 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102997


--- Comment #20 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to hubicka from comment #19)
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102997
> > 
> > --- Comment #18 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
> > 
> > > If I read it correctly, for a path that enters the loop and later leaves
> > > it (where threading is desirable since we skip the whole loop) the logic
> > > above will still return true (after finishing the whole walk which seems
> > > like a waste).
> > 
> > Perhaps Richi can comment here, since this is his code?
> 
> The testcase would be
> 
> void test ()
> {
>         int i;
>         if (test())
>           i=0;
>         else
>           i=100;
>         for (;i<100 && bar ();i++)
>                 do_work ();
> }
> 
> Here I think we can thread else path to return w/o any issues with loop
> nests even with -Os.

we fail this because

  // If we crossed a loop into an outer loop without crossing the
  // latch, this is just an early exit from the loop.
  if (loops_crossed == 1
      && !crossed_latch
      && flow_loop_nested_p (exit->dest->loop_father, exit->src->loop_father))
    return false;

does not trigger (loops_crossed == 2), and we then run into


  if (loops_crossed)
    {
      cancel_thread (&path, "Path crosses loops");
      return true;
    }

but the loops_crossed stuff is old and not from me, not sure what that's
useful for.  We could "fix" the accounting with

diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index 8aac733ac25..52ec22e85e1 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -2810,7 +2810,7 @@ jt_path_registry::cancel_invalid_paths
(vec<jump_thread_edge *> &path)

   // If we crossed a loop into an outer loop without crossing the
   // latch, this is just an early exit from the loop.
-  if (loops_crossed == 1
+  if (loops_crossed <= 2
       && !crossed_latch
       && flow_loop_nested_p (exit->dest->loop_father, exit->src->loop_father))
     return false;

to also allow to thread through a loop path not crossing the latch but
at least for the issue of "breaking loops" the loops_crossed stuff shouldn't
be necessary.  It might still prevent us from doing such transforms when
it involves peeling.

> > 
> > > 
> > > This may trigger more often at -Os since we limit loop header copying.
> > 
> > Interesting.  We have PR102906 which is a code size increase at -Os on ARM 
> > for
> > the same commit.  I wonder if it's related.
> > 
> > > 
> > > And indeed, fixing profile updating would be nice.  Why the updating
> > > code is not reused across different threaders?  (I wrote several thread
> > > updating functions for varioius threaders introduced & remoed in the
> > > past and I wonder why we need to keep reinventing it)
> > 
> > This all predates me.  I don't know.  For the record, both threaders use
> > completely different BB copiers and update mechanisms.  It could be they're
> > sufficiently different that reusing it is unfeasible ??.
> 
> Updating after trheading is not too hard however there is a painful
> point that profile may disagree with fact you proved. In general you do
>  1) you compute count of path your are threading (i.e. all edges you
>  redirect to destination)
>  2) all BBs along the path needs to have this count subtracted since
>  they are not going to be executed on it anymore
>  3) all conditionas on the path needs probabilities updated: you know
>  the count entering the conditional and with same count you exit it via
>  a particular edge.
>  However here one needs to take into account that especially for guessed
>  profiles (but for feedback too if code was duplicated earlier) the
>  branch probability may be in a way that the leaving edge count is less
>  than the count you want to subtract.
> 
> Honza

[Bug middle-end/102997] [12 Regression] 45% 454.calculix regression with LTO+PGO -march=native -Ofast on Zen since r12-4526-gd8edfadfc7a9795b65177a50ce44fd348858e844

Reply via email to