https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102997
--- Comment #20 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to hubicka from comment #19) > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102997 > > > > --- Comment #18 from Aldy Hernandez <aldyh at gcc dot gnu.org> --- > > > > > If I read it correctly, for a path that enters the loop and later leaves > > > it (where threading is desirable since we skip the whole loop) the logic > > > above will still return true (after finishing the whole walk which seems > > > like a waste). > > > > Perhaps Richi can comment here, since this is his code? > > The testcase would be > > void test () > { > int i; > if (test()) > i=0; > else > i=100; > for (;i<100 && bar ();i++) > do_work (); > } > > Here I think we can thread else path to return w/o any issues with loop > nests even with -Os. we fail this because // If we crossed a loop into an outer loop without crossing the // latch, this is just an early exit from the loop. if (loops_crossed == 1 && !crossed_latch && flow_loop_nested_p (exit->dest->loop_father, exit->src->loop_father)) return false; does not trigger (loops_crossed == 2), and we then run into if (loops_crossed) { cancel_thread (&path, "Path crosses loops"); return true; } but the loops_crossed stuff is old and not from me, not sure what that's useful for. We could "fix" the accounting with diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c index 8aac733ac25..52ec22e85e1 100644 --- a/gcc/tree-ssa-threadupdate.c +++ b/gcc/tree-ssa-threadupdate.c @@ -2810,7 +2810,7 @@ jt_path_registry::cancel_invalid_paths (vec<jump_thread_edge *> &path) // If we crossed a loop into an outer loop without crossing the // latch, this is just an early exit from the loop. - if (loops_crossed == 1 + if (loops_crossed <= 2 && !crossed_latch && flow_loop_nested_p (exit->dest->loop_father, exit->src->loop_father)) return false; to also allow to thread through a loop path not crossing the latch but at least for the issue of "breaking loops" the loops_crossed stuff shouldn't be necessary. It might still prevent us from doing such transforms when it involves peeling. > > > > > > > > This may trigger more often at -Os since we limit loop header copying. > > > > Interesting. We have PR102906 which is a code size increase at -Os on ARM > > for > > the same commit. I wonder if it's related. > > > > > > > > And indeed, fixing profile updating would be nice. Why the updating > > > code is not reused across different threaders? (I wrote several thread > > > updating functions for varioius threaders introduced & remoed in the > > > past and I wonder why we need to keep reinventing it) > > > > This all predates me. I don't know. For the record, both threaders use > > completely different BB copiers and update mechanisms. It could be they're > > sufficiently different that reusing it is unfeasible ??. > > Updating after trheading is not too hard however there is a painful > point that profile may disagree with fact you proved. In general you do > 1) you compute count of path your are threading (i.e. all edges you > redirect to destination) > 2) all BBs along the path needs to have this count subtracted since > they are not going to be executed on it anymore > 3) all conditionas on the path needs probabilities updated: you know > the count entering the conditional and with same count you exit it via > a particular edge. > However here one needs to take into account that especially for guessed > profiles (but for feedback too if code was duplicated earlier) the > branch probability may be in a way that the leaving edge count is less > than the count you want to subtract. > > Honza