On 15/04/2025 19:48, Jan Hubicka wrote: > Hi, > > gcc/ChangeLog: > > > > PR tree-optimization/117790 > > * tree-vect-loop.cc (scale_profile_for_vect_loop): Use > > scale_loop_profile_hold_exit_counts instead of scale_loop_profile. Drop > > the exit edge parameter, since the code now handles multiple exits. > > Adjust the caller ... > > (vect_transform_loop): ... here. > > > >gcc/testsuite/ChangeLog: > > > > PR tree-optimization/117790 > > * gcc.dg/vect/vect-early-break-profile-2.c: New test. > > > > > >- if (entry_count.nonzero_p ()) > >- set_edge_probability_and_rescale_others > >- (exit_e, > >- entry_count.probability_in (loop->header->count / vf)); > >- /* Avoid producing very large exit probability when we do not have > >- sensible profile. */ > >- else if (exit_e->probability < profile_probability::always () / (vf * 2)) > > This is handling relatively common case wehre we decide to vectorize > loop with, say, factor of 32 and have no profile-feedback. > In this case if the loop trip count is unknown at early loop, we will > esitmate it to iterate few times (approx 3-5 as that is average > iteration count of random loop based on some measurements). > > The fact that we want to vecotirze by factor 32 implies that vectorizer > does not take this info seriously and its heuristics thinks better. > In this case we do not wan to drop the loop to 0 iteraitons as that > would result of poor code layout and regalloc. > > I don't think you kept this logic in the new code?
To be honest, I didn't really follow the logic here. Thinking about the single-exit case (which the current code is designed to handle), both the body of the if and the else if logically implement the same update: P'(exit) = vf * P(exit) I suppose the code in the body of the if is just a more numerically stable way of doing this? Also, AFAICT, the condition of the else if is trying to make sure that: vf * P(exit) < 1/2 i.e. P'(exit) < 1/2 so it ensures we don't inflate the exit probably above 1/2 (equivalently we don't reduce the expected niters below 2). But I don't understand why this guard isn't applied to the whole if-else chain. Why should we allow inflating the exit probability above 1/2 if entry_count.nonzero_p (), but otherwise not? I dropped this logic in the 4/4 patch since it seemed inconsistent, and in general the updating of exit probabilities is already handled by scale_loop_profile_hold_exit_counts. Can you explain the intent behind the current logic, so I know what behaviour we want to preserve? A concrete testcase would be useful, too. FWIW, I tried doing a testsuite run on aarch64 with the following patch: diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index 3de1ea646b4..71787301e6d 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -12092,6 +12092,9 @@ scale_profile_for_vect_loop (class loop *loop, edge exit_e, unsigned vf, bool fl sensible profile. */ else if (exit_e->probability < profile_probability::always () / (vf * 2)) set_edge_probability_and_rescale_others (exit_e, exit_e->probability * vf); + else + gcc_unreachable (); + on top of current trunk, and the unreachable wasn't hit once. Thanks, Alex > > Honza > >- set_edge_probability_and_rescale_others (exit_e, exit_e->probability * > >vf); > >- loop->latch->count = single_pred_edge (loop->latch)->count (); > >- > >- scale_loop_profile (loop, profile_probability::always () / vf, > >- get_likely_max_loop_iterations_int (loop)); > >+ const auto likely_max_niters = get_likely_max_loop_iterations_int (loop); > >+ scale_loop_profile_hold_exit_counts (loop, > >+ profile_probability::always () / vf, > >+ likely_max_niters); >