On 15/04/2025 19:48, Jan Hubicka wrote:
> Hi,
> > gcc/ChangeLog:
> > 
> >     PR tree-optimization/117790
> >     * tree-vect-loop.cc (scale_profile_for_vect_loop): Use
> >     scale_loop_profile_hold_exit_counts instead of scale_loop_profile.  Drop
> >     the exit edge parameter, since the code now handles multiple exits.
> >     Adjust the caller ...
> >     (vect_transform_loop): ... here.
> >
> >gcc/testsuite/ChangeLog:
> >
> >     PR tree-optimization/117790
> >     * gcc.dg/vect/vect-early-break-profile-2.c: New test.
> >
> >
> >-  if (entry_count.nonzero_p ())
> >-    set_edge_probability_and_rescale_others
> >-        (exit_e,
> >-         entry_count.probability_in (loop->header->count / vf));
> >-  /* Avoid producing very large exit probability when we do not have
> >-     sensible profile.  */
> >-  else if (exit_e->probability < profile_probability::always () / (vf * 2))
> 
> This is handling relatively common case wehre we decide to vectorize
> loop with, say, factor of 32 and have no profile-feedback.
> In this case if the loop trip count is unknown at early loop, we will
> esitmate it to iterate few times (approx 3-5 as that is average
> iteration count of random loop based on some measurements).
> 
> The fact that we want to vecotirze by factor 32 implies that vectorizer
> does not take this info seriously and its heuristics thinks better.
> In this case we do not wan to drop the loop to 0 iteraitons as that
> would result of poor code layout and regalloc.
> 
> I don't think you kept this logic in the new code?

To be honest, I didn't really follow the logic here.  Thinking about the
single-exit case (which the current code is designed to handle), both
the body of the if and the else if logically implement the same update:

  P'(exit) = vf * P(exit)

I suppose the code in the body of the if is just a more numerically stable way
of doing this?

Also, AFAICT, the condition of the else if is trying to make sure that:

  vf * P(exit) < 1/2
  i.e.
  P'(exit) < 1/2

so it ensures we don't inflate the exit probably above 1/2 (equivalently
we don't reduce the expected niters below 2).  But I don't understand
why this guard isn't applied to the whole if-else chain.  Why should we
allow inflating the exit probability above 1/2 if entry_count.nonzero_p (),
but otherwise not?

I dropped this logic in the 4/4 patch since it seemed
inconsistent, and in general the updating of exit probabilities
is already handled by scale_loop_profile_hold_exit_counts.

Can you explain the intent behind the current logic, so I know what
behaviour we want to preserve?  A concrete testcase would be useful,
too.  FWIW, I tried doing a testsuite run on aarch64 with the following patch:

diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 3de1ea646b4..71787301e6d 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -12092,6 +12092,9 @@ scale_profile_for_vect_loop (class loop *loop, edge 
exit_e, unsigned vf, bool fl
      sensible profile.  */
   else if (exit_e->probability < profile_probability::always () / (vf * 2))
     set_edge_probability_and_rescale_others (exit_e, exit_e->probability * vf);
+  else
+    gcc_unreachable ();
+

on top of current trunk, and the unreachable wasn't hit once.

Thanks,
Alex

> 
> Honza
> >-    set_edge_probability_and_rescale_others (exit_e, exit_e->probability * 
> >vf);
> >-  loop->latch->count = single_pred_edge (loop->latch)->count ();
> >-
> >-  scale_loop_profile (loop, profile_probability::always () / vf,
> >-                  get_likely_max_loop_iterations_int (loop));
> >+  const auto likely_max_niters = get_likely_max_loop_iterations_int (loop);
> >+  scale_loop_profile_hold_exit_counts (loop,
> >+                                   profile_probability::always () / vf,
> >+                                   likely_max_niters);
> 

Reply via email to