https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89992

--- Comment #2 from Martin Liška <marxin at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #1)
> It's simply that inlining makes the guessed profile not consider the loop
> worth
> optimizing for speed.  Part of that is because the loop ends up in main()
> which we know is executed exactly once and bb->count is less than the entry
> block count so we hit
> 
> maybe_hot_count_p (struct function *fun, profile_count count)
> {
> ...
>       if (node->frequency == NODE_FREQUENCY_EXECUTED_ONCE
>           && count < (ENTRY_BLOCK_PTR_FOR_FN (fun)->count.apply_scale (2,
> 3)))
>         return false;
> 
> this is probably due to predictors saying that
> 
>   if (__eax <= 6)
>     return 0; // return from main
> 
> is likely (it gets 66% hit predicted).  The foo() != 0 gets even probability
> and the following == 230 test gets only 11% probability to hit.
> 
> The "fun" of static profile... (and doing benchmarking in main()).
> 
> But it doesn't have anything to do with the vectorizer or calls.

As Richi says, static probability of calling 'do_test' in main is 3.8%. You can
use __builtin_expect{,_with_probability} if you want to make the path more
probable.

Reply via email to