Sender:Jan Hubicka <hubi...@ucw.cz>
Sent at:2018 Nov 5 (Mon) 22:21
To:Richard Biener <richard.guent...@gmail.com>
Cc:bin.cheng <bin.ch...@linux.alibaba.com>; GCC Patches 
<gcc-patches@gcc.gnu.org>
Subject:Re: [PATCH AutoFDO/2]Treat ZERO as common profile probability/count
 
> 
> > On Wed, Oct 31, 2018 at 7:30 AM bin.cheng <bin.ch...@linux.alibaba.com> 
> > wrote:
> > >
> > > Hi,
> > > In new profile probability/count infra, we have different precision 
> > > quality categories,
> > > and probabilities/counts of different categories are not supposed to be 
> > > compared or
> > > calculated.  Though in general is an improvement, it introduces 
> > > unexpected behavior.
> > > Specifically, class profile_probablity and profile_count themselves are 
> > > implemented
> > > by comparing probabilities/counts against profile_count::zero().  while 
> > > zero() is of
> > > profile_precision category, it's always compared different to zero of 
> > > other precision
> > > categories including afdo.
> > >
> > > I can see two ways fixing this: 1) Treat zero as a common 
> > > probability/count regardless
> > > of its category; 2) Provide an "is_zero" method rather than relying on 
> > > "==" comparison
> > > against probability_count::zero().  2) requires lots of code changes so I 
> > > went with 1)
> > > in this patch set.  This patch doesn't handle "always" but it might be.
> > >
> > > This patch also corrects a minor issue where we try to invert an 
> > > uninitialized value.
> > >
> > > Bootstrap and test on x86_64 in patch set.  Is it OK?
> > 
> > I'll defer on the emit_store_flag_force change, likewise for the zero
> > handling in
> > compares - I don't think zeros of different qualities should compare equal.
> > Would compares against ::always() not have the very same issue?
> > Likewise ::even(),
> > ::likely(), etc.?  Those always get guessed quality.
> > 
> > The invert change looks OK to me.  The related change to the always() API 
> > would
> > suggest to replace guessed_always() with always (guessed) and also do 
> > similar
> > changes throughout the whole API...
> > 
> > Honza?
> 
> The zeros are really differenct zeros.  profile_count::zero makes us to
> drop the basic block into cold section because we know that it won't be
> executed in normal run of program (either we have accurate profile
> feedback or by proving that the program is on way to crash or user
> annotated cold section).  Having guessed zero or auto-fdo zero won't
> make us to do such agressive size optimization. 
> This is important since those zeros relatively commonly happens by
> accident and thus if we dropped all the code to cold section the cold
> section would be visited relativel often during execution of program
> which would eliminate its need.
> 
> Most comparsion in profile-count.h which goes agains profile_count==zero
> are realy intended to pass only for this "aboslute zero". They bypass
> the precision adjusmtents which normally happen when you merge values
> of different precision. 
> 
> What kind of unexpected behaviour are you seeing?
> We already have nonzero_p which is what we use when we want to know that
> count is non-zero in some sense of precision.
Hi Honza,
Sorry for letting this slip away.  So in case of AutoFDO, due to the nature
of sampling, lots of funcs/bbs are annotated with zero profile_count in afdo
precision, and we have checks against zero profile_count in precise precision
All these checks end up with false and cause issues.  Take the code in
update_profiling_info as an example:

update_profiling_info (struct cgraph_node *orig_node,
                       struct cgraph_node *new_node)
{
   struct cgraph_edge *cs;
   struct caller_statistics stats;
   profile_count new_sum, orig_sum;
   profile_count remainder, orig_node_count = orig_node->count;

   if (!(orig_node_count.ipa () > profile_count::zero ()))
     return;
   //...
   for (cs = new_node->callees; cs; cs = cs->next_callee)
     cs->count = cs->count.apply_scale (new_sum, orig_node_count);

Since we also have below code in profile_count::operator>,
      if (other == profile_count::zero ())
        return !(*this == profile_count::zero ());

If orig_node_count is afdo zero, the above zero check for orig_node_count
returns false, we end up with passing zero density to apply_scale issue and
asserting.

In this updated patch, I restrcited changes only to profile_count::operator
<, >, <= and >=.  Plus, I think there is a latent typo in operator>= because
current code return TRUE if '*this' is precise zero and 'other' is precise
non-zero.
@@ -879,7 +879,7 @@ public:
       if (other == profile_count::zero ())
        return true;
       if (*this == profile_count::zero ())
-       return !(other == profile_count::zero ());
+       return !other.nonzero_p ();

Bootstrap and test on x86_64 along with other patches.

Thanks,
bin

2018-11-19  Bin Cheng  <bin.ch...@linux.alibaba.com>

        * profile-count.h (profile_count::operator<, >, <=): Check ZERO count
        using nonzero_p.
        (profile_count::oeprator>=): Invert return condition when *this is
        precise zero.  Check ZERO count in that condition using nonzero_p.

Attachment: 0003-Check-ZERO-profile-count-regardless-of-precision.patch
Description: Binary data

Reply via email to