https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55157

--- Comment #7 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
(In reply to Andrew Macleod from comment #6)
> (In reply to Aldy Hernandez from comment #4)
> 
> > 
> > The patch below does this, but it does have a 3% penalty for VRP (though no
> > penalty to overall compilation).  I'm inclined to pursue this route, since
> > it makes nonzero mask optimization more pervasive across the board.
> > 
> > What do you think Andrew?
> > 
> 
> 1) Why wouldn't this be done in set_range_from_nonzero_bits()?  That call is

You're right.

> 2) That seems expensive.. we must be doing unnecessary work.  Maybe it would
> speed up if we checked if either the ctz or clz would cause it to do
> anything first.  Thus avoiding creating a couple of ranges and performing a
> union and intersection in cases where neither the leading nor trailing bit
> is a zero?

I had already played with that.  It made a marginal difference.

> 
> 3) It also seems to me that you then only need to add the zero/union iff the
> trailing bit has zeros. ie, if the are no trailing zeros, then just set the
> lb to 0, and calculate the UB based on the clz.

That actually made it slightly worse.

I'll attach what I just tested.

Reply via email to