https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55157
--- Comment #7 from Aldy Hernandez <aldyh at gcc dot gnu.org> --- (In reply to Andrew Macleod from comment #6) > (In reply to Aldy Hernandez from comment #4) > > > > > The patch below does this, but it does have a 3% penalty for VRP (though no > > penalty to overall compilation). I'm inclined to pursue this route, since > > it makes nonzero mask optimization more pervasive across the board. > > > > What do you think Andrew? > > > > 1) Why wouldn't this be done in set_range_from_nonzero_bits()? That call is You're right. > 2) That seems expensive.. we must be doing unnecessary work. Maybe it would > speed up if we checked if either the ctz or clz would cause it to do > anything first. Thus avoiding creating a couple of ranges and performing a > union and intersection in cases where neither the leading nor trailing bit > is a zero? I had already played with that. It made a marginal difference. > > 3) It also seems to me that you then only need to add the zero/union iff the > trailing bit has zeros. ie, if the are no trailing zeros, then just set the > lb to 0, and calculate the UB based on the clz. That actually made it slightly worse. I'll attach what I just tested.