http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50268

--- Comment #6 from Marc Glisse <marc.glisse at normalesup dot org> 2011-09-02 
08:03:22 UTC ---
(In reply to comment #5)
> This one is much better, and actually should lead to slightly better code than
> C++98, because we don't do anything if _Nw > 1 (the 32-bit case is also better
> but doesn't optimize the case _Nb % _GLIBCXX_BITSET_BITS_PER_WORD == 0 && _Nb 
> %
> _GLIBCXX_BITSET_BITS_PER_ULL != 0. I don't care much these times)

Looks better indeed. I think the compiler should be responsible for optimizing
x&~0UL, not the library. I'll have to check that bitset<32>(x).count() has no
overhead compared to a call to __builtin_popcount.

Looks to me like _DoWork is actually _Nb<_GLIBCXX_BITSET_BITS_PER_ULL (more
intuitive, and it makes _Nw and _Extrabits useless). I usually write the number
~((~static_cast<unsigned long long>(0)) << _Extrabits) as (1ULL <<
_Extrabits)-1 and just noticed that your version would be faster at runtime
(here it is compile-time anyway), cool.

Reply via email to