Dear gcc developers,

Im am not sure if this is the right email address. Please give me
the right email address if necessary.

I checked the implementation of std::bitset::count in gcc
and find that it uses a table look up.

This is not optimal.

First on 64-bit machines divide-and-conquer algorithms are faster than
table look up and save even memory.

Second if the bitset contains several words, there are for both 32 and
64 bit machines population count algorithms that are faster than
the simple loop over all words used by boost.

Together with Y. Edel I developed a new version,which saves 60%
against the simple algorithm. We put the code online
(http://cage.ugent.be/~klein/popc.html). A disadvantage of our
implementation is that we relay on aggressive optimization (-O3) of
the compiler.

If you are searching a compromise between speed and code
complexity. The first iteration of the Harley-Seal-method (25% faster
than the simple loop) would be a candidate.

You find this and other population count algorithms on our homepage, too.

If there is interest I can add one of the advance algorithms to the
bitset libary.

Best wishes
     Andreas Klein

Reply via email to