Pádraig Brady wrote:
I noticed that count_one_bits() branches on the popcount_support variable on each call, which might negate much of the gain from using this instruction?
Could be. As far as I know it's never been benchmarked. I stole that code from Emacs without investigating its performance (requires MS-Windows to test, which I don't have). It'd be nice to simplify the code by removing it if it doesn't help performance significantly on MS-Windows.