https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118984
--- Comment #4 from Maxim Egorushkin <maxim.yegorushkin at gmail dot com> --- To add more context, I use Mula's AVX2 popcount function from https://arxiv.org/abs/1611.07612 It produces 4 counts in a v4di register which should be summed into a scalar total. Which brought me here.