https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101639
--- Comment #13 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
>
> For XOR cstorem4 isn't of help, but if we can get a scalar bit mask we
> can use popcount&1 here. Targets with separate vector modes for masks
> can use reduc_{and,ior,xor}_scal but on x86 with either integer vector modes
> or integer scalar modes that's going to be difficult. A more explicit
> reduc_mask_{and,ior,xor}_scal would be better there.
Yes, indeed, x86 can use vpmovmskb/kmov to convert vector mask to scalar and
then popcnt&1, those implementation can all be done in the backend expander.