https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68577
Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jakub at gcc dot gnu.org --- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> --- Bet this started with r230475 or so. The problem is that these builtins are in scalar form always returning int, but take some argument of wider bitsize. The vectorizer thinks the target can handle narrowing conversion builtin, and thus emits: vect_l.17_53 = [vec_unpack_lo_expr] vect_vec_iv_.15_49; vect_l.17_54 = [vec_unpack_hi_expr] vect_vec_iv_.15_49; vect__7.18_55 = POPCOUNT (vect_l.17_53, vect_l.17_54); where vec_l.17 has V2DImode, and vect__7.18 has V4SImode. But at least the power8 popcountv2di2 pattern takes a single V2DImode argument (rather than 2 V2DImode arguments) and returns a V2DImode result, while the caller expects one that takes two V2DImode arguments and produces 4 results in V4SImode vector for that. So, first of all, what do we expect from the backends, shall they implement these popcount<mode>2 expanders for vector modes as returning the result in the same mode as the single argument (then the vectorizer needs to be told about that and needs to do the narrowing manually, do e.g. in the above case do POPCOUNT on each V2DImode argument separately, then combine the two V2DImode results as for long -> int vectorized conversion. Or they need to do something different, but then the question is if it should be called popcountv2di2 if it actually takes 2 arguments rather than 1, etc.