https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68577

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Bet this started with r230475 or so.
The problem is that these builtins are in scalar form always returning int, but
take some argument of wider bitsize.
The vectorizer thinks the target can handle narrowing conversion builtin, and
thus emits:
  vect_l.17_53 = [vec_unpack_lo_expr] vect_vec_iv_.15_49;
  vect_l.17_54 = [vec_unpack_hi_expr] vect_vec_iv_.15_49;
  vect__7.18_55 = POPCOUNT (vect_l.17_53, vect_l.17_54);
where vec_l.17 has V2DImode, and vect__7.18 has V4SImode.
But at least the power8 popcountv2di2 pattern takes a single V2DImode argument
(rather than 2 V2DImode arguments) and returns a V2DImode result, while the
caller expects one that takes two V2DImode arguments and produces 4 results in
V4SImode vector for that.

So, first of all, what do we expect from the backends, shall they implement
these popcount<mode>2 expanders for vector modes as returning the result in the
same mode as the single argument (then the vectorizer needs to be told about
that and needs to do the narrowing manually, do e.g. in the above case do
POPCOUNT on each V2DImode argument separately, then combine the two V2DImode
results as for long -> int vectorized conversion.  Or they need to do something
different, but then the question is if it should be called popcountv2di2 if it
actually takes 2 arguments rather than 1, etc.

Reply via email to