https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102494

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
The vectorizer looks for a way to "shift" the whole vector by either vec_shr
or a corresponding vec_perm with constant shuffle operands.  When the target
provides none of those you get element extracts and scalar adds.

So yes, the vectorizer does the work for you but only if you hand it the
pieces.

It could possibly use a larger vector, doing only the "tail" of its final
reduction, so try with v8hi instead of v4hi, but it's not really clear if
such strategy would be good in general.

Reply via email to