https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111023
--- Comment #1 from Uroš Bizjak <ubizjak at gmail dot com> --- (In reply to Richard Biener from comment #0) > We could vectorize gcc.dg/vect/pr65947-7.c if we implement the > extendv4siv4hi pattern (sign-extend V4HI to V4SI). We can already do > vec_unpacks_lo via > > pcmpgtw %xmm0, %xmm1 > movdqa %xmm0, %xmm2 > punpcklwd %xmm1, %xmm2 > > and that would trivially extend to the required pattern - just the > input is v4hi instead of v8hi. > > Other related patterns are probably missing as well, where we can do > vec_unpack[s]_lo we should be able to implement [zero_]extend. We have: (define_expand "<insn>v4hiv4si2" [(set (match_operand:V4SI 0 "register_operand") (any_extend:V4SI (match_operand:V4HI 1 "nonimmediate_operand")))] "TARGET_SSE4_1" in sse.md, so the testcase should be vectorized using -msse4.1. Is there any other pattern missing for efficient vectorization?