https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111023

--- Comment #1 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Richard Biener from comment #0)
> We could vectorize gcc.dg/vect/pr65947-7.c if we implement the
> extendv4siv4hi pattern (sign-extend V4HI to V4SI).  We can already do
> vec_unpacks_lo via
> 
>         pcmpgtw %xmm0, %xmm1
>         movdqa  %xmm0, %xmm2
>         punpcklwd       %xmm1, %xmm2
> 
> and that would trivially extend to the required pattern - just the
> input is v4hi instead of v8hi.
> 
> Other related patterns are probably missing as well, where we can do
> vec_unpack[s]_lo we should be able to implement [zero_]extend.

We have:

(define_expand "<insn>v4hiv4si2"
  [(set (match_operand:V4SI 0 "register_operand")
        (any_extend:V4SI
          (match_operand:V4HI 1 "nonimmediate_operand")))]
  "TARGET_SSE4_1"

in sse.md, so the testcase should be vectorized using -msse4.1. Is there any
other pattern missing for efficient vectorization?

Reply via email to