On Thu, Oct 27, 2011 at 9:50 PM, Jakub Jelinek <ja...@redhat.com> wrote:
> Hi!
>
> This patch cleans up the vector/vector shifts, there is no need
> to write them with lots of vec_selects/vec_concats etc.
> Additionally, it hooks them up into the standard vlshr<mode>3,
> vashl<mode>3 and vashr<mode>3 expanders so that the vectorizer
> can use them.  The V16QImode and V8HImode expanders XOP provides
> aren't probably very useful for autovectorization of C/C++ code,
> because the FEs will use int shifts in that case and we can't
> prove using smaller shifts is ok (except for left shifts if
> the vectorizer got a guarantee that larger than width shifts
> just zero instead of being clipped.).
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2011-10-27  Jakub Jelinek  <ja...@redhat.com>
>
>        * config/i386/sse.md (VI4SD_AVX2): Removed.
>        (VI48_AVX2): New iterator.
>        (vlshr<mode>3, vashl<mode>3): For VI48_AVX2 modes
>        implement for TARGET_AVX2, for V4SImode also for
>        TARGET_XOP if !TARGET_AVX2.
>        (vashr<mode>3): For VI4_AVX2 modes implement for
>        TARGET_AVX2, for V4SImode also for
>        TARGET_XOP if !TARGET_AVX2.
>        (avx2_ashrvv8si, avx2_ashrvv4si, avx2_<lshift>vv8si,
>        avx2_<lshift>vv2di): Removed.
>        (avx2_ashrv<mode>): New insn with VI4_AVX2 iterator.
>        (avx2_<lshift>v<mode>): Macroize using VI48_AVX2
>        iterator.  Simplify pattern.
>
>        * gcc.dg/vshift-1.c: New test.
>        * gcc.dg/vshift-2.c: New test.
>        * gcc.target/i386/xop-vshift-1.c: New test.
>        * gcc.target/i386/xop-vshift-2.c: New test.
>        * gcc.target/i386/avx2-vshift-1.c: New test.



> +(define_expand "vlshr<mode>3"
> +  [(match_operand:VI48_AVX2 0 "register_operand" "")
> +   (match_operand:VI48_AVX2 1 "register_operand" "")
> +   (match_operand:VI48_AVX2 2 "register_operand" "")]
> +  "TARGET_AVX2 || (<MODE>mode == V4SImode && TARGET_XOP)"
> +{
> +  if (<MODE>mode == V4SImode && !TARGET_AVX2)
> +    {
> +      rtx neg = gen_reg_rtx (V4SImode);
> +      emit_insn (gen_negv4si2 (neg, operands[2]));
> +      emit_insn (gen_xop_lshlv4si3 (operands[0], operands[1], neg));
> +      DONE;
> +    }
> +  emit_insn (gen_avx2_lshrv<mode> (operands[0], operands[1], operands[2]));
> +  DONE;
> +})

...

>  (define_insn "avx2_<lshift>v<mode>"

> +  [(set (match_operand:VI48_AVX2 0 "register_operand" "=x")
> +       (lshift:VI48_AVX2 (match_operand:VI48_AVX2 1 "register_operand" "x")
> +                         (match_operand:VI48_AVX2 2 "nonimmediate_operand"
> +                                                    "xm")))]
>   "TARGET_AVX2"
>   "vp<lshift_insn>v<ssemodesuffix>\t{%2, %1, %0|%0, %1, %2}"
>   [(set_attr "type" "sseishft")
>    (set_attr "prefix" "vex")
>    (set_attr "mode" "<sseinsnmode>")])

Please use expressive RTX forms for expanders, similar to the above
define_insn RTX. You can avoid calling gen_avx2_lshrv<mode> at the end
of c code. Also, expanders can have nonimmediate_operand as operand 2
and conditionally move it to register in C code block if needed.

Uros.

Reply via email to