Hello!
> SAD (Sum of Absolute Differences) is a common and important algorithm
> in image processing and other areas. SSE2 even introduced a new
> instruction PSADBW for it. A SAD loop can be greatly accelerated by
> this instruction after being vectorized. This patch introduced a new
> operation SAD_EXPR and a SAD pattern recognizer in vectorizer.
>
> In order to express this new operation, a new expression SAD_EXPR is
> introduced in tree.def, and the corresponding entry in optabs is
> added. The patch also added the "define_expand" for SSE2 and AVX2
> platforms for i386.
+(define_expand "sadv16qi"
+ [(match_operand:V4SI 0 "register_operand")
+ (match_operand:V16QI 1 "register_operand")
+ (match_operand:V16QI 2 "register_operand")
+ (match_operand:V4SI 3 "register_operand")]
+ "TARGET_SSE2"
+{
+ rtx t1 = gen_reg_rtx (V2DImode);
+ rtx t2 = gen_reg_rtx (V4SImode);
+ emit_insn (gen_sse2_psadbw (t1, operands[1], operands[2]));
+ convert_move (t2, t1, 0);
+ emit_insn (gen_rtx_SET (VOIDmode, operands[0],
+ gen_rtx_PLUS (V4SImode,
+ operands[3], t2)));
+ DONE;
+})
Please use generic expanders (expand_simple_binop) to generate plus
expression. Also, please use nonimmediate_operand predicate for
operand 2 and operand 3.
Please note, that nonimmediate operands should be passed as the second
input operand to commutative operators, to match their insn pattern
layout.
Uros.