On Fri, Dec 20, 2024 at 2:14 PM Vineet Gupta <vine...@rivosinc.com> wrote:
>
> This improves codegen for x264 sum of absolute difference routines.
> The insn count is same, but we avoid double widening ops and ensuing
> whole register moves.
>
> Also for more general applicability, we chose to implement abs diff
> vs. the sum of abs diff variant.
>
> Suggested-by: Robin Dapp <rd...@ventanamicro.com>
> Co-developed-by: Pan Li <pan2...@intel.com>

I think we use Co-Authored-By rather than Co-developed-by.

Thanks,
Andrew Pinski


> Signed-off-by: Vineet Gupta <vine...@rivosinc.com>
>
>         PR target/117722
>
> gcc/ChangeLog:
>         * config/riscv/autovec.md: Add uabd expander.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/riscv/rvv/autovec/pr117722.c: New test.
>
> Signed-off-by: Vineet Gupta <vine...@rivosinc.com>
> ---
>  gcc/config/riscv/autovec.md                   | 26 +++++++++++++++++++
>  .../gcc.target/riscv/rvv/autovec/pr117722.c   | 23 ++++++++++++++++
>  2 files changed, 49 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr117722.c
>
> diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
> index 2529dc77f221..4678906fb918 100644
> --- a/gcc/config/riscv/autovec.md
> +++ b/gcc/config/riscv/autovec.md
> @@ -2928,3 +2928,29 @@
>      riscv_vector::expand_strided_store (<MODE>mode, operands);
>      DONE;
>    })
> +
> +; ========
> +; == Absolute difference (not including sum)
> +; ========
> +(define_expand "uabd<mode>3"
> +  [(match_operand:V_VLSI 0 "register_operand")
> +   (match_operand:V_VLSI 1 "register_operand")
> +   (match_operand:V_VLSI 2 "register_operand")]
> +  "TARGET_VECTOR"
> +  {
> +    rtx max = gen_reg_rtx (<MODE>mode);
> +    insn_code icode = code_for_pred (UMAX, <MODE>mode);
> +    rtx ops1[] = {max, operands[1], operands[2]};
> +    riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, ops1);
> +
> +    rtx min = gen_reg_rtx (<MODE>mode);
> +    icode = code_for_pred (UMIN, <MODE>mode);
> +    rtx ops2[] = {min, operands[1], operands[2]};
> +    riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, ops2);
> +
> +    icode = code_for_pred (MINUS, <MODE>mode);
> +    rtx ops3[] = {operands[0], max, min};
> +    riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, ops3);
> +
> +    DONE;
> +  });
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr117722.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr117722.c
> new file mode 100644
> index 000000000000..c633f31ef25b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr117722.c
> @@ -0,0 +1,23 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv_zvl256b -mabi=lp64d -O2" } */
> +
> +/* Generate sum of absolute difference as sub (max, min).
> +   This helps with x264 sad routines.  */
> +
> +inline int abs(int i)
> +{
> +  return (i < 0 ? -i : i);
> +}
> +
> +int pixel_sad_n(unsigned char *pix1, unsigned char *pix2, int n)
> +{
> +  int sum = 0;
> +  for( int i = 0; i < n; i++ )
> +       sum += abs(pix1[x] - pix2[x]);
> +
> +  return sum;
> +}
> +
> +/* { dg-final { scan-assembler {vmin\.v} } } */
> +/* { dg-final { scan-assembler {vmax\.v} } } */
> +/* { dg-final { scan-assembler {vsub\.v} } } */
> --
> 2.43.0
>

Reply via email to