[Bug tree-optimization/117722] RISC-V: Failed to vectorize x264_pixel_sad_4x4

juzhe.zhong at rivai dot ai via Gcc-bugs Wed, 20 Nov 2024 20:11:15 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117722


--- Comment #1 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
OK. I see we are lacking ssadd/usad pattern (SAD_EXPR):

Compute the sum of absolute differences of two signed/unsigned elements.
Operand 1 and operand 2 are of the same mode. Their absolute difference, which
is of a wider mode, is computed and added to operand 3. Operand 3 is of a mode
equal or wider than the mode of the absolute difference. The result is placed
in operand 0, which is of the same mode as operand 3. m is the mode of operand
1 and operand 2.

I think we implement this pattern by learning from Clang:

https://godbolt.org/z/bvEKf4h1z

        vminu.vv        v10, v8, v9
        vmaxu.vv        v8, v8, v9
        vsub.vv v8, v8, v10
        vsetvli zero, zero, e32, m4, ta, ma
        vzext.vf4       v12, v8

Since operand 3 is {0, 0, 0, 0}. So we don't need Addition here.

For optimization, we should allow operand3 to be const vector.

min + max + sub + extend + addition.

[Bug tree-optimization/117722] RISC-V: Failed to vectorize x264_pixel_sad_4x4

Reply via email to