[Bug target/117769] RISC-V: Worse codegen in x264_pixel_satd_8x4

2024-11-25 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117769 JuzheZhong changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED

[Bug target/117769] RISC-V: Worse codegen in x264_pixel_satd_8x4

2024-11-25 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117769 --- Comment #3 from Robin Dapp --- Ok, I see. Those x264 functions are sensitive to alignment. Right now the only tune model to enable it by default is generic ooo. But the commit you mentioned cannot have been OK then either?

[Bug target/117769] RISC-V: Worse codegen in x264_pixel_satd_8x4

2024-11-25 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117769 --- Comment #2 from JuzheZhong --- Ok. I see it is not an issue now. When we enable -mno-vect-strict-align: https://godbolt.org/z/MzqzPTcc6 We have same codegen as ARM SVE now: x264_pixel_satd_8x4(unsigned char*, int, unsigned char*, int):

[Bug target/117769] RISC-V: Worse codegen in x264_pixel_satd_8x4

2024-11-25 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117769 --- Comment #1 from Robin Dapp --- The SLP vec_perm patch went upstream since which seems pretty related as specifically targets SATD's permutes. Surprised to see a higher icount, though.