[Bug target/118019] RISC-V: Performance regression in hottest function of X264

2025-01-16 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118019 --- Comment #15 from Robin Dapp --- I think it's r15-2820-gab18785840d7b8.

[Bug target/118019] RISC-V: Performance regression in hottest function of X264

2025-01-15 Thread vineetg at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118019 --- Comment #14 from Vineet Gupta --- (In reply to Robin Dapp from comment #7) > > The problem is GCC-15 has performance regression compare to GCC-14 on both > > strict align and we should fix it, we can't specify use no strict align in > > GCC-

[Bug target/118019] RISC-V: Performance regression in hottest function of X264

2024-12-18 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118019 Jeffrey A. Law changed: What|Removed |Added CC||law at gcc dot gnu.org --- Comment #13

[Bug target/118019] RISC-V: Performance regression in hottest function of X264

2024-12-16 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118019 --- Comment #12 from Robin Dapp --- Could you please check if the patch helped?

[Bug target/118019] RISC-V: Performance regression in hottest function of X264

2024-12-16 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118019 --- Comment #11 from GCC Commits --- The master branch has been updated by Robin Dapp : https://gcc.gnu.org/g:ce199a952bfef3e27354a4586a17bc55274c1d3c commit r15-6277-gce199a952bfef3e27354a4586a17bc55274c1d3c Author: Robin Dapp Date: Fri De

[Bug target/118019] RISC-V: Performance regression in hottest function of X264

2024-12-13 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118019 --- Comment #10 from Robin Dapp --- Ah I see - the actual vector code isn't even that bad and the vec_constructs aren't either. The problem is rather that we have slow unaligned (scalar) access with the default tune model. Thus we need to load

[Bug target/118019] RISC-V: Performance regression in hottest function of X264

2024-12-13 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118019 --- Comment #9 from Robin Dapp --- I think I'll post a patch to increase vec_construct costs first. It's just too cheap right now. That should already help with the default settings.

[Bug target/118019] RISC-V: Performance regression in hottest function of X264

2024-12-13 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118019 --- Comment #8 from JuzheZhong --- (In reply to Robin Dapp from comment #7) > > The problem is GCC-15 has performance regression compare to GCC-14 on both > > strict align and we should fix it, we can't specify use no strict align in > > GCC-15

[Bug target/118019] RISC-V: Performance regression in hottest function of X264

2024-12-13 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118019 --- Comment #7 from Robin Dapp --- > The problem is GCC-15 has performance regression compare to GCC-14 on both > strict align and we should fix it, we can't specify use no strict align in > GCC-15 to pretend that we don't have such performance

[Bug target/118019] RISC-V: Performance regression in hottest function of X264

2024-12-13 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118019 --- Comment #6 from JuzheZhong --- (In reply to Robin Dapp from comment #5) > According to Li Pan's results this is "just" vector strict align again? > We should be vectorizing the first loop, in particular after the > SLP-grouping changes. > >

[Bug target/118019] RISC-V: Performance regression in hottest function of X264

2024-12-13 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118019 --- Comment #5 from Robin Dapp --- According to Li Pan's results this is "just" vector strict align again? We should be vectorizing the first loop, in particular after the SLP-grouping changes. I realize it's annoying having to resort to strict

[Bug target/118019] RISC-V: Performance regression in hottest function of X264

2024-12-12 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118019 --- Comment #4 from Li Pan --- (In reply to Li Pan from comment #3) >1 │ #include >2 │ >3 │ #define I_P1 16 >4 │ #define I_P2 1344 >5 │ >6 │ #define HADAMARD4(d0, d1, d2, d3, s0, s1, s2, s3) {\ >7 │

[Bug target/118019] RISC-V: Performance regression in hottest function of X264

2024-12-12 Thread pan2.li at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118019 --- Comment #3 from Li Pan --- 1 │ #include 2 │ 3 │ #define I_P1 16 4 │ #define I_P2 1344 5 │ 6 │ #define HADAMARD4(d0, d1, d2, d3, s0, s1, s2, s3) {\ 7 │ int t0 = s0 + s1;\ 8 │ int t1 = s0 - s1;\

[Bug target/118019] RISC-V: Performance regression in hottest function of X264

2024-12-12 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118019 --- Comment #2 from JuzheZhong --- (In reply to Vineet Gupta from comment #1) > How exactly are you building it ? -march=rv64gcv_zvl512b -mabi=lp64d -mrvv-vector-bits=zvl -mrvv-max-lmul=m2

[Bug target/118019] RISC-V: Performance regression in hottest function of X264

2024-12-12 Thread vineetg at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118019 Vineet Gupta changed: What|Removed |Added CC||vineetg at gcc dot gnu.org --- Comment #