On 4/25/23 13:03, Jeff Law wrote:
On 4/25/23 11:25, Vineet Gupta wrote:Just a note, there are some regressions in this table. For example xz input #2. So to be fair when making comparisons it's probably worth noting the regressions as well as improvements.On 4/18/23 11:36, Jeff Law via Gcc-patches wrote:On 4/18/23 08:36, Vineet Gupta wrote:[partial addressing of PR/109279]RISCV splitters have restrictions to not create pesudos due to a combine limitatation. And despite this being a split-during-combine limitation, all split passes take the hit due to way define*_split are used in gcc.With the original combine issue being fixed 61bee6aed2 ("combine: Don'trecord for UNDO_MODE pointers into regno_reg_rtx array [PR104985]") the RV splitters can now be relaxed. This improves the codegen in general. e.g. long long f(void) { return 0x0101010101010101ull; } Before li a0,0x01010000 addi a0,0x0101 slli a0,a0,16 addi a0,a0,0x0101 slli a0,a0,16 addi a0,a0,0x0101 ret With patch li a5,0x01010000 addi a5,a5,0x0101 mv a0,a5 slli a5,a5,32 add a0,a5,a0 ret This is testsuite clean, no regression w/ patch. ========= Summary of gcc testsuite =========| # of unexpected case / # of unique unexpected case | gcc | g++ | gfortran | rv64imafdc/ lp64d/ medlow | 2 / 2 | 1 / 1 | 6 / 1 |rv64imac/ lp64/ medlow | 3 / 3 | 1 / 1 | 43 / 8 |rv32imafdc/ ilp32d/ medlow | 1 / 1 | 3 / 2 | 6 / 1 |rv32imac/ ilp32/ medlow | 1 / 1 | 3 / 2 | 43 / 8 | This came up as part of IRC chat on PR/109279 and was suggested by Andrew Pinski. Signed-off-by: Vineet Gupta <vine...@rivosinc.com> --- gcc/config/riscv/riscv-protos.h | 4 +--gcc/config/riscv/riscv.cc | 46 +++++++++++++--------------------gcc/config/riscv/riscv.md | 8 +++--- 3 files changed, 24 insertions(+), 34 deletions(-)This looks fine, except that you don't have a ChangeLog.Pushed with Changelog and some additional info about qemu icount reductions with this patch500.perlbench_r 0 1235310737733 1231742384460 0.29% 1 744489708820 743515759958 2 714072106766 712875768625 0.17% 502.gcc_r 0 197365353269 197178223030 1 235614445254 235465240341 2 226769189971 226604663947 3 188315686133 188123584015 4 289372107644 289187945424 503.bwaves_r 0 326291538768 326291539697 1 515809487294 515809488863 2 401647004144 401647005463 3 488750661035 488750662484 505.mcf_r 0 681926695281 681925418147 507.cactuBSSN_r 0 3832240965352 3832226068734 508.namd_r 0 1919838790866 1919832527292 510.parest_r 0 3515999635520 3515878553435 511.povray_r 0 3073889223775 3074758622749 519.lbm_r 0 1194077464296 1194077464041 520.omnetpp_r 0 1014144252460 1011530791131 0.26% 521.wrf_r 0 3966715533120 3966265425092 523.xalancbmk_r 0 1064914296949 1064506711802 525.x264_r 0 509290028335 509258131632 1 2001424246635 2001677767181 2 1914660798226 1914869407575 526.blender_r 0 1726083839515 1725974286174 527.cam4_r 0 2336526136415 2333656336419 531.deepsjeng_r 0 1689007489539 1686541299243 0.15% 538.imagick_r 0 3247960667520 3247942048723 541.leela_r 0 2072315300365 2070248271250 544.nab_r 0 1527909091282 1527906483039 548.exchange2_r 0 2086120304280 2086314757502 549.fotonik3d_r 0 2261694058444 2261670330720 554.roms_r 0 2640547903140 2640512733483 557.xz_r 0 388736881767 386880875636 0.48% 1 959356981818 959993132842 2 547643353034 546374038310 0.23% 997.specrand_fr 0 512881578 512599641 999.specrand_ir 0 512881578 512599641
Fair point, but in my defense I dropped anything under 0.1 % (including improvements).
Here's the list of what regressed due to this patch. 511.povray_r 0 3073889223775 3074758622749 -0.03% 525.x264_r 0 509290028335 509258131632 0.01% 1 2001424246635 2001677767181 -0.01% 2 1914660798226 1914869407575 -0.01% 548.exchange2_r 0 2086120304280 2086314757502 -0.01% 557.xz_r 0 388736881767 386880875636 0.48% 1 959356981818 959993132842 -0.07%
Anyway, my results are in line with yours. Given the instruction counts and known IPC, I think we're taking about a 2.5% hit on deepsjeng and about a 1% hit on leela and x264#2 comparing our internal gcc-12 vs gcc-13 trees.
I think you are referring to the bigger deepsjeng regression due to commit 2e886eef7f2b5a which introduced the splitter movconst_internal. This patch is about a small improvement to all splitters which reduces that perf loss slightly (but does regress some tests too as pointed above).
Thx, -Vineet