Re: [Bug target/119373] RISC-V: missed unrolling opportunity

2025-04-24 Thread Robin Dapp via Gcc-bugs
Since the primary underlying scalar mode in the loop is DF, the autodetected vector mode returned by preferred_simd_mode is RVVM1DF. In comparison, AArch64 picks VNx2DF, which allows the vectorisation factor to be 8. By choosing RVVMF8QI, RISC-V is restricted to VF = 4. Generally we pick the lar

Re: [Bug target/120067] New: RISC-V: x264 sub4x4_dct high icount

2025-05-02 Thread Robin Dapp via Gcc-bugs
This is reduced from 525.x264_r's 4th hottest block: https://godbolt.org/z/KdWv1er6f AArch64 assembly is clean and efficient (35 insns) while RISC-V's is long and messy (114 insns). The most obvious issue is that it keeps spilling and reloading the same data from the stack. Also I do not underst