Maybe it's a good time to create a -mtune=generic and copy-and-modify from rocket?
On Wed, Jun 18, 2025 at 6:59 AM Jeff Law <jeffreya...@gmail.com> wrote: > > > > On 6/17/25 10:51 AM, Yangyu Chen wrote: > > > > > > On 17/6/2025 20:42, Jeff Law wrote: > >> > >> > >> On 6/16/25 10:08 PM, Dongyan Chen wrote: > >>> Hi, I've come across a question regarding the branch cost of gcc. In > >>> the link > >>> https://gcc.godbolt.org/z/hnddevd5h, gcc fails to recognize the > >>> optimization > >>> branch judgment, while llvm does. I eventually discovered that the > >>> value of the branch > >>> cost was too small. Moreover, in that link, if I add "-mbranch- > >>> cost=4" (a larger > >>> number can also be used) for gcc, the zicond extension functions > >>> properly. So, is > >>> it necessary to modify the branch cost for gcc? According to the > >>> source code, the > >>> default mtun is rocket, which has a branch cost of 3. I think it > >>> should be set to 4. > >>> > >>> gcc/ChangeLog: > >>> > >>> * config/riscv/riscv.cc: Change the branch cost. > >>> > >>> gcc/testsuite/ChangeLog: > >>> > >>> * gcc.target/riscv/zicond- > >>> primitiveSemantics_compare_reg_reg_return_reg_reg.c: New test. > >> So I'd be a lot more comfortable with this if someone that knows the > >> rocket uarch could chime in or if we had wider data on how this > >> behaves in general. One pico-sized benchmark isn't a great way to > >> evaluate something like this. > > > > The rocket core is quite simple, utilizing a five-stage in-order scalar > > pipeline with a 3-cycle branch mis-predict penalty. > > > > However, there is a trade-off here: > > > > - Use branch > > - 2-3 dynamic instructions reduced for each loop > > - 3 cycles penalty when branch can be predicted > > > > - Use Zicond > > - No branch mis-predict penalty > > - 2-3 dynamic instructions overhead for each loop > > > > I agree that this might not be helpful for rocket-chip. However, since > > rocket-chip is the default tune information for RISC-V, and AFAIK every > > rocket core that has been taped out lacks a zicond extension. I think > > it's acceptable to adjust this for better RISC-V ecosystems, as branch > > misprediction on large OoO cores usually incurs a penalty of about 10 > > cycles. > No, that's not a good reason. > > You could make the argument that instead of defaulting to rocket that we > should use a default generic tuning model. That would make much more > sense than deliberately choosing the wrong values for the rocket uarch > because it happens to be used as the default. > > Jeff >