https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118886
--- Comment #1 from GCC Commits <cvs-commit at gcc dot gnu.org> --- The master branch has been updated by Jeff Law <l...@gcc.gnu.org>: https://gcc.gnu.org/g:83d19b5d842dadc1720b57486d4675a238966ba4 commit r16-1984-g83d19b5d842dadc1720b57486d4675a238966ba4 Author: Jeff Law <j...@ventanamicro.com> Date: Thu Jul 3 06:44:31 2025 -0600 [RISC-V][PR target/118886] Refine when two insns are signaled as fusion candidates A number of folks have had their fingers in this code and it's going to take a few submissions to do everything we want to do. This patch is primarily concerned with avoiding signaling that fusion can occur in cases where it obviously should not be signaling fusion. Every DEC based fusion I'm aware of requires the first instruction to set a destination register that is both used and set again by the second instruction. If the two instructions set different registers, then the destination of the first instruction was not dead and would need to have a result produced. This is complicated by the fact that we have pseudo registers prior to reload. So the approach we take is to signal fusion prior to reload even if the destination registers don't match. Post reload we require them to match. That allows us to clean up the code ever-so-slightly. Second, we sometimes signaled fusion into loads that weren't scalar integer loads. I'm not aware of a design that's fusing into FP loads or vector loads. So those get rejected explicitly. Third, the store pair "fusion" code is cleaned up a little. We use fusion to model store pair commits since the basic properties for detection are the same. The point where they "fuse" is different. Also this code liked to "return false" at each step along the way if fusion wasn't possible. Future work for additional fusion cases makes that behavior undesirable. So the logic gets reworked a little bit to be more friendly to future work. Fourth, if we already fused the previous instruction, then we can't fuse it again. Signaling fusion in that case is, umm, bad as it creates an atomic blob of code from a scheduling standpoint. Hopefully I got everything correct with extracting this work out of a larger set of changes ð We will contribute some instrumentation & testing code so if I botched things in a major way we'll soon have a way to test that and I'll be on the hook to fix any goof's. From a correctness standpoint this should be a big fat nop. We've seen this make measurable differences in pico benchmarks, but obviously as you scale up to bigger stuff the gains largely disappear into the noise. This has been through Ventana's internal CI and my tester. I'll obviously wait for a verdict from the pre-commit tester. PR target/118886 gcc/ * config/riscv/riscv.cc (riscv_macro_fusion_pair_p): Check for fusion being disabled earlier. If PREV is already fused, then it can't be fused again. Be more selective about fusing when the destination registers do not match. Don't fuse into loads that aren't scalar integer modes. Revamp store pair commit support. Co-authored-by: Daniel Barboza <dbarb...@ventanamicro.com> Co-authored-by: Shreya Munnangi <smunnan...@ventanamicro.com>