https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98981
--- Comment #5 from Jim Wilson <wilson at gcc dot gnu.org> --- Neither of the two patches I mentioned in comment 1 can fix the problem by themselves, as we still have a mix of SImode and DImode operations. I looked at REE. It doesn't work because there is more than one reaching def. But even if it did work, I don't think it would completely solve the problem because it runs after register allocation and hence won't be able to remove move instructions. To get the best result, we need the register allocator to take two registers with different modes with overlapping live ranges, and realize that they can be allocated to the same hard reg because the overlapping uses are non-conflicting. I haven't tried looking at the register allocator, but it doesn't seem like a good way to try to solve the problem. We have an inconvenient mix of SImdoe and DImode because we don't have SImode compare and branch instructions. That requires sign extending 32-bit values to 64-bit to compare them, which then results in the sign extend and register allocation optimization issues. it is unlikely that 32-bit compare and branch instructions will be added to the ISA though. One useful thing I noticed is that the program is doing a max operation, and the B extension adds a max instruction. Having one instruction instead of a series of instructions including a branch to compute max makes the optimization issues easier, and gcc does give the right result in this case. Using a compiler with B support I get lw a4,0(a5) lw a2,0(a3) addi a5,a5,4 addi a3,a3,4 addw a4,a4,a2 max a0,a4,a0 bne a5,a1,.L2 which is good code with the extra moves and sign-extends removed. So I have a workaround of sorts, but only if you have the B extension. The -mtune=sifive-7-series can support conditional move via macro fusion, I was hopeful that this would work as well as max, but unfortunately the sign-extend that was removed in the max case does't get removed in the conditional move case. Also, the conditional move is 2-address, and the register allocator ends up needing a reload, which gives us the unwanted mv again. So the code in this case is the same as without the option. I didn't check to see if this is fixable.