https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66204
Bug ID: 66204 Summary: [MIPS] LRA: Non-optimal code / regression Product: gcc Version: 5.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: robert.suchanek at imgtec dot com Target Milestone: --- Hi Vlad, Whilst looking into whether pr65862 is resolved for good, I came across a case where data is not loaded into a floating-pointer register directly with a reload partially eliminated. The following case: int a; void foo() { a = (int)(float) a; } has the following sequence with ToT compiler: lui $3,%hi(a) addiu $sp,$sp,-16 lw $2,%lo(a)($3) mtc1 $2,$f0 sw $2,8($sp) addiu $sp,$sp,16 cvt.s.w $f0,$f0 trunc.w.s $f0,$f0 j $31 swc1 $f0,%lo(a)($3) but it used to be: lui $3,%hi(a) lwc1 $f0,%lo(a)($3) cvt.s.w $f0,$f0 trunc.w.s $f0,$f0 jr $31 swc1 $f0,%lo(a)($3) LRA is now inserting unnecessary reloads and I tracked it down to r216154 with the key change: ... - > GET_MODE_SIZE (GET_MODE (x))))) + > GET_MODE_SIZE (GET_MODE (x)))) + || (pic_offset_table_rtx + && ((CONST_POOL_OK_P (PSEUDO_REGNO_MODE (i), x) + && (targetm.preferred_reload_class + (x, lra_get_allocno_class (i)) == NO_REGS)) + || contains_symbol_ref_p (x)))) ira_reg_equiv[i].defined_p = false; ... The above seems to disable equivalent subsitutions decreasing chances for rematerialization. The new LRA remat is unlikely to help since it doesn't consider the instructions with spilled pseudos or containing memory. Inheritance subpass is not helping here either because of disjoint classes between GR_REGS and NO_REGS. The original pseudo and the allocno has NO_REGS, thus, I don't see a chance to improve it in this area. Do you have any suggestions how to improve it? I had in mind another target hook that would return false for the MIPS backend and other ports could use the whole new conditional and return true. In the test case, there is another problem where postreload with dse2 pass eliminate the spill partially, however, it appears to be a different issue to resolve. I think that solving this in LRA is likely to give better results. The impact is about ~0.2-0.3% for the codesize at -O2 and -Os with the new hook (pr65862) in place. I compared the code size with and without the new conditional.