https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66204

            Bug ID: 66204
           Summary: [MIPS] LRA: Non-optimal code / regression
           Product: gcc
           Version: 5.1.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: robert.suchanek at imgtec dot com
  Target Milestone: ---

Hi Vlad,

Whilst looking into whether pr65862 is resolved for good, I came across a case
where data is not loaded into a floating-pointer register directly with a
reload partially eliminated.

The following case:

int a;

void foo()
{
  a = (int)(float) a;
}

has the following sequence with ToT compiler:

    lui       $3,%hi(a)
    addiu     $sp,$sp,-16
    lw        $2,%lo(a)($3)
    mtc1      $2,$f0
    sw        $2,8($sp)
    addiu     $sp,$sp,16
    cvt.s.w   $f0,$f0
    trunc.w.s $f0,$f0
    j         $31
    swc1      $f0,%lo(a)($3)

but it used to be:

    lui       $3,%hi(a)
    lwc1      $f0,%lo(a)($3)
    cvt.s.w   $f0,$f0
    trunc.w.s $f0,$f0
    jr        $31
    swc1      $f0,%lo(a)($3)

LRA is now inserting unnecessary reloads and I tracked it down to r216154 with
the key change:

...
-                   > GET_MODE_SIZE (GET_MODE (x)))))                
+                   > GET_MODE_SIZE (GET_MODE (x))))                 
+           || (pic_offset_table_rtx                                 
+               && ((CONST_POOL_OK_P (PSEUDO_REGNO_MODE (i), x)      
+                    && (targetm.preferred_reload_class              
+                        (x, lra_get_allocno_class (i)) == NO_REGS)) 
+                   || contains_symbol_ref_p (x))))       
                     ira_reg_equiv[i].defined_p = false;
...

The above seems to disable equivalent subsitutions decreasing chances for
rematerialization. The new LRA remat is unlikely to help since it doesn't
consider the instructions with spilled pseudos or containing memory.

Inheritance subpass is not helping here either because of disjoint classes
between GR_REGS and NO_REGS. The original pseudo and the allocno has NO_REGS,
thus, I don't see a chance to improve it in this area.

Do you have any suggestions how to improve it? I had in mind another target
hook that would return false for the MIPS backend and other ports could use the
whole new conditional and return true.

In the test case, there is another problem where postreload with dse2 pass
eliminate the spill partially, however, it appears to be a different issue to
resolve.  I think that solving this in LRA is likely to give better results.

The impact is about ~0.2-0.3% for the codesize at -O2 and -Os with the new hook
(pr65862) in place.  I compared the code size with and without the new
conditional.

Reply via email to