https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119839

            Bug ID: 119839
           Summary: RISC-V gobmk performance regression with Node clones
                    share order patch (bad LTO partitioning)
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: lto
          Assignee: unassigned at gcc dot gnu.org
          Reporter: anton at ozlabs dot org
                CC: mjires at gcc dot gnu.org, pinskia at gcc dot gnu.org
  Target Milestone: ---

We are seeing a performance regression on RISC-V when building gobmk from
cpu2006 with LTO. Victor Ying narrowed it down to this loop, where we
continually load and store change_stack_pointer (a static variable):

   15b42:       ff87a783        lw      a5,-8(a5)
   15b46:       c31c            sw      a5,0(a4)
   15b48:       3581b783        ld      a5,856(gp) # 3868b0
<change_stack_pointer.lto_priv.0>
   15b4c:       ff078713        addi    a4,a5,-16
   15b50:       34e1bc23        sd      a4,856(gp) # 3868b0
<change_stack_pointer.lto_priv.0>
   15b54:       ff07b703        ld      a4,-16(a5)
   15b58:       f76d            bnez    a4,15b42 <popgo+0x2e>

With
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=0895aef01c64c317b489811dbe4ac55f9c13aab3
reverted we see the expected behaviour, change_stack_pointer is moved out of
the loop:

   17f90:       ff078693        addi    a3,a5,-16
   17f94:       ff07b703        ld      a4,-16(a5)
   17f98:       3ad1b423        sd      a3,936(gp) # 381900
<change_stack_pointer>
   17f9c:       1781            addi    a5,a5,-32
   17f9e:       cb09            beqz    a4,17fb0 <popgo+0x4c>
   17fa0:       4f94            lw      a3,24(a5)
   17fa2:       863e            mv      a2,a5
   17fa4:       17c1            addi    a5,a5,-16
   17fa6:       c314            sw      a3,0(a4)
   17fa8:       6b98            ld      a4,16(a5)
   17faa:       fb7d            bnez    a4,17fa0 <popgo+0x3c>

This looks to be an issue with LTO partitioning, because change_stack_pointer
was promoted to change_stack_pointer.lto_priv.0. This issue goes away if we use
-flto-partition=one, which seems to confirm this.

I'm not sure if this is just bad luck, but the patch is definitely changes how
LTO partitions things.

Reply via email to