https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92283

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |ra
             Target|                            |x86_64-*-*
                 CC|                            |vmakarov at gcc dot gnu.org
          Component|tree-optimization           |rtl-optimization

--- Comment #21 from Richard Biener <rguenth at gcc dot gnu.org> ---
So the bug is here:

# results.f:473:      &              vkl(3,1)*vkl(3,3)          #
results.f:473:      &              vkl(3,1)*vkl(3,3)
        vfmadd231sd     8(%rsp), %xmm0, %xmm2   # %sfp, preph |        
vfmadd231sd     %xmm7, %xmm0, %xmm2     # pretmp_8926

where the good case uses 8(%rsp) as operand and the bad case has a stray
use of %xmm7.  Suspicous is that %xmm7 was previously moved to a GPR in
the bad case (but that's actually correct AFAICS):

                                                              >         vmovq  
%xmm7, %rsi     # _874, _878
        vmovsd  %xmm7, 488(%rsp)        # _878, eyy                     vmovsd 
%xmm7, 488(%rsp)        # _878, eyy

More context (you can see 8(%rsp) used in the good case twice, the bad
case has 72(%rsp) for this in the first, loaded to %xmm7 but that reg
is then clobbered in the following insn):

        vfmadd132sd     %xmm6, %xmm12, %xmm7    # tmp2767, pr |        
vfmadd132sd     %xmm6, %xmm12, %xmm7    # tmp2769, pr
                                                              >         vmovq  
%xmm7, %rsi     # _874, _878
        vmovsd  %xmm7, 488(%rsp)        # _878, eyy                     vmovsd 
%xmm7, 488(%rsp)        # _878, eyy
# results.f:469:                ezz=ezz+(vkl(1,3)**2+vkl(2,3)   #
results.f:469:                ezz=ezz+(vkl(1,3)**2+vkl(2,3)
        vmovsd  144(%rsp), %xmm15       # %sfp, pretmp_8932             vmovsd 
144(%rsp), %xmm15       # %sfp, pretmp_8932
        vmulsd  %xmm15, %xmm15, %xmm5   # pretmp_8932, pretmp           vmulsd 
%xmm15, %xmm15, %xmm5   # pretmp_8932, pretmp
        vmovsd  8(%rsp), %xmm9  # %sfp, pretmp_8926           |         vmovsd 
72(%rsp), %xmm7 # %sfp, pretmp_8926
        vfmadd132sd     %xmm9, %xmm5, %xmm9     #, _4016, pre |        
vfmadd132sd     %xmm7, %xmm5, %xmm7     # pretmp_8926
        vmovaps %xmm9, %xmm5    # pretmp_8926, _879           |         vmovaps
%xmm7, %xmm5    # pretmp_8926, _879
        vfmadd231sd     %xmm13, %xmm13, %xmm5   # pretmp_8918          
vfmadd231sd     %xmm13, %xmm13, %xmm5   # pretmp_8918
        vfmadd132sd     %xmm6, %xmm13, %xmm5    # tmp2767, pr |        
vfmadd132sd     %xmm6, %xmm13, %xmm5    # tmp2769, pr
        vmovsd  %xmm5, 504(%rsp)        # _884, ezz                     vmovsd 
%xmm5, 504(%rsp)        # _884, ezz
# results.f:471:      &              vkl(3,1)*vkl(3,2)          #
results.f:471:      &              vkl(3,1)*vkl(3,2)
        vmovsd  80(%rsp), %xmm9 # %sfp, pretmp_8920           <
        vfmadd231sd     %xmm9, %xmm0, %xmm4     # pretmp_8920          
vfmadd231sd     %xmm9, %xmm0, %xmm4     # pretmp_8920
        vfmadd231sd     %xmm14, %xmm12, %xmm4   # pretmp_8922          
vfmadd231sd     %xmm14, %xmm12, %xmm4   # pretmp_8922
        vfmadd231sd     %xmm11, %xmm10, %xmm4   # pretmp_8928          
vfmadd231sd     %xmm11, %xmm10, %xmm4   # pretmp_8928
        vmovsd  %xmm4, 472(%rsp)        # _8924, exy                    vmovsd 
%xmm4, 472(%rsp)        # _8924, exy
# results.f:473:      &              vkl(3,1)*vkl(3,3)          #
results.f:473:      &              vkl(3,1)*vkl(3,3)
        vfmadd231sd     8(%rsp), %xmm0, %xmm2   # %sfp, preph |        
vfmadd231sd     %xmm7, %xmm0, %xmm2     # pretmp_8926
        vfmadd231sd     %xmm15, %xmm14, %xmm2   # pretmp_8932          
vfmadd231sd     %xmm15, %xmm14, %xmm2   # pretmp_8932
        vfmadd231sd     %xmm13, %xmm11, %xmm2   # pretmp_8918          
vfmadd231sd     %xmm13, %xmm11, %xmm2   # pretmp_8918
        vmovsd  %xmm2, 480(%rsp)        # _8930, exz                    vmovsd 
%xmm2, 480(%rsp)        # _8930, exz

before IRA the insns with the stack use and the later bogus reg use are

(insn 1815 1814 1816 176 (set (reg:DF 246 [ _879 ])
 (fma:DF (reg:DF 1447 [ pretmp_8926 ])
  (reg:DF 1447 [ pretmp_8926 ])
  (reg:DF 1018 [ _4016 ]))) "results.f":469:0 1960 {*fma_fmadd_df}
  (expr_list:REG_DEAD (reg:DF 1018 [ _4016 ])
  (nil)))

(insn 1825 1824 1826 176 (set (reg:DF 252 [ _902 ])
        (fma:DF (reg:DF 1440 [ prephitmp_8903 ])
            (reg:DF 1447 [ pretmp_8926 ])
            (reg:DF 1449 [ _8930 ]))) "results.f":473:0 1960 {*fma_fmadd_df}
     (expr_list:REG_DEAD (reg:DF 1449 [ _8930 ])
        (nil)))

and after reload it's broken:

(insn 10605 9622 1815 179 (set (reg:DF 27 xmm7 [orig:1447 pretmp_8926 ] [1447])
        (mem/c:DF (plus:DI (reg/f:DI 7 sp)
                (const_int 72 [0x48])) [22 %sfp+-6808 S8 A64]))
"results.f":469:0 111 {*movdf_internal}
     (nil))
(insn 1815 10605 10530 179 (set (reg:DF 27 xmm7 [orig:1447 pretmp_8926 ]
[1447])
        (fma:DF (reg:DF 27 xmm7 [orig:1447 pretmp_8926 ] [1447])
            (reg:DF 27 xmm7 [orig:1447 pretmp_8926 ] [1447])
            (reg:DF 25 xmm5 [orig:1018 _4016 ] [1018]))) "results.f":469:0 1960
{*fma_fmadd_df}
     (nil))

oops, %xmm7 clobbered by insn 1815!  but re-used later

(note 10338 1824 9630 179 NOTE_INSN_DELETED)
(note 9630 10338 1825 179 NOTE_INSN_DELETED)
(insn 1825 9630 1826 179 (set (reg:DF 22 xmm2 [orig:252 _902 ] [252])
        (fma:DF (reg:DF 20 xmm0 [orig:1440 prephitmp_8903 ] [1440])
            (reg:DF 27 xmm7 [orig:1447 pretmp_8926 ] [1447])
            (reg:DF 22 xmm2 [orig:1449 _8930 ] [1449]))) "results.f":473:0 1960
{*fma_fmadd_df}
     (nil))

it looks like 10338 and 9630 were inserted (reloads maybe?) but then
discarded:

    Use smallest class of ALL_SSE_REGS and SSE_REGS
      Creating newreg=4364 from oldreg=1447, assigning class ALL_SSE_REGS to
inheritance r4364
    Original reg change 1447->4364 (bb176):
 9630: r3696:DF=r4364:DF
    Add inheritance<-original before:
 10338: r4364:DF=r1447:DF

    Inheritance reuse change 1447->4372 (bb176):
 10338: r4364:DF=r4372:DF

   Insn after restoring regs:
 9630: r3696:DF=r1447:DF
           Removing inheritance:
 10338: r4364:DF=r4372:DF
      REG_DEAD r4372:DF
deleting insn with uid = 10338.

so it looks like a reload inheritance issue to me.

Vladimir?  The testcase is results.f from 454.calculix, compile it with
-O2 -mfma -mtune=znver2 -fdbg-cnt=ivopts_loop:66:67 -fno-schedule-insns2
-mno-stv -fno-tree-slsr -fno-tree-ter (-fdbg-cnt=ivopts_loop:66:66 yields
correct
code).  The same issue probably appears when building it with -O2 -march=znver2
but the above debugging is with the "reduced" flags.

Reply via email to