[Bug tree-optimization/41783] New: r151561 (PRE fix) regresses zeusmp

matz at gcc dot gnu dot org Wed, 21 Oct 2009 07:34:33 -0700

zeusmp regressed by about 5% again with the PRE fix for PR41101, which is
r151561.  The problem is that PRE now finds a partial redundancy (where in
reality there isn't any) and the PHI node to compensate for this prevents
vectorization of a loop due to its value used outside that loop.  Testcase
extracted from zeusmp:


% cat hsmoc-1.f
      subroutine hsmoc ( )
      implicit NONE
      integer ijkn
      parameter(ijkn =   128+5)
      real*8 dt, fact, db(ijkn), w1dt(ijkn)
      integer i, is, ie, j, js, je
      common /rootr/ dt
      common /scratch/  w1dt
         do 9 i=is,ie
           do 807 j=js-1,je+1
             db (j  ) = j
 807       continue
           fact = dt * i
           do 808 j=js,je+1
             w1dt(j)= fact * db (j)
 808       continue
 9      continue
       return
       end

(compile with -march=barcelona -O3 -ffast-math -funroll-loops -fpeel-loops)
The problem is the access to 'dt' (rootr.dt), which PRE thinks is partially
redundant in the first loop (!?), hence it creates this code:

pretmp.11_53 = rootr.dt;
Loop-i:
  prephitmp.12_51 = PHI <pretmp.11_53(9), D.1376_20(20)>
...
  Loop-j1
    prephitmp.12_49 = PHI <prephitmp.12_51(11), pretmp.11_52(14)>
    ...
    pretmp.11_52 = rootr.dt;
    goto Loop-j1
  prephitmp.12_23 = PHI <prephitmp.12_51(12), prephitmp.12_49(13)>
  D.1376_20 = prephitmp.12_23;
  ...
  Loop-j2

Notice especially how we now read rootr.dt in the backedge for loop-j1,
which is much more often than before.  Originally we access it ie-is times,
now we access it (ie-is)*(je-js) times.

It's possible that this alone explains the speed regression, and not
necessarily the missed vectorization.  But the missed vectorization was
much easier to detect.


-- 
           Summary: r151561 (PRE fix) regresses zeusmp
           Product: gcc
           Version: 4.5.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: matz at gcc dot gnu dot org
  GCC host triplet: x86_64-linux


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41783

[Bug tree-optimization/41783] New: r151561 (PRE fix) regresses zeusmp

Reply via email to