When profile based feedback indicates that a loop has a low iteration
count, it will often refuse to unroll even though unrolling is still
useful.  Moreover, while it knows about the average loop iteration
count, it lacks the concept of a prevalent iteration count.

In particular, the header checksumming of the EEMBC packetflow benchmark
usually has ten iterations.  With gcc 3.x, unrolling was by a factor of four,
which was mediocre.  With the introduction of the new loop unroller in 4.0,
unrolling when doing profile feedback was no longer done at all.
The proper thing to do would be to unroll this loop five times.

When the case of a loop that is not a multiple of the chosen unroll factor is
deemed sufficiently unlikely, that case can be taken care of by generating
a non-unrolled loop after the unrolled loop.  The unrolled loop can use
a suitably transformed unequality check for the loop start and end to
verify that a sufficient number of iterations is outstanding, so that no
casesi / tablejump code is needed.

See also: http://gcc.gnu.org/ml/gcc-patches/2004-09/msg02373.html


-- 
           Summary: inept unrolling for small iteration counts
           Product: gcc
           Version: 4.3.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: amylaar at gcc dot gnu dot org
OtherBugsDependingO 29842
             nThis:


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29946

Reply via email to