GCC 4.x tree optimization decides to put int values into long long int
temporaries.  When RTL expansion comes around, the expander sees only a DImode
multiply and so generates three SImode multiplies to deal with the problem.

GCC 3.x sees that the source values are SImode and uses mulsidi3 to generate
32x32->64 multiplies, which are much more efficient.  It also picks up the
accumulation.

(using -O3 for all compilations)

GCC 3.4 has an 84-byte stack frame, and a body of 372 instructions.
GCC 4.1 has a 1416-byte stack frame, and a body of 1668 instructions.
GCC 4.2 has a 1320-byte stack frame, and a body of 1565 instructions.


-- 
           Summary: 4.1, 4.2 (possibly 4.0?) not using mulsidi3
           Product: gcc
           Version: 4.1.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: eplondke at gmail dot com
  GCC host triplet: x86_64-suse-linux
GCC target triplet: arm-unkown-elf


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29274

Reply via email to