http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48037

           Summary: Missed optimization: unnecessary register moves
           Product: gcc
           Version: 4.6.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
        AssignedTo: unassig...@gcc.gnu.org
        ReportedBy: schnet...@gmail.com


Created attachment 23587
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23587
Source code

I want to perform certain operations on an SSE double precision vector. I am
using the intrinsics offered by emmintrin.h to decompose the vector into two
scalars, perform the operation on both elements, and reconstruct the vector. As
example operation I calculate the square root using scalar instructions. I am
aware that there is a vector instruction for this; I am only using this as a
placeholder to simplify the code.

I use gcc 4.6.0:

$ g++-mp-4.6 --version
g++-mp-4.6 (GCC) 4.6.0 20110305 (experimental)

on a MacBook Pro:

$ uname -a
Darwin erik-schnetters-macbook-pro.local 10.6.0 Darwin Kernel Version 10.6.0:
Wed Nov 10 18:13:17 PST 2010; root:xnu-1504.9.26~3/RELEASE_I386 i386

with a 2.66 GHz Intel Core i7 processor and I compile with the options

$ g++-mp-4.6 -S -O3 -march=native -ffast-math vecmath.cc

I tried four different ways of extracting the scalars for the vector, and I
find that gcc generates unnecessary register-register moves in almost every
case.

Reply via email to