http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59578

            Bug ID: 59578
           Summary: Overuse of v prefix for SSE instructions
           Product: gcc
           Version: 4.8.2
            Status: UNCONFIRMED
          Severity: minor
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: msharov at users dot sourceforge.net

typedef float v16sf __attribute__((vector_size(16)));
v16sf f (v16sf x)
{ return (__builtin_ia32_shufps (x, x, 0xff)); }

Compiled on a Haswell 4770 with -march=native -O emits:

vshufps $255, %xmm0, %xmm0, %xmm0

Even though all registers are the same and

shufps $255, %xmm0, %xmm0

would have worked just as well without the extra byte for the v prefix.
This happens with other __builtin instructions as well. For example:

typedef long long v16so __attribute__((vector_size(16)));
v16so k (v16so x)
{ return (__builtin_ia32_aeskeygenassist128 (x, 1)); }

Emits vaeskeygenassist even though no memory accesses are present.

Reply via email to