------- Additional Comments From guardia at sympatico dot ca  2005-01-27 02:30 
-------
Ok ok, SSE is not enabled by default on Athlon... 

So, is there some sort of "pragma" that could be used to disable SSE registers
(force -mmmx sort of) for only part of some code? 

The way I see it, the problem seems to be that gcc views __m64 and __m128 as the
same kind of variables, when they are not. __m64 should always be on mmx
registers, and __m128 should always be on xmm registers. Actually, Intel created
a new type __m128d, instead of trying to guess which out of integer or float
instructions one should use for stuff like MOVDQA..

We can easily see that gcc is trying to put an __m64 variable on xmm registers
in moo2.i . I can also prevent it from using an xmm register by using only
__v8qi variables (which are invalid ie.: too small on xmm registers):
__v8qi moo(__v8qi mmx1)
{
   mmx1 = __builtin_ia32_punpcklbw (mmx1, mmx1);
   return mmx1;
}
tadam! no movss or movlps...

Shouldn't gcc not try to place __m64 variables on xmm registers? If one wants to
use an xmm register, one should use __m128 or __m128d (or at least a cast from a
__m64 pointer), even on the Pentium 4, I think it makes sense, because moving
stuff from mmx registers to xmm registers is not so cheap either..

If one wants to move one 32 bit integer to a mmx register, that should be the
job of a specialized intrinsics (_mm_cvtsi32_si64) which maps to a MOVD
instruction. And if one wants to load a 64 bit something into an xmm register,
that should be the job of _mm_load_ss (and other such functions). At the moment,
these intrinsics (_mm_cvtsi32_si64, _mm_load_ss) do NOT generate a mov
instruction by themselves.. they go through a process (from what I can
understand of i386.c) of "vector initialization" which starts generating mov
instructions from MMX, SSE or SSE2 sets without discrimination... In my mind
_mm_cvtsi32_si64 should generate a MOVD, and _mm_load_ss a MOVSS, period. Just
like __builtin_ia32_punpcklbw generates a PUNPCKLBW.

Does it make sense? Is this what you mean by a complete rewrite or were you
thinking of something else?

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19530

Reply via email to