[Bug target/19530] MMX load intrinsic produces SSE superfluous instructions (movlps)

guardia at sympatico dot ca Fri, 28 Jan 2005 20:47:34 -0800

------- Additional Comments From guardia at sympatico dot ca  2005-01-29 04:47 
-------
Hum, there apparently seems to be a problem with the optimization stages.. I
cooked up another snippet :


void moo(__m64 i, unsigned int *r)
{
   unsigned int tmp = __builtin_ia32_vec_ext_v2si (i, 0);
   *r = tmp;
}

With -O0 -mmmx we get:
        movd    %mm0, -4(%ebp)
        movl    8(%ebp), %edx
        movl    -4(%ebp), %eax
        movl    %eax, (%edx)
Which with -O3 gets reduced to:
        movl    8(%ebp), %eax
        movd    %mm0, (%eax)

Now, clearly it understands that "movd" is the same as "movl", except they work
on different registers on an MMX only machine. With "movlps" and "movq" it
should do the same I think? If the optimization stages can work this out, maybe
we wouldn't need to rewrite the MMX/SSE1 support...

(BTW, correction, when I said 200+ instructions to schedule, I meant per
function. I have a dozen such functions with 200+ instructions, and it ain't
going to get any smaller)

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19530

[Bug target/19530] MMX load intrinsic produces SSE superfluous instructions (movlps)

Reply via email to