------- Additional Comments From guardia at sympatico dot ca 2005-01-27 06:19 ------- Ok, so from what I gather, the backend is being designed for the autovectorizer which will probably only work right with SSE2 (on x86 that is), as mucking with emms will probably bring too much trouble. Second, we can do any MMX operations on XMM registers in SSE2. So the code for SSE2 does not need to be changed optimization wise for intrinsics.
As for a pragma or something, could we for example disable the automatic use of such instructions as movss, movhps, movlps, and the likes on SSE1 (if I may call it that way)? That would most certainly prevent gcc from trying to put __m64 in xmm registers however eager it might want to mov it there... (would it?) And supply a few built-ins to implement manual use of those instructions. I guess such a solution would be nice, although I realize it might not be too kosher ;) I use MMX to load char * arrays into shorts and convert them into float in SSE registers, to process them with float * arrays, so I can't separate the MMX code from the SSE code... Of course, with the way things look at the moment, I might end up writing everything in assembler by hand, but scheduling 200+ instructions (yup yup I have some pretty funky code here) by hand is no fun at all, especially if (ugh when) the algorithm changes. Also, the same code in C with intrinsics can target x86-64 :) yah, that's cool -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19530