Summary: Incorrect code is generated by gcc 4.1.1 when compiling vectorized code on a PS3 running yellowdog linux.
Other platforms: the code compiles correctly on a ppc macintosh, and compiles correctly on the PS3 if sony's ppu-gcc is used. Discussion: It seems to be an ABI issue: the caller assumes vr0 is stable across a function call, but the callee alters vr0 and doesn't restore it. If the function is inlined then the problem is not seen. I have reduced the problem to a small test case. Compiler options and C source are shown below. The variable tkr should be unaffected by the function call, but its value is being corrupted. $ gcc -maltivec -O3 main.c -o my_program $ ./my_program 1.000000 1.000000 1.000000 1.000000 0.000000 0.000000 0.000000 0.000000 $ ppu-gcc -maltivec -O3 main.c -o my_program $ ./my_program 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 $ gcc -v Using built-in specs. Target: ppu Configured with: ../src/configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-threads --disable-checking --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-languages=c,c++,fortran --disable-nls --enable-version-specific-runtime-libs --enable-install-libbfd --with-long-double-128 --program-prefix=ppu- --target=ppu --enable-targets=spu Thread model: posix gcc version 4.1.1 $ cat main.c #include <altivec.h> #include <stdio.h> typedef vector float vFloat; static void PrintFloat(float *temp) { printf("%f %f %f %f\n", temp[0], temp[1], temp[2], temp[3]); } static void Print(vFloat v) { PrintFloat((float *)&v); } void subfunc(vFloat *outABValues) __attribute__((noinline)); /* this function wipes out v0 */ void subfunc(vFloat *outABValues) { vFloat zero = (vFloat) { 0.0, 0.0, 0.0, 0.0 }; outABValues[0] = zero; } int main (int argc, char * argv[]) { vFloat temp; vFloat tkr = (vFloat) { 1.0, 1.0, 1.0, 1.0 }; Print(tkr); /* the compiler seems to assume v0 is unchanged across this function call */ subfunc(&temp); Print(tkr); } -- Summary: incorrect vector codegen on PS3 at O3 level optimization Product: gcc Version: 4.1.1 Status: UNCONFIRMED Severity: major Priority: P3 Component: target AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: j dot m dot taylor at dur dot ac dot uk GCC host triplet: ppc64-yellowdog-linux GCC target triplet: ppc64-yellowdog-linux http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33671