------- Comment #3 from mikpe at it dot uu dot se 2009-07-07 11:35 ------- Confirmed, with gcc-4.3-20090705 it works, with gcc-4.4-20090630 it fails. Compiling with -S and comparing the .s files it looks like 4.4 completely mis-schedules the code for put_uint32:
put_uint32: .register %g2, #scratch .register %g3, #scratch ldub [%sp+2175], %g1 ldub [%sp+2176], %g3 ldub [%sp+2177], %g2 ldub [%sp+2178], %g4 st %o0, [%sp+2175] stb %g4, [%o1+3] stb %g1, [%o1] stb %g3, [%o1+1] jmp %o7+8 stb %g2, [%o1+2] Notice how the store of %o0 to the four bytes at %sp+2175 comes after the corresponding byte loads, so %g1 to %g4 are loaded with garbage, likely zeroes. In contrast, gcc-4.3 generates the store before the loads: put_uint32: .register %g2, #scratch .register %g3, #scratch st %o0, [%sp+2175] ldub [%sp+2176], %g3 ldub [%sp+2177], %g4 ldub [%sp+2178], %g2 ldub [%sp+2175], %g1 stb %g2, [%o1+3] stb %g1, [%o1] stb %g3, [%o1+1] jmp %o7+8 stb %g4, [%o1+2] -- mikpe at it dot uu dot se changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |mikpe at it dot uu dot se http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40668