On Mon, Sep 16, 2013 at 10:24:10PM +0200, Søren Sandmann wrote: > Lennart Sorensen <[email protected]> writes: > > > And to make it even more annoying to track down: > > > > It doesn't fail on a power6, only on a power7. power7 machines are > > known to have found numerous powerpc memory barrier bugs in code > > (including compiler and library code), where earlier generations let > > you get away with stuff that the architecture didn't actually allow, > > but which usually worked. > > > > So it seems to be a bug trigged by: vmx code used with openmp with libc > > 2.13 on power7. Change any one of those 4 things, and the bug > > vanishes. > > As far as I can see, this is all still consistent with the bug being > that the VMX combiners are writing outside the malloced memory: > > - Disable OpenMP, and it doesn't matter because the same bytes are read > and written. With OpenMP, two threads can mess each other's memory up. > > - Use different libc: malloc() may allocate different amounts of memory > so that the combiners don't write outside of the allocated area. > > - Disable VMX: There is no writing outside the malloc()ed area > > - Power 6: Could just be timing differences, but may also have to do > with different atomicity of the incorrect memory accesses.
The thing is that if I set the openmp environment to only use one thread, the problem does NOT go away. So either I did it wrong, or it's a bit more tricky than that. But perhaps it really is just that the unaligned vector operations are smashing each other. I did see one altivec document showing the use of vec_ste rather than vec_st in a complex manner to do unaligned stores with a comment about thread safety. It may be that vec_st is NOT thread safe for unaligned stores. I did try increasing the malloc size by some extra margin amount, and that did not help at all. If it was just each thread going into the range of the bytes used by another thread, I would not expect that to corrupt libc's data structures for the malloc, which certainly seems to be happening. libc complains about double free or corrupt linked lists on some runs (other times it just segfaults). -- Len Sorensen _______________________________________________ Pixman mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/pixman
