On Thu, Aug 29, 2013 at 10:56:49PM +0300, Siarhei Siamashka wrote: > On Thu, 29 Aug 2013 15:18:57 -0400 > "Lennart Sorensen" <[email protected]> wrote: > > > I get crashes in the scaling and affinity tests on power7. The crashes > > are always in the vmx code, so building with vmx support disabled makes > > the problem go away. > > > > The error is not consistent, so my current guess is that multiple threads > > are running and depending on timing one thread manages to sometimes > > corrupt another and cause it to fail. > > > > As far as I can tell, it doesn't fail on power5 or power6 machines, > > but given the interesting memory model of the powerpc and requirement > > for explicit syncs and barriers to ensure things have really made it to > > memory and other CPUs, the power7 has managed to show up bugs in glibc > > and gcc already where power5 and power6 and other powerpc systems never > > failed before. > > > > Any suggestions on how to debug this or where to look? Any traces or > > logs that would be helpful? > > > > I am currently using version 0.26.0-4 debian package on Debian 7 (wheezy). > > > > Interestingly, if I change the version of libc to 2.17 instead of 2.13 > > that wheezy is using, then the problem also disappears, but again, this > > might just be a timing change causing this, or perhaps there is something > > relevant changed in the newer libc, although I haven't spotted anything > > suspicious looking when doing a diff so far. > > VMX/Altivec is a bit tricky because all the vector load/store > operations must be aligned. For the unaligned reads/writes, pixman > seems to use the LOAD_VECTORS and STORE_VECTOR macros:
My understanding was that vec_lda must be aligned but vec_ld does not have to be aligned. > > http://cgit.freedesktop.org/pixman/tree/pixman/pixman-vmx.c?id=pixman-0.30.2#n151 > > The STORE_VECTOR macro is particularly interesting because it performs > two stores. We can have a look at the typical combiner function, such > as "vmx_combine_over_u_no_mask": > > > http://cgit.freedesktop.org/pixman/tree/pixman/pixman-vmx.c?id=pixman-0.30.2#n187 > > In the case if the destination buffer is unaligned and the width is a > perfect multiple of 4 pixels, I believe that we may have some writes > crossing the boundaries of the destination buffer. > > Is suspect that it just reads the data outside the destination buffer, > modifies the parts which really belong to the destination image and > writes everything back (so that the chunk of memory outside the > destination buffer is restored by the STORE_VECTOR macro to the value > that it had at the time of LOAD_VECTORS invocation). Without heavy > multithreading this kinda works just fine. But with many concurrent > threads, the chunk of data beyond the destination buffer may be > possibly actively used by some other thread, creating a race condition. > > That was just a guess based on the quick look at the pixman vmx code. > You can possibly try to experiment with overriding malloc by something > that allocates memory blocks with 16 bytes granularity (for both the > starting address and size). This would make sure that each 16 bytes > aligned memory chunk is never shared by multiple threads. If the crashes > disappear, then that's probably it. And the libc 2.17 might be perhaps > enforcing something like this. Well running under valgrind shows that sometimes the LOAD_VECTORS and STORE_VECTOR do read and write outside the malloc area. I tried just making the malloc get 16 bytes extra, but that did not solve the issue. It seems it has to be something more complicated than that. I am not sure if the vec_ld is implemented in the compiler or libc, and I can't remember if I still used the same gcc version when testing with libc 2.17. I am using gcc 4.6 from Debian wheezy at the moment. I am pretty sure I tried with 4.7 as well with no change in behaviour. -- Len Sorensen _______________________________________________ Pixman mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/pixman
