On Wed, Jun 27, 2012 at 12:20:05AM +0200, intrigeri wrote: > shawn wrote (17 Jun 2012 03:13:04 GMT) : > > on 3.4 (custom) + sheevaplug: > > > $ gcc -O0 625828.c > > $ ./a.out > > ok 1 - got 2000, expected 2000 > > I asked for more thorough testing, and shawn reported > "I ran it 500 times, it succeeded only 191 of those times."
This seems to be another instance of VIVT L1 caching weirdness. The problem has been discussed years ago on the arm kernel mailing list (see [1]) and a patch has been proposed at that time. Apparently, that never made it into the kernel. I think what happens in the test program is the following: 1. When mapping 1 is initialized, it is cacheable (in L1 and L2) 2. When mapping 2 is initialized (by "c = *shmaddr2;"), both mappings become uncacheable (neither in L1 nor L2) since they belong to the same process. If they were cacheable, the two mappings could become incoherent within the same process. (Incoherencies between processes can be avoided by flushing the L1 cache at context switch. However, this is not possible within a process and thus, the kernel sets shared mappings to uncacheable) 3. The forked process only uses mapping 1, i.e. the mapping remains cacheable in this process. 4. Task switches between the two processes flush the L1 caches, but not the L2 cache. Since the original process does not use the L2 cache at all and the second process uses it, the processes have incoherent views. If this theory is true, the following should make the problem disappear (tested on kirkwood): 1. Don't initialize the second mapping by "c = *shmaddr2;" -> The mapping remains cacheable in both processes. Problem goes away, as already stated in the original test program. 2. Initialize the second mapping in both processes. (Don't detach the second range and move "c = *shmaddr2;" to happen after the fork.) -> The mappings are non-cacheable in both processes. Problem goes away. 3. Disable the L2 cache in both U-Boot and the kernel (requires to compile the kernel with CONFIG_CACHE_FEROCEON_L2=n) -> L1 flushes now are sufficient to ensure coherency. Problem goes away. This seems to support my theory... The patch that was proposed in the thread mentioned above would render the mapping(s) of the second process uncacheable as there are already other uncacheable mappings (those of the original process). Thus, it leads to the situation described in 2 above and should fix the problem. Unfortunately, it does not apply to my current kernel (3.5-rc4) anymore. So, I could not test whether it really helps. - Simon [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2009-December/005471.html -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org