Hi all,

sync builtins are described in the documentations as being full memory barriers, with the possible exception of __sync_lock_test_and_set. However, GCC is not enforcing the fact that they are also full _optimization_ barriers. The RTL produced by builtins does not in general include a memory optimization barrier such as a set of (mem/v:BLK (scratch:P)).

This can cause problems with lock-free algorithms, for example this:

http://libdispatch.macosforge.org/trac/ticket/35

This can be solved either in generic code, by wrapping sync builtins (before and after) with an asm("":::"memory"), or in the single machine descriptions by adding a memory barrier in parallel to the locked instructions or with the ll/sc instructions.

Is the above analysis correct? Or should the users put explicit compiler barriers?

Paolo

Reply via email to