https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65697

--- Comment #38 from torvald at gcc dot gnu.org ---
(In reply to Andrew Macleod from comment #34)
> > However, I guess some people relying on data races in their programs could
> > (mis?)understand the __sync_lock_release semantics to mean that it is a
> > means to get the equivalent of a C11 release *fence* -- which it is not
> > because the fence would apply to the (erroneously non-atomic) store after
> > the barrier, which could one lead to believe that if one observes the store
> > after the barrier, the fence must also be in effect.  Thoughts?
> 
> before we get too carried away, maybe we should return to what we *think*
> __sync are suppose to do. It represents a specific definition by intel..
> From the original documentation for __sync "back in the day", and all legacy
> uses of sync should expect this behaviour:

The problem I see with that is that I don't think that just looking at the
psABI gives you enough information to really reason about what you are allowed
to do or not.  Strictly speaking, the psABI doesn't give you guarantees about
normal memory accesses that are not data-race-free (through use of the __sync
builtins).  Nonetheless, legacy code does use them in a combination with the
__sync builtins.

Also, if you look at the IA-64 __sync_lock_release vs. GCC docs'
__sync_lock_release, the latter is like x86/TSO.  Do you have any info on which
other semantics __sync was supposed to adhere to?

One potential way to solve it would be to just require code that uses __sync to
more or less implement an IA-64 or x86 memory model, modulo allowing
compiler-reordering and optimization between adjacent non-__sync memory
accesses.  This could be inefficient on ARM (see James' examples) and perhaps
Power too (or not -- see Jakub's comments).

Reply via email to