On 09/11/2011 02:22 PM, Paolo Bonzini wrote:
On 09/11/2011 04:12 PM, Andrew MacLeod wrote:
tail->value = othervalue // global variable write
atomic_exchange (&var, tail) // acquire operation
although the optimizer moving the store of tail->value to AFTER the
exchange seems very wrong on the surface, it's really emulating what
another thread could possibly see. When another thread synchronizes
and reads 'var', an acquire operation doesn't cause outstanding stores
to be fully flushed, so the other process has no guarantee that the
store to tail->value has happened yet even though it gets the expected
value of 'var'.
You're right that using lock_test_and_set as an exchange is very wrong
because of the compiler barrier semantics, but I think this is
entirely a red herring in this case. The same problem could happen
with a fetch_and_add or even a lock_release operation.
My point is that if even once we get the right barriers in place, due to
its definition as acquire, this testcase could actually still fail, AND
the optimization is valid... unless we decide to retroactively make
all the original sync routine set_cst. I'm not saying we don't have
other issues with rtl optimizations right now.
Andrew