On 06/03/2016 05:32 AM, Jakub Jelinek wrote:
A change from wide CAS to locking would be an ABI change I suppose, but
it could also be considered a necessary bugfix if we don't want to write
to read-only memory. Does this affect anything but i686?
Also x86_64 (for 128-bit atomics), clearly also either arm or aarch64
(judging from who initiated this thread), I bet there are many others.
Both arm and aarch64 are not affected. Both use the double-word load-locked
instruction for the atomic load; they simply don't pair it with a
store-conditional in that case.
There's nothing to be done about <= i686, but for recent-ish cpus we should be
able to use either sse or fpu loads. While it's not promised in the spec, I
would suggest that any 64-bit capable cpu will in practice perform all aligned
64-bit loads atomicly.
There's nothing that can be done for x86_64 128-bit load. There are probably
some implementations for which an aligned 128-bit sse load is atomic, but we
also know that there are some implementations for which it is not (those that
split sse instructions into 2 64-bit micro-ops, e.g. some AMD and all Atoms).
It's an oversight that it would be nice for Intel+AMD to fix for us...
r~