On 27/12/14 00:02, Matt Godbolt wrote: > On Fri, Dec 26, 2014 at 5:19 PM, Andrew Haley <a...@redhat.com> wrote: >> On 26/12/14 22:49, Matt Godbolt wrote: >>> On Fri, Dec 26, 2014 at 4:26 PM, Andrew Haley <a...@redhat.com> wrote: > >>> Thanks. I realise I was unclear in my original email. I'm really >>> looking for a way to say "do a non-lock-prefixed increment". >> >> Why? > > Performance. The single-threaded writers do not need to use a lock > prefix: the atomicity of their read-add-write is guaranteed by my > knowing no other threads write to the value. Thus the bus lock they > take out unnecessarily slows down the instruction and potentially > causes extra coherency traffic. The order of stores (on x86) is > guaranteed and so provided I take a relaxed view in the consumer > there's not even a need for any other flush. The memory write will > necessarily "eventually" become visible to the reader. Within the > constraints of the architecture I'm working in, this is plenty enough > for a metric.
Okay, but that's not what I was trying to ask: if you don't need an atomic access, why do you care that it uses a read-modify-write instruction instead of three instructions? Is it faster? Have you measured it? Is it so much faster that it's critical for your application? Andrew.