Re: volatile access optimization (C++ / x86_64)

Andrew Haley Sat, 27 Dec 2014 09:58:07 -0800

On 27/12/14 00:02, Matt Godbolt wrote:
> On Fri, Dec 26, 2014 at 5:19 PM, Andrew Haley <a...@redhat.com> wrote:
>> On 26/12/14 22:49, Matt Godbolt wrote:
>>> On Fri, Dec 26, 2014 at 4:26 PM, Andrew Haley <a...@redhat.com> wrote:
> 
>>> Thanks. I realise I was unclear in my original email. I'm really
>>> looking for a way to say "do a non-lock-prefixed increment".
>>
>> Why?
> 
> Performance. The single-threaded writers do not need to use a lock
> prefix: the atomicity of their read-add-write is guaranteed by my
> knowing no other threads write to the value. Thus the bus lock they
> take out unnecessarily slows down the instruction and potentially
> causes extra coherency traffic.  The order of stores (on x86) is
> guaranteed and so provided I take a relaxed view in the consumer
> there's not even a need for any other flush.  The memory write will
> necessarily "eventually" become visible to the reader. Within the
> constraints of the architecture I'm working in, this is plenty enough
> for a metric.


Okay, but that's not what I was trying to ask: if you don't need an
atomic access, why do you care that it uses a read-modify-write
instruction instead of three instructions?  Is it faster?  Have you
measured it?  Is it so much faster that it's critical for your
application?

Andrew.

Re: volatile access optimization (C++ / x86_64)

Reply via email to