https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80817
Bug ID: 80817
Summary: [missed optimization][x86] relaxed atomics
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: Joost.VandeVondele at mat dot ethz.ch
Target Milestone: ---
Using gcc 7.1 on x86, the following
#include <atomic>
void increment_relaxed(std::atomic<uint64_t>& counter) {
atomic_store_explicit(&counter,
atomic_load_explicit(&counter, std::memory_order_relaxed) + 1,
std::memory_order_relaxed);
}
compiles to:
.cfi_startproc
movq (%rdi), %rax
addq $1, %rax
movq %rax, (%rdi)
ret
.cfi_endproc
while I would expect that
.cfi_startproc
addq $1, (%rdi)
ret
.cfi_endproc
would be fine and more efficient.
I also looked at
atomic_fetch_add_explicit(&counter, uint64_t(1), std::memory_order_relaxed);
but that surprised me with
.cfi_startproc
lock addq $1, (%rdi)
ret
.cfi_endproc