https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110184
Ivan Bodrov <securesneakers at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |securesneakers at gmail dot com
--- Comment #2 from Ivan Bodrov <securesneakers at gmail dot com> ---
This seem to have been implemented, at least for __atomic_fetch_and, but the
optimization is very fragile and fails when "lock and" value and mask used
during checking come from separate literals:
$ cat fragile-fetch-and.c
void slowpath(unsigned long *p);
void func_bad(unsigned long *p)
{
if (__atomic_fetch_and(p, ~1UL, __ATOMIC_RELAXED) & ~1UL)
slowpath(p);
}
void func_good(unsigned long *p)
{
unsigned long mask = ~1UL;
if (__atomic_fetch_and(p, mask, __ATOMIC_RELAXED) & mask)
slowpath(p);
}
Compiling this we can see that even though functions are the same, the first
one wasn't optimized:
$ gcc --version
gcc (GCC) 13.2.1 20230801
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ uname -s -m
Linux x86_64
$ gcc -O2 -c fragile-fetch-and.c
$ objdump -d fragile-fetch-and.o
fragile-fetch-and.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <func_bad>:
0: 48 8b 07 mov (%rdi),%rax
3: 48 89 c1 mov %rax,%rcx
6: 48 89 c2 mov %rax,%rdx
9: 48 83 e1 fe and $0xfffffffffffffffe,%rcx
d: f0 48 0f b1 0f lock cmpxchg %rcx,(%rdi)
12: 75 ef jne 3 <func_bad+0x3>
14: 48 83 fa 01 cmp $0x1,%rdx
18: 77 06 ja 20 <func_bad+0x20>
1a: c3 ret
1b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
20: e9 00 00 00 00 jmp 25 <func_bad+0x25>
25: 66 66 2e 0f 1f 84 00 data16 cs nopw 0x0(%rax,%rax,1)
2c: 00 00 00 00
0000000000000030 <func_good>:
30: f0 48 83 27 fe lock andq $0xfffffffffffffffe,(%rdi)
35: 75 09 jne 40 <func_good+0x10>
37: c3 ret
38: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
3f: 00
40: e9 00 00 00 00 jmp 45 <func_good+0x15>