https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96189
Uroš Bizjak <ubizjak at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jakub at gcc dot gnu.org,
| |rguenth at gcc dot gnu.org
Status|RESOLVED |REOPENED
Resolution|FIXED |---
--- Comment #5 from Uroš Bizjak <ubizjak at gmail dot com> ---
Hm...
Please note that peephole2 scanning require exact RTL sequences, and already
fails for e.g.:
_Bool
foo (unsigned int *x, unsigned int z)
{
unsigned int y = 0;
__atomic_compare_exchange_n (x, &y, z, 0, __ATOMIC_RELAXED,
__ATOMIC_RELAXED);
return y == 0;
}
(which is used in a couple of places throughout glibc), due to early peephole2
optimization that converts:
(insn 7 4 8 2 (set (reg:SI 0 ax [90])
(const_int 0 [0])) "cmpx0.c":5:3 75 {*movsi_internal}
to:
(insn 31 4 8 2 (parallel [
(set (reg:DI 0 ax [90])
(const_int 0 [0]))
(clobber (reg:CC 17 flags))
Other than that, the required sequence is broken quite often by various
reloads, due to the complexity of CMPXCHG insn.
However, __atomic_compare_exchange_n returns a boolean value that is exactly
what the first function is testing, so the following two functions are
equivalent:
--cut here--
_Bool
foo (unsigned int *x, unsigned int y, unsigned int z)
{
unsigned int old_y = y;
__atomic_compare_exchange_n (x, &y, z, 0, __ATOMIC_RELAXED,
__ATOMIC_RELAXED);
return y == old_y;
}
_Bool
bar (unsigned int *x, unsigned int y, unsigned int z)
{
return __atomic_compare_exchange_n (x, &y, z, 0, __ATOMIC_RELAXED,
__ATOMIC_RELAXED);
}
--cut here--
I wonder, if the above transformation can happen on the tree level, so it would
apply universally for all targets, and would also handle CMPXCHG[8,16]B
doubleword instructions on x86 targets.
Let's ask experts.