https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96189

--- Comment #6 from rguenther at suse dot de <rguenther at suse dot de> ---
On July 16, 2020 9:05:52 AM GMT+02:00, ubizjak at gmail dot com
<gcc-bugzi...@gcc.gnu.org> wrote:
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96189
>
>Uroš Bizjak <ubizjak at gmail dot com> changed:
>
>           What    |Removed                     |Added
>----------------------------------------------------------------------------
>              CC|                            |jakub at gcc dot gnu.org,
>               |                            |rguenth at gcc dot gnu.org
>             Status|RESOLVED                    |REOPENED
>         Resolution|FIXED                       |---
>
>--- Comment #5 from Uroš Bizjak <ubizjak at gmail dot com> ---
>Hm...
>
>Please note that peephole2 scanning require exact RTL sequences, and
>already
>fails for e.g.:
>
>_Bool
>foo (unsigned int *x, unsigned int z)
>{
>  unsigned int y = 0;
>  __atomic_compare_exchange_n (x, &y, z, 0, __ATOMIC_RELAXED,
>__ATOMIC_RELAXED);
>  return y == 0;
>}
>
>(which is used in a couple of places throughout glibc), due to early
>peephole2
>optimization that converts:
>
>(insn 7 4 8 2 (set (reg:SI 0 ax [90])
>        (const_int 0 [0])) "cmpx0.c":5:3 75 {*movsi_internal}
>
>to:
>
>(insn 31 4 8 2 (parallel [
>            (set (reg:DI 0 ax [90])
>                (const_int 0 [0]))
>            (clobber (reg:CC 17 flags))
>
>Other than that, the required sequence is broken quite often by various
>reloads, due to the complexity of CMPXCHG insn.
>
>However, __atomic_compare_exchange_n returns a boolean value that is
>exactly
>what the first function is testing, so the following two functions are
>equivalent:
>
>--cut here--
>_Bool
>foo (unsigned int *x, unsigned int y, unsigned int z)
>{
>  unsigned int old_y = y;
>  __atomic_compare_exchange_n (x, &y, z, 0, __ATOMIC_RELAXED,
>__ATOMIC_RELAXED);
>  return y == old_y;
>}
>
>_Bool
>bar (unsigned int *x, unsigned int y, unsigned int z)
>{
>  return __atomic_compare_exchange_n (x, &y, z, 0, __ATOMIC_RELAXED,
>__ATOMIC_RELAXED);
>}
>--cut here--
>
>I wonder, if the above transformation can happen on the tree level, so
>it would
>apply universally for all targets, and would also handle CMPXCHG[8,16]B
>doubleword instructions on x86 targets.
>
>Let's ask experts.

In principle value numbering can make the comparison available at the cmpxchg
and replace the later comparison. We've pondered with this trick for memcpy
results for example. 

Richard.

Reply via email to