https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96189
--- Comment #6 from rguenther at suse dot de <rguenther at suse dot de> --- On July 16, 2020 9:05:52 AM GMT+02:00, ubizjak at gmail dot com <gcc-bugzi...@gcc.gnu.org> wrote: >https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96189 > >Uroš Bizjak <ubizjak at gmail dot com> changed: > > What |Removed |Added >---------------------------------------------------------------------------- > CC| |jakub at gcc dot gnu.org, > | |rguenth at gcc dot gnu.org > Status|RESOLVED |REOPENED > Resolution|FIXED |--- > >--- Comment #5 from Uroš Bizjak <ubizjak at gmail dot com> --- >Hm... > >Please note that peephole2 scanning require exact RTL sequences, and >already >fails for e.g.: > >_Bool >foo (unsigned int *x, unsigned int z) >{ > unsigned int y = 0; > __atomic_compare_exchange_n (x, &y, z, 0, __ATOMIC_RELAXED, >__ATOMIC_RELAXED); > return y == 0; >} > >(which is used in a couple of places throughout glibc), due to early >peephole2 >optimization that converts: > >(insn 7 4 8 2 (set (reg:SI 0 ax [90]) > (const_int 0 [0])) "cmpx0.c":5:3 75 {*movsi_internal} > >to: > >(insn 31 4 8 2 (parallel [ > (set (reg:DI 0 ax [90]) > (const_int 0 [0])) > (clobber (reg:CC 17 flags)) > >Other than that, the required sequence is broken quite often by various >reloads, due to the complexity of CMPXCHG insn. > >However, __atomic_compare_exchange_n returns a boolean value that is >exactly >what the first function is testing, so the following two functions are >equivalent: > >--cut here-- >_Bool >foo (unsigned int *x, unsigned int y, unsigned int z) >{ > unsigned int old_y = y; > __atomic_compare_exchange_n (x, &y, z, 0, __ATOMIC_RELAXED, >__ATOMIC_RELAXED); > return y == old_y; >} > >_Bool >bar (unsigned int *x, unsigned int y, unsigned int z) >{ > return __atomic_compare_exchange_n (x, &y, z, 0, __ATOMIC_RELAXED, >__ATOMIC_RELAXED); >} >--cut here-- > >I wonder, if the above transformation can happen on the tree level, so >it would >apply universally for all targets, and would also handle CMPXCHG[8,16]B >doubleword instructions on x86 targets. > >Let's ask experts. In principle value numbering can make the comparison available at the cmpxchg and replace the later comparison. We've pondered with this trick for memcpy results for example. Richard.