https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70823

            Bug ID: 70823
           Summary: x86_64: __atomic_fetch_and/or/xor() should perhaps use
                    BTR/BTS/BTC if they can
           Product: gcc
           Version: 6.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: dhowells at redhat dot com
  Target Milestone: ---

Created attachment 38347
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38347&action=edit
Test source

If given a mask that clears, sets or flips a single bit and the result is
checked for just that bit and reduced to bool, then the __atomic_fetch_and, _or
and _xor functions should consider using BTR, BTS or BTC as appropriate.

So, something like:

   static __always_inline bool test_and_set_bit(unsigned bit, unsigned long
*ptr)
   {
      unsigned long mask = 1UL << (bit & (BITS_PER_LONG - 1));
      unsigned long old;

      ptr += bit / BITS_PER_LONG;

      old = __atomic_fetch_or(ptr, mask, __ATOMIC_SEQ_CST);
      return old & mask;
   }

where the mask is constructed by 1UL << bitnr.  As things stand, for the
example above, the result ends up with a CMPXCHG loop rather a BTS instruction:

   b:   89 f9                   mov    %edi,%ecx
   d:   ba 01 00 00 00          mov    $0x1,%edx
  12:   c1 ef 06                shr    $0x6,%edi
  15:   48 d3 e2                shl    %cl,%rdx
  18:   89 f9                   mov    %edi,%ecx
  1a:   48 8b 04 ce             mov    (%rsi,%rcx,8),%rax
  1e:   49 89 c0                mov    %rax,%r8
  21:   48 89 c7                mov    %rax,%rdi
  24:   49 09 d0                or     %rdx,%r8
  27:   f0 4c 0f b1 04 ce       lock cmpxchg %r8,(%rsi,%rcx,8)
  2d:   75 ef                   jne    1e <set_bit+0x13>
  2f:   48 85 fa                test   %rdi,%rdx
  32:   0f 95 c0                setne  %al
  35:   c3                      retq   

Could we instead get something like:

    bts    %edi,(%rsi)
    setne  %al
    retq

See the attached test source which should be compiled to a .s file.

This is the case for all of:

    gcc version 5.3.1 20151207 (Red Hat 5.3.1-2) (GCC)
    gcc version 6.0.0 20160219 (Red Hat Cross 6.0.0-0.1) (GCC)
    gcc version 4.8.5 20150623 (Red Hat 4.8.5-2.x) (GCC)

Reply via email to