Hi Richard,

>> I noticed that sync_lock_release uses lwsync if available but every other
>> sync_* builtin uses a heavyweight sync. eg:
>
> Every other sync builtin has full-barrier semantics.  AFAIK, isync is  
> correct.

I think we can change the sync to an lwsync and still maintain full
barrier semantics. The code sequence now is:

<fetch_and_add>:
  60:   7c 00 04 ac     sync
  64:   7c 69 1b 78     mr      r9,r3
  68:   7c 60 48 28     lwarx   r3,0,r9
  6c:   39 63 00 01     addi    r11,r3,1
  70:   7d 60 49 2d     stwcx.  r11,0,r9
  74:   40 a2 ff f4     bne-    68 <sync_fetch_and_add+0x8>
  78:   4c 00 01 2c     isync
  7c:   4e 80 00 20     blr

The only thing lwsync wont order is a store followed by a load. Since
the lwsync will always be paired with a store (the stwcx), we will order
all accesses before it and provide a release barrier.

The stwcx; bne; isync combination provides the acquire barrier.

Anton

Reply via email to