Hi Richard, >> I noticed that sync_lock_release uses lwsync if available but every other >> sync_* builtin uses a heavyweight sync. eg: > > Every other sync builtin has full-barrier semantics. AFAIK, isync is > correct.
I think we can change the sync to an lwsync and still maintain full barrier semantics. The code sequence now is: <fetch_and_add>: 60: 7c 00 04 ac sync 64: 7c 69 1b 78 mr r9,r3 68: 7c 60 48 28 lwarx r3,0,r9 6c: 39 63 00 01 addi r11,r3,1 70: 7d 60 49 2d stwcx. r11,0,r9 74: 40 a2 ff f4 bne- 68 <sync_fetch_and_add+0x8> 78: 4c 00 01 2c isync 7c: 4e 80 00 20 blr The only thing lwsync wont order is a store followed by a load. Since the lwsync will always be paired with a store (the stwcx), we will order all accesses before it and provide a release barrier. The stwcx; bne; isync combination provides the acquire barrier. Anton