On Sun, Jan 13, 2013 at 9:36 PM, Andi Kleen <a...@firstfloor.org> wrote:
>> > __atomic_clear and __atomic_store_n didn't have code to generate >> > the TSX HLE RELEASE prefix. Add this plus test cases. >> >> +(define_insn "atomic_store_hle_release<mode>" >> + [(set (match_operand:ATOMIC 0 "memory_operand") >> + (unspec:ATOMIC [(match_operand:ATOMIC 1 "register_operand") >> + (match_operand:SI 2 "const_int_operand")] >> + UNSPEC_MOVA_RELEASE))] >> + "" >> + "%K2mov{<imodesuffix>}\t{%1, %0|%0, %1}") >> >> This pattern doesn't have any constraints! Also, mov insn can store >> immediates directly. > > Can you suggest a better pattern? It is implemented in the patch, attached to my previous message. >> >> + if (model & IX86_HLE_RELEASE) >> + { >> + emit_insn (gen_atomic_store_hle_release<mode> (operands[0], >> operands[1], >> + operands[2])); >> + DONE; >> + } >> + >> /* For seq-cst stores, when we lack MFENCE, use XCHG. */ >> if (model == MEMMODEL_SEQ_CST && !(TARGET_64BIT || TARGET_SSE2)) >> >> What about __ATOMIC_SEQ_CST; should >> >> __atomic_clear (p, __ATOMIC_SEQ_CST | __ATOMIC_HLE_RELEASE); >> >> emit a mfence at the end; in case of for your test: > > Originally I thought not, but now on reconsideration it's needed for > older CPUs that don't know about the XRELEASE. And it may be even needed > with TSX for the non transactional fallback execution. I'll fix the patch. Also fixed in my patch. It emits mfence at the end. >> + >> +void >> +hle_clear (int *p, int v) >> >> hle_clear (char *p) >> >> This argument should correspond to a bool, please see documentation. > > > Not sure I understand? Which documentation? This is just a random test case Ah, I was referring to the gcc documentation about __atomic_clear. >> +(define_insn "atomic_store<mode>_1" >> + [(set (match_operand:ATOMIC 0 "memory_operand" "=m") >> + (unspec:ATOMIC [(match_operand:ATOMIC 1 "<nonmemory_operand>" "<r><i>") >> + (match_operand:SI 2 "const_int_operand")] >> + UNSPEC_MOVA))] >> + "" >> + "%K2mov{<imodesuffix>}\t{%1, %0|%0, %1}") > > Is that the updated pattern you wanted? It looks similar to mine. Yes the attached patch actually implements all proposed fixes. Uros.