On Sun, Jan 13, 2013 at 9:36 PM, Andi Kleen <a...@firstfloor.org> wrote:

>> > __atomic_clear and __atomic_store_n didn't have code to generate
>> > the TSX HLE RELEASE prefix. Add this plus test cases.
>>
>> +(define_insn "atomic_store_hle_release<mode>"
>> +  [(set (match_operand:ATOMIC 0 "memory_operand")
>> +     (unspec:ATOMIC [(match_operand:ATOMIC 1 "register_operand")
>> +                     (match_operand:SI 2 "const_int_operand")]
>> +                    UNSPEC_MOVA_RELEASE))]
>> +  ""
>> +  "%K2mov{<imodesuffix>}\t{%1, %0|%0, %1}")
>>
>> This pattern doesn't have any constraints! Also, mov insn can store
>> immediates directly.
>
> Can you suggest a better pattern?

It is implemented in the patch, attached to my previous message.

>>
>> +      if (model & IX86_HLE_RELEASE)
>> +        {
>> +               emit_insn (gen_atomic_store_hle_release<mode> (operands[0],
>> operands[1],
>> +                                                      operands[2]));
>> +       DONE;
>> +        }
>> +
>>        /* For seq-cst stores, when we lack MFENCE, use XCHG.  */
>>        if (model == MEMMODEL_SEQ_CST && !(TARGET_64BIT || TARGET_SSE2))
>>
>> What about __ATOMIC_SEQ_CST; should
>>
>>   __atomic_clear (p, __ATOMIC_SEQ_CST | __ATOMIC_HLE_RELEASE);
>>
>> emit a mfence at the end; in case of for your test:
>
> Originally I thought not, but now on reconsideration it's needed for
> older CPUs that don't know about the XRELEASE. And it may be even needed
> with TSX for the non transactional fallback execution. I'll fix the patch.

Also fixed in my patch. It emits mfence at the end.

>> +
>> +void
>> +hle_clear (int *p, int v)
>>
>> hle_clear (char *p)
>>
>> This argument should correspond to a bool, please see documentation.
>
>
> Not sure I understand? Which documentation? This is just a random test case

Ah, I was referring to the gcc documentation about __atomic_clear.

>> +(define_insn "atomic_store<mode>_1"
>> +  [(set (match_operand:ATOMIC 0 "memory_operand" "=m")
>> +     (unspec:ATOMIC [(match_operand:ATOMIC 1 "<nonmemory_operand>" "<r><i>")
>> +                     (match_operand:SI 2 "const_int_operand")]
>> +                    UNSPEC_MOVA))]
>> +  ""
>> +  "%K2mov{<imodesuffix>}\t{%1, %0|%0, %1}")
>
> Is that the updated pattern you wanted? It looks similar to mine.

Yes the attached patch actually implements all proposed fixes.

Uros.

Reply via email to