Hi Andrew,

> Does it make any sense to simply predefine the possible valid 
> combinations with the HLE bit already set?  it at least removes any 
> possible invalid combinations and forces the programmer to consciously 
> choose their memory model.
> 
> ie,
> __ATOMIC_HLE_XACQ_CONSUME
> __ATOMIC_HLE_XACQ_ACQUIRE
> __ATOMIC_HLE_XACQ_ACQ_REL
> __ATOMIC_HLE_XACQ_SEQ_CST
> 
> __ATOMIC_HLE_XREL_RELEASE
> __ATOMIC_HLE_XREL_ACQ_REL
> __ATOMIC_HLE_XREL_SEQ_CST
> 
> or whatever happens to be valid...   Doesn't really scale to adding more 
> new bits later, but perhaps that doesn't matter.

Idea sounds good to me. Certainly would be much harder to misuse,
so generally be a better interface.

As to what combinations make sense:

An HLE region ACQUIRE has somewhat interesting ordering semantics.
It's a fairly strong barrier (LOCK prefix) for reads and writes.
The HLE RELEASE is either LOCK too, or MOV without LOCK. If it's running
transactionally the whole block acts like a LOCK too. But we have
to use the weakest.

I suppose that would map to always _SEQ_CST just for most instructions,
except for mov release whih can be _RELEASE too (and would need 
an additional MFENCE generated for anything stronger)

Probably there is not a lot of value in allowing the optimizer
weaker models than what the CPU does.

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only.

Reply via email to