On 10/05/2011 10:44 AM, Jeffrey Yasskin wrote:
Yes, that's what I'm suggesting. The rule for 'volatile' from the language is just that "Accesses to volatile objects are evaluated strictly according to the rules of the abstract machine." If the instruction-level implementation for a 16-byte atomic load is cmpxchg16b, then that's just how the abstract machine is implemented, and the rule says you have to do that consistently for volatile objects rather than sometimes optimizing it away. That's my argument anyway. If there's another standard you're following beyond "kernel people tend to ask for it," the situation may be trickier.
perfect, I like it. Andrew