On Fri, Apr 1, 2011 at 6:24 PM, Richard Henderson <r...@redhat.com> wrote: > On 03/31/2011 08:28 AM, Richard Guenther wrote: >>>> Well, I'm not sure that strict-align targets that provide byte access do >>>> not simply hide the issue inside the CPU (thus, perform the >>>> read-modify-write >>>> there and do not guarantee any atomicity unless you ask for it). >>> Certainly some do this internally, but that's clearly out of our >>> control. >> >> Sure. My argument is that the memory model which guarantees >> this kind of things for _any_ memory access is fundamentally flawed. >> They should have simply required annotating objects which should >> behave that way (and then only behave that way "per object", not >> for any concurrent field accesses). > > (0) Let's limit our discussion to cpus that are actually put into SMP systems, > and have been manufactured in the last decade. > > (1) Do we agree that all such cpus have user-level store insns with byte > granularity. Honestly the only non-microcontroler I ever heard of > without this was the original Alpha. Which is excluded per (0). > > (2) Do we agree that all such cpus have on-chip caches? > > (3) Let us at this point limit our discussion to cacheable, i.e. non-I/O, > memory. I believe we can agree that all sorts of system-dependent stuff > happens in memory-mapped registers. > > (4) Do we agree that all such cpus transfer entire cachelines to and fro > the memory bus? And further that they simultaneously transfer a > modification mask as part of their cache coherency protocol? > > (5) Do we agree that all such cpus use a byte-granular modification mask? > > I'm guessing that you don't actually agree on point (5), but ... honestly, > please name the offender because I can't think of one. For the mainstream > processors we really care about, I think every one of them Does The Right > Thing.
Yes, we don't agree on (5). And I can't name a CPU, but I was just guessing that strict alignment CPUs would have such requirement to also make their store queues simpler (no need for such mask). Now, as of (0) I might agree to disregard the original Alpha, but as the embedded world moves to SMP I'm not sure we can disregard non-cache coherent NUMA setups or even CPUs without a byte store. But well, I guess the thing I don't like about the standard is that it makes people that have started to be somewhat aware about threading issues _less_ aware of them by providing some "false" safety to them. It really smells like a standard designed for a very high-level language where people don't have to think instead of a standard suitable for a C family language. Richard. > > > r~ >