On Mon, 13 Jan 2025, Jeff Law wrote: > > + For non-BWX targets we need to load data from memory, mask it such as > > + to keep any part outside the area written, insert data to be stored, > > + and write the result back atomically. For sizes that are not a power > > + of 2 there are no byte mask or insert machine instructions available > > + so the mask required has to be built by hand, however ZAP and ZAPNOT > > + instructions can then be used to apply the mask. Since LL/SC loops > > + are used, the high and low parts have to be disentangled from each > > + other and handled sequentially except for size 1 where there is only > > + the low part to be written. */ > So doesn't this mean that we're doing partial updates and thus have partial > update visibility problems? Granted, it's still an improvement over the > current state of the world. Just want to make sure I understand the basics > here.
Partial updates are not an issue here, because the high-level-language operations that result in these atomic sequences do not have atomicity in the contract. It is no different from say MIPS SWL/SWR instruction pairs where another thread/CPU can see the intermediate state. The only guarantee made here is that data *outside* the area written won't be modified by the atomic sequence. I think the SWL/SWR analogy is actually a good one. We do essentially the same, writing unaligned data by pieces, except with SWL/SWR hardware drives byte lane enable signals such as not to modify data outside the quantity written, while we have to do this by hand, necessarily with a pair of atomic sequences. Have I made myself clear here? Maciej