[PATCH -perfbook 3/3] memorder: Add index markers for "multicopy atomicity"

Akira Yokosawa Sun, 06 Apr 2025 03:34:36 -0700

Notes:
 1.  "other-multicopy atomicity" and "non-multicopy atomicity" want
     macros with alternative forms such as
     \IXalth{other-multicopy atomicity}{other-}{multicopy atomicity},
     so that they don't have extra spaces after "other-" and "non-" in
     the printed text.


 2.  Several instances of "multicopy atomic" and friends are also marked
     for "multicopy atomicity".

Signed-off-by: Akira Yokosawa <[email protected]>
---
 memorder/memorder.tex | 32 +++++++++++++++++++-------------
 1 file changed, 19 insertions(+), 13 deletions(-)

diff --git a/memorder/memorder.tex b/memorder/memorder.tex
index 46261137..1a367509 100644
--- a/memorder/memorder.tex
+++ b/memorder/memorder.tex
@@ -2273,7 +2273,7 @@ accommodate only one store at a time, then any pair of 
CPUs would
 agree on the order of all stores that they observed.
 Unfortunately, building a computer system as shown in the figure, without
 store buffers or even caches, would result in glacially slow computation.
-Most CPU vendors interested in providing multicopy atomicity therefore
+Most CPU vendors interested in providing \IXB{multicopy atomicity} therefore
 instead provide the slightly weaker
 \emph{other-multicopy atomicity}~\cite[Section B2.3]{ARMv8A:2017},
 which excludes the CPU doing a given store from the requirement that all
@@ -2287,7 +2287,7 @@ CPUs agree on the order of all stores.\footnote{
        \cpageref{tab:memorder:Summary of Memory Ordering}.}
 This means that if only a subset of CPUs are doing stores, the
 other CPUs will agree on the order of stores, hence the ``other''
-in ``other-multicopy atomicity''.
+in ``\IXBalth{other-multicopy atomicity}{other-}{multicopy atomicity}''.
 Unlike multicopy-atomic platforms, within other-multicopy-atomic platforms,
 the CPU doing the store is permitted to observe its
 store early, which allows its later loads to obtain the newly stored
@@ -2332,8 +2332,8 @@ value directly from the store buffer, which improves 
performance.
 
 Perhaps there will come a day when all platforms provide some flavor
 of multi-copy atomicity, but
-in the meantime, non-multicopy-atomic platforms do exist, and so software
-must deal with them.
+in the meantime, \IXalth{non-multicopy-atomic}{non-}{multicopy atomicity}
+platforms do exist, and so software must deal with them.
 
 \begin{listing}
 \input{CodeSamples/formal/litmus/[email protected]}
@@ -3447,7 +3447,9 @@ bottom, confirming that this counter-intuitive really can 
happen.
 If you wish, you can click on ``Undo'' to explore other options or
 click on ``Reset'' to start over.
 It can be very helpful to carry out these steps in different orders
-to better understand how a non-multicopy-atomic architecture operates.
+to better understand how a
+\IXalth{non-multicopy-atomic}{non-}{multicopy atomicity}
+architecture operates.
 
 \QuickQuiz{
        What happens if that \co{lwsync} instruction is instead a
@@ -5364,11 +5366,13 @@ sequential consistency.
 Performance considerations have dictated that no modern mainstream
 system is sequentially consistent.
 
-The next three rows cover multicopy atomicity, which was defined in
+The next three rows cover \IX{multicopy atomicity}, which was defined in
 \cref{sec:memorder:Multicopy Atomicity}.
-The first is full-up (and rare) multicopy atomicity, the second is the
-weaker other-multicopy atomicity, and the third is the weakest
-non-multicopy atomicity.
+The first is full-up (and rare)
+\IXalth{multicopy atomicity}{full}{multicopy atomicity}, the second is the
+weaker \IXalth{other-multicopy atomicity}{other-}{multicopy atomicity},
+and the third is the weakest
+\IXalth{non-multicopy atomicity}{non-}{multicopy atomicity}.
 
 The next row, ``Non-Cache Coherent'', covers accesses from multiple
 threads to a single variable, which was discussed in
@@ -6071,7 +6075,7 @@ instructions~\cite{PowerPC94,MichaelLyons05a}:
        order of these stores.
        Not so on PowerPC, even with an \co{lwsync} instruction between each
        pair of memory-reference instructions, because PowerPC is
-       non-multicopy atomic.
+       \IXalth{non-multicopy atomic}{non-}{multicopy atomicity}.
 \item  [\tco{eieio}] (enforce in-order execution of I/O, in case you
        were wondering) causes all preceding cacheable stores to appear
        to have completed before all subsequent stores.
@@ -6233,7 +6237,8 @@ own stores as having happened earlier than this total 
global order
 would indicate.
 This exception to the total ordering is needed to allow important
 hardware optimizations involving store buffers.
-In addition, x86 provides other-multicopy atomicity, for example,
+In addition, x86 provides
+\IXalth{other-multicopy atomicity}{other-}{multicopy atomicity}, for example,
 so that if CPU~0 sees a store by CPU~1, then CPU~0 is guaranteed to see
 all stores that CPU~1 saw prior to its store.
 Software may use atomic operations to override these hardware optimizations,
@@ -6287,8 +6292,9 @@ but compiler constraints suffices for both the
 It also has strong memory-ordering semantics, as shown in
 \cref{tab:memorder:Summary of Memory Ordering}.
 In particular, all CPUs will agree on the order of unrelated stores from
-different CPUs, that is, the z~Systems CPU family is fully multicopy
-atomic, and is the only commercially available system with this property.
+different CPUs, that is, the z~Systems CPU family is
+\IXalth{fully multicopy atomic}{full}{multicopy atomicity},
+and is the only commercially available system with this property.
 
 As with most CPUs, the z~Systems architecture does not guarantee a
 cache-coherent instruction stream, hence,
-- 
2.34.1

[PATCH -perfbook 3/3] memorder: Add index markers for "multicopy atomicity"

Reply via email to