https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104831

--- Comment #5 from Patrick O'Neill <patrick at rivosinc dot com> ---
IIUC, Appendix A is incorrect.

We cannot allow any memory ops to enter within the LR/SC pair, since a
reordering like that is visible to other threads.

Here's a litmus test showing this fact:

(*
  LR/SC with .aq .rl bits does not allow read operations to be reordered
  within/beneath it.
*)

{
0:x6=a; 0:x8=b;
1:x6=a; 1:x8=b;
}

 P0                     | P1          ;
 lw x5,0(x6)            | ori x1,x0,1 ;
 lr.w.aq.rl x7,0(x8)    | sw x1,0(x8) ;
 ori x1,x0,1            | fence rw,rw ;
 sc.w.aq.rl x1,x1,0(x8) | sw x1,0(x6) ;

~exists (0:x5=1 /\ 0:x7=0 /\ b=1)

In a sequentially consistent atomic operation (which this LRSC pair is
emulating), it is not possible for both x5 to be loaded with a 1 and the
LR/SC pair to load/operate on a 0.

With the pairing of LR.aq/SC.aqrl this outcome is possible.

Similarly, for LR.aqrl/SC.rl, a similar reordering needs to be forbidden:

RISCV LRSC-WRITE

(* 
  LR/SC with .aq .rl bits does not allow write operations to be reordered
  within/above it.
*)

{
0:x8=b; 0:x10=c;
1:x8=b; 1:x10=c;
}

 P0                     | P1          ;
 ori x9,x0,1            | lw x9,0(x10);
 lr.w.aq.rl x7,0(x8)    | fence rw,rw ;
 ori x7,x0,1            | lw x7,0(x8) ;
 sc.w.aq.rl x1,x7,0(x8) |             ;
 sw x9,0(x10)           |             ;

~exists (1:x9=1 /\ 1:x7=0 /\ b=1)

In a sequentially consistent atomic operation, it is not possible for
both Hart 1's x9 to be loaded with a 1 and Hart 1's x7 to be loaded with a 0
(as long as the SC succeeds, which b=1 enforces).

That outcome is possible with LR.aqrl/SC.rl since operations can get
reordered within the LR/SC pairing.

Reply via email to