On Fri, May 5, 2023 at 2:42 PM Hans Boehm <hbo...@google.com> wrote: > > I think A.6-tso also needs to change the last line in the table from lr.aqrl > ... sc to lr.aq ... sc.rl, otherwise I think we have problems with a > subsequent A.7-tso generated l.aq . Otherwise I agree. > > I certainly agree that, given the Ztso extension, there should be a standard > compiler-implemented mapping that leverages it. I'm personally much less > enthusiastic about calling it an ABI. I'd like to see clarity that the RVWMO > ABI is the standard we expect portable libraries to be prepared to use.
There's already a ratified sentiment that effectively implies this. Ztso is not required by the RVA profiles, and so it follows that any binary that's compatible across RVA-profile implementations cannot assume the presence of Ztso. (I agree the ABI should encode this property for de jure purposes, too, but it's already a de facto requirement.) > If they want to test for and use Ztso internally, fine. But having users deal > with two different ABIs seems like a very high cost for avoiding some > (basically no-op?) fences. > > Hans > > > > On Fri, May 5, 2023 at 1:11 PM Andrea Parri <and...@rivosinc.com> wrote: >> >> On Fri, May 05, 2023 at 12:18:12PM -0700, Palmer Dabbelt wrote: >> > On Fri, 05 May 2023 11:55:31 PDT (-0700), Andrea Parri wrote: >> > > On Fri, May 05, 2023 at 10:12:56AM -0700, Patrick O'Neill wrote: >> > > > The RISC-V Ztso extension currently has no effect on generated code. >> > > > With the additional ordering constraints guarenteed by Ztso, we can >> > > > emit >> > > > more optimized atomic mappings than the RVWMO mappings. >> > > > >> > > > This patch implements Andrea Parri's proposed Ztso mappings ("Proposed >> > > > Mapping"). >> > > > >> > > > https://github.com/preames/public-notes/blob/master/riscv-tso-mappings.rst >> > > > >> > > > LLVM has implemented this same mapping (Ztso is still behind a >> > > > experimental flag in LLVM, so there is *not* a defined ABI for this >> > > > yet). >> > > > https://reviews.llvm.org/D143076 >> > > >> > > Given the recent patches/discussions, it seems worth pointing out the >> > > the Ztso mappings referred to above was designed to be compatible with >> > > the mappings in Table A.6 and that they are _not_ compatible with the >> > > mappings in Table A.7 or with a "subset" of A.7 (even assuming RVTSO). >> > >> > I guess that brings up the question of what we should do about WMO/TSO >> > compatibility. IIUC the general plan has been that WMO binaries would be >> > compatible with TSO binaries when run on TSO systems, and that TSO binaries >> > would require TSO systems. >> > >> > I suppose it would be possible to have TSO produce binaries that would run >> > on WMO systems by just emitting a bunch of extra fences, but I don't think >> > anyone wants that? >> > >> > We've always just assumed that WMO binaries would be compatible with TSO >> > binaries, but I don't think it's ever really been concretely discussed. >> > Having an ABI break here wouldn't be the craziest idea as it'd let us fix >> > some other issues, but that'd certainly need to be pretty widely discussed. >> > >> > Do we have an idea of what A.7-compatible TSO mappings would look like? >> >> As in riscv-tso-mappings.rst but with >> >> atomic_store(memory_order_seq_cst) | s{b|h|w|d} ; fence rw,rw >> >> would be A.7-compatible: call the resulting mappings "A.6-tso". >> >> A.6-tso is (also) compatible with the following subset of A.7: >> >> C/C++ Construct | A.7-tso Mapping >> ------------------------------------------------------------------------------ >> Non-atomic load | l{b|h|w|d} >> atomic_load(memory_order_relaxed | l{b|h|w|d} >> atomic_load(memory_order_acquire) | l{b|h|w|d} >> atomic_load(memory_order_seq_cst) | l{b|h|w|d}.aq >> ------------------------------------------------------------------------------ >> Non-atomic store | s{b|h|w|d} >> atomic_store(memory_order_relaxed) | s{b|h|w|d} >> atomic_store(memory_order_release) | s{b|h|w|d} >> atomic_store(memory_order_seq_cst) | s{b|h|w|d}.rl >> ------------------------------------------------------------------------------ >> atomic_thread_fence(memory_order_acquire) | nop >> atomic_thread_fence(memory_order_release) | nop >> atomic_thread_fence(memory_order_acq_rel) | nop >> atomic_thread_fence(memory_order_seq_cst) | fence rw,rw >> ------------------------------------------------------------------------------ >> C/C++ Construct | RVTSO AMO Mapping >> atomic_<op>(memory_order_relaxed) | amo<op>.{w|d} >> atomic_<op>(memory_order_acquire) | amo<op>.{w|d} >> atomic_<op>(memory_order_release) | amo<op>.{w|d} >> atomic_<op>(memory_order_acq_rel) | amo<op>.{w|d} >> atomic_<op>(memory_order_seq_cst) | amo<op>.{w|d} >> ------------------------------------------------------------------------------ >> C/C++ Construct | RVTSO LR/SC Mapping >> atomic_<op>(memory_order_relaxed) | loop: lr.{w|d} ; <op> ; >> | sc.{w|d} ; bnez loop >> atomic_<op>(memory_order_acquire) | loop: lr.{w|d} ; <op> ; >> | sc.{w|d} ; bnez loop >> atomic_<op>(memory_order_release) | loop: lr.{w|d} ; <op> ; >> | sc.{w|d} ; bnez loop >> atomic_<op>(memory_order_acq_rel) | loop: lr.{w|d} ; <op> ; >> | sc.{w|d} ; bnez loop >> atomic_<op>(memory_order_seq_cst) | loop: lr.{w|d}.aq ; <op> ; >> | sc.{w|d}.rl ; bnez >> loop >> >> Andrea