On Mon, Sep 21, 2020 at 02:04:23PM +0100, Andrew Cooper wrote:
> MFENCE is overly heavyweight for SMP semantics on WB memory, because it also
> orders weaker cached writes, and flushes the WC buffers.
> 
> This technique was used as an optimisation in Java[1], and later adopted by
> Linux[2] where it was measured to have a 60% performance improvement in VirtIO
> benchmarks.
> 
> The stack is used because it is hot in the L1 cache, and a -4 offset is used
> to avoid creating a false data dependency on live data.  (For 64bit userspace,
> the offset needs to be under the red zone to avoid false dependences).
> 
> Fix up the 32 bit definitions in HVMLoader and libxc to avoid a false data
> dependency.
> 
> [1] https://shipilev.net/blog/2014/on-the-fence-with-dependencies/
> [2] https://git.kernel.org/torvalds/c/450cbdd0125cfa5d7bbf9e2a6b6961cc48d29730
> 
> Signed-off-by: Andrew Cooper <[email protected]>
> ---
> CC: Jan Beulich <[email protected]>
> CC: Roger Pau MonnĂ© <[email protected]>
> CC: Wei Liu <[email protected]>
> CC: Ian Jackson <[email protected]>
> ---
>  tools/firmware/hvmloader/util.h   | 2 +-
>  tools/libs/ctrl/include/xenctrl.h | 4 ++--

If this is ever needed:

Acked-by: Wei Liu <[email protected]>

I have not followed the discussion in the thread closely, but the change
looks to be following what Linux does, so I'm certainly fine with this.

Reply via email to