On Wed,  2 Dec 2020 14:53:41 -0800 Arjun Roy wrote:
> Summarized:
> 1. It is possible that a read payload is not exactly page aligned -
> that there may exist "straggler" bytes that we cannot map into the
> caller's address space cleanly. For this, we allow the caller to
> provide as argument a "hybrid copy buffer", turning
> getsockopt(TCP_ZEROCOPY_RECEIVE) into a "hybrid" operation that allows
> the caller to avoid a subsequent recvmsg() call to read the
> stragglers.
> 
> 2. Similarly, for "small" read payloads that are either below the size
> of a page, or small enough that remapping pages is not a performance
> win - we allow the user to short-circuit the remapping operations
> entirely and simply copy into the buffer provided.
> 
> Some of the patches in the middle of this set are refactors to support
> this "short-circuiting" optimization.
> 
> 3. We allow the user to provide a hint that performing a page zap
> operation (and the accompanying TLB shootdown) may not be necessary,
> for the provided region that the kernel will attempt to map pages
> into. This allows us to avoid this expensive operation while holding
> the socket lock, which provides a significant performance advantage.
> 
> With all of these changes combined, "medium" sized receive traffic
> (multiple tens to few hundreds of KB) see significant efficiency gains
> when using TCP receive zerocopy instead of regular recvmsg(). For
> example, with RPC-style traffic with 32KB messages, there is a roughly
> 15% efficiency improvement when using zerocopy. Without these changes,
> there is a roughly 60-70% efficiency reduction with such messages when
> employing zerocopy.

Applied, thank you!

Reply via email to