On Wed, 2 Dec 2020 14:53:41 -0800 Arjun Roy wrote: > Summarized: > 1. It is possible that a read payload is not exactly page aligned - > that there may exist "straggler" bytes that we cannot map into the > caller's address space cleanly. For this, we allow the caller to > provide as argument a "hybrid copy buffer", turning > getsockopt(TCP_ZEROCOPY_RECEIVE) into a "hybrid" operation that allows > the caller to avoid a subsequent recvmsg() call to read the > stragglers. > > 2. Similarly, for "small" read payloads that are either below the size > of a page, or small enough that remapping pages is not a performance > win - we allow the user to short-circuit the remapping operations > entirely and simply copy into the buffer provided. > > Some of the patches in the middle of this set are refactors to support > this "short-circuiting" optimization. > > 3. We allow the user to provide a hint that performing a page zap > operation (and the accompanying TLB shootdown) may not be necessary, > for the provided region that the kernel will attempt to map pages > into. This allows us to avoid this expensive operation while holding > the socket lock, which provides a significant performance advantage. > > With all of these changes combined, "medium" sized receive traffic > (multiple tens to few hundreds of KB) see significant efficiency gains > when using TCP receive zerocopy instead of regular recvmsg(). For > example, with RPC-style traffic with 32KB messages, there is a roughly > 15% efficiency improvement when using zerocopy. Without these changes, > there is a roughly 60-70% efficiency reduction with such messages when > employing zerocopy.
Applied, thank you!