Hi, On Wed, 2017-12-13 at 15:08 -0500, David Miller wrote: > From: Paolo Abeni <pab...@redhat.com> > Date: Tue, 12 Dec 2017 14:09:28 +0100 > > > When a reuseport socket group is using a BPF filter to distribute > > the packets among the sockets, we don't need to compute any hash > > value, but the current reuseport_select_sock() requires the > > caller to compute such hash in advance. > > > > This patch reworks reuseport_select_sock() to compute the hash value > > only if needed - missing or failing BPF filter. Since different > > hash functions have different argument types - ipv4 addresses vs ipv6 > > ones - to avoid over-complicate the interface, reuseport_select_sock() > > is now a macro. > > > > Additionally, the sk_reuseport test is move inside reuseport_select_sock, > > to avoid some code duplication. > > > > Overall this gives small but measurable performance improvement > > under UDP flood while using SO_REUSEPORT + BPF. > > > > Signed-off-by: Paolo Abeni <pab...@redhat.com> > > I don't doubt that this improves the case where the hash is elided, but > I suspect it makes things slower othewise. > > You're doing two function calls for an operation that used to require > just one in the bottom of the call chain. > > You're also putting something onto the stack that the compiler can't > possibly optimize into purely using cpu registers to hold.
Thank you for the feedback. I was unable to measure any performance regression for the hash based demultiplexing, and I think that the number of function calls is unchanged in such scenario (with vanilla kernel we have ehash() and reuseport_select_sock(), with the patched one __reuseport_get_info() and ehash()). I agree you are right about the additional stack usage introduced by this patch. Overall I see we need something better than this. Thanks, Paolo