On Tue, 2016-12-06 at 19:31 +0100, Paolo Abeni wrote:

> cacheline 2 boundary (128 bytes) is 8 bytes before sk_lock: cacheline 2
> includes also skc_refcnt and skc_rxhash from __sk_common (I use 'pahole
> -E ...' to get the full blown output). skc_rxhash is read for each
> packet in inet_recvmsg()/sock_rps_record_flow() if CONFIG_RPS is set. I
> get a cache miss per packet there and inet_recvmsg() in my test takes
> about 8% of the whole u/s processing time.



Wait a minute, this sk->sk_rxhash should only be read on connected
socket. Relying on it being 0 was okay only if we did not care
of false sharing. And UDP sockets used to grab socket refcount, so we
had false sharing a _lot_ in the past.

We must fix this if not already done properly.

Can you take care of this problem ?

Thanks !




Reply via email to