Stephen Hemminger a écrit :
On Thu, 01 Nov 2007 11:16:20 +0100
Eric Dumazet <[EMAIL PROTECTED]> wrote:
As done two years ago on IP route cache table (commit
22c047ccbc68fa8f3fa57f0e8f906479a062c426) , we can avoid using one lock per
hash bucket for the huge TCP/DCCP hash tables.
On a typical x86_64 platform, this saves about 2MB or 4MB of ram, for litle
performance differences. (we hit a different cache line for the rwlock, but
then the bucket cache line have a better sharing factor among cpus, since we
dirty it less often)
Using a 'small' table of hashed rwlocks should be more than enough to provide
correct SMP concurrency between different buckets, without using too much
memory. Sizing of this table depends on NR_CPUS and various CONFIG settings.
This patch provides some locking abstraction that may ease a future work using
a different model for TCP/DCCP table.
Signed-off-by: Eric Dumazet <[EMAIL PROTECTED]>
include/net/inet_hashtables.h | 40 ++++++++++++++++++++++++++++----
net/dccp/proto.c | 16 ++++++++++--
net/ipv4/inet_diag.c | 9 ++++---
net/ipv4/inet_hashtables.c | 7 +++--
net/ipv4/inet_timewait_sock.c | 13 +++++-----
net/ipv4/tcp.c | 11 +++++++-
net/ipv4/tcp_ipv4.c | 11 ++++----
net/ipv6/inet6_hashtables.c | 19 ++++++++-------
8 files changed, 89 insertions(+), 37 deletions(-)
Longterm is there any chance of using rcu for this? Seems like
it could be a big win.
This was discussed in the past, and I even believe some patch was proposed,
but some guys (including David) complained that RCU is well suited for 'mostly
read' structures.
On some web server workloads, TCP hash table is constantly accessed in write
mode (socket creation, socket move to timewait state, socket deleted...), and
RCU added overhead and poor cache re-use (because sockets must be placed on
RCU queue before reuse)
On these typical workload, hash table without RCU is still the best.
Longterm changes would rather be based on Robert Olsson suggestion last year
(trie based lookups and unified IP/TCP cache)
Short term changes would be to be able to resize the TCP hash table (being
small at boot, and be able to grow it if necessary). Its current size on
modern machines is just insane.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html