Stephen Hemminger a écrit :
On Thu, 01 Nov 2007 11:16:20 +0100
Eric Dumazet <[EMAIL PROTECTED]> wrote:

As done two years ago on IP route cache table (commit 22c047ccbc68fa8f3fa57f0e8f906479a062c426) , we can avoid using one lock per hash bucket for the huge TCP/DCCP hash tables.

On a typical x86_64 platform, this saves about 2MB or 4MB of ram, for litle performance differences. (we hit a different cache line for the rwlock, but then the bucket cache line have a better sharing factor among cpus, since we dirty it less often)

Using a 'small' table of hashed rwlocks should be more than enough to provide correct SMP concurrency between different buckets, without using too much memory. Sizing of this table depends on NR_CPUS and various CONFIG settings.

This patch provides some locking abstraction that may ease a future work using a different model for TCP/DCCP table.

Signed-off-by: Eric Dumazet <[EMAIL PROTECTED]>

  include/net/inet_hashtables.h |   40 ++++++++++++++++++++++++++++----
  net/dccp/proto.c              |   16 ++++++++++--
  net/ipv4/inet_diag.c          |    9 ++++---
  net/ipv4/inet_hashtables.c    |    7 +++--
  net/ipv4/inet_timewait_sock.c |   13 +++++-----
  net/ipv4/tcp.c                |   11 +++++++-
  net/ipv4/tcp_ipv4.c           |   11 ++++----
  net/ipv6/inet6_hashtables.c   |   19 ++++++++-------
  8 files changed, 89 insertions(+), 37 deletions(-)


Longterm is there any chance of using rcu for this? Seems like
it could be a big win.


This was discussed in the past, and I even believe some patch was proposed, but some guys (including David) complained that RCU is well suited for 'mostly read' structures.

On some web server workloads, TCP hash table is constantly accessed in write mode (socket creation, socket move to timewait state, socket deleted...), and RCU added overhead and poor cache re-use (because sockets must be placed on RCU queue before reuse)

On these typical workload, hash table without RCU is still the best.

Longterm changes would rather be based on Robert Olsson suggestion last year (trie based lookups and unified IP/TCP cache)

Short term changes would be to be able to resize the TCP hash table (being small at boot, and be able to grow it if necessary). Its current size on modern machines is just insane.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to