From: Jesper Dangaard Brouer <[email protected]>
On servers with many IPv4 addresses, __ip_dev_find() becomes visible in
perf profiles on the unconnected UDP sendmsg path. The call chain is:
udpv6_sendmsg / udp_sendmsg
ip_route_output_flow
ip_route_output_key_hash_rcu
__ip_dev_find <-- source address validation
__ip_dev_find() calls inet_lookup_ifaddr_rcu() which walks a hash chain
in inet_addr_lst. With the current fixed table size of 256 buckets, a
host with ~700 IPv4 addresses averages ~2.8 entries per chain, adding
unnecessary cache misses under RCU on every unconnected send.
Add CONFIG_INET_ADDR_HASH_BUCKETS (default 256, range 64-16384, EXPERT)
so hosts with many addresses can size the table appropriately. The value
is rounded up to the nearest power of 2 at compile time via
order_base_2(). Memory cost is one hlist_head pointer per bucket per net
namespace.
Reported-by: Ivan Babrou <[email protected]>
Signed-off-by: Jesper Dangaard Brouer <[email protected]>
---
net/ipv4/Kconfig | 16 ++++++++++++++++
net/ipv4/devinet.c | 2 +-
2 files changed, 17 insertions(+), 1 deletion(-)
diff --git a/net/ipv4/Kconfig b/net/ipv4/Kconfig
index df922f9f5289..3c5e5e74b3e4 100644
--- a/net/ipv4/Kconfig
+++ b/net/ipv4/Kconfig
@@ -402,6 +402,22 @@ config INET_IPCOMP
If unsure, say Y.
+config INET_ADDR_HASH_BUCKETS
+ int "IPv4 address hash table size" if EXPERT
+ range 64 16384
+ default 256
+ help
+ Number of hash buckets for looking up local IPv4 addresses,
+ e.g. during route output to validate the source address via
+ __ip_dev_find(). Rounded up to the nearest power of 2.
+
+ Hosts with many IPv4 addresses benefit from a larger table to reduce
+ hash chain lengths. This is particularly relevant when sending using
+ unconnected UDP sockets.
+
+ The default of 256 is fine for most systems. A value of 1024
+ suits hosts with ~500+ addresses.
+
config INET_TABLE_PERTURB_ORDER
int "INET: Source port perturbation table size (as power of 2)" if
EXPERT
default 16
diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index 58fe7cb69545..9e3da06fb618 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -108,7 +108,7 @@ static const struct nla_policy ifa_ipv4_policy[IFA_MAX+1] =
{
[IFA_PROTO] = { .type = NLA_U8 },
};
-#define IN4_ADDR_HSIZE_SHIFT 8
+#define IN4_ADDR_HSIZE_SHIFT order_base_2(CONFIG_INET_ADDR_HASH_BUCKETS)
#define IN4_ADDR_HSIZE (1U << IN4_ADDR_HSIZE_SHIFT)
static u32 inet_addr_hash(const struct net *net, __be32 addr)
--
2.43.0