[PATCH ipsec-next 00/11] xfrm: policy: add inexact policy search tree

Florian Westphal Wed, 07 Nov 2018 14:04:33 -0800

This series attempts to improve xfrm policy lookup performance when
a lot of (several hundred or even thousands) inexact policies exist
on a system.


On insert, a policy is either placed in hash table (all direct (/32 for
ipv4, /128 policies, or all policies matching a user-configured threshold).
All other policies get inserted into inexact list as per priority.

Lookup then scans inexact list for first matching entry.

This series instead makes it so that inexact policy is added to exactly
one of four different search list classes.

1. "Any:Any" list, containing policies where both saddr and daddr are
   wildcards or have very coarse prefixes, e.g. 10.0.0.0/8 and the like.
2. "saddr:any" list, containing policies with a fixed saddr/prefixlen,
   but without destination restrictions.
   These lists are stored in rbtree nodes; each node contains those
   policies matching saddr/prefixlen.
3. "Any:daddr" list. Similar to 2), except for policies where only the
   destinations are specified.
4. "saddr:daddr" lists, containing policies that match the given
   source/destination network.

   The root of the saddr/daddr tree is stored in the nodes of the
   'daddr' tree.

This diagram illustrates the list classes, and their
placement in the lookup hierarchy:

xfrm_pol_inexact_bin = hash(dir,type,family,if_id);
 |
 +---- root_d: sorted by daddr:prefix
 |                 |
 |        xfrm_pol_inexact_node
 |                 |
 |                 +- root: sorted by saddr/prefix
 |                 |              |
 |                 |         xfrm_pol_inexact_node
 |                 |              |
 |                 |              + root: unused
 |                 |              |
 |                 |              + hhead: saddr:daddr policies
 |                 |
 |                 +- coarse policies and all any:daddr policies
 |
 +---- root_s: sorted by saddr:prefix
 |                 |
 |        xfrm_pol_inexact_node
 |                 |
 |                 + root: unused
 |                 |
 |                 + hhead: saddr:any policies
 |
 +---- coarse policies and all any:any policies

First we obtain inexact bin by hashing direction, family and other 'must
match' criteria.

Next, we search root_d using packets' destination ip address.
If we find a matching node, we search that nodes' source address tree
using packets src address.

We also search the bins root_s tree using packets source address.

This initial lookup now has created a 'candidate set' consisting
of up to four hlist heads to the four search classes.

Each list gets scanned for first matching entry.
Out of those up the one with the highest priority is chosen.
In case several policies had same priority, most recent one is used to
match old behaviour.

Locking rules:
insert requires net->xfrm.xfrm_policy_lock spinlock held + BH off
(no change to current kernel).
Insert/removal of a tree node needs increment of its sequence count.

lookup requires rcu read lock, lookups in a tree need to sample
the sequence count of the bin that tree is located in as well and
re-try in case it has changed.

Results on my kvm test setup, using 4 netns:

# 1.2   1.1   3.1  3.10    2.1   2.2
# eth1  eth1 veth0 veth0 eth1   eth1
# ns1 ---- ns3 ----- ns4 ---- ns2

ns3 and ns4 are connected via ipsec tunnel.
'Dummy policies' below means that i created 1k non-matching
policies in both ns3 and ns4.

20 parallel netperf (mbits, unidirectional TCP_STREAM):
normal:
993.518000 (no dummy policies)
796.32 (with dummy policies)

patched:
991.335 (no dummy policies)
989.684 (with dummy policies)

20 parallel TCP_RR, trans. per second:
normal:
32413.930000 (no dummy policies)
17725.970000 (with dummy policies)

patched:
32314.51 (no dummy policies)
32326.190000 (with dummy policies)

caveats:

1. Won't work unless large part of policies have no common saddr/daddr pairs.
   Extreme example: adding 10k 0.0.0.0/0 -> 0.0.0.0/0
   policies may require search of all policies, just like current kernel.

2. The policy add sequence
 a daddr 10.0.0.0/26
 b daddr 10.0.0.64/26
 c daddr 10.0.0.0/24

When c gets added, search nodes for a and b have to be merged with
the new subnet, as it contains both.

This also means that a single policy rule with a small subnet value
can cause severe tree degradation and thus longer search lists.

3. Causes (small) performance degradation when setup only has few policies.
4. Policies need an additional hlist, as MIGRATE doesn't consider if_id,
   but the patchset makes the if_id part of the initial hash lookup.
   Could be avoided by not considering if_id as part of initial bin lookup,
   but then increases list length a lot when we have lots of xfrm interfaces.
5. In oder to solve 'two matching candidates have same priority, which
   takes precedence?' problem I added a number to xfrm_policy struct that
   maps to its index in the complete inexact list.
   This was necessary to make sure same sequence of policy-add commands
   keeps on returning same lookup result compared to older kernel.

This work doesn't prevent us from re-adding a new flow caching scheme,
but so far i found no good solution (all have various DoS or degradation
issues).

Comments or questions welcome.

Florian Westphal (11):
      selftests: add xfrm policy test script
      xfrm: security: iterate all, not inexact lists
      xfrm: policy: split list insertion into a helper
      xfrm: policy: return NULL when inexact search needed
      xfrm: policy: store inexact policies in an rhashtable
      xfrm: policy: consider if_id when hashing inexact policy
      xfrm: policy: add inexact policy search tree infrastructure
      xfrm: policy: store inexact policies in a tree ordered by destination 
address
      xfrm: policy: check reinserted policies match their node
      xfrm: policy: store inexact policies in a tree ordered by source address
      xfrm: policy: add 2nd-level saddr trees for inexact policies

 include/net/netns/xfrm.h                   |    2
 include/net/xfrm.h                         |    3
 net/xfrm/xfrm_policy.c                     | 1234 ++++++++++++++++++++++++++---
 tools/testing/selftests/net/Makefile       |    3
 tools/testing/selftests/net/xfrm_policy.sh |  276 ++++++
 5 files changed, 1391 insertions(+), 127 deletions(-)

[PATCH ipsec-next 00/11] xfrm: policy: add inexact policy search tree

Reply via email to