On Tue, 31 Mar 2026 23:41:12 +0200 Maxime Leroy <[email protected]> wrote:
> This RFC proposes an optional shared tbl8 pool for FIB/FIB6, > to address the difficulty of sizing num_tbl8 upfront. > > In practice, tbl8 usage depends on prefix distribution and > evolves over time. In multi-VRF environments, some VRFs are > elephants (full table, thousands of tbl8 groups) while others > consume very little (mostly /24 or shorter). Per-FIB sizing > forces each instance to provision for its worst case, leading > to significant memory waste. > > A shared pool solves this: all FIBs draw from the same tbl8 > memory, so elephant VRFs use what they need while light VRFs > cost almost nothing. The sharing granularity is flexible: one pool per > VRF, per address family, a global pool, or no sharing at all. > > This series adds: > > - A shared tbl8 pool, replacing per-backend allocation > (bitmap in dir24_8, stack in trie) with a common > refcounted O(1) stack allocator. > - An optional resizable mode (grow via alloc + copy + QSBR > synchronize), removing the need to guess peak usage at > creation time. > - A stats API (rte_fib_tbl8_pool_get_stats()) exposing > used/total/max counters. > > All features are opt-in: > > - Existing per-FIB allocation remains the default. > - Shared pool is enabled via the tbl8_pool config field. > - Resize is enabled by setting max_tbl8 > 0 with QSBR. > > Shrinking (reducing pool capacity after usage drops) is not > part of this series. It would always be best-effort since > there is no compaction: if any tbl8 group near the end of the > pool is still in use, the pool cannot shrink. The current LIFO > free-list makes this less likely by immediately reusing freed > high indices, which prevents a contiguous free tail from > forming. A different allocation strategy (e.g. a min-heap > favoring low indices) could improve shrink opportunities, but > is better addressed separately. > > A working integration in Grout is available: > https://github.com/DPDK/grout/pull/581 (still a draft) > > Maxime Leroy (5): > test/fib6: zero-initialize config struct > fib: share tbl8 definitions between fib and fib6 > fib: add shared tbl8 pool > fib: add resizable tbl8 pool > fib: add tbl8 pool stats API > > app/test/test_fib6.c | 10 +- > lib/fib/dir24_8.c | 234 ++++++++++--------------- > lib/fib/dir24_8.h | 17 +- > lib/fib/fib_tbl8.h | 50 ++++++ > lib/fib/fib_tbl8_pool.c | 337 ++++++++++++++++++++++++++++++++++++ > lib/fib/fib_tbl8_pool.h | 113 ++++++++++++ > lib/fib/meson.build | 5 +- > lib/fib/rte_fib.h | 3 + > lib/fib/rte_fib6.h | 3 + > lib/fib/rte_fib_tbl8_pool.h | 149 ++++++++++++++++ > lib/fib/trie.c | 230 +++++++++--------------- > lib/fib/trie.h | 15 +- > 12 files changed, 844 insertions(+), 322 deletions(-) > create mode 100644 lib/fib/fib_tbl8.h > create mode 100644 lib/fib/fib_tbl8_pool.c > create mode 100644 lib/fib/fib_tbl8_pool.h > create mode 100644 lib/fib/rte_fib_tbl8_pool.h > Brief AI review Review of [RFC 0/5] fib: shared and resizable tbl8 pool Good series overall. The motivation for shared tbl8 pools in multi-VRF environments is clear and the cover letter is well-written. A few issues below, mostly in the resize path. Patch 4/5: fib: add resizable tbl8 pool -------------------------------------------- Error: Uses C11 <stdatomic.h> directly instead of DPDK atomic wrappers. New DPDK code must use rte_atomic_thread_fence() with rte_memory_order_* constants, not C11 atomic_thread_fence() with memory_order_*. In fib_tbl8_pool.c: #include <stdatomic.h> ... atomic_thread_fence(memory_order_release); Should be: #include <rte_stdatomic.h> ... rte_atomic_thread_fence(rte_memory_order_release); Warning: The plain store to consumer tbl8 pointers during resize (*c->tbl8_ptr = new_tbl8) and the data-plane readers' plain load of dp->tbl8 in the lookup functions have no acquire/release annotation. This works today because the RCU synchronize prevents use-after-free of the old array, and both old and new arrays contain identical data during the transition. However, the release fence before pool->tbl8 = new_tbl8 does not cover the subsequent consumer pointer stores. Consider using rte_atomic_store_explicit() with release ordering on the consumer pointer stores, or at minimum adding a comment explaining why plain stores are safe here. Warning: rte_fib_tbl8_pool_resize() is declared in the public header and exported, but it is also called automatically from fib_tbl8_pool_alloc() as an internal fallback. Having an auto-resize path that calls rte_rcu_qsbr_synchronize() means a route add can block for an unbounded time waiting for all reader threads to go quiescent. This should be documented prominently, or the resize should be separated from the alloc path so the caller can control when blocking is acceptable. Patch 3/5: fib: add shared tbl8 pool -------------------------------------------- Warning: The rte_fib_tbl8_pool struct and the free_list array are allocated with rte_zmalloc_socket but are only used on the control path. Standard calloc/malloc would avoid consuming limited hugepage memory. The tbl8 data array is correctly allocated with rte_zmalloc_socket since it is accessed in the data plane. Warning: install_to_fib() in dir24_8.c has an error path that calls fib_tbl8_pool_cleanup_and_free() to return tbl8_idx when tmp_tbl8_idx allocation fails: } else if (tmp_tbl8_idx < 0) { fib_tbl8_pool_cleanup_and_free(dp->pool, tbl8_idx); return -ENOSPC; } This is correct (cleans the initialized tbl8 group before returning it), but note this is a behavior change from the previous patch in the series where tbl8_put() was used without cleanup. The change is an improvement but should be mentioned in the commit message since it affects error-path semantics. Patches 3/5, 4/5, 5/5: New public API -------------------------------------------- Warning: Five new public API functions are added across these patches (rte_fib_tbl8_pool_create, rte_fib_tbl8_pool_free, rte_fib_tbl8_pool_rcu_qsbr_add, rte_fib_tbl8_pool_resize, rte_fib_tbl8_pool_get_stats) but no tests are added. New APIs need test coverage, at minimum exercising: - create/free lifecycle - shared pool between two FIB instances - resize with RCU configured - stats accuracy after alloc/free cycles Warning: No release notes for the new APIs and features. These will be needed before the series moves past RFC. Reviewed-by: Stephen Hemminger <[email protected]>

