Thanks for sharing this suggestion.
We actually evaluated the heap-based approach before implementing this
patch.
It can help in some scenarios, but unfortunately it does not fully solve our
use cases. Specifically:
1. **Heap count / scalability**
Our application maintains at least ~200 rte_acl_ctx instances (due
to the
total rule count and multi-tenant isolation). Allowing a dedicated
heap per
context would exceed the practical limits of the current rte_malloc heap
model. The number of heaps that can be created is not unlimited, and
maintaining hundreds of separate heaps would introduce considerable
management overhead.
2. **Temporary allocations in build stage**
During `rte_acl_build`, a significant portion of memory is allocated
through
`calloc()` for internal temporary structures. These allocations are
freed
right after the build completes. Even if runtime memory could come
from a
custom heap, these temporary allocations would still need an independent
allocator or callback mechanism to avoid fragmentation and repeated
malloc/free cycles.
The goal of this patch is to provide allocator callbacks so that
applications
can apply their own memory model consistently — static region for
runtime, and
resettable pool for build — without relying on uncontrolled internal
allocations.
On 11/26/2025 2:01 AM, Dmitry Kozlyuk wrote:
On 11/25/25 15:14, mannywang(王永峰) wrote:
Reduce memory fragmentation caused by dynamic memory allocations
by allowing users to provide custom memory allocator.
Add new members to struct rte_acl_config to allow passing custom
allocator callbacks to rte_acl_build:
- running_alloc: allocator callback for run-time internal memory
- running_free: free callback for run-time internal memory
- running_ctx: user-defined context passed to running_alloc/free
- temp_alloc: allocator callback for temporary memory during ACL build
- temp_reset: reset callback for temporary allocator
- temp_ctx: user-defined context passed to temp_alloc/reset
These callbacks allow users to provide their own memory pools or
allocators for both persistent runtime structures and temporary
build-time data.
A typical approach is to pre-allocate a static memory region
for rte_acl_ctx, and to provide a global temporary memory manager
that supports multipleallocations and a single reset during ACL build.
Since tb_mem_pool handles allocation failures using siglongjmp,
temp_alloc follows the same approach for failure handling.
If a static memory region would suffice for runtime memory,
could you have solved the issue using existing API as follows?
1. Allocate memory in any way, may even use `rte_malloc_*()`.
2. Create a new heap using `rte_malloc_heap_create()`.
3. Attach the memory to the heap using `rte_malloc_heap_memory_add()`.
4. Get the heap "socket ID" using `rte_malloc_heap_get_socket()`.
5. Pass the heap "socket ID" to `rte_acl_create()`.
In https://inbox.dpdk.org/dev/[email protected]/
you said that the issue is runtime memory fragmentation,
but also did "propose extending the ACL API to support
external memory buffers for the build process".
What is the issue with build-time allocations?