> From: Bruce Richardson [mailto:[email protected]]
> Sent: Thursday, 19 March 2026 10.41
> 
> On Thu, Mar 19, 2026 at 09:13:00AM +0000, Morten Brørup wrote:
> > The descriptions for the mempool creation functions contained advice
> for
> > choosing the optimum (in terms of memory usage) number of elements
> and
> > cache size.
> > The advice was based on implementation details, which was changed
> long
> > ago, making the advice completely irrelevant.
> >
> 
> The comment is still correct in most cases, since the default backing
> storage remains an rte_ring. If passing a power-of-2 size to mempool
> create
> one will get a backing rte_ring store which is twice as large as
> requested,
> leading to lots of ring slots being wasted. For example, for a pool
> with
> 16k elements, the actual ring size allocated will be 32k, leading to
> wasting 128k of RAM, and also potentially cache too. The latter will
> occur
> because of the nature of the ring to iterate through all mempool/ring
> entries, meaning that even if only 16k of the 32k slots will ever be
> used,
> all 32k slots will be passed through the cpu cache if it works on the
> mempool directly and not just from the per-core cache.

You are right about the waste of memory in the ring driver. And good point 
about the CPU cache!

However, only pointer entries (8 byte each) are being wasted, not object 
entries (which are much larger). This is not 100 % clear from the advice.

Furthermore, with 16k mbufs of 2368 byte each, the mempool itself consumes 37 
MB worth of memory, so do we really care about wasting 128 KB?

IMHO, removing the advice improves the quality of the documentation.
I don't think a detail about saving 0.3 % of the memory used by the mempool 
should be presented so prominently in the documentation.

Obviously, the memory waste percentage for a mempool holding smaller objects 
will be larger.
With 64 byte objects (+64 byte object header), the waste is 8 bytes in the ring 
driver per 128 bytes in the mempool itself, ca. 6 %.
Still relatively small.

Alternatively, we could keep the advice and change the wording. Something like:

If the mempool uses a ring driver, the optimum size (in terms of memory) is 
when n is a power of two minus one, n = (2^q - 1); otherwise the ring's array 
of pointers to the objects will be larger, as it's size is aligned to the next 
power of 2.

It's difficult to keep the advice short, if we add the fine print to it.

But then the advice is still missing for all the other mempool drivers using 
nonlinear memory allocation.
I'm not sure, but I think the "bucket" driver also uses nonlinear memory 
allocation.

Another alternative:
The mempool drivers or underlying libraries could log a debug message when 
allocating an oversize array for alignment reasons.

> 
> On the other hand, I'd be in favour of removing this text if we
> switched
> the default mempool in DPDK to being stack-based. While the stack may
> not
> be lock-free like the ring, with per-lcore caches the number of
> accesses to
> the stack should be small, and it gives much better cache utilization
> overall - especially in cases where buffers are allocated on one core
> and
> freed on a different one! Even in cases where we are not transferring
> between cores, in a single-core case we still will get better reuse of
> "hot" buffers than in an rte_ring-backed case.

<jokingly>
The lock free stack driver's reference to each object uses 2 pointers, so 
guaranteed waste of memory here.
</jokingly>

Let's discuss changing the default mempool driver some other day, and focus on 
the documentation for now.

I get your point, though.
It makes sense that the documentation for a library considers the underlying 
default implementation.

But if it does, the advice should come with a disclaimer mentioning when it 
applies.

Reply via email to