On Fri, Apr 25, 2025 at 10:27:23AM +0200, Vlastimil Babka wrote: > Add functions for efficient guaranteed allocations e.g. in a critical > section that cannot sleep, when the exact number of allocations is not > known beforehand, but an upper limit can be calculated. > > kmem_cache_prefill_sheaf() returns a sheaf containing at least given > number of objects. > > kmem_cache_alloc_from_sheaf() will allocate an object from the sheaf > and is guaranteed not to fail until depleted. > > kmem_cache_return_sheaf() is for giving the sheaf back to the slab > allocator after the critical section. This will also attempt to refill > it to cache's sheaf capacity for better efficiency of sheaves handling, > but it's not stricly necessary to succeed. > > kmem_cache_refill_sheaf() can be used to refill a previously obtained > sheaf to requested size. If the current size is sufficient, it does > nothing. If the requested size exceeds cache's sheaf_capacity and the > sheaf's current capacity, the sheaf will be replaced with a new one, > hence the indirect pointer parameter. > > kmem_cache_sheaf_size() can be used to query the current size. > > The implementation supports requesting sizes that exceed cache's > sheaf_capacity, but it is not efficient - such "oversize" sheaves are > allocated fresh in kmem_cache_prefill_sheaf() and flushed and freed > immediately by kmem_cache_return_sheaf(). kmem_cache_refill_sheaf() > might be especially ineffective when replacing a sheaf with a new one of > a larger capacity. It is therefore better to size cache's > sheaf_capacity accordingly to make oversize sheaves exceptional. > > CONFIG_SLUB_STATS counters are added for sheaf prefill and return > operations. A prefill or return is considered _fast when it is able to > grab or return a percpu spare sheaf (even if the sheaf needs a refill to > satisfy the request, as those should amortize over time), and _slow > otherwise (when the barn or even sheaf allocation/freeing has to be > involved). sheaf_prefill_oversize is provided to determine how many > prefills were oversize (counter for oversize returns is not necessary as > all oversize refills result in oversize returns). > > When slub_debug is enabled for a cache with sheaves, no percpu sheaves > exist for it, but the prefill functionality is still provided simply by > all prefilled sheaves becoming oversize. If percpu sheaves are not > created for a cache due to not passing the sheaf_capacity argument on > cache creation, the prefills also work through oversize sheaves, but > there's a WARN_ON_ONCE() to indicate the omission. > > Signed-off-by: Vlastimil Babka <[email protected]> > Reviewed-by: Suren Baghdasaryan <[email protected]> > ---
Looks good to me, Reviewed-by: Harry Yoo <[email protected]> with a nit below. > +/* > + * Use this to return a sheaf obtained by kmem_cache_prefill_sheaf() > + * > + * If the sheaf cannot simply become the percpu spare sheaf, but there's > space > + * for a full sheaf in the barn, we try to refill the sheaf back to the > cache's > + * sheaf_capacity to avoid handling partially full sheaves. > + * > + * If the refill fails because gfp is e.g. GFP_NOWAIT, or the barn is full, > the > + * sheaf is instead flushed and freed. > + */ > +void kmem_cache_return_sheaf(struct kmem_cache *s, gfp_t gfp, > + struct slab_sheaf *sheaf) > +{ > + struct slub_percpu_sheaves *pcs; > + bool refill = false; > + struct node_barn *barn; > + > + if (unlikely(sheaf->capacity != s->sheaf_capacity)) { > + sheaf_flush_unused(s, sheaf); > + kfree(sheaf); > + return; > + } > + > + local_lock(&s->cpu_sheaves->lock); > + pcs = this_cpu_ptr(s->cpu_sheaves); > + > + if (!pcs->spare) { > + pcs->spare = sheaf; > + sheaf = NULL; > + stat(s, SHEAF_RETURN_FAST); > + } else if (data_race(pcs->barn->nr_full) < MAX_FULL_SHEAVES) { > + barn = pcs->barn; > + refill = true; > + } > + > + local_unlock(&s->cpu_sheaves->lock); > + > + if (!sheaf) > + return; > + > + stat(s, SHEAF_RETURN_SLOW); > + > + /* > + * if the barn is full of full sheaves or we fail to refill the sheaf, > + * simply flush and free it > + */ > + if (!refill || refill_sheaf(s, sheaf, gfp)) { > + sheaf_flush_unused(s, sheaf); > + free_empty_sheaf(s, sheaf); > + return; > + } > + > + /* we racily determined the sheaf would fit, so now force it */ > + barn_put_full_sheaf(barn, sheaf); > + stat(s, BARN_PUT); > +} nit: as accessing pcs->barn outside local_lock is safe (it does not go away until the cache is destroyed...), this could be simplified a little bit: diff --git a/mm/slub.c b/mm/slub.c index 2bf83e2b85b2..4e1daba4d13e 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -5043,7 +5043,6 @@ void kmem_cache_return_sheaf(struct kmem_cache *s, gfp_t gfp, struct slab_sheaf *sheaf) { struct slub_percpu_sheaves *pcs; - bool refill = false; struct node_barn *barn; if (unlikely(sheaf->capacity != s->sheaf_capacity)) { @@ -5059,9 +5058,6 @@ void kmem_cache_return_sheaf(struct kmem_cache *s, gfp_t gfp, pcs->spare = sheaf; sheaf = NULL; stat(s, SHEAF_RETURN_FAST); - } else if (data_race(pcs->barn->nr_full) < MAX_FULL_SHEAVES) { - barn = pcs->barn; - refill = true; } local_unlock(&s->cpu_sheaves->lock); @@ -5071,17 +5067,19 @@ void kmem_cache_return_sheaf(struct kmem_cache *s, gfp_t gfp, stat(s, SHEAF_RETURN_SLOW); + /* Accessing pcs->barn outside local_lock is safe */ + barn = pcs->barn; + /* * if the barn is full of full sheaves or we fail to refill the sheaf, * simply flush and free it */ - if (!refill || refill_sheaf(s, sheaf, gfp)) { + if (data_race(barn->nr_full) >= MAX_FULL_SHEAVES || + refill_sheaf(s, sheaf, gfp)) { sheaf_flush_unused(s, sheaf); free_empty_sheaf(s, sheaf); - return; } - /* we racily determined the sheaf would fit, so now force it */ barn_put_full_sheaf(barn, sheaf); stat(s, BARN_PUT); } -- Cheers, Harry / Hyeonggon

