On Thu, Dec 18, 2025 at 02:01:56PM +0100, David Hildenbrand (Red Hat) wrote: > On 12/15/25 06:30, Barry Song wrote: > > From: Barry Song <[email protected]> > > > > In many cases, the pages passed to vmap() may include high-order > > pages allocated with __GFP_COMP flags. For example, the systemheap > > often allocates pages in descending order: order 8, then 4, then 0. > > Currently, vmap() iterates over every page individually—even pages > > inside a high-order block are handled one by one. > > > > This patch detects high-order pages and maps them as a single > > contiguous block whenever possible. > > > > An alternative would be to implement a new API, vmap_sg(), but that > > change seems to be large in scope. > > > > When vmapping a 128MB dma-buf using the systemheap, this patch > > makes system_heap_do_vmap() roughly 17× faster. > > > > W/ patch: > > [ 10.404769] system_heap_do_vmap took 2494000 ns > > [ 12.525921] system_heap_do_vmap took 2467008 ns > > [ 14.517348] system_heap_do_vmap took 2471008 ns > > [ 16.593406] system_heap_do_vmap took 2444000 ns > > [ 19.501341] system_heap_do_vmap took 2489008 ns > > > > W/o patch: > > [ 7.413756] system_heap_do_vmap took 42626000 ns > > [ 9.425610] system_heap_do_vmap took 42500992 ns > > [ 11.810898] system_heap_do_vmap took 42215008 ns > > [ 14.336790] system_heap_do_vmap took 42134992 ns > > [ 16.373890] system_heap_do_vmap took 42750000 ns > > > > That's quite a speedup. > > > Cc: David Hildenbrand <[email protected]> > > Cc: Uladzislau Rezki <[email protected]> > > Cc: Sumit Semwal <[email protected]> > > Cc: John Stultz <[email protected]> > > Cc: Maxime Ripard <[email protected]> > > Tested-by: Tangquan Zheng <[email protected]> > > Signed-off-by: Barry Song <[email protected]> > > --- > > * diff with rfc: > > Many code refinements based on David's suggestions, thanks! > > Refine comment and changelog according to Uladzislau, thanks! > > rfc link: > > https://lore.kernel.org/linux-mm/[email protected]/ > > > > mm/vmalloc.c | 45 +++++++++++++++++++++++++++++++++++++++------ > > 1 file changed, 39 insertions(+), 6 deletions(-) > > > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > > index 41dd01e8430c..8d577767a9e5 100644 > > --- a/mm/vmalloc.c > > +++ b/mm/vmalloc.c > > @@ -642,6 +642,29 @@ static int vmap_small_pages_range_noflush(unsigned > > long addr, unsigned long end, > > return err; > > } > > +static inline int get_vmap_batch_order(struct page **pages, > > + unsigned int stride, unsigned int max_steps, unsigned int idx) > > +{ > > + int nr_pages = 1; > > unsigned int, maybe > > Why are you initializing nr_pages when you overwrite it below? > > > + > > + /* > > + * Currently, batching is only supported in vmap_pages_range > > + * when page_shift == PAGE_SHIFT. > > I don't know the code so realizing how we go from page_shift to stride too > me a second. Maybe only talk about stride here? > > OTOH, is "stride" really the right terminology? > > we calculate it as > > stride = 1U << (page_shift - PAGE_SHIFT); > > page_shift - PAGE_SHIFT should give us an "order". So is this a > "granularity" in nr_pages? > > Again, I don't know this code, so sorry for the question. > To me "stride" also sounds unclear.
-- Uladzislau Rezki
