Convert vma_alloc_anon_folio_pmd() to pass __GFP_ZERO instead of zeroing at the callsite.
Pass the exact fault address (not PMD-aligned) to vma_alloc_folio() to ensure the cache locality optimization in folio_zero_user() works correctly. The NUMA interleave index computation already shifts by PAGE_SHIFT + order, so the unmasked address gives the same result. Signed-off-by: Michael S. Tsirkin <[email protected]> Assisted-by: Claude:claude-opus-4-6 --- mm/huge_memory.c | 12 ++---------- 1 file changed, 2 insertions(+), 10 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 8e2746ea74ad..3f2a868cf9e9 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1256,11 +1256,11 @@ EXPORT_SYMBOL_GPL(thp_get_unmapped_area); static struct folio *vma_alloc_anon_folio_pmd(struct vm_area_struct *vma, unsigned long addr) { - gfp_t gfp = vma_thp_gfp_mask(vma); + gfp_t gfp = vma_thp_gfp_mask(vma) | __GFP_ZERO; const int order = HPAGE_PMD_ORDER; struct folio *folio; - folio = vma_alloc_folio(gfp, order, vma, addr & HPAGE_PMD_MASK); + folio = vma_alloc_folio(gfp, order, vma, addr); if (unlikely(!folio)) { count_vm_event(THP_FAULT_FALLBACK); @@ -1279,14 +1279,6 @@ static struct folio *vma_alloc_anon_folio_pmd(struct vm_area_struct *vma, } folio_throttle_swaprate(folio, gfp); - /* - * When a folio is not zeroed during allocation (__GFP_ZERO not used) - * or user folios require special handling, folio_zero_user() is used to - * make sure that the page corresponding to the faulting address will be - * hot in the cache after zeroing. - */ - if (user_alloc_needs_zeroing()) - folio_zero_user(folio, addr); /* * The memory barrier inside __folio_mark_uptodate makes sure that * folio_zero_user writes become visible before the set_pmd_at() -- MST

