> 
>> Whatever leaves the buddy shall be zeroed out. If there is a
>> double-zeroing happen, the latter could get optimized out by checking
>> something like user_alloc_needs_zeroing().
>>
>> See mm/huge_memory.c:vma_alloc_anon_folio_pmd() as an example where we
>> avoid double-zeroing.
> 
> It isn't just reducing double-zeroing to single zeroing. It's about
> avoiding zeroing such pages at all. If a domU is started with
> populate-on-demand, many (sometimes most) of its pages are populated in
> EPT. The idea of PoD is to start guest with high static memory size, but
> low actual allocation and fake it until balloon driver kicks in and make
> the domU really not use more pages than it has. When balloon driver try
> to return those pages to the hypervisor, normally it would just take
> unallocated page one by one and made Linux not use them. But if _any_
> zeroing is happening, each page first needs to be mapped to the guest by
> the hypervisor (one trip through EPT), just to be removed from them a
> moment later...

The same is true for most balloon drivers, including virtio-balloon.

So far nobody really cared about that, though, as init_on_free usually
comes with such a high performance price tag that people in cheap VMs
(where you overcommit etc) don't enable it.

__GFP_BALLOON_OUT is just nasty.

We could probably have a special allocation interface (not exposed to
arbitrary kernel modules) and have things like mm/balloon.c consume that.


IIUC, xen balloon does not use the memory balloon infrastructure,
though. So we'd need some EXPORT_SYMBOL_FOR_MODULES() magic.


Like an

        struct page *alloc_balloon_pages(gfp_t gfp, unsigned int order);

Where we only support a subset of gfp flags, for example, to now having
to deal with mempolicy.

But it needs a bit of code to make it fly, so I am not sure if the page
allocator wants to support that.

-- 
Cheers,

David

Reply via email to