ers.
Your proposal here suggests modifying hugetlb so that it can be used in
a new way (use case) by KVM's guest_mem. As such it really seems like
something that should be done in a separate filesystem/driver/allocator.
You will likely not get much support for modifying hugetlb.
--
Mike Krav
te hugetlb pages. This will require different alignment
and size requirements on the UDMABUF_CREATE API.
[1]
https://lore.kernel.org/linux-mm/20230512072036.1027784-1-junxiao.ch...@intel.com/
Fixes: 16c243e99d33 ("udmabuf: Add support for mapping hugepages (v4)")
Cc:
Signed-off-by: Mike K
art to fail.
>> I assume that is acceptable/expected behavior. Correct?
>> On some systems, hugetlb pages are a precious resource and the sysadmin
>> carefully configures the number needed by applications. Removing a hugetlb
>> page (even for a very short period of time) could cause serious application
>> failure.
>
> That' true, especially for 1G pages. Any suggestions?
> Let the hugepage allocator be aware of this situation and retry ?
I would hate to add that complexity to the allocator.
This question is likely based on my lack of understanding of virtio-balloon
usage and this reporting mechanism. But, why do the hugetlb pages have to
be 'temporarily' allocated for reporting purposes?
--
Mike Kravetz
hat can be allocated from the buddy is
(MAX_ORDER - 1). So, the check should be '>='.
--
Mike Kravetz
zone lock and resume processing */
> + spin_lock_irq(&hugetlb_lock);
> +
> + /* flush reported pages from the sg list */
> + hugepage_reporting_drain(prdev, h, sgl,
> + HUGEPAGE_REPORTING_CAPACITY, !ret);
> +
> + /*
> + * Reset next to first entry, the old next isn't valid
> + * since we dropped the lock to report the pages
> + */
> + next = list_first_entry(list, struct page, lru);
> +
> + /* exit on error */
> + if (ret)
> + break;
> + }
> +
> + /* Rotate any leftover pages to the head of the freelist */
> + if (&next->lru != list && !list_is_first(&next->lru, list))
> + list_rotate_to_front(&next->lru, list);
> +
> + spin_unlock_irq(&hugetlb_lock);
> +
> + return ret;
> +}
--
Mike Kravetz
mory accesses and notifying QEMU which makes the mprotect
>> using unacceptable.
>>
>> Protected memory accesses tracking can be done via userfaultfd's WP mode
>> which isn't available right now.
>>
>> So, the reasonable conclusion is to wait until the WP mode is available and
>> build the background snapshot on top of userfaultfd-wp.
>> But, works on adding the WP-mode is pending for a quite a long time already.
>>
>> Is there any way to estimate when it could be available?
>
> I think a question is whether anyone is actively working on it; I
> suspect really it's on a TODO list rather than moving at the moment.
>
I am not working on it, and it is not on my TODO list.
However, if someone starts making progress I will jump in and work on
hugetlbfs support. My intention would be to not let hugetlbfs support
'fall behind' general uffd support.
--
Mike Kravetz
es something similar today.
When the copy is done (or aborted) we then create/convert a new vma for
the huge page and merge it into the target vma(s).
Not sure if that would be any easier. It was just the first thing that
popped into my head.
--
Mike Kravetz
On 03/14/2017 11:37 AM, Andrea Arcangeli wrote:
> Hello,
>
> On Wed, Mar 08, 2017 at 05:30:55PM -0800, Mike Kravetz wrote:
>> On 01/10/2017 03:02 PM, Mike Kravetz wrote:
>>> Another more concrete topic is hugetlb reservations. Michal Hocko
>>> proposed the t
for hugetlbfs was there
to make the code common with the anon version. The use case I had was to
simply 'catch' no page hugetlbfs faults private -or- shared. That is why
you can register hugetlbfs shared regions.
I can take a look at what it would take to enable copy, and agree with Andrea
that it should be relatively easy.
--
Mike Kravetz