I have a memory subsystem design question that I'm hoping someone can answer.

I've been looking at a machine that is completely out of memory, as in

 v_free_count = 0, 
 v_cache_count = 0, 

I wondered how a machine could completely run out of memory like this, 
especially after finding a lack of interrupt storms or other pathologies that 
would tend to overcommit memory. So I started investigating.

Most allocators come down to vm_page_alloc(), which has this guard:

        if ((curproc == pageproc) && (page_req != VM_ALLOC_INTERRUPT)) {
                page_req = VM_ALLOC_SYSTEM;
        };

        if (cnt.v_free_count + cnt.v_cache_count > cnt.v_free_reserved ||
            (page_req == VM_ALLOC_SYSTEM && 
            cnt.v_free_count + cnt.v_cache_count > cnt.v_interrupt_free_min) ||
            (page_req == VM_ALLOC_INTERRUPT &&
            cnt.v_free_count + cnt.v_cache_count > 0)) {

The key observation is if VM_ALLOC_INTERRUPT is set, it will allocate every 
last page.

>From the name one might expect VM_ALLOC_INTERRUPT to be somewhat rare, perhaps 
>only used from interrupt threads. Not so, see kmem_malloc() or 
>uma_small_alloc() which both contain this mapping:

        if ((flags & (M_NOWAIT|M_USE_RESERVE)) == M_NOWAIT)
                pflags = VM_ALLOC_INTERRUPT | VM_ALLOC_WIRED;
        else
                pflags = VM_ALLOC_SYSTEM | VM_ALLOC_WIRED;

Note that M_USE_RESERVE has been deprecated and is used in just a handful of 
places. Also note that lots of code paths come through these routines.

What this means is essentially _any_ allocation using M_NOWAIT will bypass 
whatever reserves have been held back and will take every last page available.

There is no documentation stating M_NOWAIT has this side effect of essentially 
being privileged, so any innocuous piece of code that can't block will use it. 
And of course M_NOWAIT is literally used all over.

It looks to me like the design goal of the BSD allocators is on recovery; it 
will give all pages away knowing it can recover.

Am I missing anything? I would have expected some small number of pages to be 
held in reserve just in case. And I didn't expect M_NOWAIT to be a sort of back 
door for grabbing memory.


Thanks,

-Steve

_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[email protected]"

Reply via email to