Hi Samuel,
On 12/04/2026 21:55, Samuel Thibault wrote:
Hello,
Samuel Thibault, le ven. 27 mars 2026 01:28:59 +0100, a ecrit:
It's hard to tell. I have tried to run the mypy build, it's still quite
slow, but it seems faster. Possibly its working set is simply really
large.
I guess it'd be simpler to just test with synthetic benchmarks which
exhibit simple memory access patterns.
It did help. I noticed that on a box with 2G, of which a few hundred MB
are in highmem, the memory loaded in directmem stays there and doesn't
get swapped out, and thus makes the available memory much smaller for
the working set. This was because the LRU list is per-segment, and thus
as long as the eviction finds inactive pages that it can evict from the
highest segment, it will do so, without caring that way less active
pages might be in a lower segment. I have reworked this to keep a global
LRU list, which avoids the issue entirely.
I've thought for a while that something significant had to change with
the pageout and this is certainly a significant change. struct vm_page
has grown in size by 2 pointers, I think. That equates to an extra 32M
used per 8G of RAM which doesn't seem that much if the system
performance benefits significantly.
I had a prototype from a few months ago that did combine all segments
into a single free list. I shelved that "for the future" because I was
failing to come up with a sensible way of finding free pages in specific
segments (eg. for DMA) without adding new list pointers to vm_page. My
test case didn't require allocating memory from other than HIGHMEM so I
was able to run without worrying about that requirement but I did find
there was a very significant performance increase. I'd expect that your
complete solution will perform much better too. Have you tried it with
the heavy build load?
I've finally got back to running the mypy build again to try and
understand more about the cause of the vm_map deadlock. I'll continue to
use the old code for this purpose.
Cheers,
Mike.