[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-18 Thread Tim Peters
[Tim] >> - For truly effective RAM releasing, we would almost certainly need to >> make major changes, to release RAM at an OS page level. 256K arenas >> were already too fat a granularity. [also Tim] > We can approximate that closely right now by using 4K pools _and_ 4K > arenas: one pool per

[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-18 Thread Tim Peters
[Tim] > - For truly effective RAM releasing, we would almost certainly need to > make major changes, to release RAM at an OS page level. 256K arenas > were already too fat a granularity. We can approximate that closely right now by using 4K pools _and_ 4K arenas: one pool per arena, and mmap()/

[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-18 Thread Tim Peters
And one more random clue. The memcrunch.py attached to the earlier-mentioned bug report does benefit a lot from changing to a "use the arena with the smallest address" heuristic, leaving 86.6% of allocated bytes in use by objects at the end (this is with the arena-thrashing fix, and the current 25

[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-18 Thread Antoine Pitrou
On Mon, 17 Jun 2019 13:44:29 -0500 Tim Peters wrote: > > To illustrate, I reverted that change in my PR and ran exactly same > thing. Wow - _then_ the 1M-arena-16K-pool PR reclaimed 1135(!) arenas > instead of none. Almost all worse than uselessly. The only one that > "paid" was the last: the

[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-17 Thread Tim Peters
[Tim] > ... > Now under 3.7.3. First when phase 10 is done building: > > phase 10 adding 9953410 > phase 10 has 16743920 objects > > # arenas allocated total = 14,485 > # arenas reclaimed =2,020 > # arenas highwater mark=

[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-17 Thread Tim Peters
Heh. I wasn't intending to be nasty, but this program makes our arena recycling look _much_ worse than memcrunch.py does. It cycles through phases. In each phase, it first creates a large randomish number of objects, then deletes half of all objects in existence. Except that every 10th phase, i

[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-17 Thread Tim Peters
[Tim] > ... > Here are some stats from running [memcrunch.py] under > my PR, but using 200 times the initial number of objects > as the original script: > > n = 2000 #number of things > > At the end, with 1M arena and 16K pool: > > 3362 arenas * 1048576 bytes/arena =3,525,312,512 > # b

[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-17 Thread Tim Peters
[Inada Naoki] >> Increasing pool size is one obvious way to fix these problems. >> I think 16KiB pool size and 2MiB (huge page size of x86) arena size is >> a sweet spot for recent web servers (typically, about 32 threads, and >> 64GiB), but there is no evidence about it. [Antoine] > Note that the

[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-17 Thread Inada Naoki
On Mon, Jun 17, 2019 at 6:14 PM Antoine Pitrou wrote: > But it's not enabled by default... And there are reasons for that (see > the manpage I quoted). Uh, then if people want to use huge page, they need to enable it on system wide, or add madvice in obmalloc.c. > > In web applications, it's co

[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-17 Thread Antoine Pitrou
Le 17/06/2019 à 10:55, Inada Naoki a écrit : > On Mon, Jun 17, 2019 at 5:18 PM Antoine Pitrou wrote: >> >> On Mon, 17 Jun 2019 11:15:29 +0900 >> Inada Naoki wrote: >>> >>> Increasing pool size is one obvious way to fix these problems. >>> I think 16KiB pool size and 2MiB (huge page size of x86)

[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-17 Thread Inada Naoki
On Mon, Jun 17, 2019 at 5:18 PM Antoine Pitrou wrote: > > On Mon, 17 Jun 2019 11:15:29 +0900 > Inada Naoki wrote: > > > > Increasing pool size is one obvious way to fix these problems. > > I think 16KiB pool size and 2MiB (huge page size of x86) arena size is > > a sweet spot for recent web serve

[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-17 Thread Antoine Pitrou
On Mon, 17 Jun 2019 11:15:29 +0900 Inada Naoki wrote: > > Increasing pool size is one obvious way to fix these problems. > I think 16KiB pool size and 2MiB (huge page size of x86) arena size is > a sweet spot for recent web servers (typically, about 32 threads, and > 64GiB), but there is no evide

[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-16 Thread Tim Peters
[Inada Naoki ] > obmalloc is very nice at allocating small (~224 bytes) memory blocks. > But it seems current SMALL_REQUEST_THRESHOLD (512) is too large to me. For the "unavoidable memory waste" reasons you spell out here, Vladimir deliberately set the threshold to 256 at the start. As things tur

[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-16 Thread Inada Naoki
obmalloc is very nice at allocating small (~224 bytes) memory blocks. But it seems current SMALL_REQUEST_THRESHOLD (512) is too large to me. ``` >>> pool_size = 4096 - 48 # 48 is pool header size >>> for bs in range(16, 513, 16): ... n,r = pool_size//bs, pool_size%bs + 48 ... print(bs, n,

[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-16 Thread Tim Peters
[Antoine] > We moved from malloc() to mmap() for allocating arenas because of user > requests to release memory more deterministically: > > https://bugs.python.org/issue11849 Which was a good change! As was using VirtualAlloc() on Windows. None of that is being disputed. The change under discuss

[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-16 Thread Antoine Pitrou
On Sat, 15 Jun 2019 22:02:35 -0600 Neil Schemenauer wrote: > On 2019-06-15, Antoine Pitrou wrote: > > We should evaluate what problem we are trying to solve here, instead > > of staring at micro-benchmark numbers on an idle system. > > I think a change to obmalloc is not going to get accepted u

[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-16 Thread Antoine Pitrou
On Sat, 15 Jun 2019 19:56:58 -0500 Tim Peters wrote: > > At the start, obmalloc never returned arenas to the system. The vast > majority of users were fine with that. A relative few weren't. Evan > Jones wrote all the (considerable!) code to change that, and I > massaged it and checked it in -

[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-15 Thread Tim Peters
[Tim] >> At the start, obmalloc never returned arenas to the system. The vast >> majority of users were fine with that. [Neil] > Yeah, I was totally fine with that back in the day. However, I > wonder now if there is a stronger reason to try to free memory back > to the OS. Years ago, people wo

[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-15 Thread Neil Schemenauer
On 2019-06-15, Tim Peters wrote: > At the start, obmalloc never returned arenas to the system. The vast > majority of users were fine with that. Yeah, I was totally fine with that back in the day. However, I wonder now if there is a stronger reason to try to free memory back to the OS. Years ag

[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-15 Thread Neil Schemenauer
On 2019-06-15, Antoine Pitrou wrote: > We should evaluate what problem we are trying to solve here, instead > of staring at micro-benchmark numbers on an idle system. I think a change to obmalloc is not going to get accepted unless we can show it doesn't hurt these micro-benchmarks. To displace t

[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-15 Thread Tim Peters
[Tim. to Neil] >> Moving to bigger pools and bigger arenas are pretty much no-brainers >> for us, [...] [Antoine] > Why "no-brainers"? We're running tests, benchmarks, the Python programs we always run, Python programs that are important to us, staring at obmalloc stats ... and seeing nothing bad

[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-15 Thread Antoine Pitrou
On Sat, 15 Jun 2019 01:15:11 -0500 Tim Peters wrote: > > > ... > > My feeling right now is that Tim's obmalloc-big-pool is the best > > design at this point. Using 8 KB or 16 KB pools seems to be better > > than 4 KB. The extra complexity added by Tim's change is not so > > nice. obmalloc is a

[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-14 Thread Tim Peters
[Neil Schemenauer ] > ... > BTW, the current radix tree doesn't even require that pools are > aligned to POOL_SIZE. We probably want to keep pools aligned > because other parts of obmalloc rely on that. obmalloc relies on it heavily. Another radix tree could map block addresses to all the necess

[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-14 Thread Neil Schemenauer
Here are benchmark results for 64 MB arenas and 16 kB pools. I ran without the --fast option and on a Linux machine in single user mode. The "base" columm is the obmalloc-big-pools branch with ARENA_SIZE = 64 MB and POOL_SIZE = 16 kB. The "radix" column is obmalloc_radix_tree (commit 5e00f6041)

[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-14 Thread Neil Schemenauer
On 2019-06-14, Tim Peters wrote: > However, last I looked there Neil was still using 4 KiB obmalloc > pools, all page-aligned. But using much larger arenas (16 MiB, 16 > times bigger than my branch, and 64 times bigger than Python currently > uses). I was testing it verses your obmalloc-big-pool

[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-14 Thread Neil Schemenauer
On 2019-06-15, Inada Naoki wrote: > Oh, do you mean your branch doesn't have headers in each page? That's right. Each pool still has a header but pools can be larger than the page size. Tim's obmalloc-big-pool idea writes something to the head of each page within a pool. The radix tree doesn't

[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-14 Thread Tim Peters
[Inada Naoki . to Neil S] > Oh, do you mean your branch doesn't have headers in each page? That's probably right ;-) Neil is using a new data structure, a radix tree implementing a sparse set of arena addresses. Within obmalloc pools, which can be of any multiple-of-4KiB (on a 64-bit box) size,

[Python-Dev] Re: radix tree arena map for obmalloc

2019-06-14 Thread Inada Naoki
Oh, do you mean your branch doesn't have headers in each page? https://bugs.python.org/issue32846 As far as I remember, this bug was caused by cache thrashing (page header is aligned by 4K, so cache line can conflict often.) Or this bug can be caused by O(N) free() which is fixed already. I'll s