On Mon, Jun 5, 2017 at 8:10 PM, Tim Peters wrote:
> [Tim]
> >> So at most 9 arenas ("highwater mark") were ever simultaneously
> allocated [by the
> >> time the REPL prompt appeared in a 64-bit 3.6.1]..
>
> > ... though not completely off-base.
>
> Yes, 9 is in the ballpark of 16.
>
>
> I think
[Larry]
> ...
> Oh! I thought it also allocated the arenas themselves, in a loop. I
> thought I saw that somewhere. Happy to be proved wrong...
There is a loop in `new_arena()`, but it doesn't do what a casual
glance may assume it's doing ;-) It's actually looping over the
newly-allocated teen
On 06/04/2017 01:18 PM, Tim Peters wrote:
[Larry Hastings ]
...
Yet CPython's memory consumption continues to grow. By the time a current
"trunk" build of CPython reaches the REPL prompt it's already allocated 16
arenas.
I'd be surprised if that's true ;-) The first time `new_arena()` is
cal
On Fri, Jun 2, 2017 at 12:33 PM Larry Hastings wrote:
>
> On 06/02/2017 02:46 AM, Victor Stinner wrote:
>
> I would be curious of another test: use pymalloc for objects larger
> than 512 bytes. For example, allocate up to 4 KB?
>
> In the past, we already changed the maximum size from 256 to 512
[Larry Hastings ]
> ...
> Yet CPython's memory consumption continues to grow. By the time a current
> "trunk" build of CPython reaches the REPL prompt it's already allocated 16
> arenas.
I'd be surprised if that's true ;-) The first time `new_arena()` is
called, it allocates space for a vector o
[Tim]
>> ... That is, it's up to the bit vector implementation
>> to intelligently represent what's almost always going to be a
>> relatively tiny slice of a theoretically massive address space.
[Antoine]
> True. That works if the operating system doesn't go too wild in
> address space randomizat
On Sun, 4 Jun 2017 09:46:10 -0500
Tim Peters wrote:
> [Tim]
> >> A virtual address space span of a terabyte could hold 1M pools, so
> >> would "only" need a 1M/8 = 128KB bit vector. That's minor compared to
> >> a terabyte (one bit per megabyte).
>
> [Antoine]
> > The virtual address space cur
[Tim]
>> A virtual address space span of a terabyte could hold 1M pools, so
>> would "only" need a 1M/8 = 128KB bit vector. That's minor compared to
>> a terabyte (one bit per megabyte).
[Antoine]
> The virtual address space currently supported by x86-64 is 48 bits
> wide (spanning an address spa
On Sat, 3 Jun 2017 21:46:09 -0500
Tim Peters wrote:
>
> A virtual address space span of a terabyte could hold 1M pools, so
> would "only" need a 1M/8 = 128KB bit vector. That's minor compared to
> a terabyte (one bit per megabyte).
The virtual address space currently supported by x86-64 is 48 b
For fun, let's multiply everything by 256:
- A "pool" becomes 1 MB.
- An "arena" becomes 64 MB.
As briefly suggested before, then for any given size class a pool
could pass out hundreds of times more objects before needing to fall
back on the slower code creating new pools or new arenas.
As an a
On Fri, 2 Jun 2017 12:31:19 -0700
Larry Hastings wrote:
>
> Anyway, I'm not super excited by the prospect of using obmalloc for
> larger objects. There's an inverse relation between the size of
> allocation and the frequency of allocation. In Python there are lots of
> tiny allocations, but
On 06/02/2017 02:38 AM, Antoine Pitrou wrote:
I hope those are not the actual numbers you're intending to use ;-)
I still think that allocating more than 1 or 2MB at once would be
foolish. Remember this is data that's going to be carved up into
(tens of) thousands of small objects. Large objec
On 06/02/2017 12:09 PM, Tim Peters wrote:
I should note that Py_ADDRESS_IN_RANGE is my code - this isn't a
backhanded swipe at someone else.
One minor note. During the development of 3.6, CPython started
permitting some C99-isms, including static inline functions.
Py_ADDRESS_IN_RANGE was th
On 06/02/2017 02:46 AM, Victor Stinner wrote:
I would be curious of another test: use pymalloc for objects larger
than 512 bytes. For example, allocate up to 4 KB?
In the past, we already changed the maximum size from 256 to 512 to
support most common Python objects on 64-bit platforms. Since
[Tim]
>> While I would like to increase the pool size, it's fraught with
>> danger.
[Antoine Pitrou ]
> What would be the point of increasing the pool size? Apart from being
> able to allocate 4KB objects out of it, I mean.
>
> Since 4KB+ objects are relatively uncommon (I mean we don't allocate
On Fri, 2 Jun 2017 13:23:05 -0500
Tim Peters wrote:
>
> While I would like to increase the pool size, it's fraught with
> danger.
What would be the point of increasing the pool size? Apart from being
able to allocate 4KB objects out of it, I mean.
Since 4KB+ objects are relatively uncommon (I
[INADA Naoki ]
> ...
> Since current pool size is 4KB and there is pool_header in pool,
> we can't allocate 4KB block from pool.
> And if support 1KB block, only 3KB of 4KB can be actually used.
> I think 512 bytes / 4KB (1/8) is good ratio.
>
> Do you mean increase pool size?
>
> How about adding
> I would be curious of another test: use pymalloc for objects larger
> than 512 bytes. For example, allocate up to 4 KB?
Since current pool size is 4KB and there is pool_header in pool,
we can't allocate 4KB block from pool.
And if support 1KB block, only 3KB of 4KB can be actually used.
I think
I would be curious of another test: use pymalloc for objects larger
than 512 bytes. For example, allocate up to 4 KB?
In the past, we already changed the maximum size from 256 to 512 to
support most common Python objects on 64-bit platforms. Since Python
objects contain many pointers: switching fr
On Thu, 1 Jun 2017 20:21:12 -0700
Larry Hastings wrote:
> On 06/01/2017 02:50 AM, Antoine Pitrou wrote:
> > Another possible strategy is: allocate several arenas at once (using a
> > larger mmap() call), and use MADV_DONTNEED to relinquish individual
> > arenas.
>
> Thus adding a *fourth* layer
On 06/01/2017 02:50 AM, Antoine Pitrou wrote:
Another possible strategy is: allocate several arenas at once (using a
larger mmap() call), and use MADV_DONTNEED to relinquish individual
arenas.
Thus adding a *fourth* layer of abstraction over memory we get from the OS?
block -> pool -> aren
On Thursday 01 June 2017 01:27 PM, Victor Stinner wrote:
> The GNU libc malloc uses a variable threshold to choose between sbrk()
> (heap memory) or mmap(). It starts at 128 kB or 256 kB, and then is
> adapted depending on the workload (I don't know how exactly).
The threshold starts at 128K and i
On 06/01/2017 02:03 AM, Victor Stinner wrote:
2017-06-01 10:41 GMT+02:00 Larry Hastings :
On 06/01/2017 01:19 AM, Antoine Pitrou wrote:
If you'd like to go that way anyway, I would suggest 1MB as a starting
point in 3.7.
I understand the desire for caution. But I was hoping maybe we could
It seems very complex and not portable at all to "free" a part of an
arena. We already support freeing a whole arena using munmap(). I was
a huge enhancement in our memory allocator. Change made in Python 2.5?
I don't recall, ask Evan Jones:
http://www.evanjones.ca/memoryallocator/ :-)
I'm not sur
On Thursday 01 June 2017 01:53 PM, INADA Naoki wrote:
> * On Linux, madvice(..., MADV_DONTNEED) can be used.
madvise does not reduce the commit charge in the Linux kernel, so in
high consumption scenarios (and where memory overcommit is disabled or
throttled) you'll see programs dying with OOM des
On Thu, 1 Jun 2017 18:37:17 +0900
INADA Naoki wrote:
> x86's hugepage is 2MB.
> And some Linux enables "Transparent Huge Page" feature.
>
> Maybe, 2MB arena size is better for TLB efficiency.
> Especially, for servers having massive memory.
But, since Linux is able to merge pages transparently,
For the ARENA_SIZE, will that be better to setting by ./configure first,
and without hard code in c files?
2017-06-01 17:37 GMT+08:00 INADA Naoki :
> x86's hugepage is 2MB.
> And some Linux enables "Transparent Huge Page" feature.
>
> Maybe, 2MB arena size is better for TLB efficiency.
> Especiall
x86's hugepage is 2MB.
And some Linux enables "Transparent Huge Page" feature.
Maybe, 2MB arena size is better for TLB efficiency.
Especially, for servers having massive memory.
On Thu, Jun 1, 2017 at 4:38 PM, Larry Hastings wrote:
>
>
> When CPython's small block allocator was first merged in
On Thu, Jun 1, 2017 at 10:45 AM, Larry Hastings wrote:
> On 06/01/2017 01:41 AM, Larry Hastings wrote:
>
> On 06/01/2017 01:19 AM, Antoine Pitrou wrote:
>
> malloc() you said? Arenas are allocated using mmap() nowadays, right?
>
> malloc() and free(). See _PyObject_ArenaMalloc (etc) in
> Object
Thanks for detailed info.
But I don't think it's a big problem.
Arenas are returned to system by chance. So other processes
shouldn't relying to it.
And I don't propose to stop returning arena to system.
I just mean per pool (part of arena) MADV_DONTNEED can reduce RSS.
If we use very large are
> * On Linux, madvice(..., MADV_DONTNEED) can be used.
Recent Linux has MADV_FREE. It is faster than MADV_DONTNEED,
https://lwn.net/Articles/591214/
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Un
2017-06-01 10:41 GMT+02:00 Larry Hastings :
> On 06/01/2017 01:19 AM, Antoine Pitrou wrote:
> If you'd like to go that way anyway, I would suggest 1MB as a starting
> point in 3.7.
>
> I understand the desire for caution. But I was hoping maybe we could
> experiment with 4mb in trunk for a while?
On Thu, 1 Jun 2017 01:41:15 -0700
Larry Hastings wrote:
> On 06/01/2017 01:19 AM, Antoine Pitrou wrote:
> > If you'd like to go that way anyway, I would suggest 1MB as a starting
> > point in 3.7.
>
> I understand the desire for caution. But I was hoping maybe we could
> experiment with 4mb i
On 06/01/2017 01:41 AM, Larry Hastings wrote:
On 06/01/2017 01:19 AM, Antoine Pitrou wrote:
malloc() you said? Arenas are allocated using mmap() nowadays, right?
malloc() and free(). See _PyObject_ArenaMalloc (etc) in
Objects/obmalloc.c.
Oh, sorry, I forgot how to read. If ARENAS_USE_MMAP
On 06/01/2017 01:19 AM, Antoine Pitrou wrote:
If you'd like to go that way anyway, I would suggest 1MB as a starting
point in 3.7.
I understand the desire for caution. But I was hoping maybe we could
experiment with 4mb in trunk for a while? We could change it to 1mb--or
even 256k--before b
2017-06-01 10:23 GMT+02:00 INADA Naoki :
> AFAIK, allocating arena doesn't eat real (physical) memory.
>
> * On Windows, VirtualAlloc is used for arena. Real memory page is assigned
> when the page is used first time.
> * On Linux and some other *nix, anonymous mmap is used. Real page is
> as
2017-06-01 10:19 GMT+02:00 Antoine Pitrou :
> Yes, this is the same kind of reason the default page size is still 4KB
> on many platforms today, despite typical memory size having grown by a
> huge amount. Apart from the cost of fragmentation as you mentioned,
> another issue is when many small Py
Hello.
AFAIK, allocating arena doesn't eat real (physical) memory.
* On Windows, VirtualAlloc is used for arena. Real memory page is assigned
when the page is used first time.
* On Linux and some other *nix, anonymous mmap is used. Real page is
assigned when first touch, like Windows.
Aren
On Thu, 1 Jun 2017 00:38:09 -0700
Larry Hastings wrote:
> * CPython programs would use more memory. How much? Hard to say. It
> depends on their allocation strategy. I suspect most of the time it
> would be < 3mb additional memory. But for pathological allocation
> strategies th
On 06/01/2017 12:57 AM, Victor Stinner wrote:
I would prefer to have an adaptative arena size. For example start at
256 kB and then double the arena size while the memory usage grows.
The problem is that pymalloc is currently designed for a fixed arena
size. I have no idea how hard it would be to
2017-06-01 9:38 GMT+02:00 Larry Hastings :
> When CPython's small block allocator was first merged in late February 2001,
> it allocated memory in gigantic chunks it called "arenas". These arenas
> were a massive 256 KILOBYTES apiece.
The arena size defines the strict minimum memory usage of Pyth
On 06/01/2017 12:38 AM, Larry Hastings wrote:
I propose we make the arena size larger. By how much? I asked Victor
to run some benchmarks with arenas of 1mb, 2mb, and 4mb. The results
with 1mb and 2mb were mixed, but his benchmarks with a 4mb arena size
showed measurable (>5%) speedups on t
When CPython's small block allocator was first merged in late February
2001, it allocated memory in gigantic chunks it called "arenas". These
arenas were a massive 256 KILOBYTES apiece.
This tunable parameter has not been touched in the intervening 16
years. Yet CPython's memory consumpti
43 matches
Mail list logo