Re: [Cython] [cython-users] freelist benchmarks

2013-02-24 Thread Stefan Behnel
mark florisson, 24.02.2013 15:50:
> On 24 February 2013 13:52, Stefan Behnel wrote:
>> for those who haven't notice my other e-mail, I implemented a new extension
>> type decorator "@cython.freelist(N)" that replaces the normal object
>> creation and deallocation with a freelist of N recently freed objects.
>> Currently, it's only supported for types that do not have a base class (and
>> lifting that restriction is not all that easy).
> 
> Very cool! I've been wanting that for a while now :)

So did I.


> What's the hurdle with base classes?

The problem is that the current way types are being instantiated is
recursive. The top-most base class calls tp_alloc() and then each step in
the hierarchy adds its bit of initialisation. If you want to introduce a
freelist into this scheme, then it's still the top-most class that does the
allocation, so it would need to manage all freelists of all of its children
in order to return the right object struct for a given instantiation request.

This cannot really be done at compile time. Imagine a subtype in a
different module, for example, for which the code requests a freelist. The
compilation of the base type wouldn't even know that it's supposed to
manage a freelist at all, only the subtype knows it.

There are a couple of ways to deal with this. One is to replicate the
freelist in the base type for all subtypes that it finds at runtime. That
might actually be the easiest way to do it, but it requires a bit of memory
management in order to add a new freelist when a new subtype is found at
runtime. It also means that we'd have to find the right freelist before we
can get an object from it (or not, if it's empty), which would likely be an
operation that's linear with the number of subtypes. And the freelist set
would better be bounded in size to prevent users from flooding it with lots
and lots of subtypes.

Another option would be to split the initialisation up into two functions,
one that allocates *and* initialises the instance and one that *only*
initialises it. That would allow each hierarchy level to manage its own
freelists and to take its own decision about where to get the object from.
This approach comes with a couple of tricky details, as CPython doesn't
provide support for this. So we'd need to find a way to handle type
hierarchies that are implemented across modules.

Maybe the best approach would be to let the base type manage everything and
just statically limit the maximum number of subtypes for which it provides
separate freelists, at a first come, first serve basis. And the freelist
selection could be based on the object struct size (tp_basicsize) instead
of the specific type. As long as we don't support inheriting from variable
size objects (like tuple/bytes), that would cut down the problem quite
nicely. I think I should just give it a try at some point.

Stefan

___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] [cython-users] freelist benchmarks

2013-02-24 Thread mark florisson
On 24 February 2013 17:50, mark florisson  wrote:
> On 24 February 2013 15:58, Stefan Behnel  wrote:
>> mark florisson, 24.02.2013 15:50:
>>> On 24 February 2013 13:52, Stefan Behnel wrote:
 for those who haven't notice my other e-mail, I implemented a new extension
 type decorator "@cython.freelist(N)" that replaces the normal object
 creation and deallocation with a freelist of N recently freed objects.
 Currently, it's only supported for types that do not have a base class (and
 lifting that restriction is not all that easy).
>>>
>>> Very cool! I've been wanting that for a while now :)
>>
>> So did I.
>>
>>
>>> What's the hurdle with base classes?
>>
>> The problem is that the current way types are being instantiated is
>> recursive. The top-most base class calls tp_alloc() and then each step in
>> the hierarchy adds its bit of initialisation. If you want to introduce a
>> freelist into this scheme, then it's still the top-most class that does the
>> allocation, so it would need to manage all freelists of all of its children
>> in order to return the right object struct for a given instantiation request.
>>
>> This cannot really be done at compile time. Imagine a subtype in a
>> different module, for example, for which the code requests a freelist. The
>> compilation of the base type wouldn't even know that it's supposed to
>> manage a freelist at all, only the subtype knows it.
>>
>> There are a couple of ways to deal with this. One is to replicate the
>> freelist in the base type for all subtypes that it finds at runtime. That
>> might actually be the easiest way to do it, but it requires a bit of memory
>> management in order to add a new freelist when a new subtype is found at
>> runtime. It also means that we'd have to find the right freelist before we
>> can get an object from it (or not, if it's empty), which would likely be an
>> operation that's linear with the number of subtypes. And the freelist set
>> would better be bounded in size to prevent users from flooding it with lots
>> and lots of subtypes.
>>
>> Another option would be to split the initialisation up into two functions,
>> one that allocates *and* initialises the instance and one that *only*
>> initialises it. That would allow each hierarchy level to manage its own
>> freelists and to take its own decision about where to get the object from.
>> This approach comes with a couple of tricky details, as CPython doesn't
>> provide support for this. So we'd need to find a way to handle type
>> hierarchies that are implemented across modules.
>
> Thanks for the explanation Stefan, this is the one I was thinking of,
> but I suppose it'd need an extra pointer to the pure init function in
> the type.

Hm, since extension types don't do multiple inheritance (and excluding
Python subclasses), couldn't you import those init functions across
modules through capsules?

>> Maybe the best approach would be to let the base type manage everything and
>> just statically limit the maximum number of subtypes for which it provides
>> separate freelists, at a first come, first serve basis. And the freelist
>> selection could be based on the object struct size (tp_basicsize) instead
>> of the specific type. As long as we don't support inheriting from variable
>> size objects (like tuple/bytes), that would cut down the problem quite
>> nicely. I think I should just give it a try at some point.
>
> What about using pyextensible type from SEP 200 and using a custom
> freelist entry on the type?
>
>> Stefan
>>
>> --
>>
>> ---
>> You received this message because you are subscribed to the Google Groups 
>> "cython-users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to cython-users+unsubscr...@googlegroups.com.
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>>
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] [cython-users] freelist benchmarks

2013-02-24 Thread mark florisson
On 24 February 2013 15:58, Stefan Behnel  wrote:
> mark florisson, 24.02.2013 15:50:
>> On 24 February 2013 13:52, Stefan Behnel wrote:
>>> for those who haven't notice my other e-mail, I implemented a new extension
>>> type decorator "@cython.freelist(N)" that replaces the normal object
>>> creation and deallocation with a freelist of N recently freed objects.
>>> Currently, it's only supported for types that do not have a base class (and
>>> lifting that restriction is not all that easy).
>>
>> Very cool! I've been wanting that for a while now :)
>
> So did I.
>
>
>> What's the hurdle with base classes?
>
> The problem is that the current way types are being instantiated is
> recursive. The top-most base class calls tp_alloc() and then each step in
> the hierarchy adds its bit of initialisation. If you want to introduce a
> freelist into this scheme, then it's still the top-most class that does the
> allocation, so it would need to manage all freelists of all of its children
> in order to return the right object struct for a given instantiation request.
>
> This cannot really be done at compile time. Imagine a subtype in a
> different module, for example, for which the code requests a freelist. The
> compilation of the base type wouldn't even know that it's supposed to
> manage a freelist at all, only the subtype knows it.
>
> There are a couple of ways to deal with this. One is to replicate the
> freelist in the base type for all subtypes that it finds at runtime. That
> might actually be the easiest way to do it, but it requires a bit of memory
> management in order to add a new freelist when a new subtype is found at
> runtime. It also means that we'd have to find the right freelist before we
> can get an object from it (or not, if it's empty), which would likely be an
> operation that's linear with the number of subtypes. And the freelist set
> would better be bounded in size to prevent users from flooding it with lots
> and lots of subtypes.
>
> Another option would be to split the initialisation up into two functions,
> one that allocates *and* initialises the instance and one that *only*
> initialises it. That would allow each hierarchy level to manage its own
> freelists and to take its own decision about where to get the object from.
> This approach comes with a couple of tricky details, as CPython doesn't
> provide support for this. So we'd need to find a way to handle type
> hierarchies that are implemented across modules.

Thanks for the explanation Stefan, this is the one I was thinking of,
but I suppose it'd need an extra pointer to the pure init function in
the type.

> Maybe the best approach would be to let the base type manage everything and
> just statically limit the maximum number of subtypes for which it provides
> separate freelists, at a first come, first serve basis. And the freelist
> selection could be based on the object struct size (tp_basicsize) instead
> of the specific type. As long as we don't support inheriting from variable
> size objects (like tuple/bytes), that would cut down the problem quite
> nicely. I think I should just give it a try at some point.

What about using pyextensible type from SEP 200 and using a custom
freelist entry on the type?

> Stefan
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups 
> "cython-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to cython-users+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] [cython-users] freelist benchmarks

2013-02-24 Thread Stefan Behnel
mark florisson, 24.02.2013 18:56:
> On 24 February 2013 17:50, mark florisson  wrote:
>> On 24 February 2013 15:58, Stefan Behnel  wrote:
>>> mark florisson, 24.02.2013 15:50:
 On 24 February 2013 13:52, Stefan Behnel wrote:
> for those who haven't notice my other e-mail, I implemented a new 
> extension
> type decorator "@cython.freelist(N)" that replaces the normal object
> creation and deallocation with a freelist of N recently freed objects.
> Currently, it's only supported for types that do not have a base class 
> (and
> lifting that restriction is not all that easy).

 Very cool! I've been wanting that for a while now :)
>>>
>>> So did I.
>>>
>>>
 What's the hurdle with base classes?
>>>
>>> The problem is that the current way types are being instantiated is
>>> recursive. The top-most base class calls tp_alloc() and then each step in
>>> the hierarchy adds its bit of initialisation. If you want to introduce a
>>> freelist into this scheme, then it's still the top-most class that does the
>>> allocation, so it would need to manage all freelists of all of its children
>>> in order to return the right object struct for a given instantiation 
>>> request.
>>>
>>> This cannot really be done at compile time. Imagine a subtype in a
>>> different module, for example, for which the code requests a freelist. The
>>> compilation of the base type wouldn't even know that it's supposed to
>>> manage a freelist at all, only the subtype knows it.
>>>
>>> There are a couple of ways to deal with this. One is to replicate the
>>> freelist in the base type for all subtypes that it finds at runtime. That
>>> might actually be the easiest way to do it, but it requires a bit of memory
>>> management in order to add a new freelist when a new subtype is found at
>>> runtime. It also means that we'd have to find the right freelist before we
>>> can get an object from it (or not, if it's empty), which would likely be an
>>> operation that's linear with the number of subtypes. And the freelist set
>>> would better be bounded in size to prevent users from flooding it with lots
>>> and lots of subtypes.
>>>
>>> Another option would be to split the initialisation up into two functions,
>>> one that allocates *and* initialises the instance and one that *only*
>>> initialises it. That would allow each hierarchy level to manage its own
>>> freelists and to take its own decision about where to get the object from.
>>> This approach comes with a couple of tricky details, as CPython doesn't
>>> provide support for this. So we'd need to find a way to handle type
>>> hierarchies that are implemented across modules.
>>
>> Thanks for the explanation Stefan, this is the one I was thinking of,
>> but I suppose it'd need an extra pointer to the pure init function in
>> the type.
> 
> Hm, since extension types don't do multiple inheritance (and excluding
> Python subclasses), couldn't you import those init functions across
> modules through capsules?

Well, yes, I suppose you could. However, that's quite some overhead. I
think it's way easier to just provision a couple of freelists in advance
and assign them to different subtype sizes as they come in. Even in
somewhat large hierarchies, I doubt that the object structs will have all
that many different sizes. Remember, the size only changes when you add
cdef attributes, and only once when you start adding cdef methods. And even
structs that appear in different subtrees of the hierarchy and that carry
different attributes may end up having the same struct size due to layout
coincidences. I would expect that even a type hierarchy of, say, 20 types,
would have at most some 4-8 different struct sizes. Most of the time,
subtypes are there to change behaviour, not state.

The only real drawback is that you need to enable the base type to do all
that's necessary, which you may not have control over in a few cases. But
then again, if it's worth using a freelist on one subtype, it's probably
worth using it in general, so it's best to fix the base type in any way.


>>> Maybe the best approach would be to let the base type manage everything and
>>> just statically limit the maximum number of subtypes for which it provides
>>> separate freelists, at a first come, first serve basis. And the freelist
>>> selection could be based on the object struct size (tp_basicsize) instead
>>> of the specific type. As long as we don't support inheriting from variable
>>> size objects (like tuple/bytes), that would cut down the problem quite
>>> nicely. I think I should just give it a try at some point.

I changed the current type pointer check to look at tp_basicsize instead.
That made it work for almost all classes in lxml's own Element hierarchy,
with only a couple of exceptions in lxml.objectify that have one additional
object field. So, just extending the freelist support to use two different
lists for different struct sizes instead of just one would make it work for
all 

Re: [Cython] [cython-users] freelist benchmarks

2013-02-24 Thread David Roe
I changed the current type pointer check to look at tp_basicsize instead.

> That made it work for almost all classes in lxml's own Element hierarchy,
> with only a couple of exceptions in lxml.objectify that have one additional
> object field. So, just extending the freelist support to use two different
> lists for different struct sizes instead of just one would make it work for
> all of lxml already. Taking a look at Sage to see how the situation appears
> over there would be interesting, I guess.
>

I found some chains of length 5.  This could be shortened to 4 by putting
the freelist at the level of Element (which is where you most care about
speed of object creation).

SageObject
-> Element (_parent attribute and cdef methods)
-> Vector (_degree)
-> FreeModuleElement (_is_mutable)
-> FreeModuleElement_generic_dense (_entries)

SageObject
-> Element (_parent attribute and cdef methods)
->sage.structure.element.Matrix (_nrows)
-> sage.matrix.matrix.Matrix (_base_ring)
-> Matrix_integer_dense (_entries)

This does look cool to have though.
David
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel