Re: [Cython] [cython-users] freelist benchmarks

Robert Bradshaw Tue, 26 Feb 2013 12:17:32 -0800

On Mon, Feb 25, 2013 at 1:17 AM, Stefan Behnel <stefan...@behnel.de> wrote:
> Hi,
>
> thanks for looking through it.
>
> David Roe, 25.02.2013 00:00:
>> I changed the current type pointer check to look at tp_basicsize instead.
>>
>>> That made it work for almost all classes in lxml's own Element hierarchy,
>>> with only a couple of exceptions in lxml.objectify that have one additional
>>> object field. So, just extending the freelist support to use two different
>>> lists for different struct sizes instead of just one would make it work for
>>> all of lxml already. Taking a look at Sage to see how the situation appears
>>> over there would be interesting, I guess.
>>
>> I found some chains of length 5.  This could be shortened to 4 by putting
>> the freelist at the level of Element (which is where you most care about
>> speed of object creation).
>
> It's substantially easier to keep it in the top-level base class, though.
> Otherwise, we'd need a new protocol between inheriting types as I
> previously described. That add a *lot* of complexity.
>
>
>> SageObject
>>     -> Element (_parent attribute and cdef methods)
>>     -> Vector (_degree)
>>     -> FreeModuleElement (_is_mutable)
>>     -> FreeModuleElement_generic_dense (_entries)
>>
>> SageObject
>>     -> Element (_parent attribute and cdef methods)
>>     ->sage.structure.element.Matrix (_nrows)
>>     -> sage.matrix.matrix.Matrix (_base_ring)
>>     -> Matrix_integer_dense (_entries)


I don't know that (expensive) matrices are the best example, and often
the chains are larger for elements one really cares about.

sage: def base_tree(x): return [] if x is None else [x] +
base_tree(x.__base__)
   ...:

sage: base_tree(Integer)
 [<type 'sage.rings.integer.Integer'>, <type
'sage.structure.element.EuclideanDomainElement'>, <type
'sage.structure.element.PrincipalIdealDomainElement'>, <type
'sage.structure.element.DedekindDomainElement'>, <type
'sage.structure.element.IntegralDomainElement'>, <type
'sage.structure.element.CommutativeRingElement'>, <type
'sage.structure.element.RingElement'>, <type
'sage.structure.element.ModuleElement'>, <type
'sage.structure.element.Element'>, <type
'sage.structure.sage_object.SageObject'>, <type 'object'>]

sage: base_tree(RealDoubleElement)
 [<type 'sage.rings.real_double.RealDoubleElement'>, <type
'sage.structure.element.FieldElement'>, <type
'sage.structure.element.CommutativeRingElement'>, <type
'sage.structure.element.RingElement'>, <type
'sage.structure.element.ModuleElement'>, <type
'sage.structure.element.Element'>, <type
'sage.structure.sage_object.SageObject'>, <type 'object'>]

sage: base_tree(type(mod(1, 10)))
 [<type 'sage.rings.finite_rings.integer_mod.IntegerMod_int'>, <type
'sage.rings.finite_rings.integer_mod.IntegerMod_abstract'>, <type
'sage.rings.finite_rings.element_base.FiniteRingElement'>, <type
'sage.structure.element.CommutativeRingElement'>, <type
'sage.structure.element.RingElement'>, <type
'sage.structure.element.ModuleElement'>, <type
'sage.structure.element.Element'>, <type
'sage.structure.sage_object.SageObject'>, <type 'object'>]

> Ok, so even for something as large as Sage, we'd apparently end up with
> just a couple of freelists for a given base type. That really makes it
> appear reasonable to make that number a compile time constant as well. I
> mean, even if you *really* oversize it, all you loose is the static memory
> for a couple of pointers. On a 64 bit system, if you use a freelist size of
> 8 objects and provision freelists for 8 differently sized subtypes, that's
> 8*8*8 bytes in total, or half a KB, statically allocated. Even a hundred
> times that size shouldn't hurt anyone. Unused subtype freelists really take
> almost no space and won't hurt performance either.

Elements in Sage are typically larger than 8 bytes, and our
experiments for Integer showed that the benefit (for this class)
extended well beyond 8 items. On the other hand lots of elements are
so expensive that they don't merit this at all.

I think one thing to keep in mind is that Python's heap is essentially
a "freelist" of objects of every size up to 128(?) bytes, so what are
we trying to save by putting it at the base type and going up and down
the __cinit__/__dealloc__ chain? I suppose we save the zero-ing out of
memory and a function call or two, but that's not the expensive part.
For our Integer free list, we save going up the __cinit__/__dealloc__
call, initializing a couple of members, setting the vtable pointers,
which turns out to be the bulk of the cost. I'd love to see something
like this work, if just for final classes. It may require the
introduction of new functions to determine exactly how much
cleanup/setup we want to do when inserting/removing stuff from the
pool rather than giving up the memory completely.

>> This does look cool to have though.
>
> It definitely is.

Yes!

- Robert
_______________________________________________
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel

Re: [Cython] [cython-users] freelist benchmarks

Reply via email to