[Python-Dev] Re: Using vectorcall for tp_new and tp_init

2019-06-09 Thread Mark Shannon




On 07/06/2019 11:41 am, Jeroen Demeyer wrote:

Hello,

I'm starting this thread to brainstorm for using vectorcall to speed up 
creating instances of Python classes.


Currently the following happens when creating an instance of a Python 
class X using X(.) and assuming that __new__ and __init__ are Python 
functions and that the metaclass of X is simply "type":


1. type_call (the tp_call wrapper for type) is invoked with arguments 
(X, args, kwargs).


2. type_call calls slot_tp_new with arguments (X, args, kwargs).

3. slot_tp_new calls X.__new__, prepending X to the args tuple. A new 
object obj is returned.


4. type_call calls slot_tp_init with arguments (obj, args, kwargs).

5. slot_tp_init calls type(obj).__init__ method, prepending obj to the 
args tuple. A new object obj is returned.


In the worst case, no less than 6 temporary objects are needed just to 
pass arguments around:


1. An args tuple and kwargs dict for tp_call

3. An args array with X prepended and a kwnames tuple for __new__

5. An args array with obj prepended and a kwnames tuple for __init__

This is clearly not as efficient as it could be.

An obvious solution would be to introduce variants of tp_new and tp_init 
using the vectorcall protocol. Assuming PY_VECTORCALL_ARGUMENTS_OFFSET 
is used, all 6 temporary allocations could be dropped. The 
implementation could be in the form of two new slots tp_vector_new and 
tp_vector_init. Since we're just dealing with type slots here (as 
opposed to offsets in an object structure), this should be easier to 
implement than PEP 590 itself.


Relatively few classes override __new__, which means that object.__new__ 
can be inlined. Something like this (which needs a bit of cleaning up):

https://github.com/markshannon/cpython/commit/9ff46e3ba0747f386f9519933910d63d5caae6ee

Cheers,
Mark.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/M6PJSOSQUE4YRYWJPMFO6K2NIH7OQGAP/


[Python-Dev] Re: obmalloc (was Have a big machine and spare time? Here's a possible Python bug.)

2019-06-09 Thread Tim Peters
[Tim\
> For the current obmalloc, I have in mind a different way ...
> Not ideal, but ... captures the important part (more objects
> in a pool -> more times obmalloc can remain in its
> fastest "all within the pool" paths).

And now there's a PR that removes obmalloc's limit on pool sizes, and,
for a start, quadruples pool (and arena!) sizes on 64-bit boxes:

https://github.com/python/cpython/pull/13934

https://bugs.python.org/issue37211

As the PR says,

"""
It would be great to get feedback from 64-bit apps that do massive
amounts of small-object allocations and deallocations.
"""

:-)
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Z4YIHDGNZLP4WX5HVEBXDSFIDIWPTTYK/