> Then the correct endpoint would more likely to be PyObject_New(), but
> there's no way to intercept such calls for statistical analysis
> currently. And as you wrote, if some code decide to use PyMalloc()
> directly, then that memory won't be tracked.

The one that CPython uses in debug mode to track all references for
sys.getobjects() is PyObject_Init(), which is heavily inlined for
performance
reasons (the same as many of the other calls in the allocation chain), so
unfortunately
is not possible to intercept using LD_PRELOAD or even GOT patching.

> t sounds like the provided C API is a bit too low level for this,
> preventing any kind of statistical analysis of the allocation patterns.
> :(

Yes, allowing interception or customization will prevent inlining or other
optimizations
and therefore will involve a considerable performance hit. As an experiment
I forced
PyObject_Init and _PyObject_Init to not be inlined and that made a 7-13%
speed
impact overall in the performance test suite.

If you want to track all objects creation, your best bet IMHO is a debug
build and to use sys.getobjects().

Regards from sunny London,
Pablo Galindo Salgado


On Fri, 15 Jan 2021 at 12:17, Julien Danjou <jul...@danjou.info> wrote:

> On Fri, Jan 15 2021, Pablo Galindo Salgado wrote:
>
> >> Exactly, which is a bit a bummer. Considering Python provides 3
> >> different memory allocator, it'd be great if there was some ability to
> >> be sure that PyObject_Malloc pointer are actually PyObject, not
> >> Py_GC_HEAD.
> >
> > The allocators are specialized based on the allocation strategy
> > and efficiency, not based on what are you going to use the memory
> > for. If you want to allocate a buffer using the object allocation
> > strategy because <insert reason> then nobody is preventing you
> > to use PyObject_Malloc(). Even if we sanitize the whole stdlib to
> > be conforming to "only objects are allocated using PyObejct_Malloc()",
> > 3rd party extension modules and other bests can do whatever, so you
> > can still crash if you decide to interpreter the output as an object.
>
> Agreed.
>
> Then the correct endpoint would more likely to be PyObject_New(), but
> there's no way to intercept such calls for statistical analysis
> currently. And as you wrote, if some code decide to use PyMalloc()
> directly, then that memory won't be tracked.
>
> It sounds like the provided C API is a bit too low level for this,
> preventing any kind of statistical analysis of the allocation patterns.
> :(
>
> --
> Julien Danjou
> # Free Software hacker
> # https://julien.danjou.info
>
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/A6FCIFGSWL6POYZNIEYPZVCQMJDOES7P/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to