[Python-Dev] PyGC and PyObject_Malloc introspection
Hi there, I've spent quite some time on memory profiling for Python now. I'm struggling to get more information from the allocated memory right now for what looks like a sad reason. :( Supposedly PyObject_Malloc() returns some memory space to store a PyObject. If that was true all the time, that would allow anyone to introspect the allocated memory and understand why it's being used. Unfortunately, this is not the case. Objects whose types are tracked by the GC go through _PyObject_GC_Alloc() which changes the underlying memory structure to be (PyGC_HEAD + PyObject). This is a bummer as there then no safe way that I can think of to know if an allocated memory space is gc-tracked or gc-untracked. It makes it therefore impossible to introspect the memory allocated by PyObject_Malloc(). There are multiple ways to solve this, but I don't see any way right now of doing this without slightly changing CPython. Has anyone any idea on how to workaround this or the kind of change that could be acceptable to mitigate the issue? Thanks! Cheers, -- Julien Danjou # Free Software hacker # https://julien.danjou.info ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/QRX6U5XBXMHMT6YKIXERS3UT64ALYV27/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PyGC and PyObject_Malloc introspection
On Thu, Jan 14 2021, Pablo Galindo Salgado wrote: Hi Pablo, > Isn't this a similar problem that you have with regular malloc? When you > call malloc() with > some size, malloc actually will reserve more than that for > alignment purposes and for > bookkeeping and apart from some platform-specific APIs > like malloc_usable_size() > you cannot query that value. Not really. It's not a real problem if malloc reserve more memory than the size you requested if you still know the original size. When working with Python memory allocator API, you do have this original size so you're able to read the memory from its beginning (the allocated pointer address) to its end. > My answer would be that that's because memory itself cannot be gc tracked, > only objects can and those belonging to different > categories. For example, notice that the tracemalloc module does not report > objects, it only reports memory blocks and you > cannot ask tracemalloc to give you a list of all the objects that were > allocated because it does not have the notion of what > an object is. Exactly, which is a bit a bummer. Considering Python provides 3 different memory allocator, it'd be great if there was some ability to be sure that PyObject_Malloc pointer are actually PyObject, not Py_GC_HEAD. -- Julien Danjou // Free Software hacker // https://julien.danjou.info ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/MMCIEUV62ER6DWYG7FVIRCRD2KI3E4EZ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PyGC and PyObject_Malloc introspection
On Thu, Jan 14 2021, Tim Peters wrote: > I'm not clear on exactly what it is you're after, but CPython faces > the same question all the time: _given_ a pointer to an object, is > there, or is there not, a GC header prepended? That's answered by > this C API function: > > """ > int PyObject_IS_GC(PyObject *obj) > > Returns non-zero if the object implements the garbage collector > protocol, otherwise returns 0. > > The object cannot be tracked by the garbage collector if this function > returns 0. > """ > > FYI, the implementation usually resolves it by looking at whether > obj's type object has the Py_TPFLAGS_HAVE_GC flag set. Right, but that only works if you have the PyObject address. If you all got is a pointer returned by PyObject_Malloc(), you don't know if the PyObject is at this pointer address, or after the PyGC_HEAD header that is prepended. :( -- Julien Danjou /* Free Software hacker https://julien.danjou.info */ ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/Z6GJUNKOFE2IN6EKQKDBBMX4SUZ36ITU/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PyGC and PyObject_Malloc introspection
On Fri, Jan 15 2021, Antoine Pitrou wrote: > Also note that PyObject_Malloc() may also be used to allocate > non-objects, for example a bytearray's payload, IIRC. Interesting. What's the rational for not using PyMem_Malloc() in such cases? -- Julien Danjou # Free Software hacker # https://julien.danjou.info ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/6GM4IOWIQWAQNBRAAPZBC5TRD2HZ7VTL/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PyGC and PyObject_Malloc introspection
On Fri, Jan 15 2021, Pablo Galindo Salgado wrote: >> Exactly, which is a bit a bummer. Considering Python provides 3 >> different memory allocator, it'd be great if there was some ability to >> be sure that PyObject_Malloc pointer are actually PyObject, not >> Py_GC_HEAD. > > The allocators are specialized based on the allocation strategy > and efficiency, not based on what are you going to use the memory > for. If you want to allocate a buffer using the object allocation > strategy because then nobody is preventing you > to use PyObject_Malloc(). Even if we sanitize the whole stdlib to > be conforming to "only objects are allocated using PyObejct_Malloc()", > 3rd party extension modules and other bests can do whatever, so you > can still crash if you decide to interpreter the output as an object. Agreed. Then the correct endpoint would more likely to be PyObject_New(), but there's no way to intercept such calls for statistical analysis currently. And as you wrote, if some code decide to use PyMalloc() directly, then that memory won't be tracked. It sounds like the provided C API is a bit too low level for this, preventing any kind of statistical analysis of the allocation patterns. :( -- Julien Danjou # Free Software hacker # https://julien.danjou.info ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/UAWA2ADOQNI5IAGI3YVNE7EYKGSBIHZ5/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Thread argument for exc_info and public API
Hi there, I've been poking around the thread state API and error/exception handling, and there's something missing I'd like to see happening. The only way to retrieve the current exception is via sys.excinfo or PyErr_GetExcInfo in C. However, the issue is that they don't take a PyThreadState as argument, but use _PyThreadState_GET() to retrieve the thread state. That makes it impossible to retrieve the exception information for a different thread than the one calling the function. In order to retrieve the exception from *any* PyThreadState, the caller has to use _PyErr_GetTopmostException which takes a PyThreadState as argument — though that function is private and therefore not documented or usable in an external module (in theory at least). Should we make _PyErr_GetTopmostException public, or implement something different to retrieve the top most exception from a PyThreadState? Cheers, -- Julien Danjou # Free Software hacker # https://julien.danjou.info ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/QVMFP76R35SXUIM2WPPVPV5XCVMKPDEB/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Thread argument for exc_info and public API
On Fri, Dec 20 2019, Victor Stinner wrote: Hi Victor, > If we add the following function, does it solve your use case? > > void > _PyErr_GetExcInfo(PyThreadState *tstate, > PyObject **p_type, PyObject **p_value, PyObject > **p_traceback) Yes, it would. >> In order to retrieve the exception from *any* PyThreadState, the caller >> has to use _PyErr_GetTopmostException which takes a PyThreadState as >> argument — though that function is private and therefore not documented >> or usable in an external module (in theory at least). > > What if this function is exported as a private function? Usually, > functions in Python header files are prefixed by PyAPI_FUNC(), but for > an unknown reason, this one is not: > > _PyErr_StackItem *_PyErr_GetTopmostException(PyThreadState *tstate); > > Maybe it's because the author didn't want to expose the private > _PyErr_StackItem structure? This is my guess too. It'd make sense since this was struct was introduced around 3.7 and those values where just stored inside PyThreadState before. I guess nobody wanted to commit to an API. I don't think exposing _PyErr_StackItem is worth it. It's just a placeholder for the usual exception triplet (tp, value, tb). >> Should we make _PyErr_GetTopmostException public, or implement something >> different to retrieve the top most exception from a PyThreadState? > > IMHO a _PyErr_GetExcInfo() function taking a tstate parameter is a > better idea than exposing the private _PyErr_StackItem structure. I agree. > Private functions *can* be used for debuggers, but we don't provide > any warranty that the function is not going to disappear. Such > functions have been moved to the internal API for example. The > internal C API *can* be used, but you have to opt-in so you know that > you get unstable APIs :-) Alright! I didn't know that you could use private functions. I'm totally in the debugger case so it's fine with me. :) I'll come up with a patch that exposes _PyErr_GetExcInfo() as you suggested. Thanks Victor! -- Julien Danjou // Free Software hacker // https://julien.danjou.info ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/NQMQVVWMURHR7VFVCD43F2LC2JZZEX3F/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] ABI breakage in >= 3.7.4 ?
Hi fellows, I've been hit by a segfault recently in a binary wheel running on Python 3.7.3, but that worked fine on Python 3.7.5. The code was compiled on Python 3.7.5 using manylinux1 Docker images and transformed in a manylinux1 wheel. The segmentation fault back trace did not make any sense, so I guessed it was an ABI change. After running git bisect for a few hours, it turns out that the culprit is the following commit: https://github.com/python/cpython/commit/8766cb74e186d3820db0a855ccd780d6d84461f7 >From my understanding, that commit indeed changes the ABI by modifying the header file and the size of the PyGC_Head and breaks code compiled on Python >= 3.7.4 running on earlier Python 3.7 version. I don't know if anything can be done about this now that it's done and released. It might be something we want to double check in the future. Having some kind of warning when doing backports could be worth it. Cheers, -- Julien Danjou // Free Software Hacker // https://julien.danjou.info ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/J2FGZPS5PS7473TONJTPAVSNXRGV3TFL/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: ABI breakage in >= 3.7.4 ?
On Fri, Feb 07 2020, Victor Stinner wrote: > You can find the rationale for this change in two issues: > > * https://bugs.python.org/issue27987 > * https://bugs.python.org/issue36618 > > First, it was a warning in clang ubsan (Undefined Behavior Sanitizer). > Then ctypes started to crash when Python was compiled with clang. It > means that compiling Python 3.7 with clang also had the issue. > > The quick fix was to compile Python with -fmax-type-align=8 when clang > was detected. > > But it was a signal that something was wrong in Python on x86-64: > Python didn't respect the x86-64 ABI. Thanks for the detailed explanation! >> I've been hit by a segfault recently in a binary wheel running on Python >> 3.7.3, but that worked fine on Python 3.7.5. > > Do you know which wheel package caused the issue? Which part of Python > caused the problem? Well, open a issue and try to provide as much > information as possible ;-) This happened in a custom module I wrote. I've opened https://bugs.python.org/issue39599 with as many details as I can right now. It's a bit fuzzy even for me how to reproduce it with a minimal test case. -- Julien Danjou ;; Free Software hacker ;; https://julien.danjou.info ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/AG2JQ4RSQCHCSKY5YPRRY6H76K3ABO27/ Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-Dev] [issue1644818] Allow importing built-in submodules
At 1261178549 time_t, Martin v. Löwis wrote: > > Is there to chance to see this *bug* fixed someday? > > Please ask on python-dev. I may be willing to revive my five-for-one offer. Not sure I really understand, but I can ask gain here: will that patch, which seems correct finally merged and get this bug closed? Cheers, -- Julien Danjou // ᐰhttp://julien.danjou.info // 9A0D 5FD9 EB42 22F6 8974 C95C A462 B51E C2FE E5CD // Trust no one. signature.asc Description: Digital signature ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [issue1644818] Allow importing built-in submodules
At 1261181806 time_t, "Martin v. Löwis" wrote: > Only if some committer finds time for it, which is *really* scarce. > If you want to push that issue, I promise to review it after you have > reviewed five other patches. Well, that's disapointing. I work on several other free software projects, and my time is really scarce too. I understand blackmailing me to close a bug can be seen as a nice game. Honestly, if I had more time to get involve in that area, I'll take it as a game and would do it with pleasure. But in my current position and with "I-do-software-developement-too", you are just pissing me off for not fixing a bug in your program with a 10 lines long patch written by someone else 3 years ago. Something that should take 5 minutes, probably the time we both lost by writing our respective emails. Or give commit access, I'll do it for you. Anyway, thanks. Cheers, -- Julien Danjou // ᐰhttp://julien.danjou.info // 9A0D 5FD9 EB42 22F6 8974 C95C A462 B51E C2FE E5CD // This is the end of my signature. signature.asc Description: Digital signature ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com