[Python-Dev] static variables in CPython - duplicated _Py_IDENTIFIERs?
I just ran an analysis of static variable definitions in CPython code, using clang, on Ubuntu and Windows. The results should be available here: https://cpython.red-dove.com/ As I understand it, _Py_IDENTIFIER instances are supposed to hold constant strings that are used in Python - e.g. "__class__", "__dict__" and so on. I noticed that there are numerous duplicates of these - e.g. 8 instances of __name__, 11 instances of __dict__, and so on - each instance is defined as static in its source file and so completely distinct from the others. I realise the overall amount of memory used by these structures isn't large, but is there any particular benefit to having these multiple copies? The current situation seems a little untidy, at least. What would be the disadvantage of making them extern in the headers and allocating them once in some consts.c module? After all, they seem for the most part to be well-known constant strings that don't need to be encapsulated in any particular C compilation unit. Regards, Vinay Sajip ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/GVQOMWXUDYLBXZ2MMIDX5D6X7X42VQV7/
[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?
> From my understanding, _Py_IDENTIFIER was designed for static usage. > The overhead is quite low; keeping it as a private (static) > module-level internal variable helps to decouple things. I understand the importance of decoupling in general, but meanings of __name__, __dict__ etc. aren't going to change. What practical difficulties (in terms of coupling) do you foresee if e.g. all C modules know that somewhere there is a constant value representing the constant string "__dict__", and so on? There's no suggestion that some C modules will use a different constant string value to refer to e.g. the "__dict__" attribute than other C modules, is there? I'm not suggesting that all _Py_IDENTIFIERs need to be extern, just the ones whose meanings are public and can't be changed. I'm not coming from a worried-about-the-memory-overhead perspective - it's more about the manageability of ad hoc static variables which can be "considered harmful" in certain scenarios - such as sub-interpreters, for example. _Py_IDENTIFIER seems to account for a fair proportion of all static variable declarations. > This target is very important for keeping public API as small as possible. I'm not suggesting that _Py_IDENTIFIERs become part of the public API. Surely it's not the case that any and every extern identifier becomes part of the public API? (I know people can misuse things and use unsupported internal APIs, but then they're on their own, right?) Regards, Vinay Sajip ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/ZTLSGAHCKHLQK5BBKYHLEQGJHCU4OAFB/
[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?
> > Right, I'm pretty sure that right now Python doesn't have any way to > share symbols between .c files without also exposing them in the C > API. On other C projects I've worked on, the public API is expressed in one set of header files, and internal APIs that need to be exposed across modules are described in a different set of internal header files, and developers who incorrectly use internal APIs by including the internal headers could see breakage when the internals change ... excuse my naïveté, as I haven't done much at Python's C level - does this discipline/approach not apply to CPython? Regards, Vinay Sajip ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/XI4Y4M7WBXSMPY2ZDMD6IQWBAOFR6WYE/
[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?
> Moving some of the especially common identifier references into the > interpreter state struct may make sense. > Adding more process globals wouldn't be desirable though, as they're one of > the more common ways of breaking encapsulation between subinterpreters > (hence Eric's efforts to eliminate as many of them as he reasonably can). Well, another way of breaking encapsulation is by having ad hoc statics (as opposed to globals) which should really be in the interpreter state - it prevents compartmentalizing the GIL into per-subinterpreter GILs, for example. That's my motivation for looking at this area - I spent a bit of time working with Eric at the recent core dev sprint, and wanted to explore the problems in this area. Regards, Vinay ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/43BYLJIZXPTVBGZTTQUTSPICZVWIF6HR/
[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?
> requires some newer tools like -fvisibility=hidden that work > differently across different platforms, and so far no-one's done the > work to sort out the details. I've started looking at this, but quite apart from the specifics of applying -fvisibility=hidden, there are some things that aren't yet clear to me about the intent behind some of our symbol definitions. For example, the file Include/fileutils.h contains the definitions PyAPI_FUNC(wchar_t *) Py_DecodeLocale(const char *arg, size_t *size); and PyAPI_FUNC(int) _Py_DecodeLocaleEx(const char *arg, wchar_t **wstr, size_t *wlen, const char **reason, int current_locale, _Py_error_handler errors); However, only the first of these is documented, though the definition via PyAPI_FUNC implies that both are part of the public API. If this is the case, why aren't both documented? If _Py_DecodeLocaleEx is not part of the public API (and the leading underscore suggests so), should it be polluting the symbol space? The comment for PyAPI_FUNC is "Declares a public Python API function and return type". Is this really the case, or has PyAPI_FUNC been coopted to provide external linkage for use by Python-internal code in different compilation units? _Py_DecodeLocaleEx is called in Modules/_testcapimodule.c and also in Objects/unicodeobject.c. If we want to take steps to restrict symbol visibility, it will potentially affect all of the code base - so presumably, a PEP would be required, even though it's an implementation detail from the point of view of the language itself? Regards, Vinay Sajip ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/H726K4PDW6PI34VVBX6HN26MAE5IZUNG/
[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?
Thanks for the pointer. According to that information, everything in Include/fileutils.h should be the portable public C API. But there are definitions in there which start with _Py - and the information relating to Include/cpython/*.h suggests that API prefixed with _Py is conventionally private. Is there a reason why (for example) _Py_DecodeLocaleEx appears in Include/*.h with a prefix conventionally suggesting it's private, and which is backed up by the fact that it's not mentioned in the C API documentation? If it's just there for the moment because no-one got around to moving it to Include/cpython/*.h, it would be useful to know that. I'm not trying to nit-pick here - if we're going to apply visibility rules, then things like this need to be absolutely clear. Just for info, I tried configuring and compiling with -fvisibility=hidden and declared default visibility on PyAPI_FUNC, PyAPI_DATA and PyMODINIT_FUNC, as well as one or two other declarations. The resulting build passes all tests run locally, except test_tools (which is currently also failing for me on master). I haven't yet pushed these changes to my public CPython fork, as it's still work in progress - which is why I'm asking these questions. Regards, Vinay Sajip ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/G6P4RSVKFF7E6QYHBO4ZRLNHSNJWA6MT/
[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?
Nathaniel Smith wrote: > Windows already has working symbol visibility handling, and PyAPI_FUNC is > what controls it. So adding symbol visibility handling to Linux/macOS is > just about making all the platforms consistent. There might be some weird > choices being made, but I don't think you need to sort all those out as > part of this. Well, _Py_DecodeLocaleEx is declared with PyAPI_FUNC, so would you expect it to be exposed on Windows? I haven't a Windows machine handy right now, but I would expect it not to be exposed, as it doesn't appear in PC/python3.def. I will check when I get a chance. Regards, Vinay Sajip ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/Q6H72YBNCKQYUQ4O4TMP65R4VEPEVLMT/
[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?
Ah - I checked, and it's there OK ... (head scratch) Regards, Vinay Sajip ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/POLJVZ7UJGQ74E5YYZ3G5HPNDLP6G5T5/
[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?
OK - but that's just one I picked at random. There are others like it - what would be the process for deciding which ones need to be made private and moved? Should an issue be raised to track this? Regards, Vinay Sajip ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/USKL36S4SRDEK7TY46C2HIUHXFVRDQUQ/
[Python-Dev] Re: static variables in CPython - duplicated _Py_IDENTIFIERs?
Fair enough. Pull request created: https://github.com/python/cpython/pull/16347 Regards, Vinay Sajip ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/IQGK5HFMV6ESBF7CS32U2BWVM23DIKXF/