[Python-Dev] Importing a submodule doesn't always set an attribute on its parent
Hello, I came across what seems like either a bug in the import system or a gap in its documentation, so I'd like to run it by folks here to see if I should submit a bug report. If there's somewhere else more appropriate to discuss this, please let me know. If you import A.B, then remove A from sys.modules and import A.B again, the newly-loaded version of A will not contain an attribute referring to B. Using "collections.abc" as an example submodule from the standard library: >>> import sys >>> import collections.abc >>> del sys.modules['collections'] >>> import collections.abc >>> collections.abc Traceback (most recent call last): File "", line 1, in AttributeError: module 'collections' has no attribute 'abc' This behavior seems quite counter-intuitive to me: why should the fact that B is already loaded prevent adding a reference to it to A? It also goes against the general principle that "import FOO" makes the expression "FOO" well-defined; for example PLR 5.7 states that "'import XXX.YYY.ZZZ' should expose 'XXX.YYY.ZZZ' as a usable expression". Finally, it violates the "invariant" stated in PLR 5.4.2 that if 'A' and 'A.B' both appear in sys.modules, then A.B must be defined and refer to sys.modules['A.B']. On the other hand, PLR 5.4.2 also states that "when a submodule is loaded using any mechanism... a binding is placed in the parent module's namespace to the submodule object", which is consistent with the behavior above, since the second import of A.B does not actually "load" B (only retrieve it from the sys.modules cache). So perhaps Python is working as intended here, and there is an unwritten assumption that if you unload a module from the cache, you must also unload all of its submodules. If so, I think this needs to be added to the documentation (which currently places no restrictions on how you can modify sys.modules, as far as I can tell). This may be an obscure corner case that is unlikely to come up in practice (I imagine few people need to modify sys.modules), but it did actually cause a bug in a project I work on, where it is necessary to uncache certain modules so that they can be reloaded. I was able to fix the bug some other way, but I think it would still be worthwhile to either make the import behavior more consistent (so that 'import A.B' always sets the B attribute of A) or add a warning in the documentation about this case. I'd appreciate any thoughts on this! Thanks, Daniel ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/VIPXZRK3OJNSVNSZSAJ7CO6QFC2RX27W/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Importing a submodule doesn't always set an attribute on its parent
Thanks, Brett. I understand why the behavior happens, I just don't understand the decision to implement imports this way. Since there's no warning in the documentation that removing items from sys.modules can break the fact that "import X.Y" defines "X.Y" (note that the "behind the curtain" stuff happens *before* the second import, so it's still the case that the second import does not define "X.Y" as implied by the docs), and there's also no warning that submodules must be removed at the same time as their parent, I would expect my example code to work. I don't see any downside to having "import X.Y" always set the Y attribute of X (instead of only setting it if 'X.Y' is not already in sys.modules), but if you think it's a bad idea, here's a suggestion for a paragraph to add at the end of PLR 5.4.2: "Note that the binding to the submodule object in the parent module's namespace is only added when the submodule is actually *loaded*. If the submodule is already present in `sys.modules` when it is imported (through any of the mechanisms above), then it will not be loaded again and no binding will be added to the parent module." If removing a module but not its submodules from sys.modules is considered "cheating" and could potentially break other parts of the import system, that should also be documented, e.g. by adding the sentence "If you delete a key for a module in `sys.modules`, you must also delete the keys for all submodules of that module." at the end of the 3rd paragraph of PLR 5.3.1. However, I would much rather not impose this restriction, since it seems unnecessarily restrictive (indeed, my code violates it but works fine, and changing it to transitively remove all submodules would necessitate reloading many modules which do not actually need to be reloaded). (Terry, thanks for your suggestion. My concern about adding such a vague warning is that to me, it reads as saying that all bets are off if you modify sys.modules by hand, which means it would never be safe to do so, i.e., the behavior might change arbitrarily in a future Python version. But in my opinion there are legitimate cases where it is necessary to ensure a module will be reloaded the next time it is imported, and the documented way to do that is to remove entries from sys.modules.) Daniel ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/N763W6AGD6NQ4IXVWMNGDL4DBN3LXBJ7/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Importing a submodule doesn't always set an attribute on its parent
Brett Cannon wrote: > So we don't want to strengthen the definition at > all; best we are comfortable with is put up a warning that you don't want > to do stuff with sys.modules unless you know what you're doing. OK, thanks for the clarification. Having read through the source of importlib one too many times, I guess I will declare that I "know what [I'm] doing" for now and keep on mutating sys.modules, since the alternative (intercepting all imports) seems more painful to me. If my code breaks in a future Python version I'll only blame myself :) Best, Daniel ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/HZEPOAI3YME4GD2M6RPWG2KG4OTSB5KX/ Code of Conduct: http://python.org/psf/codeofconduct/