[Python-Dev] Importing a submodule doesn't always set an attribute on its parent

2022-04-08 Thread dfremont--- via Python-Dev
Hello,

I came across what seems like either a bug in the import system or a gap in its 
documentation, so I'd like to run it by folks here to see if I should submit a 
bug report. If there's somewhere else more appropriate to discuss this, please 
let me know.

If you import A.B, then remove A from sys.modules and import A.B again, the 
newly-loaded version of A will not contain an attribute referring to B. Using 
"collections.abc" as an example submodule from the standard library:

>>> import sys
>>> import collections.abc
>>> del sys.modules['collections']
>>> import collections.abc
>>> collections.abc
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: module 'collections' has no attribute 'abc'

This behavior seems quite counter-intuitive to me: why should the fact that B 
is already loaded prevent adding a reference to it to A? It also goes against 
the general principle that "import FOO" makes the expression "FOO" 
well-defined; for example PLR 5.7 states that "'import XXX.YYY.ZZZ' should 
expose 'XXX.YYY.ZZZ' as a usable expression". Finally, it violates the 
"invariant" stated in PLR 5.4.2 that if 'A' and 'A.B' both appear in 
sys.modules, then A.B must be defined and refer to sys.modules['A.B'].

On the other hand, PLR 5.4.2 also states that "when a submodule is loaded using 
any mechanism... a binding is placed in the parent module's namespace to the 
submodule object", which is consistent with the behavior above, since the 
second import of A.B does not actually "load" B (only retrieve it from the 
sys.modules cache). So perhaps Python is working as intended here, and there is 
an unwritten assumption that if you unload a module from the cache, you must 
also unload all of its submodules. If so, I think this needs to be added to the 
documentation (which currently places no restrictions on how you can modify 
sys.modules, as far as I can tell).

This may be an obscure corner case that is unlikely to come up in practice (I 
imagine few people need to modify sys.modules), but it did actually cause a bug 
in a project I work on, where it is necessary to uncache certain modules so 
that they can be reloaded. I was able to fix the bug some other way, but I 
think it would still be worthwhile to either make the import behavior more 
consistent (so that 'import A.B' always sets the B attribute of A) or add a 
warning in the documentation about this case. I'd appreciate any thoughts on 
this!

Thanks,
Daniel
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VIPXZRK3OJNSVNSZSAJ7CO6QFC2RX27W/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Importing a submodule doesn't always set an attribute on its parent

2022-04-09 Thread dfremont--- via Python-Dev
Thanks, Brett. I understand why the behavior happens, I just don't understand 
the decision to implement imports this way. Since there's no warning in the 
documentation that removing items from sys.modules can break the fact that 
"import X.Y" defines "X.Y" (note that the "behind the curtain" stuff happens 
*before* the second import, so it's still the case that the second import does 
not define "X.Y" as implied by the docs), and there's also no warning that 
submodules must be removed at the same time as their parent, I would expect my 
example code to work.

I don't see any downside to having "import X.Y" always set the Y attribute of X 
(instead of only setting it if 'X.Y' is not already in sys.modules), but if you 
think it's a bad idea, here's a suggestion for a paragraph to add at the end of 
PLR 5.4.2:

"Note that the binding to the submodule object in the parent module's namespace 
is only added when the submodule is actually *loaded*. If the submodule is 
already present in `sys.modules` when it is imported (through any of the 
mechanisms above), then it will not be loaded again and no binding will be 
added to the parent module."

If removing a module but not its submodules from sys.modules is considered 
"cheating" and could potentially break other parts of the import system, that 
should also be documented, e.g. by adding the sentence "If you delete a key for 
a module in `sys.modules`, you must also delete the keys for all submodules of 
that module." at the end of the 3rd paragraph of PLR 5.3.1. However, I would 
much rather not impose this restriction, since it seems unnecessarily 
restrictive (indeed, my code violates it but works fine, and changing it to 
transitively remove all submodules would necessitate reloading many modules 
which do not actually need to be reloaded).

(Terry, thanks for your suggestion. My concern about adding such a vague 
warning is that to me, it reads as saying that all bets are off if you modify 
sys.modules by hand, which means it would never be safe to do so, i.e., the 
behavior might change arbitrarily in a future Python version. But in my opinion 
there are legitimate cases where it is necessary to ensure a module will be 
reloaded the next time it is imported, and the documented way to do that is to 
remove entries from sys.modules.)

Daniel
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/N763W6AGD6NQ4IXVWMNGDL4DBN3LXBJ7/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Importing a submodule doesn't always set an attribute on its parent

2022-04-11 Thread dfremont--- via Python-Dev
Brett Cannon wrote:
> So we don't want to strengthen the definition at
> all; best we are comfortable with is put up a warning that you don't want
> to do stuff with sys.modules unless you know what you're doing.

OK, thanks for the clarification. Having read through the source of importlib 
one too many times, I guess I will declare that I "know what [I'm] doing" for 
now and keep on mutating sys.modules, since the alternative (intercepting all 
imports) seems more painful to me. If my code breaks in a future Python version 
I'll only blame myself :)

Best,
Daniel
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HZEPOAI3YME4GD2M6RPWG2KG4OTSB5KX/
Code of Conduct: http://python.org/psf/codeofconduct/