Re: [Python-Dev] Implementing PEP 382, Namespace Packages
At 03:44 PM 5/29/2010 -0700, Brett Cannon wrote: On Sat, May 29, 2010 at 12:29, "Martin v. Löwis" wrote: > Am 29.05.2010 21:06, schrieb P.J. Eby: >> >> At 08:45 PM 5/29/2010 +0200, Martin v. Löwis wrote: In it he says that PEP 382 is being deferred until it can address PEP 302 loaders. I can't find any follow-up to this. I don't see any discussion in PEP 382 about PEP 302 loaders, so I assume this issue was never resolved. Does it need to be before PEP 382 is implemented? Are we wasting our time by designing and (eventually) coding before this issue is resolved? >>> >>> Yes, and yes. >> >> Is there anything we can do to help regarding that? > > You could comment on the proposal I made back then, or propose a different > solution. [sorry for the fundamental PEP questions, but I think PEP 382 came about while I was on my python-dev sabbatical last year] I have some questions about the PEP which might help clarify how to handle the API changes. For finders, their search algorithm is changed in a couple of ways. One is that modules are given priority over packages (is that intentional, Martin, or just an oversight?). Two, the package search requires checking for a .pth file on top of an __init__.py. This will change finders that could before simply do an existence check on an __init__ "file" (or whatever the storage back-end happened to be) and make it into a list-and-search which one would hope wasn't costly, but in same cases might be if the paths to files is not stored in a hierarchical fashion (e.g. zip files list entire files paths in their TOC or a sqlite3 DB which uses a path for keys will have to list **all** keys, sort them to just the relevant directory, and then look for .pth or some such approach). Are we worried about possible performance implications of this search? No. First, an importer would not be required to implement it in a precisely analagous way; you could have database entries or a special consolidated index in a zipfile, if you wanted to do it like that. (In practice, Python's zipimporter has a memory cache of the TOC, and a simple database index on paths would make a search for .pth's in a subdirectory trivial for the database case.) I say no, but I just want to make sure people we are not and people are aware about the design shift required in finders. This entire worry would be alleviated if only .pth files named after the package were supported, much like *.pkg files in pkgutil. Which would completely break one of the major use cases of the PEP, which is precisely to ensure that you can install two pieces of code to the same namespace without either one overwriting the other's files. And then the search for the __init__.py begins on the newly modified __path__, which I assume ends with the first __init__ found on __path__, but if no file is found it's okay and essentially an empty module with just module-specific attributes is used? In other words, can a .pth file replace an __init__ file in delineating a package? Yes. Or is it purely additive? I assume the latter for compatibility reasons, Nope. The idea is specifically to allow separately installed projects to create a package without overwriting any files (causing conflicts for system installers). but the PEP says "a directory is considered a package if it **either** contains a file named __init__.py, **or** a file whose name ends with ".pth"" (emphasis mine). Otherwise I assume that the search will be done simply with ``os.path.isdir(os.path.join(sys_path_entry, top_level_package_name)`` and all existing paths will be added to __path__. Will they come before or after the directory where the *.pth was found? And will any subsequent *.pth files found in other directories also be executed? As for how "*" works, is this limited to top-level packages, or will sub-packages participate as well? Sub-packages as well. I assume the former, but it is not directly stated in the PEP. If the latter, is a dotted package name changed to ``os.sep.join(sy_path_entry, package_name.replace('".", os.sep)``? For sys.path_hooks, I am assuming import will simply skip over passing that as it is a marker that __path__ represents a namsepace package and not in any way functional. Although with sys.namespace_packages, is leaving the "*" in __path__ truly necessary? I'm going to leave these to Martin to answer. For the search of paths to use to extend, are we limiting ourselves to actual file system entries on sys.path (as pkgutil does), pkgutil doesn't have such a limitation, except in the case extend_path, and that limitation is one that PEP 382 intends to remove. or do we want to support other storage back-ends? To do the latter I would suggest having a successful path discovery be when a finder can be created for the hypothetical directory from sys.path_hooks. The downside to that is that NullImporter is the default importer, so you'd still have to spe
Re: [Python-Dev] Bugfix releases should not change APIs
On 5/29/2010 6:39 AM, Antoine Pitrou wrote: It is not the product of oversight. I am actually glad, in a sense, that it was not casual whim. ;-) I do not like the change, since it moves streams back further away from Python's sequence model, but I withdraw the request for reversion in 3.1.3. I will add further comments on the docs to the issue. What it does teach us is that Python 3.1 sees some real use, It is an odd 'coincidence' that the method changed was one of the only two stdlib methods I have used so far used directly. But with enough users, such happens. What it teaches *me* is that before I install another release, I should, as planned, automate the running of all module tests together so I can easily test everything before and after a new installation. When I do release sample chapters and code, I will try to remember to specify the version and platform I tested with. we have entered a phase where backwards compatibility will become as important as it was in the 2.x line. I have assumed that there might be a few stdlib API tweeks in 3.2 -- and that they would be well announced. Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Bugfix releases should not change APIs
On 5/28/2010 11:41 PM, Nick Coghlan wrote: However, it may be worth modifying the policy to ensure that such exceptional bug fixes be mentioned prominently in the release notes and on the download page for that maintenance release. A sentence like "The behavior of it.X.truncate has been intentionally changed from ... to ... .", if I read and cognized it, would have helped me, in this case, to the problem and fix much more quickly. Is it possible with svn or hg to get a list of the commits that changed version x to version y? Would is not be possible to get a diff between at least the .rst versions of the docs for version x and version y? Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] variable name resolution in exec is incorrect
On 5/29/2010 6:20 AM, Colin H wrote: Perhaps the next step is to re-open the issue? If it is seen as a bug, it would be great to see a fix in 2.6+ - For the purpose of bugfix releases, a 'bug' is a discrepancy between doc and behavior. Every new feature is seen as a 'design bug' by someone. > a number of options which will not break backward compatibility have been put forward - cheers, Code that uses a new x.y.z feature does not work in previous x.y versions. Problems with such micro-release additions lead to the current policy. The 3.2 feature addition deadline is about 5 months away. It will probably take 1 or more people at least a couple of months to write a PEP listing the rationale for a change, the options and possible pros and cons, possibly test one or more patches, solicit opinions on which is best, select one, write new test cases and docs, and get the final patch committed. Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Implementing PEP 382, Namespace Packages
On Sat, May 29, 2010 at 15:56, P.J. Eby wrote: > At 09:29 PM 5/29/2010 +0200, Martin v. Löwis wrote: >> >> Am 29.05.2010 21:06, schrieb P.J. Eby: >>> >>> At 08:45 PM 5/29/2010 +0200, Martin v. Löwis wrote: > > In it he says that PEP 382 is being deferred until it can address PEP > 302 loaders. I can't find any follow-up to this. I don't see any > discussion in PEP 382 about PEP 302 loaders, so I assume this issue was > never resolved. Does it need to be before PEP 382 is implemented? Are > we > wasting our time by designing and (eventually) coding before this issue > is resolved? Yes, and yes. >>> >>> Is there anything we can do to help regarding that? >> >> You could comment on the proposal I made back then, or propose a different >> solution. > > Looking at that proposal, I don't follow how changing *loaders* (vs. > importers) would help. If an importer's find_module doesn't natively > support PEP 382, then there's no way to get a loader for the package in the > first place. Today, namespace packages work fine with PEP 302 loaders, > because the namespace-ness is really only about setting up the __path__, and > detecting that you need to do this in the first place. > > In the PEP 302 scheme, then, it's either importers that have to change, or > the process that invokes them. Being able to ask an importer the > equivalents of os.path.join, listdir, and get_data would suffice to make an > import process that could do the trick. > > Essentially, you'd ask each importer to first attempt to find the module, > and then asking it (or the loader, if the find worked) whether > packagename/*.pth exists, and then processing their contents. > > I don't think there's a need to have a special method for executing a > package __init__, since what you'd do in the case where there are .pth but > no __init__, is to simply continue the search to the end of sys.path (or the > parent package __path__), and *then* create the module with an appropriate > __path__. > > If at any point the find_module() call succeeds, then subsequent importers > will just be asked for .pth files, which can then be processed into the > __path__ of the now-loaded module. > > IOW, something like this (very rough draft): > > pth_contents = [] > module = None > > for pathitem in syspath_or_parent__path__: > > importer = pkgutil.get_importer(pathitem) > if importer is None: > continue > > if module is None: > try: > loader = importer.find_module(fullname) > except ImportError: > pass > else: > # errors here should propagate > module = loader.load_module(fullname) > if not hasattr(module, '__path__'): > # found, but not a package > return module > > pc = get_pth_contents(importer) > if pc is not None: > subpath = os.path.join(pathitem, modulebasename) > pth_contents.append(subpath) > pth_contents.extend(pc) > if '*' not in pth_contents: > # got a package, but not a namespace > break > > if pth_contents: > if module is None: > # No __init__, but we have paths, so make an empty package > module = # new module object w/empty __path__ > modify__path__(module, pth_contents) > > return module > Is it wise to modify __path__ post-import? Today people can make sure that __path__ is set to what they want before potentially reading it in their __init__ module by making the pkgutil.extend_path() call first. This would actually defer to after the import and thus not allow any __init__ code to rely on what __path__ eventually becomes. > Obviously, the details are all in the 'get_pth_contents()', and > 'modify__path__()' functions, and the above process would do extra work in > the case where an individual importer implements PEP 382 on its own > (although why would it?). > > It's also the case that this algorithm will be slow to fail imports when > implemented as a meta_path hook, since it will be doing an extra pass over > sys.path or the parent __path__, in addition to the one that's done by the > normal __import__ machinery. (Though that's not an issue for Python 3.x, > since this can be built into the core __import__). > > (Technically, the 3.x version should probably ask meta_path hooks for their > .pth files as well, but I'm not entirely sure that that's a meaningful thing > to ask.) > > The PEP 302 questions all boil down to how get_pth_contents() is > implemented, and whether 'subpath' really should be created with > os.path.join. Simply adding a get_pth_contents() method to the importer > protocol (that returns None or a list of lines), and maybe a > get_subpath(modulename) method that returns the path string that should be > used for a subdirectory importer (i.e. __path__ entry), or None if no s
Re: [Python-Dev] Implementing PEP 382, Namespace Packages
On Sun, May 30, 2010 at 00:40, P.J. Eby wrote: > At 03:44 PM 5/29/2010 -0700, Brett Cannon wrote: >> >> On Sat, May 29, 2010 at 12:29, "Martin v. Löwis" >> wrote: >> > Am 29.05.2010 21:06, schrieb P.J. Eby: >> >> >> >> At 08:45 PM 5/29/2010 +0200, Martin v. Löwis wrote: >> >> In it he says that PEP 382 is being deferred until it can address PEP >> 302 loaders. I can't find any follow-up to this. I don't see any >> discussion in PEP 382 about PEP 302 loaders, so I assume this issue >> was >> never resolved. Does it need to be before PEP 382 is implemented? Are >> we >> wasting our time by designing and (eventually) coding before this >> issue >> is resolved? >> >>> >> >>> Yes, and yes. >> >> >> >> Is there anything we can do to help regarding that? >> > >> > You could comment on the proposal I made back then, or propose a >> > different >> > solution. >> >> [sorry for the fundamental PEP questions, but I think PEP 382 came >> about while I was on my python-dev sabbatical last year] >> >> I have some questions about the PEP which might help clarify how to >> handle the API changes. >> >> For finders, their search algorithm is changed in a couple of ways. >> One is that modules are given priority over packages (is that >> intentional, Martin, or just an oversight?). Two, the package search >> requires checking for a .pth file on top of an __init__.py. This will >> change finders that could before simply do an existence check on an >> __init__ "file" (or whatever the storage back-end happened to be) and >> make it into a list-and-search which one would hope wasn't costly, but >> in same cases might be if the paths to files is not stored in a >> hierarchical fashion (e.g. zip files list entire files paths in their >> TOC or a sqlite3 DB which uses a path for keys will have to list >> **all** keys, sort them to just the relevant directory, and then look >> for .pth or some such approach). Are we worried about possible >> performance implications of this search? > > No. First, an importer would not be required to implement it in a precisely > analagous way; you could have database entries or a special consolidated > index in a zipfile, if you wanted to do it like that. (In practice, > Python's zipimporter has a memory cache of the TOC, and a simple database > index on paths would make a search for .pth's in a subdirectory trivial for > the database case.) Sure, for the two examples this works, but who knows about other odd back-ends people might be using. Granted, this is all hypothetical and why I figured we wouldn't worry about it. > > >> I say no, but I just want to >> make sure people we are not and people are aware about the design >> shift required in finders. This entire worry would be alleviated if >> only .pth files named after the package were supported, much like >> *.pkg files in pkgutil. > > Which would completely break one of the major use cases of the PEP, which is > precisely to ensure that you can install two pieces of code to the same > namespace without either one overwriting the other's files. The PEP says the goal is to span packages across directories. If you split something like zope into multiple directories, does having a separate zope.pth file in each of those directories really cause a problem here? You are not importing them so it isn't like you are worrying about precedence. If you specify that all .pth files found are run then using the same file name in all package directories isn't an issue. But I guess packages that do this want to keep unique files per directory separation that they support and not have to fix the file names at distribution time. > > >> And then the search for the __init__.py begins on the newly modified >> __path__, which I assume ends with the first __init__ found on >> __path__, but if no file is found it's okay and essentially an empty >> module with just module-specific attributes is used? In other words, >> can a .pth file replace an __init__ file in delineating a package? > > Yes. > > >> Or is it purely additive? I assume the latter for compatibility reasons, > > Nope. The idea is specifically to allow separately installed projects to > create a package without overwriting any files (causing conflicts for system > installers). > > >> but the PEP says "a directory is considered a package if it **either** >> contains a file named __init__.py, **or** a file whose name ends with >> ".pth"" (emphasis mine). Otherwise I assume that the search will be >> done simply with ``os.path.isdir(os.path.join(sys_path_entry, >> top_level_package_name)`` and all existing paths will be added to >> __path__. Will they come before or after the directory where the *.pth >> was found? And will any subsequent *.pth files found in other >> directories also be executed? >> >> As for how "*" works, is this limited to top-level packages, or will >> sub-packages participate as well? > > Sub-packages as well. > > >> I assume
Re: [Python-Dev] Implementing PEP 382, Namespace Packages
At 05:59 PM 5/30/2010 -0700, Brett Cannon wrote: Is it wise to modify __path__ post-import? Today people can make sure that __path__ is set to what they want before potentially reading it in their __init__ module by making the pkgutil.extend_path() call first. This would actually defer to after the import and thus not allow any __init__ code to rely on what __path__ eventually becomes. Well, that's what the other lines in the .pth files are for. Keep in mind that only *one* project can contain the namespace package's __init__ module, so it's only sane for that __init__ to import things that are bundled with the __init__ module. AFAIK, most uses of namespace packages today are via setuptools' API, which doesn't support having a non-empty __init__.py at all (apart from the namespace declaration), so this limitation is unlikely to cause problems in practice. When the code I gave is refactored into a proper importer/loader pair, it can actually be structured such that the full __path__ is set *before* the low-level loader is called; however, if the loader itself chooses to overwrite __path__ at that point, there's little that can be done about it. In the Python 3.x case, the loader protocol could be revised to require only *adding* a non-duplicate entry to __path__ if it's present, and the stdlib loaders updated accordingly. For my backport, OTOH, I'd have to do some sort of workaround to wrap the regular importers, so I'd just as soon leave it undefined by PEP 382 what an __init__ module sees in __path__ during its execution. (And for a backport whose sole purpose is to cut down on setuptools' funky .pth manipulations, that might suffice anyway.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Implementing PEP 382, Namespace Packages
At 06:18 PM 5/30/2010 -0700, Brett Cannon wrote: On Sun, May 30, 2010 at 00:40, P.J. Eby wrote: > > Which would completely break one of the major use cases of the PEP, which is > precisely to ensure that you can install two pieces of code to the same > namespace without either one overwriting the other's files. The PEP says the goal is to span packages across directories. The goal of namespace packages is to allow separately-distributed pieces of code to live in the same package namespace. That this is sometimes achieved by installing them to different paths is an implementation detail. In the case of e.g. Linux distributions and other system packaging scenarios, the code will all be installed to the *same* directory -- so there cannot be any filename collisions among the separately-distributed modules. That's why we want to get rid of the need for an __init__.py to mark the directory as a package: it's a collision point for system package management tools. > pkgutil doesn't have such a limitation, except in the case extend_path, and > that limitation is one that PEP 382 intends to remove. It's because pkgutil.extend_path has that limitation I am asking as that's what the PEP refers to. If the PEP wants to remove the limitation it should clearly state how it is going to do that. I'm flexible on it either way. The only other importer I know of that does anything else is one that actually allows (unsafely) importing from URLs. If we allow for other things, then we need to extend the PEP 302 protocol to have a way to ask an importer for a subpath string. As for adding to the PEP 302 protocols, it's a question of how much we want importer implementors to have control over this versus us. I personally would rather keep any protocol extensions simple and have import handle as many of the details as possible. I lean the other way a bit, in that the more of the importer internals you expose, the harder you make it for an importer to be anything other than a mere virtual file system. (As it is, I think there is too much "file-ness" coupling in the protocol already, what with file extensions and the like.) Indeed, now that I'm thinking about it, it actually seems to make more sense to just require the importers to implement PEP 382, and provide some common machinery in imp or pkgutil for reading .pth strings, setting up __path__, and hunting down all the other directories. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com