Re: [Python-Dev] Implementing PEP 382, Namespace Packages

2010-05-30 Thread P.J. Eby

At 03:44 PM 5/29/2010 -0700, Brett Cannon wrote:

On Sat, May 29, 2010 at 12:29, "Martin v. Löwis"  wrote:
> Am 29.05.2010 21:06, schrieb P.J. Eby:
>>
>> At 08:45 PM 5/29/2010 +0200, Martin v. Löwis wrote:

 In it he says that PEP 382 is being deferred until it can address PEP
 302 loaders. I can't find any follow-up to this. I don't see any
 discussion in PEP 382 about PEP 302 loaders, so I assume this issue was
 never resolved. Does it need to be before PEP 382 is implemented? Are we
 wasting our time by designing and (eventually) coding before this issue
 is resolved?
>>>
>>> Yes, and yes.
>>
>> Is there anything we can do to help regarding that?
>
> You could comment on the proposal I made back then, or propose a different
> solution.

[sorry for the fundamental PEP questions, but I think PEP 382 came
about while I was on my python-dev sabbatical last year]

I have some questions about the PEP which might help clarify how to
handle the API changes.

For finders, their search algorithm is changed in a couple of ways.
One is that modules are given priority over packages (is that
intentional, Martin, or just an oversight?). Two, the package search
requires checking for a .pth file on top of an __init__.py. This will
change finders that could before simply do an existence check on an
__init__ "file" (or whatever the storage back-end happened to be) and
make it into a list-and-search which one would hope wasn't costly, but
in same cases might be if the paths to files is not stored in a
hierarchical fashion (e.g. zip files list entire files paths in their
TOC or a sqlite3 DB which uses a path for keys will have to list
**all** keys, sort them to just the relevant directory, and then look
for .pth or some such approach). Are we worried about possible
performance implications of this search?


No.  First, an importer would not be required to implement it in a 
precisely analagous way; you could have database entries or a special 
consolidated index in a zipfile, if you wanted to do it like 
that.  (In practice, Python's zipimporter has a memory cache of the 
TOC, and a simple database index on paths would make a search for 
.pth's in a subdirectory trivial for the database case.)




 I say no, but I just want to
make sure people we are not and people are aware about the design
shift required in finders. This entire worry would be alleviated if
only .pth files named after the package were supported, much like
*.pkg files in pkgutil.


Which would completely break one of the major use cases of the PEP, 
which is precisely to ensure that you can install two pieces of code 
to the same namespace without either one overwriting the other's files.




And then the search for the __init__.py begins on the newly modified
__path__, which I assume ends with the first __init__ found on
__path__, but if no file is found it's okay and essentially an empty
module with just module-specific attributes is used? In other words,
can a .pth file replace an __init__ file in delineating a package?


Yes.



Or is it purely additive? I assume the latter for compatibility reasons,


Nope.  The idea is specifically to allow separately installed 
projects to create a package without overwriting any files (causing 
conflicts for system installers).




but the PEP says "a directory is considered a package if it **either**
contains a file named __init__.py, **or** a file whose name ends with
".pth"" (emphasis mine). Otherwise I assume that the search will be
done simply with ``os.path.isdir(os.path.join(sys_path_entry,
top_level_package_name)`` and all existing paths will be added to
__path__. Will they come before or after the directory where the *.pth
was found? And will any subsequent *.pth files found in other
directories also be executed?

As for how "*" works, is this limited to top-level packages, or will
sub-packages participate as well?


Sub-packages as well.



 I assume the former, but it is not
directly stated in the PEP. If the latter, is a dotted package name
changed to ``os.sep.join(sy_path_entry, package_name.replace('".",
os.sep)``?

For sys.path_hooks, I am assuming import will simply skip over passing
that as it is a marker that __path__ represents a namsepace package
and not in any way functional. Although with sys.namespace_packages,
is leaving the "*" in __path__ truly necessary?


I'm going to leave these to Martin to answer.



For the search of paths to use to extend, are we limiting ourselves to
actual file system entries on sys.path (as pkgutil does),


pkgutil doesn't have such a limitation, except in the case 
extend_path, and that limitation is one that PEP 382 intends to remove.




or do we
want to support other storage back-ends? To do the latter I would
suggest having a successful path discovery be when a finder can be
created for the hypothetical directory from sys.path_hooks.


The downside to that is that NullImporter is the default importer, so 
you'd still have to spe

Re: [Python-Dev] Bugfix releases should not change APIs

2010-05-30 Thread Terry Reedy

On 5/29/2010 6:39 AM, Antoine Pitrou wrote:

It is not the product of oversight.


I am actually glad, in a sense, that it was not casual whim. ;-)
I do not like the change, since it moves streams back further away from 
Python's sequence model, but I withdraw the request for reversion in 3.1.3.


I will add further comments on the docs to the issue.


What it does teach us is that Python 3.1 sees some real use,


It is an odd 'coincidence' that the method changed was one of the only 
two stdlib methods I have used so far used directly. But with enough 
users, such happens.


What it teaches *me* is that before I install another release, I should, 
as planned, automate the running of all module tests together so I can 
easily test everything before and after a new installation.


When I do release sample chapters and code, I will try to remember to 
specify the version and platform I tested with.



we have entered a phase where backwards compatibility will become as
important as it was in the 2.x line.


I have assumed that there might be a few stdlib API tweeks in 3.2 -- and 
that they would be well announced.


Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bugfix releases should not change APIs

2010-05-30 Thread Terry Reedy

On 5/28/2010 11:41 PM, Nick Coghlan wrote:


However, it may be worth modifying the policy to ensure that such
exceptional bug fixes be mentioned prominently in the release notes and
on the download page for that maintenance release.


A sentence like "The behavior of it.X.truncate has been intentionally 
changed from ... to ... .", if I read and cognized it, would have helped 
me, in this case, to the problem and fix much more quickly.


Is it possible with svn or hg to get a list of the commits that changed 
version x to version y?


Would is not be possible to get a diff between at least the .rst 
versions of the docs for version x and version y?


Terry Jan Reedy


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] variable name resolution in exec is incorrect

2010-05-30 Thread Terry Reedy

On 5/29/2010 6:20 AM, Colin H wrote:

Perhaps the next step is to re-open the issue? If it is seen as a bug,
it would be great to see a fix in 2.6+ -


For the purpose of bugfix releases, a 'bug' is a discrepancy between doc 
and behavior. Every new feature is seen as a 'design bug' by someone.


> a number of options which

will not break backward compatibility have been put forward - cheers,


Code that uses a new x.y.z feature does not work in previous x.y 
versions. Problems with such micro-release additions lead to the current 
policy.


The 3.2 feature addition deadline is about 5 months away. It will 
probably take 1 or more people at least a couple of months to write a 
PEP listing the rationale for a change, the options and possible pros 
and cons, possibly test one or more patches, solicit opinions on which 
is best, select one, write new test cases and docs, and get the final 
patch committed.


Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Implementing PEP 382, Namespace Packages

2010-05-30 Thread Brett Cannon
On Sat, May 29, 2010 at 15:56, P.J. Eby  wrote:
> At 09:29 PM 5/29/2010 +0200, Martin v. Löwis wrote:
>>
>> Am 29.05.2010 21:06, schrieb P.J. Eby:
>>>
>>> At 08:45 PM 5/29/2010 +0200, Martin v. Löwis wrote:
>
> In it he says that PEP 382 is being deferred until it can address PEP
> 302 loaders. I can't find any follow-up to this. I don't see any
> discussion in PEP 382 about PEP 302 loaders, so I assume this issue was
> never resolved. Does it need to be before PEP 382 is implemented? Are
> we
> wasting our time by designing and (eventually) coding before this issue
> is resolved?

 Yes, and yes.
>>>
>>> Is there anything we can do to help regarding that?
>>
>> You could comment on the proposal I made back then, or propose a different
>> solution.
>
> Looking at that proposal, I don't follow how changing *loaders* (vs.
> importers) would help.  If an importer's find_module doesn't natively
> support PEP 382, then there's no way to get a loader for the package in the
> first place.  Today, namespace packages work fine with PEP 302 loaders,
> because the namespace-ness is really only about setting up the __path__, and
> detecting that you need to do this in the first place.
>
> In the PEP 302 scheme, then, it's either importers that have to change, or
> the process that invokes them.  Being able to ask an importer the
> equivalents of os.path.join, listdir, and get_data would suffice to make an
> import process that could do the trick.
>
> Essentially, you'd ask each importer to first attempt to find the module,
> and then asking it (or the loader, if the find worked) whether
> packagename/*.pth exists, and then processing their contents.
>
> I don't think there's a need to have a special method for executing a
> package __init__, since what you'd do in the case where there are .pth but
> no __init__, is to simply continue the search to the end of sys.path (or the
> parent package __path__), and *then* create the module with an appropriate
> __path__.
>
> If at any point the find_module() call succeeds, then subsequent importers
> will just be asked for .pth files, which can then be processed into the
> __path__ of the now-loaded module.
>
> IOW, something like this (very rough draft):
>
>    pth_contents = []
>    module = None
>
>    for pathitem in syspath_or_parent__path__:
>
>        importer = pkgutil.get_importer(pathitem)
>        if importer is None:
>            continue
>
>        if module is None:
>            try:
>                loader = importer.find_module(fullname)
>            except ImportError:
>                pass
>            else:
>                # errors here should propagate
>                module = loader.load_module(fullname)
>                if not hasattr(module, '__path__'):
>                    # found, but not a package
>                    return module
>
>        pc = get_pth_contents(importer)
>        if pc is not None:
>            subpath = os.path.join(pathitem, modulebasename)
>            pth_contents.append(subpath)
>            pth_contents.extend(pc)
>            if '*' not in pth_contents:
>                # got a package, but not a namespace
>                break
>
>    if pth_contents:
>        if module is None:
>            # No __init__, but we have paths, so make an empty package
>            module = # new module object w/empty __path__
>        modify__path__(module, pth_contents)
>
>    return module
>

Is it wise to modify __path__ post-import? Today people can make sure
that __path__ is set to what they want before potentially reading it
in their __init__ module by making the pkgutil.extend_path() call
first. This would actually defer to after the import and thus not
allow any __init__ code to rely on what __path__ eventually becomes.

> Obviously, the details are all in the 'get_pth_contents()', and
> 'modify__path__()' functions, and the above process would do extra work in
> the case where an individual importer implements PEP 382 on its own
> (although why would it?).
>
> It's also the case that this algorithm will be slow to fail imports when
> implemented as a meta_path hook, since it will be doing an extra pass over
> sys.path or the parent __path__, in addition to the one that's done by the
> normal __import__ machinery.  (Though that's not an issue for Python 3.x,
> since this can be built into the core __import__).
>
> (Technically, the 3.x version should probably ask meta_path hooks for their
> .pth files as well, but I'm not entirely sure that that's a meaningful thing
> to ask.)
>
> The PEP 302 questions all boil down to how get_pth_contents() is
> implemented, and whether 'subpath' really should be created with
> os.path.join.  Simply adding a get_pth_contents() method to the importer
> protocol (that returns None or a list of lines), and maybe a
> get_subpath(modulename) method that returns the path string that should be
> used for a subdirectory importer (i.e. __path__ entry), or None if no s

Re: [Python-Dev] Implementing PEP 382, Namespace Packages

2010-05-30 Thread Brett Cannon
On Sun, May 30, 2010 at 00:40, P.J. Eby  wrote:
> At 03:44 PM 5/29/2010 -0700, Brett Cannon wrote:
>>
>> On Sat, May 29, 2010 at 12:29, "Martin v. Löwis" 
>> wrote:
>> > Am 29.05.2010 21:06, schrieb P.J. Eby:
>> >>
>> >> At 08:45 PM 5/29/2010 +0200, Martin v. Löwis wrote:
>> 
>>  In it he says that PEP 382 is being deferred until it can address PEP
>>  302 loaders. I can't find any follow-up to this. I don't see any
>>  discussion in PEP 382 about PEP 302 loaders, so I assume this issue
>>  was
>>  never resolved. Does it need to be before PEP 382 is implemented? Are
>>  we
>>  wasting our time by designing and (eventually) coding before this
>>  issue
>>  is resolved?
>> >>>
>> >>> Yes, and yes.
>> >>
>> >> Is there anything we can do to help regarding that?
>> >
>> > You could comment on the proposal I made back then, or propose a
>> > different
>> > solution.
>>
>> [sorry for the fundamental PEP questions, but I think PEP 382 came
>> about while I was on my python-dev sabbatical last year]
>>
>> I have some questions about the PEP which might help clarify how to
>> handle the API changes.
>>
>> For finders, their search algorithm is changed in a couple of ways.
>> One is that modules are given priority over packages (is that
>> intentional, Martin, or just an oversight?). Two, the package search
>> requires checking for a .pth file on top of an __init__.py. This will
>> change finders that could before simply do an existence check on an
>> __init__ "file" (or whatever the storage back-end happened to be) and
>> make it into a list-and-search which one would hope wasn't costly, but
>> in same cases might be if the paths to files is not stored in a
>> hierarchical fashion (e.g. zip files list entire files paths in their
>> TOC or a sqlite3 DB which uses a path for keys will have to list
>> **all** keys, sort them to just the relevant directory, and then look
>> for .pth or some such approach). Are we worried about possible
>> performance implications of this search?
>
> No.  First, an importer would not be required to implement it in a precisely
> analagous way; you could have database entries or a special consolidated
> index in a zipfile, if you wanted to do it like that.  (In practice,
> Python's zipimporter has a memory cache of the TOC, and a simple database
> index on paths would make a search for .pth's in a subdirectory trivial for
> the database case.)

Sure, for the two examples this works, but who knows about other odd
back-ends people might be using. Granted, this is all hypothetical and
why I figured we wouldn't worry about it.

>
>
>>  I say no, but I just want to
>> make sure people we are not and people are aware about the design
>> shift required in finders. This entire worry would be alleviated if
>> only .pth files named after the package were supported, much like
>> *.pkg files in pkgutil.
>
> Which would completely break one of the major use cases of the PEP, which is
> precisely to ensure that you can install two pieces of code to the same
> namespace without either one overwriting the other's files.

The PEP says the goal is to span packages across directories. If you
split something like zope into multiple directories, does having a
separate zope.pth file in each of those directories really cause a
problem here? You are not importing them so it isn't like you are
worrying about precedence. If you specify that all .pth files found
are run then using the same file name in all package directories isn't
an issue. But I guess packages that do this want to keep unique files
per directory separation that they support and not have to fix the
file names at distribution time.

>
>
>> And then the search for the __init__.py begins on the newly modified
>> __path__, which I assume ends with the first __init__ found on
>> __path__, but if no file is found it's okay and essentially an empty
>> module with just module-specific attributes is used? In other words,
>> can a .pth file replace an __init__ file in delineating a package?
>
> Yes.
>
>
>> Or is it purely additive? I assume the latter for compatibility reasons,
>
> Nope.  The idea is specifically to allow separately installed projects to
> create a package without overwriting any files (causing conflicts for system
> installers).
>
>
>> but the PEP says "a directory is considered a package if it **either**
>> contains a file named __init__.py, **or** a file whose name ends with
>> ".pth"" (emphasis mine). Otherwise I assume that the search will be
>> done simply with ``os.path.isdir(os.path.join(sys_path_entry,
>> top_level_package_name)`` and all existing paths will be added to
>> __path__. Will they come before or after the directory where the *.pth
>> was found? And will any subsequent *.pth files found in other
>> directories also be executed?
>>
>> As for how "*" works, is this limited to top-level packages, or will
>> sub-packages participate as well?
>
> Sub-packages as well.
>
>
>>  I assume 

Re: [Python-Dev] Implementing PEP 382, Namespace Packages

2010-05-30 Thread P.J. Eby

At 05:59 PM 5/30/2010 -0700, Brett Cannon wrote:

Is it wise to modify __path__ post-import? Today people can make sure
that __path__ is set to what they want before potentially reading it
in their __init__ module by making the pkgutil.extend_path() call
first. This would actually defer to after the import and thus not
allow any __init__ code to rely on what __path__ eventually becomes.


Well, that's what the other lines in the .pth files are for.  Keep in 
mind that only *one* project can contain the namespace package's 
__init__ module, so it's only sane for that __init__ to import things 
that are bundled with the __init__ module.


AFAIK, most uses of namespace packages today are via setuptools' API, 
which doesn't support having a non-empty __init__.py at all (apart 
from the namespace declaration), so this limitation is unlikely to 
cause problems in practice.


When the code I gave is refactored into a proper importer/loader 
pair, it can actually be structured such that the full __path__ is 
set *before* the low-level loader is called; however, if the loader 
itself chooses to overwrite __path__ at that point, there's little 
that can be done about it.


In the Python 3.x case, the loader protocol could be revised to 
require only *adding* a non-duplicate entry to __path__ if it's 
present, and the stdlib loaders updated accordingly.  For my 
backport, OTOH, I'd have to do some sort of workaround to wrap the 
regular importers, so I'd just as soon leave it undefined by PEP 382 
what an __init__ module sees in __path__ during its execution.  (And 
for a backport whose sole purpose is to cut down on setuptools' funky 
.pth manipulations, that might suffice anyway.)


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Implementing PEP 382, Namespace Packages

2010-05-30 Thread P.J. Eby

At 06:18 PM 5/30/2010 -0700, Brett Cannon wrote:

On Sun, May 30, 2010 at 00:40, P.J. Eby  wrote:
>
> Which would completely break one of the major use cases of the 
PEP, which is

> precisely to ensure that you can install two pieces of code to the same
> namespace without either one overwriting the other's files.

The PEP says the goal is to span packages across directories.


The goal of namespace packages is to allow separately-distributed 
pieces of code to live in the same package namespace.  That this is 
sometimes achieved by installing them to different paths is an 
implementation detail.


In the case of e.g. Linux distributions and other system packaging 
scenarios, the code will all be installed to the *same* directory -- 
so there cannot be any filename collisions among the 
separately-distributed modules.  That's why we want to get rid of the 
need for an __init__.py to mark the directory as a package: it's a 
collision point for system package management tools.




> pkgutil doesn't have such a limitation, except in the case extend_path, and
> that limitation is one that PEP 382 intends to remove.

It's because pkgutil.extend_path has that limitation I am asking as
that's what the PEP refers to. If the PEP wants to remove the
limitation it should clearly state how it is going to do that.


I'm flexible on it either way.  The only other importer I know of 
that does anything else is one that actually allows (unsafely) 
importing from URLs.


If we allow for other things, then we need to extend the PEP 302 
protocol to have a way to ask an importer for a subpath string.





As for adding to the PEP 302 protocols, it's a question of how much we
want importer implementors to have control over this versus us. I
personally would rather keep any protocol extensions simple and have
import handle as many of the details as possible.


I lean the other way a bit, in that the more of the importer 
internals you expose, the harder you make it for an importer to be 
anything other than a mere virtual file system.  (As it is, I think 
there is too much "file-ness" coupling in the protocol already, what 
with file extensions and the like.)


Indeed, now that I'm thinking about it, it actually seems to make 
more sense to just require the importers to implement PEP 382, and 
provide some common machinery in imp or pkgutil for reading .pth 
strings, setting up __path__, and hunting down all the other directories.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com