Re: [Python-Dev] Reviving restricted mode?
At 08:45 AM 2/22/2009 -0800, Guido van Rossum wrote: I've received some enthusiastic emails from someone who wants to revive restricted mode. He started out with a bunch of patches to the CPython runtime using ctypes, which he attached to an App Engine bug: http://code.google.com/p/googleappengine/issues/detail?id=671 Based on his code (the file secure.py is all you need, included in secure.tar.gz) it seems he believes the only security leaks are __subclasses__, gi_frame and gi_code. (I have since convinced him that if we add "restricted" guards to these attributes, he doesn't need the functions added to sys.) I don't recall the exploits that Samuele once posted that caused the death of rexec.py -- does anyone recall, or have a pointer to the threads? Just a question, but, if you just need a pure-python restricted environment for App Engine, why not just use the RestrictedPython package (i.e., http://pypi.python.org/pypi/RestrictedPython )? ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Reviving restricted mode?
At 07:56 PM 2/22/2009 -0800, Guido van Rossum wrote: On Sun, Feb 22, 2009 at 7:39 PM, P.J. Eby wrote: > Just a question, but, if you just need a pure-python restricted environment > for App Engine, why not just use the RestrictedPython package (i.e., > http://pypi.python.org/pypi/RestrictedPython )? How does that work? Remember, app engine doesn't support certain things, and bytecode manipulations (if that's what RestrictedPython does) are one of the unsupported things. It doesn't modify bytecode, it modifies an AST. It basically replaces prints, and attribute/item read/writes with function calls. Unfortunately, it does this AST modification by running as a traversal against the stdlib compiler package's AST, not a modern AST. So, I suppose it might not be usable as-is on app engine. It does, however, have the advantage of having been used in Zope for oh, six or seven years now? ISTM that it first came out around the same time as Python 2.3, and the latest version just dropped support for Python 2.1 and 2.2. So, if you want something that wasn't thrown together in an afternoon, it might be a good thing to take a look at. ;-) The other reason I can think of is that Tav is a capabilities purist. :-) You can implement capabilities on top of RestrictedPython; it's simply a policy-neutral enforcement framework. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Challenge: Please break this! [Now with blog post]
At 03:52 PM 2/24/2009 +0100, Victor Stinner wrote: Le Tuesday 24 February 2009 15:46:04 Andrew Dalke, vous avez écrit : > A goal is to use this in App Engine, yes? Which uses cgitb to report > errors? Which needs these restricted frame attributes to report the > values of variables when the error occurred? We should be able to restore the original environment. Example: ... jail(evil_func) # called in the jail # unsafe environment with __subclasses__, f_code, etc. ... Of course, you'll have to ensure that anything you do with data from the jail is also jailed... that callbacks run in the jail, etc. (This is one advantage of the RestrictedPython approach -- the jailing of the restricted code isn't dependent on some global state; it's wired right into the restricted code.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] ABCs and MRO
At 10:41 AM 3/2/2009 +, Paul Moore wrote: 2009/3/2 Benjamin Peterson : > 2009/3/1 Paul Moore : >> >> Is it worth getting simplegeneric exposed in 3.1 >> (http://bugs.python.org/issue5135)? If it's going to be in 2.7, I'd >> like to see it hit 3.1. The patch is against trunk (for 2.7) at the >> moment, I'm not sure what the process would be for forward-porting it >> (do I generate a new patch against the py3k branch, or should it be >> applied to trunk and merged in?) By the way guys, are you aware of: http://pypi.python.org/pypi/simplegeneric There might be a bit of name confusion by exposing pkgutils' internal simplegeneric there. Perhaps it should be called "trivialgeneric", as it's even tinier than simplegeneric. ;-) The key problem with the patch is that ABCs do not play well with the type of introspection required to implement a generic function - namely enumeration of the superclasses of a class. The MRO of the class is fine for normal inheritance, but for ABCs it is possible to register classes which don't inherit from the ABC, so that you have a situation where issubclass (C, MyABC) can be true without MyABC being in C.__mro__: >>> import abc >>> class MyABC(object): ... __metaclass__ = abc.ABCMeta ... >>> class C(object): ... pass ... >>> MyABC.register(C) >>> issubclass(C, MyABC) True >>> C.__mro__ (, ) >>> More generally, there is NO WAY to determine the list of classes for which issubclass(C, x) is true. This could be considered a limitation of, or a bug in, ABCs, I don't have a particular opinion on that, but it does mean that no code which relies on being able to traverse the class inheritance graph will see ABCs. One particular case of this is (any implementation I can think of, of) generic functions. In my view, this implies one of the following: 1) It should be a documented limitation of such code that it doesn't work with ABCs (and conversely, this limitation of ABCs should be documented in the ABC documentation) 2) Generic functions, and any other code requiring this type of introspection, is essentially useless unless it can support ABCs, and should not be used in the light of this limitation. 3) This is a bug in ABCs and should be fixed. 4) Something else I didn't think of :-) In my view, (2) is an unreasonable position to take, given the fact that (as I understand it) ABCs are supposed to be largely optional and shouldn't affect code that doesn't care about them... It's not clear to me how (3) should be addressed. Adding a slot to all classes to hold a list of ABCs they are registered against seems to be a large overhead for a relatively rarely used feature. I guess having a global registry of ABC registrations could work, but it seems clumsy. Any other suggestions? This isn't really a new problem; if you base your generic function methods off of interfaces implemented by a type or instance, you have the same basic issues. For systems that use a cache based on object type (like Guido's tuple-dispatch prototype, and my enhanced version in PEAK-Rules), the actual lookup is not a big deal. You have a type-based test and you cache the result for the type. PEAK-Rules' predicate dispatching is a bit more complex, because you need a rather more complex type test; the tree generator has to look at whether a type test is an ABC, and effectively translate it to "oldstyleisinstance(arg, ABC) or not oldstyleisinstance(arg, ABC) and ABC.__instancecheck__(arg)". (Where oldstyleisinstance represents an __instancecheck__-free version of isinstance.) This isn't a major problem either, just a bit of a bore/pain to implement. The hairier issue for these types of systems is method precedence, though. Since __mro__'s have to be consistently ordered, you can straightforwardly determine whether one class is "more specific" than another in a static way. But with dynamic registration, the question could be more complex. Personally, I'd like to see some way to subscribe to changes in ABC registration, so that generic functions or other tools can update their caches. With that feature, you might even be able to implement full ABC support for simplegeneric, by treating ABC registrations as equivalent to mass registration of the ABC's registrants. That is, if "AnABC.register(X)" and "afunc.register(AnABC, meth)" then "afunc.register(X, meth)". So each time AnABC gets a new registrant, you automatically register the ABC method for the new registrant, as long as there's not already a method registered for that specific type. That would probably be sufficient for what simplegeneric is doing. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] ABCs and MRO
At 09:14 PM 3/2/2009 +, Paul Moore wrote: 2009/3/2 Jeffrey Yasskin : > I tend to think it's a bug in ABCs. You seem to have thought of > several possible ways to fix it, and I don't have strong preferences > between them. I've discussed ways of fixing simplegeneric, but not of fixing the issue with ABCs. I'm not sure the ABC "issue" is fixable - after all, it's by design that ABCs can implement __issubclass__ and "magically" become superclasses of arbitrary classes as a result. I'm not happy about fixing simplegeneric, though, as the whole point was just to expose an existing implementation, because it might be generally useful. If we start expanding and enhancing it, there are better implementations already available on PyPI (better in the sense of having seen real-world use). And if they don't handle ABCs, then that might indicate that needing to handle ABCs isn't as vital as this discussion would seem to imply (but I have no real-world data myself to make such a claim). I would say it's mainly an annoyance, as long as you have a way to access the method registered for a given ABC, because then you can explicitly register it for types that are being dynamically detected. Now, how widespread an annoyance/FAQ issue it'll be, will depend on: 1. whether the stdlib itself is using GF's and 2. whether those GF's are meaningfully used against ABCs, and 3. how common it is for classes to be "implicit subclasses" of one of those ABCs *and* used with one of those GFs. This sounds like a lot of conditions that have to line up, for it to become an (easily worked-around) annoyance. My inclination is simply to document that it works on explicit type relationships, and note that ABC's and other exotic users of __*check__ methods may require explicit method registrations to work correctly. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 377 - allow __enter__() methods to skip the statement body
At 06:28 AM 3/16/2009 +1000, Nick Coghlan wrote: There are some practical hurdles to that idea (specifically, creating a callable which uses its parent's namespace rather than having its own), but the basic concept seems sound. Actually, that bit's pretty simple -- they're just "nonlocal" variables. But the cost of creating that function, and the cost of having cell variables in the surrounding function is potentially high. (In CPython at least, function calls are optimized in certain ways when the function only has "fast" locals, and no "cell" locals.) The cost of creating the function (though not the code object) could be held till runtime, since the interpreter could wait until it's sure there's a __with__ method before doing a MAKE_CLOSURE on the code object. Of course, at that point, what's the difference between: with foo() as bar: baz and... @foo def bar(): baz except for being slightly less verbose? (due to missing nonlocal statements, etc.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 380 (yield from a subgenerator) comments
At 04:45 PM 3/21/2009 +1000, Nick Coghlan wrote: I really like the PEP - it's a solid extension of the ideas introduced by PEP 342. (Replying to you since I haven't seen any other thread on this) My concern is that allowing 'return value' in generators is going to be confusing, since it effectively causes the return value to "disappear" if you're not using it in this special way with some framework that takes advantage. However, if you *do* have some framework that takes advantage of generators to do microthreads, then it is most likely already written so as to have things like 'yield Return(value)' to signal a return, and to handle 'yield subgenerator()' without the use of additional syntax. So, I don't really see the point of the PEP. 'yield from' seems marginally useful, but I really dislike making it an expression, rather than a statement. The difference seems just a little too subtle, considering how radically different the behavior is. Overall, it has the feel of jamming a framework into the language, when doing the same thing in a library is pretty trivial. I'd almost rather see a standard or "reference" trampoline added to the stdlib (preferably with a way to register handling for specialized yielded types IO/scheduling hooks), than try to cram half a trampoline into the language itself. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 380 (yield from a subgenerator) comments
At 10:21 AM 3/22/2009 +1200, Greg Ewing wrote: P.J. Eby wrote: My concern is that allowing 'return value' in generators is going to be confusing, since it effectively causes the return value to "disappear" if you're not using it in this special way with some framework that takes advantage. But part of all this is that you *don't* need a special framework to get the return value -- all you need is a caller that uses a yield-from statement. There are uses for that besides threading systems. Such as? I've been wracking my brain trying to come up with any *other* occasion where I'd need -- or even find it useful -- to have one generator yield the contents of another generator to its caller, and then use a separate return value in itself. (I'm thus finding it hard to believe there's a non-contrived example that's not doing I/O, scheduling, or some other form of co-operative multitasking.) In any case, you didn't address the confusion issue: the inability of generators to return a value is there for a good reason, and adding a return value that doesn't actually return anywhere unless you use it in a yield-from expression -- an expression that both looks like a statement and has control-flow side-effects -- seems both over-complex and an invitation to confusion. This is different from plain yield expressions, in that plain yield expressions are *symmetric*: the value returned from the yield expression comes from the place where control flow is passed by the yield. That is, 'x = yield y' takes value y, passes control flow to the caller, and then returns a result from the caller. It's like an inverse function call. 'x = yield from y', on the other hand, first passes control to y, then the caller, then y, then the caller, an arbitrary number of times, and then finally returns a value from y, not the caller. This is an awful lot of difference in control flow for only a slight change in syntax -- much more of a difference than the difference between yield statements and yield expressions. So at present (for whatever those opinions are worth), I'd say -0 on a yield-from *statement* (somewhat useful but maybe not worth bothering with), +0 on a reference trampoline in the stdlib (slightly better than doing nothing at all, but not by much), and -1 on yield-from expressions and return values (confusing complication with very narrowly focused benefit, reasonably doable with library code). ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 380 (yield from a subgenerator) comments
At 08:11 PM 3/22/2009 +1200, Greg Ewing wrote: P.J. Eby wrote: (I'm thus finding it hard to believe there's a non-contrived example that's not doing I/O, scheduling, or some other form of co-operative multitasking.) Have you seen my xml parser example? http://www.cosc.canterbury.ac.nz/greg.ewing/python/yield-from/ Whether you'll consider it contrived or not I don't know (contrivedness being such a subjective property) but it illustrates the style of programming I'm trying to support with the return-value feature. I find the parser *without* yield-from to be much easier to follow what's going on, actually... and don't see what benefit was obtained by the additional complication of using send(). In any case, you didn't address the confusion issue: the inability of generators to return a value is there for a good reason, It's there because formerly there was nowhere for the return value to go. If there is somewhere for it to go, the restriction will no longer be needed. But that's begging the question (in the original meaning of the phrase) of why we *want* to have two ways to return data from a generator. As for confusion, we ignore the return values of function calls all the time, without worrying that someone might be confused by the fact that their return value doesn't go anywhere. And that's the right way to think of a yield-from expression -- as a kind of function call, not a kind of yield. But it's not a function call -- it's multiple *inverted* function calls, followed by special handling of the last iteration of the iterator it takes. The control flow is also hard to explain, as is the implementation. If there's anything confusing, it's the presence of the word 'yield'. Its only virtue is that it gives a clue that the construct has something to do with generators, but you'll have to RTM to find out exactly what. Nobody has thus far suggested any better name, however. Perhaps this is because it's not that interesting of a feature. As I said, I wouldn't fight a yield-from statement without all this return-value stuff, although it still seems like too much trouble to me. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 380 (yield from a subgenerator) comments
At 10:22 PM 3/24/2009 -0400, Steve Holden wrote: There is one non-trivial extension that I've been chewing over for a while. What if you want to yield not the values from the generator but some function of those values? The present proposal appears to have no way to specify that. What about extending the syntax somewhat to yield expr for x from X The idea is that x should be a a bound variable in expr, but the "expr for x" could be optional to yield the existing proposal as a degenerate case. That would be spelled: yield from (expr for x in X) And the compiler could optionally optimize away the genexpr. Assuming, of course, that this is considered valuable enough to implement in the first place, which I don't think it is... especially not with the return bit factored in. Now, if somebody came up with a different way to spell the extra value return, I wouldn't object as much to that part. I can just see people inadvertently writing 'return x' as a shortcut for 'yield x; return', and then having what seem like mysterious off-by-one errors, or being confused by receiving a generator object instead of their desired non-generator return value. It also seems weird that the only syntactically-supported way to get the generator's "return value" is to access it inside *another* generator... which *also* can't return the return value to anyone! But if it were spelled 'raise Return(value)' or 'raise StopIteration(value)' or something similar (or even had its own syntax!), I wouldn't object, as it would then be obvious how to get the value, and there could be no possible confusion with a regular return value. The unusual spelling would also signal that something unusual (i.e., multitasking) is taking place, similar to the way some frameworks use things like 'yield Return(value)' to signal the end of a task and its return value, in place of a value in the stream. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 380 (yield from a subgenerator) comments
At 06:03 PM 3/25/2009 +1200, Greg Ewing wrote: I wanted a way of writing suspendable functions that can call each other easily. (You may remember I originally wanted to call it "call".) Then I noticed that it would also happen to provide the functionality of earlier "yield from" suggestions, so I adopted that name. I still don't see what you gain from making this syntax, vs. putting something like this in the stdlib (rough sketch): class Task(object): def __init__(self, geniter): self.stack = [geniter] def __iter__(self): return self def send(self, value=None): if not self.stack: raise RuntimeError("Can't resume completed task") return self._step(value) send = next def _step(self, value=None, exc_info=()): while self.stack: try: it = self.stack[-1] if exc_info: try: rv = it.throw(*exc_info) finally: exc_info = () elif value is not None: rv = it.send(value) else: rv = it.next() except: value = None exc_info = sys.exc_info() if exc_info[0] is StopIteration: exc_info = () # not really an error self.pop() else: value, exc_info = yield_to(rv, self) else: if exc_info: raise exc_info[0], exc_info[1], exc_info[2] else: return value def throw(self, *exc_info): if not self.stack: raise RuntimeError("Can't resume completed task") return self._step(None, exc_info) def push(self, geniter): self.stack.append(geniter) return None, () def pop(self, value=None): if self.stack: it = self.stack.pop() if hasattr(it, 'close'): try: it.close() except: return None, sys.exc_info() return value, () @classmethod def factory(cls, func): def decorated(*args, **kw): return cls(func(*args, **kw)) return decorated def yield_to(rv, task): # This could/should be a generic function, to allow yielding to # deferreds, sockets, timers, and other custom objects if hasattr(rv, 'next'): return task.push(rv) elif isinstance(rv, Return): return task.pop(rv.value) else: return rv, () class Return(object): def __init__(self, value=None): self.value = value @Task.factory def sample_task(arg1, another_arg): # blah blah something = (yield subtask(...)) yield Return(result) def subtask(...): ... yield Return(myvalue) The trampoline (the _step() method) handles the co-operative aspects, and modifying the yield_to() function allows you to define how yielded values are processed. By default, they're sent back into the generator that yields them, but you can pass a Return() to terminate the generator and pass the value up to the calling generator. Yielding another generator, on the other hand, "calls" that generator within the current task, and the same rules apply. Is there some reason why this won't do what you want, and can't be modified to do so? If so, that should be part of the PEP, as IMO it otherwise lacks motivation for a language feature vs. say, a stdlib module. If 'yield_to' is a generic function or at least supports registration of some kind, a feature like this would be interoperable with a wide variety of frameworks -- you could register deferreds and delayed calls and IO objects from Twisted, for example. So it's not like the feature would be creating an entire new framework of its own. Rather, it'd be a front-end to whatever framework (or no framework) you're using. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] "setuptools has divided the Python community"
At 12:25 PM 3/25/2009 +, Antoine Pitrou wrote: Paul Moore gmail.com> writes: > > 3. Setuptools, unfortunately, has divided the Python distribution > community quite badly. Wait a little bit, and it's gonna be even worse, now that buildout and pip seem to become popular. For example, the TurboGears people are considering switching from setuptools to pip... You mean from *easy_install* to pip. pip still uses and supports setuptools. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] "setuptools has divided the Python community"
At 08:32 AM 3/25/2009 -0500, Olemis Lang wrote: Sometimes it also happens that, once one such build/packaging systems is adopted, it is difficult to switch to using another one, since apps (... and plugins systems ...) are frequently hard-coupled to the pkg sys «exotic features» and support (... what comes to my mind right now is Trac plugins which are a little bit dependant on setuptools entry points and so on ...). Please note that entry points are not coupled to easy_install. They have a documented file format and API that's *distributed* with setuptools, but is not dependent on it and does not require .egg files, either. There's nothing stopping an independent implementation in the stdlib, ala PEP 376's metadata API, which is designed to be backward compatible with setuptools. It could be extended to support entry points as well. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] "setuptools has divided the Python community"
At 07:40 AM 3/25/2009 -0700, Guido van Rossum wrote: Well, sorry, but this complex layered interdependent architecture is one of the *causes* of confusion -- apart from you nobody knows what is what exactly, I'll pick a minor nit here... buildout, pip, and a wide variety of other tools and frameworks out there wouldn't exist if I were really the only person who knew "what's what". And I believe most of those people will be at the summit. That having been said, the setuptools documentation definitely sucks for the casual reader or curious observer, as opposed to those who have serious itches to scratch in the area of dependencies or plugins. Combine that with the practical-but-impure choices I made in easy_install to get *something* working, and you have a recipe for the current situation. pkg_resources, for example, is only bundled with setuptools because it couldn't go in the stdlib when it was written. easy_install, OTOH, is bundled with setuptools because *setuptools* isn't in the stdlib! (And of course, both use pkg_resources.) So ironically, setuptools is bundled in the way that it is, precisely *because* there's no support for dependencies in the stdlib... and nicely illustrates why smaller libraries (and less bundling) is a *good* thing. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] "setuptools has divided the Python community"
At 10:11 AM 3/25/2009 -0500, Olemis Lang wrote: ... but Trac plugins *do require* egg files ... (AFAIK after reading Trac docs and implementation of plugin upload from /admin/plugins, egg cache for plugins ... and so on ...) and this is what I was talking about ... apps (e.g. Trac) depending *today* on setuptools (e.g. egg files ...) Trac uses entry points to access plugins. Those plugins do not require .egg files, unless Trac specifically wrote code of their own that requires .egg files. If you install a plugin from source using pip or a system package manager, no .egg file is involved -- you'll have an .egg-info directory instead, alongside the installed code. The pkg_resources API for accessing entry points works with either .egg files or .egg-info directories. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] "setuptools has divided the Python community"
At 01:29 PM 3/25/2009 -0400, Terry Reedy wrote: Paul Moore wrote: 2009/3/25 Tarek Ziadé : Since setuptools came on the scene, I can state with some certainty that many packages which would otherwise have been distributed as bdist_wininst installers, now aren't. In some cases, only source packages are provided (on the basis that easy_install will build what you need). In those cases, I can accept that maybe the developer would not have built Windows installers even before setuptools arrived. But in a significant number of cases - including setuptools itself - binary, version-specific eggs for Windows are provided, but no bdist_wininst installers. If the developer is willing to build an egg, he could just as easily have built an installer - but he now has to choose - build one or the other, or both. And not everyone chooses the same way. Is it possible to write an egg to bdist converter (or vice versa)? easy_install can convert bdist_wininst installers to .egg files; I'm not sure if the reverse is possible in practice, although it certainly is in principle. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] "setuptools has divided the Python community"
At 06:08 PM 3/25/2009 -0500, Barry Warsaw wrote: I've found setuptools entry points difficult to work with for plugins, I'd be interested in hearing more about your specific difficulties, although it's probably off-topic for Python-Dev. Perhaps via the distutils-sig, since we don't have a plugins sig? ;-) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] "setuptools has divided the Python community"
At 11:35 AM 3/25/2009 -0500, Olemis Lang wrote: Yes you're right, Trac requires .egg files for local plugins installs (... in /plugins folder ;) so that not all environments but only one be able to use the plugin ... but that's not exactly what I'm saying, since setuptools AFAIK *MUST* be already there ... so Trac depends on setuptools. It depends on pkg_resources, which you can manually extract from the setuptools package if you like. I believe there may be some distros that actually do this, with separate system packages for "setuptools" (really pkg_resources) and setuptools-devel (the setuptools.* code) -- in which case you can have pkg_resources installed without the rest of setuptools being present. > If you install a plugin from source using pip Like you said setuptools is already there ... pip depends on setuptools ... isn't it? Yes, but that doesn't make *Trac* dependent on setuptools, particularly in the system package manager case. > or a system package manager, > no .egg file is involved -- you'll have an .egg-info directory instead, > alongside the installed code. The pkg_resources API for accessing entry > points works with either .egg files or .egg-info directories. > Like I just said setuptools AFAIK *MUST* be there ... isnt it ? Am I still wrong ? Not on the system using the package manager, no. (By 'setuptools' I'm refering to the setuptools.* modules, not the setuptools distribution... which is just another byproduct of our globally confusing terminology for these things.) BTW ... if you (or anyone else ;) suggests me another pkg «like» setuptools (supporting entry points and so on ...) so that I can try how to install and use Trac plugins without setuptools + pkg_* in the middle ... I'll be open to try it out ... and learn a little ;) You'd need pkg_resources, and if you're not using setuptools to initially generate the files to install, you'd have to create the entry point files by hand, following the EggFormats documentation I linked to previously. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 380 (yield from a subgenerator) comments
At 09:24 PM 3/25/2009 -0700, Guido van Rossum wrote: ISTR that the motivation for adding new syntax is that the best you can do using a trampoline library is still pretty cumbersome to use when you have to write a lot of tasks and subtasks, and when using tasks is just a tool for getting things done rather than an end goal in itself. I agree that the motivation and the comparison should be added to the PEP (perhaps moving the trampoline sample *implementation* to a reference or an appendix, since it is only the appearance of the trampoline-*using* code that matters). In particular, it should explain why these choices are so costly as to justify new syntax and a complex implementation: * decorator clearly identifying intent to create a task vs. no indication of task-ness * "yield Return(value)" vs. "return value" * "result = yield subgenerator()" vs. "result = yield from subgenerator()" Especially since the first two differences arguably make the non-"yield from" code clearer, and the third makes it more compact. with judicious use of decorators and helper classes you can get a reasonable approximation, and I think Twisted uses something like this, so it's not just theory. I think the best you can do without new syntax though is still pretty cumbersome and brittle, which is why I have encouraged your PEP. On the "cumbersome" side, there are only three differences, as I've shown above... and one of them uses less syntax than the PEP. I'm not sure what you mean by brittle. Twisted and PEAK have both had generator-based tasks for ages, and have been used in production for years, even before generators had send()/throw() -- the addition of yield expressions made it possible to get rid of the one previous brittleness, where you needed to do things like: yield subgenerator(); result = resume() in order to pass values or exceptions in. Since send()/throw() was added, the need to call a function after each yield was eliminated, at least from PEAK and my newer Trellis library; haven't looked at Twisted's tasks in a while. I believe there are other generator-based task libraries available on PyPI, but can't comment on their robustness. (The existence of throw(), by the way, makes it possible to produce tracebacks that look *exactly* as if you had called a series of functions, rather than iterating over a bunch of generators. The sample code I gave should do this correctly, as it was created by yanking out working, tested code from my Trellis library, and cutting out all the Trellis-specific bits.) If someone can find any semantic differences between the code I posted and the yield-from proposal (apart from the absence of the "for x in y: yield x" part of the functionality), I'd like to know about it... and it should go in the PEP. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 380 (yield from a subgenerator) comments
At 08:43 PM 3/26/2009 +1200, Greg Ewing wrote: Trying to think of a better usage example that combines send() with returning values, I've realized that part of the problem is that I don't actually know of any realistic uses for send() in the first place. Can anyone point me to any? Maybe it will help to inspire a better example. Er, well, I don't know what anybody *else* wanted them for, but I wanted them to implement improved trampoline functions, vs. earlier Python versions. ;-) The trampoline example I gave uses send() in order to pass the return values from one generator back into another. Of course, the task object also has a send(), so if you do find another use case for send() in a co-operative context, it should be equally doable with the trampoline. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 380 (yield from a subgenerator) comments
At 10:56 AM 3/26/2009 +, Antoine Pitrou wrote: Guido van Rossum python.org> writes: > > That's stating it a little too strongly. Phillip has shown how with > judicious use of decorators and helper classes you can get a > reasonable approximation, and I think Twisted uses something like > this, so it's not just theory. I think the best you can do without new > syntax though is still pretty cumbersome and brittle, which is why I > have encouraged your PEP. It remains to be seen whether Twisted and other libraries (Kamaelia?) can benefit from this PEP. They don't get any new features, and would require (possibly significant) changes in order to be able to take advantage of the syntax. And they *still* wouldn't be able to do away with their trampolines -- the new trampolines would just be able to avoid the need for a generator stack, if they previously had one to begin with. From your description, it sounds like Twisted's version of this doesn't even use a stack. (Note: by "trampoline" I mean, "thing that processes yielded values and manages the resumption of the generator", which need not be global. The example trampoline I posted earlier is also implemented as a decorator, and could be trivially extended via a lookup table to handle deferreds, delayed calls, or whatever else you wanted it to support as yield targets.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 380 (yield from a subgenerator) comments
At 12:20 PM 3/26/2009 -0700, Guido van Rossum wrote: By brittle I meant again having to be aware of those details of the mechanism that exist because of syntactic limitations, e.g. accidentally writing "return X" instead of "yield Return(X)". In that case, you'd either have a syntax error under the current rules (because you're using yield in the same function), or else you'd have a normal function (no yields) that worked in the way you expect it to. IIRC, the example I gave would treat a non-Return(), non-generator value as a value to be passed back into the current generator, such that if you defined a function f, and did: g = yield f() and f was not a generator or did not return one, then the above is equivalent to: g = f() and is not an error. Granted, this can fail if f() can return some other sort of iterator, but arguably that's an error in the *caller*, which should not use yield to call such a function -- and the need to distinguish the type of function you're calling is still present in PEP 380, i.e., you still need to know whether the function you're calling is a generator with a special return mechanism. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 380 (yield from a subgenerator) comments
At 12:27 PM 3/26/2009 -0700, Guido van Rossum wrote: There is some clear low-hanging fruit for Greg's proposal where no trampoline or helpers are needed -- but where currently refactoring complex code containing many yield statements is cumbersome due to the nee to write each "subroutine" call as "for x in subroutine(): yield x" -- being able to replace this with "yield from subroutine()" is a conceptual advantage to me that is not proportional to the number of characters saved. Right - I don't object to the concept of "yield from" -- I'm -0 on that. What I don't like is the confusion of adding "return values" to generators, at least using the 'return' statement. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] "setuptools has divided the Python community"
At 03:28 PM 3/26/2009 -0500, Guido van Rossum wrote: 2009/3/26 Barry Warsaw : > BTW, under a better name, I would support putting pkg_resources in the > stdlib. Last time I looked it was an incredibly complicated piece of code that would have to be refactored considerably before it would be maintainable by the core developers. I never did manage to get a good understanding of the code, but I expect that a lot of the complexity exists so that it works for all Python versions. The stdlib version shouldn't need this -- it should only care about providing a stable API that works with the current version. As someone else suggested, moving some of the functionality to PEP 302 interfaces would also help. Most of the code, though, deals with locating/inspecting installed distributions, resolving version requirements, and managing sys.path. And most of the nastiest complexity comes from trying to support true filename access to resources -- if that were dropped from the stdlib, there'd be no need for egg caches and the like, along with all the complexity entailed. Application environments such as Chandler, Trac, Zope, etc. that want their plugins to live in .egg files wouldn't necessarily be able to use such an API, but the independent pkg_resources wouldn't be disappearing. (Of course, they could also implement application-specific file extraction, if the stdlib API included the ability to inspect and open zipped resources.) The other significant source of complexity is dynamic management of namespace packages; specifically, trying to handle the situation where new sys.path entries (e.g. .egg files added as plugins) need to have their contents added to existing sys.modules __path__ entries. This is perhaps another feature that could be dropped from the stdlib version, given a way to interop with pkg_resources or a replacement. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] "setuptools has divided the Python community"
At 11:27 PM 3/26/2009 +, Paul Moore wrote: What I'd really like is essentially some form of "virtual filesystem" access to stuff addressed relative to a Python package name, Note that relative to a *Python package name* isn't quite as useful, due to namespace packages. To be unambiguous as to the targeted resource, one needs to be able to reference a specific project, and that requires you to go off the name of a module *within* a package. For example, 'zope.somemodule' rather than just 'zope'. To put it another way, you don't want to use a module that comes from an __init__.py as your origin point, but rather, some other .py file within the package. Then you can talk to that module's loader without fear of ambiguity. rather than to a filesystem path. How much filesystem functionality is required is somewhat up for debate, but the basic listdir, get (text/binary) contents, open (text/binary) stream would be a good start (as Guido pointed out). "isdir" and "exists" would also be nice; pkg_resources provides those, plus listdir and string/stream. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 380 (yield from a subgenerator) comments
At 10:39 PM 3/26/2009 -0500, Guido van Rossum wrote: That +0 could turn into a +1 if there was a way to flag this as an error (at runtime), at least if the return is actually executed: def g(): yield 42 return 43 for x in g(): print x# probably expected to print 42 and then 43 Perhaps the exception used in this case could be a different exception than StopIteration? Regular iteration could either just pass this exception through or explicitly check for it (a single pointer comparison could usually suffice), depending on whether it would be a subclass of StopIteration. Could we at least have some syntax like 'return from yield with 43', to distinguish it from a regular return, clarify that it's returning a value to a yield-from statement, and emphasize that you need a yield-from to call it? If it doesn't have some sort of variant syntax, the error message for the return exception is going to need to be rather verbose in order to be clear. However, if there is a variant syntax, then an error message like "'return from yield' without 'yield from'" might be clear enough, and we can keep the current error for returning values in generators. That way, the paired special syntax is clearly identifiable as coroutine/microthread control flow, in a way that's both TOOOWTDI and EIBTI. One remaining quirk or missing piece: ISTM there needs to be a way to extract the return value without using a yield-from statement. I mean, you could write a utility function like: def unyield(geniter): try: while 1: geniter.next() except GeneratorReturn as v: return v.value OTOH, I suppose this function is still a trampoline, just one that doesn't actually do anything except return an eventual exit value. I suppose you could do a slightly improved one thus: def unyield(geniter, value=None, func=lambda v: v) try: while 1: value=func(geniter.send(value)) except GeneratorReturn as v: return v.value And drop it into itertools or some such. It's sort of like an all-purpose map/reduce for generators, so that all you need to do is pass in a function to do whatever processing you need (e.g. I/O waiting) on the values yielded. You could also use another generator's send() method as the function passed in, in which case you'd basically have a pair of coroutines... and whichever returned a value first would end up as the return value of the overall function. That'd probably be pretty useful for the sort of simple (non I/O) coroutines Greg seems to have in mind. Or, these could just be examples in the PEP, I suppose. They're not terribly difficult to write... but then I might be biased since I've written a ridiculous number of coroutine trampolines for Python generators over the last how-many-ever years Python has had generators. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 380 (yield from a subgenerator) comments
At 04:08 PM 3/27/2009 +1300, Greg Ewing wrote: You can't expect to improve something like that by stuffing yield-from into the existing framework, because the point of yield-from is to render the framework itself unnecessary. But it doesn't. You still need *something* that processes the yielded values, since practical frameworks have various things to yield "to" - i/o, time, mouse clicks, whatever. Correctly dealing with the call stack part is tedious to implement, sure, but it's not really the focal point of a microthreading framework. Usually, you need to have some way to control which microthreads are actually to be executing, vs. the ones that are waiting for a particular time, an I/O operation, or some other sort of event. None of that stuff goes away just by taking care of the call stack. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] "setuptools has divided the Python community"
At 01:49 PM 3/27/2009 +0100, M.-A. Lemburg wrote: (*) I've had a go at this a few months ago and then found out that the egg format itself is not documented anywhere. It's been documented for just under three years now. Here's where you quoted the email where I announced that documentation, provided links to it, and asked you to let me know if there's anything else you'd need in it: http://mail.python.org/pipermail/python-dev/2006-April/064496.html ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] version compare function into main lib
At 05:08 PM 3/27/2009 +0100, M.-A. Lemburg wrote: On 2009-03-27 17:01, Eric Smith wrote: > Martin v. Löwis wrote: >>> Correct me if I wrong, but shouldn't Python include function for >>> version comparisons? >> >> On the packaging summit yesterday, people agreed that yes, we should >> have something like that in the standard library, and it should be more >> powerful than what distutils currently offers. > > Yes. > >> There was no conclusion of how specifically that functionality should >> be offered; several people agreed that Python should mandate a standard >> format, which it is then able to compare. So you might not be able to >> spell it "10.3.40-beta", but perhaps "10.3.40b1" or "10.3.40~beta". > > I got the impression that people are generally happy with what > setuptools provides for version parsing and comparison. > > Does anyone think that's not a good model? Instead of trying to parse some version string, distutils should require defining the version as tuple with well-defined entries - much like what we have in sys.version_info for Python. The developer can then still use whatever string format s/he wants. The version compare function would then work on this version tuple and probably be called cmp() (at least in Python 2.x ;-). By the way, pkg_resources.parse_version of course returns a tuple that can be compared with cmp(). ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 380 (yield from a subgenerator) comments
At 03:28 AM 3/27/2009 -0400, Scott Dial wrote: P.J. Eby wrote: > One remaining quirk or missing piece: ISTM there needs to be a way to > extract the return value without using a yield-from statement. I mean, > you could write a utility function like: > >def unyield(geniter): >try: >while 1: geniter.next() >except GeneratorReturn as v: >return v.value My first thought was to ask why it was not equivalent to say: x = yield g x = yield from g This would seem like a more obvious lack of parallelism to pick on wrt. return values. Because yield-from means you're "inlining" the generator, such that sends go into that generator, rather than into the current generator. This unyield() operation seems contrived. Never before have you been able to write a generator that returns a value, why would these suddenly become common practice? The only place a return value seems useful is when refactoring a generator and you need to mend having loss of a shared scope. What other use is there for a return value? The use case which these things are being proposed for is to replace most of the stack-management code that's currently needed for coroutine trampolines. In such a case, you're likely using generators to perform long-running asynchronous operations, or else coroutines where two functions are co-operating to produce a result, each with its own control flow. For example, you might have a generator that yields socket objects to wait for them to be ready to read or write, then returns a line of text read from the socket. You would unyield this if you wanted to write top-level code that was *not* also such a task. Similarly, you might write coroutines where one reads data from a file and sends it to a parser, and then the parser sends data back to a main program. In either case, an unyield would either be the synchronous top-level loop of the program, or part of the top-level code. Either you need to get the finished top-level object from your parser at the end of its operation, or you are waiting for all your asynchronous I/O tasks to complete. It would seem unfortunate for it to be considered a runtime error since this would prevent sharing a generator amongst "yield from" and non-"yield from" use cases. Has anyone shown a use case for doing so? I might be biased due to previous experience with these things, but I don't see how you write a function where both the yielded values *and* the return value are useful... and if you did, you'd still need some sort of unyield operation. Notice that in both the I/O and coroutine use cases, the point of yielding is primarily *to allow other code to execute*, and possibly pass a value back IN to the generator. The values passed *out* by the generator are usually either ignored, an indicator of what the generator wants to be passed back in, or what sort of event it is waiting for before it's to be resumed. In other words, they're usually not data -- they're just something that gets looped over as the task progresses. As Greg has said a number of times, we allow functions to return values with them silently being ignored all the time. Sure. But right now, the return value of a generator function *is the generator*. And you're free to ignore that, sure. But this is a "second" return value that only goes to a special place with special syntax -- without that syntax, you can't access it. But in the use cases where you'd actually want to make such a function return a value to begin with, it's because that value is the value you *really* want from the function -- the only reason it's a generator is because it needs to be paused and resumed along the way to getting that return value. If you're writing a function that yields values for other than control flow reasons, it's probably a bad idea for it to also have a "return" value because then you'd need an unyield operation to get at the data. And it seems to me that people are saying, "but that's no problem, I'll just use yield-from to get the value". But that doesn't *work*, because it turns the function where you use it into another generator! The generators have to *stop* somewhere, in order for you to *use* their return values -- which makes the return feature ONLY relevant to co-routine use cases -- i.e., places where you have trampolines or a top-level loop to handle the yields... And conversely, if you *have* such a generator, its real return value is the special return value, so you're not going to be able to use it outside the coroutine structure... so "ignoring its return value" doesn't make any sense. You'd have to write a loop over the generator, *just to ignore the va
Re: [Python-Dev] "setuptools has divided the Python community"
At 11:37 PM 3/26/2009 -0500, Eric Smith wrote: P.J. Eby wrote: > As someone else suggested, moving some of the functionality to PEP 302 > interfaces would also help. Most of the code, though, deals with > locating/inspecting installed distributions, resolving version > requirements, and managing sys.path. And most of the nastiest > complexity comes from trying to support true filename access to > resources -- if that were dropped from the stdlib, there'd be no need > for egg caches and the like, along with all the complexity entailed. > > Application environments such as Chandler, Trac, Zope, etc. that want > their plugins to live in .egg files wouldn't necessarily be able to use > such an API, but the independent pkg_resources wouldn't be > disappearing. (Of course, they could also implement > application-specific file extraction, if the stdlib API included the > ability to inspect and open zipped resources.) Could you comment on why they couldn't use such an API? If a plugin includes C code (.so/.dll), or uses a library that operates on filenames rather than bytes in memory (e.g. gettext), then the resources would need to be extracted from the .egg. pkg_resources transparently extracts such resources to a cache directory when you ask for a resource's filename, rather than asking for a stream or string of its contents. This feature represents a significant chunk of the complexity and code size of pkg_resources -- and I was proposing ways to cut down on that complexity and code size, for a (limited) stdlib version of the functionality. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] "setuptools has divided the Python community"
At 08:12 PM 3/27/2009 +0100, M.-A. Lemburg wrote: On 2009-03-27 17:19, P.J. Eby wrote: > At 01:49 PM 3/27/2009 +0100, M.-A. Lemburg wrote: >> (*) I've had a go at this a few months ago and then found out >> that the egg format itself is not documented anywhere. > > It's been documented for just under three years now. Here's where you > quoted the email where I announced that documentation, provided links to > it, and asked you to let me know if there's anything else you'd need in it: > > http://mail.python.org/pipermail/python-dev/2006-April/064496.html Thanks for reminding me. I must have forgotten about that wiki page and instead looked on the setuptools page. For future reference, you can also Google "Python egg formats" or "Python egg file format" or other such phrases, and it always comes up as the first or second hit. Ironically, that's how I found it the other day myself, as I didn't remember right off what the wiki page name was. ;-) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] "setuptools has divided the Python community"
At 10:22 PM 3/27/2009 +0100, M.-A. Lemburg wrote: Perhaps someone should start working on a tool called "FryingPan" to create "Omelettes", ie. all eggs squashed into a single ZIP file... ;-) They're called baskets actually. ;-) There's no tool to do it, but pkg_resources does support multiple eggs being contained in the same zipfile. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] My summit notes (packaging)
At 03:06 PM 3/27/2009 -0500, Tarek Ziadé wrote: They both aim at the same goal besides a few differences, and they both rely on a new metadata introduced by setuptools, wich is. "install_requires". This new metadata extends the metadata. described in PEP 314 but is slightly different from. what is descibred in the Draft PEP 345 ("Requires"). .. PEP 345 introduces "Requires" and "Provides" wich are are implemented in Distutils and PyP, but are not widely used. 40 out of +4000 if I remember correctly. Martin will correct me here if I am wrong. FYI, The reason setuptools uses a different way of specifying requirements is that the PEP-proposed way could not be used without some kind of indexed repository of packages -- and PyPI did not index "provides" at the time. Also, the PEP-proposed versioning scheme was not compatible with the versioning schemes actually used in the field at the time. These conditions could be considered to have changed now, or be changeable given enough will and volunteer effort. Since setuptools was only a 1.5-person effort back in the day (i.e., me plus occasional contribs from Ian Bicking, Bob Ippolito, and Jim Fulton), and backward compatibility was a critical requirement to get adoption, using RPM-style provides/requires was not an option at that time. - there should be an API to get metadata for a package without actually executing any of the package's installation script. Please note that where dependencies are concerned, practical considerations require that, at least for currently-released versions of Python, a package's dependencies may well be Python-version specific. A library that targets Python 2.4 and 2.5 and uses wsgiref, sqlite, ctypes, or ElementTree, for example, may have different dependencies depending on the version it is being installed in. Of course, if future Python versions declare metadata for what they include, this could be resolved. But for existing versions it's a bit of a sticky issue. Also, it's quite likely that platform-specific dependencies may exist as well. It might be possible to accommodate these things with a sufficiently flexible format, but currently, the only way to handle them with distutils/setuptools is in the setup script. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 380 (yield from a subgenerator) comments
At 12:53 PM 3/28/2009 +1200, Greg Ewing wrote: Guido van Rossum wrote: Perhaps the crux is that *if* you accidentally use "return " in a vanilla generator expecting the value to show up somewhere, you are probably enough of a newbie that debugging this will be quite hard. I'd like not to have such a newbie trap lying around. Okay, so would you be happy if the for-loop were to raise an exception if it gets a StopIteration whose value is not None? Wouldn't it have to be more than just for-loops? What about list(), map(), ...? It seems a lot simpler to just make it use a different exception, as nothing else has to change for that to work correctly; the new construct can just catch it, and everywhere else it's an error. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 380 (yield from a subgenerator) comments
At 06:52 AM 3/28/2009 -0500, Guido van Rossum wrote: > 2. In addition to the "yield from" syntax for delegating to a > subgenerator, also add new syntax for returning values from > subgenerators so that the basic "return X" can continue to trigger > SyntaxError. > > Since option 2 would most likely lead to a bikeshed discussion of epic > proportions, I'm currently a fan of option 1 ;) Me too. It also seems option 2 doesn't help us decide what it should do: I still think that raising StopIteration(value) would be misleading to vanilla users of the generators. Agreed. (I still think new syntax is a plus, though, as it helps to clearly signal both the intent to make the generator a "subtask" and the need to call it with yield-from. Readability counts.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 382: Namespace Packages
At 10:32 AM 4/2/2009 -0500, Martin v. Löwis wrote: I propose the following PEP for inclusion to Python 3.1. Please comment. An excellent idea. One thing I am not 100% clear on, is how to get additions to sys.path to work correctly with this. Currently, when pkg_resources adds a new egg to sys.path, it uses its existing registry of namespace packages in order to locate which packages need __path__ fixups. It seems under this proposal that it would have to scan sys.modules for objects with __path__ attributes that are lists that begin with a '*', instead... which is a bit troubling because sys.modules doesn't always only contain module objects. Many major frameworks place lazy module objects, and module proxies or wrappers of various sorts in there, so scanning through it arbitrarily is not really a good idea. Perhaps we could add something like a sys.namespace_packages that would be updated by this mechanism? Then, pkg_resources could check both that and its internal registry to be both backward and forward compatible. Apart from that, this mechanism sounds great! I only wish there was a way to backport it all the way to 2.3 so I could drop the messy bits from setuptools. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 382: Namespace Packages
At 10:33 PM 4/2/2009 +0200, M.-A. Lemburg wrote: That's going to slow down Python package detection a lot - you'd replace an O(1) test with an O(n) scan. I thought about this too, but it's pretty trivial considering that the only time it takes effect is when you have a directory name that matches the name you're importing, and that it will only happen once for that directory, unless there is no package on sys.path with that name, and the program tries to import the package multiple times. In other words, the overhead isn't likely to be much, compared to the time needed to say, open and marshal even a trivial __init__.py file. Alternative Approach: - Wouldn't it be better to stick with a simpler approach and look for "__pkg__.py" files to detect namespace packages using that O(1) check ? I thought the same thing (or more precisely, a single .pkg file), but when I got lower in the PEP I saw the reason was to support system packages not having overlapping filenames. The PEP could probably be a little clearer about the connection between needing *.pkg and the system-package use case. One of the namespace packages, the defining namespace package, will have to include a __init__.py file. Note that there is no such thing as a "defining namespace package" -- namespace package contents are symmetrical peers. The above mechanism allows the same kind of flexibility we already have with the existing normal __init__.py mechanism. * It doesn't add yet another .pth-style sys.path extension (which are difficult to manage in installations). * It always uses the same naive sys.path search strategy. The strategy is not determined by some file contents. The above are also true for using only a '*' in .pkg files -- in that event there are no sys.path changes. (Frankly, I'm doubtful that anybody is using extend_path and .pkg files to begin with, so I'd be fine with a proposal that instead used something like '.nsp' files that didn't even need to be opened and read -- which would let the directory scan stop at the first .nsp file found. * The search is only done once - on the first import of the package. I believe the PEP does this as well, IIUC. * It's possible to have a defining package dir and add-one package dirs. Also possible in the PEP, although the __init__.py must be in the first such directory on sys.path. (However, such "defining" packages are not that common now, due to tool limitations.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 382: Namespace Packages
At 03:21 AM 4/3/2009 +0200, Matthias Klose wrote: +1 speaking as a downstream packaging python for Debian/Ubuntu I welcome this approach. The current practice of shipping the very same file (__init__.py) in different packages leads to conflicts for the installation of these packages (this is not specific to dpkg, but is true for rpm packaging as well). Current practice of packaging (for downstreams) so called "name space packages" is: - either to split out the namespace __init__.py into a separate(linux distribution) package (needing manual packaging effort for eachname space package) - using downstream specific packaging techniques to handle conflicting files(diversions) - replicating the current behaviour of setuptools simply overwriting thefile conflicts. Following this proposal (downstream) packaging of namespace packages is made possible independent of any manual downstream packaging decisions or any downstream specific packaging decisions A clarification: setuptools does not currently install the __init__.py file when installing in --single-version-externally-managed or --root mode. Instead, it uses a project-version-nspkg.pth file that essentially simulates a variation of Martin's .pkg proposal, by abusing .pth file support. If this PEP is adopted, setuptools would replace its nspkg.pth file with a .pkg file on Python versions that provide native support for .pkg imports, keeping the .pth file only for older Pythons. (.egg files and directories will not be affected by the change, unless the zipimport module will also supports .pkg files... and again, only for Python versions that support the new approach.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 382: Namespace Packages
At 10:15 PM 4/3/2009 +0200, Martin v. Löwis wrote: I should make it clear that this is not the case. I envision it to work this way: import zope - searches sys.path, until finding either a directory zope, or a file zope.{py,pyc,pyd,...} - if it is a directory, it checks for .pkg files. If it finds any, it processes them, extending __path__. - it *then* checks for __init__.py, taking the first hit anywhere on __path__ (just like any module import would) - if no .pkg was found, nor an __init__.py, it proceeds with the next sys.path item (skipping the directory entirely) Ah, I missed that. Maybe the above should be added to the PEP to clarify. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 382: Namespace Packages
At 02:00 PM 4/6/2009 +0100, Chris Withers wrote: Martin v. Löwis wrote: Chris Withers wrote: Would this support the following case: I have a package called mortar, which defines useful stuff: from mortar import content, ... I now want to distribute large optional chunks separately, but ideally so that the following will will work: from mortar.rbd import ... from mortar.zodb import ... from mortar.wsgi import ... Does the PEP support this? That's the primary purpose of the PEP. Are you sure? Does the pep really allow for: from mortar import content from mortar.rdb import something ...where 'content' is a function defined in mortar/__init__.py and 'something' is a function defined in mortar/rdb/__init__.py *and* the following are separate distributions on PyPI: - mortar - mortar.rdb ...where 'mortar' does not contain 'mortar.rdb'. See the third paragraph of http://www.python.org/dev/peps/pep-0382/#discussion ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 382: Namespace Packages
At 02:30 PM 4/7/2009 +0200, M.-A. Lemburg wrote: >> Wouldn't it be better to stick with a simpler approach and look for >> "__pkg__.py" files to detect namespace packages using that O(1) check ? > > Again - this wouldn't be O(1). More importantly, it breaks system > packages, which now again have to deal with the conflicting file names > if they want to install all portions into a single location. True, but since that means changing the package infrastructure, I think it's fair to ask distributors who want to use that approach to also take care of looking into the __pkg__.py files and merging them if necessary. Most of the time the __pkg__.py files will be empty, so that's not really much to ask for. This means your proposal actually doesn't add any benefit over the status quo, where you can have an __init__.py that does nothing but declare the package a namespace. We already have that now, and it doesn't need a new filename. Why would we expect OS vendors to start supporting it, just because we name it __pkg__.py instead of __init__.py? ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 382: Namespace Packages
At 04:58 PM 4/7/2009 +0200, M.-A. Lemburg wrote: On 2009-04-07 16:05, P.J. Eby wrote: > At 02:30 PM 4/7/2009 +0200, M.-A. Lemburg wrote: >> >> Wouldn't it be better to stick with a simpler approach and look for >> >> "__pkg__.py" files to detect namespace packages using that O(1) >> check ? >> > >> > Again - this wouldn't be O(1). More importantly, it breaks system >> > packages, which now again have to deal with the conflicting file names >> > if they want to install all portions into a single location. >> >> True, but since that means changing the package infrastructure, I think >> it's fair to ask distributors who want to use that approach to also take >> care of looking into the __pkg__.py files and merging them if >> necessary. >> >> Most of the time the __pkg__.py files will be empty, so that's not >> really much to ask for. > > This means your proposal actually doesn't add any benefit over the > status quo, where you can have an __init__.py that does nothing but > declare the package a namespace. We already have that now, and it > doesn't need a new filename. Why would we expect OS vendors to start > supporting it, just because we name it __pkg__.py instead of __init__.py? I lost you there. Since when do we support namespace packages in core Python without the need to add some form of magic support code to __init__.py ? My suggestion basically builds on the same idea as Martin's PEP, but uses a single __pkg__.py file as opposed to some non-Python file yaddayadda.pkg. Right... which completely obliterates the primary benefit of the original proposal compared to the status quo. That is, that the PEP 382 way is more compatible with system packaging tools. Without that benefit, there's zero gain in your proposal over having __init__.py files just call pkgutil.extend_path() (in the stdlib since 2.3, btw) or pkg_resources.declare_namespace() (similar functionality, but with zipfile support and some other niceties). IOW, your proposal doesn't actually improve the status quo in any way that I am able to determine, except that it calls for loading all the __pkg__.py modules, rather than just the first one. (And the setuptools implementation of namespace packages actually *does* load multiple __init__.py's, so that's still no change over the status quo for setuptools-using packages.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] decorator module in stdlib?
At 10:51 AM 4/8/2009 -0700, Guido van Rossum wrote: I would like it even less if an API cared about the *actual* signature of a function I pass into it. One notable use of callable argument inspection is Bobo, the 12-years-ago predecessor to Zope, which used argument information to determine form or query string parameter names. (Were Bobo being written for the first time today for Python 3, I imagine it would use argument annotations to specify types, instead of requiring them to be in the client-side field names.) Bobo, of course, is just a single case of the general pattern of tools that expose a callable to some other (possibly explicitly-typed) system. E.g., wrapping Python functions for exposure to C, Java, .NET, CORBA, SOAP, etc. Anyway, it's nice for decorators to be transparent to inspection when the decorator doesn't actually modify the calling signature, so that you can then use your decorated functions with tools like the above. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Rethinking intern() and its data structure
At 06:52 PM 4/10/2009 +1000, Nick Coghlan wrote: This problem (slow application startup times due to too many imports at startup, which can in turn can be due to top level imports for library or framework functionality that a given application doesn't actually use) is actually the main reason I sometimes wish for a nice, solid lazy module import mechanism that manages to avoid the potential deadlock problems created by using import statements inside functions. Have you tried http://pypi.python.org/pypi/Importing ? Or more specifically, http://peak.telecommunity.com/DevCenter/Importing#lazy-imports ? It does of course use the import lock, but as long as your top-level module code doesn't acquire locks (directly or indirectly), it shouldn't be possible to deadlock. (Or more precisely, to add any *new* deadlocks that you didn't already have.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 382: Namespace Packages
At 05:02 PM 4/14/2009 +0200, M.-A. Lemburg wrote: I don't see the emphasis in the PEP on Linux distribution support and the remote possibility of them wanting to combine separate packages back into one package as good argument for adding yet another separate hierarchy of special files which Python scans during imports. That said, note that most distributions actually take the other route: they try to split up larger packages into smaller ones, so the argument becomes even weaker. I think you've misunderstood something about the use case. System packaging tools don't like separate packages to contain the *same file*. That means that they *can't* split a larger package up with your proposal, because every one of those packages would have to contain a __pkg__.py -- and thus be in conflict with each other. Either that, or they would have to make a separate system package containing *only* the __pkg__.py, and then make all packages using the namespace depend on it -- which is more work and requires greater co-ordination among packagers. Allowing each system package to contain its own .pkg or .nsp or whatever files, on the other hand, allows each system package to be built independently, without conflict between contents (i.e., having the same file), and without requiring a special pseudo-package to contain the additional file. Also, executing multiple __pkg__.py files means that when multiple system packages are installed to site-packages, only one of them could possibly be executed. (Note that, even though the system packages themselves are not "combined", in practice they will all be installed to the same directory, i.e., site-packages or the platform equivalent thereof.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 382: Namespace Packages
At 10:59 PM 4/14/2009 +0200, M.-A. Lemburg wrote: You are missing the point: When breaking up a large package that lives in site-packages into smaller distribution bundles, you don't need namespace packages at all, so the PEP doesn't apply. The way this works is by having a base distribution bundle that includes the needed __init__.py file and a set of extension bundles the add other files to the same directory (without including another copy of __init__.py). The extension bundles include a dependency on the base package to make sure that it always gets installed first. If we're going to keep that practice, there's no point to having the PEP: all three methods (base+extensions, pkgutil, setuptools) all work just fine as they are, with no changes to importing or the stdlib. In particular, without the feature of being able to drop that practice, there would be no reason for setuptools to adopt the PEP. That's why I'm -1 on your proposal: it's actually inferior to the methods we already have today. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 382: Namespace Packages
At 09:51 AM 4/15/2009 +0200, M.-A. Lemburg wrote: On 2009-04-15 02:32, P.J. Eby wrote: > At 10:59 PM 4/14/2009 +0200, M.-A. Lemburg wrote: >> You are missing the point: When breaking up a large package that lives in >> site-packages into smaller distribution bundles, you don't need namespace >> packages at all, so the PEP doesn't apply. >> >> The way this works is by having a base distribution bundle that includes >> the needed __init__.py file and a set of extension bundles the add >> other files to the same directory (without including another copy of >> __init__.py). The extension bundles include a dependency on the base >> package to make sure that it always gets installed first. > > If we're going to keep that practice, there's no point to having the > PEP: all three methods (base+extensions, pkgutil, setuptools) all work > just fine as they are, with no changes to importing or the stdlib. Again: the PEP is about creating a standard for namespace packages. It's not about making namespace packages easy to use for Linux distribution maintainers. Instead, it's targeting *developers* that want to enable shipping a single package in multiple, separate pieces, giving the user the freedom to the select the ones she needs. Of course, this is possible today using various other techniques. The point is that there is no standard for namespace packages and that's what the PEP is trying to solve. > In particular, without the feature of being able to drop that practice, > there would be no reason for setuptools to adopt the PEP. That's why > I'm -1 on your proposal: it's actually inferior to the methods we > already have today. It's simpler and more in line with the Python Zen, not inferior. You are free not to support it in setuptools - the methods implemented in setuptools will continue to work as they are, but continue to require support code and, over time, no longer be compatible with other tools building upon the standard defined in the PEP. In the end, it's the user that decides: whether to go with a standard or not. Up until this point, I've been trying to help you understand the use cases, but it's clear now that you already understand them, you just don't care. That wouldn't be a problem if you just stayed on the sidelines, instead of actively working to make those use cases more difficult for everyone else than they already are. Anyway, since you clearly understand precisely what you're doing, I'm now going to stop trying to explain things, as my responses are apparently just encouraging you, and possibly convincing bystanders that there's some genuine controversy here as well. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 382: Namespace Packages
At 09:10 AM 4/15/2009 -0700, Aahz wrote: For the benefit of us bystanders, could you summarize your vote at this point? Given the PEP's intended goals, if you do not oppose the PEP, are there any changes you think should be made? I'm +1 on Martin's original version of the PEP, subject to the point brought up by someone that .pkg should be changed to a different extension. I'm -1 on all of MAL's proposed revisions, as IMO they are a step backwards: they "standardize" an approach that will create problems that don't need to exist, and don't exist now. Martin's proposal is an improvement on the status quo, Marc's proposal is a dis-improvement. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 382: Namespace Packages
At 06:15 PM 4/15/2009 +0200, M.-A. Lemburg wrote: The much more common use case is that of wanting to have a base package installation which optional add-ons that live in the same logical package namespace. Please see the large number of Zope and PEAK distributions on PyPI as minimal examples that disprove this being the common use case. I expect you will find a fair number of others, as well. In these cases, there is NO "base package"... the entire point of using namespace packages for these distributions is that a "base package" is neither necessary nor desirable. In other words, the "base package" scenario is the exception these days, not the rule. I actually know specifically of only one other such package besides your mx.* case, the logilab ll.* package. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 382: Namespace Packages
At 02:52 PM 4/15/2009 -0400, A.M. Kuchling wrote: On Wed, Apr 15, 2009 at 01:59:34PM -0400, P.J. Eby wrote: > Please see the large number of Zope and PEAK distributions on PyPI as > minimal examples that disprove this being the common use case. I expect > you will find a fair number of others, as well. ... > In other words, the "base package" scenario is the exception these days, > not the rule. I actually know specifically of only one other such > package besides your mx.* case, the logilab ll.* package. Isn't that pretty even, then? zope.* and PEAK are two examples of one approach; and mx.* and ll.* are two examples that use the base package approach. Neither approach seems to be the more common one, and both are pretty rare. If you view the package listings on PyPI, you'll see that the "pure" namespaces currently in use include: alchemist.* amplecode.* atomisator.* bda.* benri.* beyondskins.* bliptv.* bopen.* borg.* bud.* ... This is just going down to the 'b's, looking only at packages whose PyPI project name reflects a nested package name, and only including those with entries that: 1. use setuptools, 2. declare one or more namespace packages, and 3. do not depend on some sort of "base" or "core" package. Technically, setuptools doesn't support base packages anyway, but if the organization appeared to be based on a "core+plugins/addons" model (as opposed to "collection of packages grouped in a namespace") I didn't include it in the list above -- i.e., I'm bending over backwards to be fair in the count. If somebody wants to do a formal count of base vs. pure, it might provide interesting stats. I initially only mentioned Zope and PEAK because I have direct knowledge of the developers' intent regarding their namespace packages. However, now that I've actually looked at a tiny sample of PyPI, it's clear that the actual field use of pure namespace packages has positively exploded since setuptools made it practical to use them. It's unclear, however, who is using base packages besides mx.* and ll.*, although I'd guess from the PyPI listings that perhaps Django is. (It seems that "base" packages are more likely to use a 'base-extension' naming pattern, vs. the 'namespace.project' pattern used by "pure" packages.) Of course, I am certainly not opposed to supporting base packages, and Martin's version of PEP 382 is a plus for setuptools because it would allow setuptools to better support the "base" scenario. But pure packages are definitely not a minority; in fact, a superficial observation of the full PyPI list suggests that there may be almost as many projects using pure-namespace packages, as there are non-namespaced projects! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 382: Namespace Packages
At 10:20 PM 4/15/2009 +0200, M.-A. Lemburg wrote: Whether base packages are useful or not is really a side aspect of the PEP and my proposal. It's not whether they're useful, it's whether they're required. Your proposal *requires* base packages, and for people who intend to use pure packages, this is NOT a feature: it's a bug. Specifically, it introduces a large number of unnecessary, boilerplate dependencies to their package distribution strategy. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 382: Namespace Packages
At 10:00 PM 4/15/2009 +0200, Tarek Ziadé wrote: Now for the "base" or "core" package, what peoplethat uses setuptools do most of the time: 1- they use zc.buildout so they don't need a base package : they list in a configuration files all packages needed to build the application, and one of these package happen to have the scripts to launch the application. 2 - they have a "main" package that doesn't use the same namespace, but uses setuptools instal_requires metadata to include namespaced packages. It acts like zc.buildout in some ways. For example, you mentioned atomisator.* in your example, this app has a main package called "Atomisator" (notice the upper A) that uses strategy #2 I think that there is some confusion here. A "main" package or buildout that assembles a larger project from components is not the same thing as having a "base" package for a namespace package. A base or core package is one that is depended upon by most or all of the related projects. In other words, the dependencies are in the *opposite direction* from what you described above. To have a base package in setuptools, you would move the target code from the namespace package __init__.py to another module or subpackage within your namespace, then make all your other projects depend on the project containing that module or subpackage. And I explicitly excluded from my survey any packages that were following this strategy, on the assumption that they might consider switching to an __init__.py or __pkg__.py strategy if some version of PEP 382 were supported by setuptools, since they already have a "base" or "core" project -- in that case, they are only changing ONE of their packages' distribution metadata to adopt the new strategy, because the dependencies already exist. So : - having namespaces natively in Python is a big win (Namespaces are one honking great idea -- let's do more of those!) - being able to still write some code under the primary namespace is something I (and lots of people) wish we could do with setuptools, so it's a big win too. Yes, that's why I support Martin's proposal: it would allow setuptools to support this case in the future, and it would also allow improved startup times for installations with many setuptools-based namespace packages installed in flat form. (Contra MAL's claims of decreased performance: adopting Martin's proposal allows there to be *fewer* .pth files read at startup, because only .pkg files for an actually-imported package need to be read.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 382: Namespace Packages
At 09:59 AM 4/16/2009 +0900, Stephen J. Turnbull wrote: I think that for this PEP it's more important to unify the various use cases for namespace packages than it is to get rid of the .pth files. Actually, Martin's proposal *does* get rid of the .pth files in site-packages, and replaces them with other files inside the individual packages. (Thereby speeding startup times when many namespace packages are present but only a few are used.) So Martin's proposal is a win for performance and even for decreasing clutter. (The same number of special files will be present, but they will be moved inside the namespace package directories instead of being in the parent directory.) AFAICS the only things PJE has said about base packages is that (a) they aren't a universal use case for namespace packages, and (b) he'd like to be able to support them in setuptools, but admits that at present they aren't. ...and that Martin's proposal would actually permit me to do so, whereas MAL's proposal would not. Replacing __init__.py with a __pkg__.py wouldn't change any of the tradeoffs for how setuptools handles namespace packages, except to add an extra variable to consider (i.e., two filenames to keep track of). ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 382: Namespace Packages
At 03:46 AM 4/16/2009 +, gl...@divmod.com wrote: On 15 Apr, 09:11 pm, p...@telecommunity.com wrote: I think that there is some confusion here. A "main" package or buildout that assembles a larger project from components is not the same thing as having a "base" package for a namespace package. I'm certainly confused. Twisted has its own system for "namespace" packages, and I'm not really sure where we fall in this discussion. I haven't been able to follow the whole thread, but my original understanding was that the PEP supports "defining packages", which we now seem to be calling "base packages", just fine. Yes, it does. The discussion since the original proposal, however, has been dominated by MAL's counterproposal, which *requires* a defining package. There is a slight distinction between "base package" and "defining package", although I suppose I've been using them a bit interchangeably. Base package describes a use case: you have a base package which is extended in the same namespace. In that use case, you may want to place your base package in the defining package. In contrast, setuptools does not support a defining package, so if you have a base package, you must place it in a submodule or subpackage of the namespace. Does that all make sense now? MAL's proposal requires a defining package, which is counterproductive if you have a pure package with no base, since it now requires you to create an additional project on PyPI just to hold your defining package. I'd appreciate it if the PEP could also be extended cover Twisted's very similar mechanism for namespace packages, "twisted.plugin.pluginPackagePaths". I know this is not quite as widely used as setuptools' namespace package support, but its existence belies a need for standardization. The PEP also seems a bit vague with regard to the treatment of other directories containing __init__.py and *.pkg files. Do you have a clarification to suggest? My understanding (probably a projection) is that to be a nested namespace package, you have to have a parent namespace package. The concept of a "defining package" seems important to avoid conflicts like this one: http://twistedmatrix.com/trac/ticket/2339 More specifically I don't quite understand the PEP's intentions towards hierarchical packages. It says that all of sys.path will be searched, but what about this case? In Twisted, the suggested idiom to structure a project which wants to provide Twisted plugins is to have a directory structure like this: MyProject/ myproject/ __init__.py twisted/ plugins/ myproject_plugin.py If you then put MyProject on PYTHONPATH, MyProject/twisted/plugins will be picked up automatically by the plugin machinery. Namespaces are not plugins and vice versa. The purpose of a namespace package is to allow projects managed by the same entity to share a namespace (ala Java "package" names) and avoid naming conflicts with other authors. A plugin system, by contrast, is explicitly intended for use by multiple authors, so the use case is rather different... and using namespace packages for plugins actually *increases* the possibility of naming conflicts, unless you add back in another level of hierarchy. (As apparently you are recommending via "myproject_plugin".) However, as "twisted" is *not* a "namespace" package in the same way, .py files in MyProject/twisted/ would not be picked up - this is very much intentional, since the "twisted" namespace is intended to be reserved for packages that we actually produce. If either MyProject/twisted or MyProject/twisted/plugins/ had an __init__.py, then no modules in MyProject/twisted/plugins/ would be picked up, because it would be considered a conflicting package. Precisely. Note, however, that neither is twisted.plugins a namespace package, and it should not contain any .pkg files. I don't think it's reasonable to abuse PEP 382 namespace packages as a plugin system. In setuptools' case, a different mechanism is provided for locating plugin code, and of course Twisted already has its own system for the same thing. It would be nice to have a standardized way of locating plugins in the stdlib, but that will need to be a different PEP. I hope this all makes sense. As I understand it, both setuptools and the proposed standard would either still have the bug described by ticket 2339 above, or would ignore twisted/plugins/ as a namespace package because its parent isn't a namespace package. If twisted/ lacked an __init__.py, then setuptools would ignore it. Under PEP 382, the same, unless it had .pkg files. (Again, setuptools explicitly does not support using namespace packages as a plugin mechanism.) P.S.: vendor packaging systems *ARE* a major use case for just about any aspect of Python's package structure. I really liked MvL's coverage of "vendor packages", in the PEP, since this cou
Re: [Python-Dev] PEP 382: Namespace Packages
At 03:58 AM 4/17/2009 +, gl...@divmod.com wrote: Just as a use-case: would the Java "com.*" namespace be an example of a "pure package with no base"? i.e. lots of projects are in it, but no project owns it? Er, I suppose. I was thinking more of the various 'com.foo' and 'org.bar' packages as being the pure namespaces in question. For Python, a "flat is better than nested" approach seems fine at the moment. Just to clarify things on my end: "namespace package" to *me* means "package with modules provided from multiple distributions (the distutils term)". The definition provided by the PEP, that a package is spread over multiple directories on disk, seems like an implementation detail. Agreed. Entries on __path__ slow down import, so my understanding of the platonic ideal of a system python installation is one which has a single directory where all packages reside, and a set of metadata off to the side explaining which files belong to which distributions so they can be uninstalled by a package manager. True... except that part of the function of the PEP is to ensure that if you install those separately-distributed modules to the same directory, it still needs to work as a package and not have any inter-package file conflicts. Of course, for a development installation, easy uninstallation and quick swapping between different versions of relevant dependencies is more important than good import performance. So in that case, you would want to optimize differently by having all of your distributions installed into separate directories, with a long PYTHONPATH or lots of .pth files to point at them. And of course you may want a hybrid of the two. Yep. So another clarification I'd like in the PEP is an explanation of motivation. For example, it comes as a complete surprise to me that the expectation of namespace packages was to provide only single-source namespaces like zope.*, peak.*, twisted.*. As I mentioned above, I implicitly thought this was more for com.*, twisted.plugins.*. Well, aside from twisted.plugins, I wasn't aware of anybody in Python doing that... and as I described, I never really interpreted that through the lens of "namespace package" vs. "plugin finding". Right now it just says that it's a package which resides in multiple directories, and it's not made clear why that's a desirable feature. Good point; perhaps you can suggest some wording on these matters to Martin? Okay. So what I'm hearing is that Twisted should happily continue using our own wacky __path__-calculation logic for twisted.plugins, but that *twisted* should be a namespace package so that our separate distributions (TwistedCore, TwistedWeb, TwistedConch, et. al.) can be installed into separate directories. Yes. Thanks for taking the time to participate in this and add another viewpoint to the mix, not to mention clarifying some areas where the PEP could be clearer. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 382: Namespace Packages
At 05:35 PM 5/1/2009 +0100, Chris Withers wrote: P.J. Eby wrote: It's unclear, however, who is using base packages besides mx.* and ll.*, although I'd guess from the PyPI listings that perhaps Django is. (It seems that "base" packages are more likely to use a 'base-extension' naming pattern, vs. the 'namespace.project' pattern used by "pure" packages.) I'll stress it again in case you missed it the first time: I think the main reason people use "pure namespace" versus "base namespace" packages is because hardly anyone know how to do the latter, not because there is no desire to do so! I didn't say there's *no* desire, however IIRC the only person who *ever* asked on distutils-sig how to do a base package with setuptools was the author of the ll.* packages. And in the case of at least the zope.* peak.* and osaf.* namespace packages it was specifically *not* the intention to have a base __init__. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 382: Namespace Packages
At 07:41 PM 5/1/2009 +0200, Martin v. Löwis wrote: >> It's unclear, however, who is using base packages besides mx.* and >> ll.*, although I'd guess from the PyPI listings that perhaps Django >> is. (It seems that "base" packages are more likely to use a >> 'base-extension' naming pattern, vs. the 'namespace.project' pattern >> used by "pure" packages.) > > I'll stress it again in case you missed it the first time: I think the > main reason people use "pure namespace" versus "base namespace" packages > is because hardly anyone know how to do the latter, not because there is > no desire to do so! > > I, for one, have been trying to figure out how to do "base namespace" > packages for years... You mean, without PEP 382? That won't be possible, unless you can coordinate all addon packages. Base packages are a feature solely of PEP 382. Actually, if you are using only the distutils, you can do this by listing only modules in the addon projects; this is how the ll.* tools are doing it. That only works if the packages are all being installed in the same directory, though, not as eggs. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] .pth files are evil
At 04:18 PM 5/9/2009 +0200, Martin v. Löwis wrote: Zooko O'Whielacronx wrote: > .pth files are why I can't easily use GNU stow with easy_install. > If installing a Python package involved writing new files into the > filesystem, but did not require reading, updating, and re-writing any > extant files such as .pth files, then GNU stow would Just Work with > easy_install the way it Just Works with most things. Please understand that this is the fault of easy_install, not of .pth files. There is no technical need for easy_install to rewrite .pth files on installation. It could just as well have created new .pth files, rather than modifying existing ones. If you always use --single-version-externally-managed with easy_install, it will stop editing .pth files on installation. It's --multi-version (-m) that does that. --single-version-externally-managed is a "setup.py install" option. Both have the effect of not editing .pth files, but they do so in different ways. The "setup.py install" option causes it to install in a distutils-compatible layout, whereas --multi-version simply drops .egg files or directories in the target location and leaves it to the user (or the generated script wrappers) to add them to sys.path. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] .pth files are evil
At 04:42 PM 5/9/2009 +0200, Martin v. Löwis wrote: >> If you always use --single-version-externally-managed with easy_install, >> it will stop editing .pth files on installation. > > It's --multi-version (-m) that does that. > --single-version-externally-managed is a "setup.py install" option. > > Both have the effect of not editing .pth files, but they do so in > different ways. The "setup.py install" option causes it to install in a > distutils-compatible layout, whereas --multi-version simply drops .egg > files or directories in the target location and leaves it to the user > (or the generated script wrappers) to add them to sys.path. Ah, ok. Is there also an easy_install invocation that unpacks the zip file into some location of sys.path (which then wouldn't require editing sys.path)? Not as yet. I'm sort of waiting to see what comes out of PEP 376 discussions re: an installation manifest... but then, if I actually had time to work on it right now, I'd probably just implement something. Currently, you can use pip to do that, though, as long as the packages you want are in source form. pip doesn't unzip eggs as yet. It would be really straightforward, though, for someone to implement an easy_install variant that does this. Just invoke "easy_install -Zmaxd /some/tmpdir packagelist" to get a full set of unpacked .egg directories in /some/tmpdir, and then move the contents of the resulting .egg subdirs to the target location, renaming EGG-INFO subdirs to projectname-version.egg-info subdirs. (Of course, this ignores the issue of uninstalling previous versions, or overwriting of conflicting files in the target -- does pip handle these?) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] how GNU stow is complementary rather than alternative to distutils
At 12:04 PM 5/10/2009 -0600, Zooko Wilcox-O'Hearn wrote: The thing that prevents this from working with setuptools is that setuptools creates a file named easy_install.pth during the "python ./ setup.py install --prefix=foo" if you build two different Python packages this way, they will each create an easy_install.pth file, and then when you ask GNU stow to link the two resulting packages into your system, it will say "You are asking me to install two different packages which both claim that they need to write a file named '/usr/local/lib/python2.5/site-packages/easy_install.pth'. Adding --record and --single-version-externally-managed to that command line will prevent the .pth file from being used or needed, although I believe you already know this. (What that mode won't do is install dependencies automatically.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] .pth files are evil
At 04:42 PM 5/9/2009 +0200, Martin v. Löwis wrote: >> If you always use --single-version-externally-managed with easy_install, >> it will stop editing .pth files on installation. > > It's --multi-version (-m) that does that. > --single-version-externally-managed is a "setup.py install" option. > > Both have the effect of not editing .pth files, but they do so in > different ways. The "setup.py install" option causes it to install in a > distutils-compatible layout, whereas --multi-version simply drops .egg > files or directories in the target location and leaves it to the user > (or the generated script wrappers) to add them to sys.path. Ah, ok. Is there also an easy_install invocation that unpacks the zip file into some location of sys.path (which then wouldn't require editing sys.path)? No; you'd have to use the -e option to easy_install to download and extract a source version of the package; then run that package's setup.py, e.g.: easy_install -eb /some/tmpdir SomeProject cd /some/tmpdir/someproject # subdir is always lowercased/normalized setup.py install --single-version-externally-managed --record=... I suspect that this is basically what pip is doing under the hood, as that would explain why it doesn't support .egg files. I previously posted code to the distutils-sig that was an .egg unpacker with appropriate renaming, though. It was untested, and assumes you already checked for collisions in the target directory, and that you're handling any uninstall manifest yourself. It could probably be modified to take a filter function, though, something like: def flatten_egg(egg_filename, extract_dir, filter=lambda s,d: d): eggbase = os.path.filename(egg_filename)+'-info' def file_filter(src, dst): if src.startswith('EGG-INFO/'): src = eggbase+s[8:] dst = os.path.join(extract_dir, *src.split('/')) return filter(src, dst) return unpack_archive(egg_filename, extract_dir, file_filter) Then you could pass in a None-returning filter function to check and accumulate collisions and generate a manifest. A second run with the default filter would do the unpacking. (This function should work with either .egg files or .egg directories as input, btw, since unpack_archive treats a directory input as if it were an archive.) Anyway, if you used "easy_install -mxd /some/tmpdir [specs]" to get your target eggs found/built, you could then run this flattening function (with appropriate filter functions) over the *.egg contents of /some/tmpdir to do the actual installation. (The reason for using -mxd instead of -Zmaxd or -zmaxd is that we don't care whether the eggs are zipped or not, and we leave out the -a so that dependencies already present on sys.path aren't copied or re-downloaded to the target; only dependencies we don't already have will get dropped in /some/tmpdir.) Of course, the devil of this is in the details; to handle conflicts and uninstalls properly you would need to know what namespace packages were in the eggs you are installing. But if you don't care about blindly overwriting things (as the distutils does not), then it's actually pretty easy to make such an unpacker. I mainly haven't made one myself because I *do* care about things being blindly overwritten. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376 : Changing the .egg-info structure
At 12:21 AM 5/15/2009 +0200, Tarek Ziadé wrote: Hello I'm proposing this PEP, which has been discussed in Distutils-SIG, for inclusion in Python 2.7 and 3.2 http://www.python.org/dev/peps/pep-0376/ Please comment ! I'd like to reiterate my suggestion that the uninstall record include size and checksum information, ala PEP 262's "FILES" section. This would allow the uninstall function to validate whether a file has been modified, and thus prevent uninstalling a locally-modified file, or a file installed in some other way. It may also be that providing an uninstall API that simply yields files to be uninstalled, with data about their existence/modification status, would be more useful than a blind uninstall operation with a filter function. Also, the PEP doesn't document what happens if a single file was installed by more than one package. Ideally, a file with identical size/checksum that belongs to more than one project should be silently left alone, and a file installed by more than one project with *different* size/checksum should be warned about and left alone. Next, the doc for the metadata API functions seems quite sparse. ISTR that I've previously commented on such issues as case- and punctuation-insensitivity of project names, and '/' separation in egg_info subpaths, but these don't seem to have been incorporated into the current version of the PEP. These are important considerations in general, btw, because project name and version canonicalization and escaping are an important part of both generating and parsing .egg-info filenemaes. At minimum, the relevant setuptools docs that define these standards should be cited. Finally, the "Definitions" section also claims that a project installs one or more packages, but a project may not contain *any* packages; it may have a standalone module, or just a script, data, or metadata. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376 : Changing the .egg-info structure
At 08:32 AM 5/15/2009 +0200, Jeroen Ruigrok van der Werven wrote: Agreed. Within FreeBSD's ports the installed package registration gets a MD5 hash per file recorded. Size is less interesting though, since essentially this information is encapsulated within the hash. Remove one byte from the file and your hash is already different. Which also means that in that case you can skip computing the MD5. The size allows you to easily notice an overwrite/corruption without further processing. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376 : Changing the .egg-info structure
At 08:32 AM 5/15/2009 +0200, Tarek Ziadé wrote: 2009/5/15 P.J. Eby : > Ideally, a file with identical size/checksum that > belongs to more than one project should be silently left alone, and a file > installed by more than one project with *different* size/checksum should be > warned about and left alone. I think the path is the info that should be looked at. By "a file that belongs to more than one project" I meant a single file on *disk* (i.e., one absolute path). But I don't think you want to leave alone a file with identical size/checksum that belongs to more than one project when it's not the same absolute path. That wouldn't be "a file" then, would it? ;-) Here's an example why : if two different packages includes the "feedparser.py" module (from the FeedParser project) for conveniency, and if you remove one package, you *do* want to remove its "feeparser.py" module even if it exists in the other project. Right, that would be *two files*, though, not one file. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376 : Changing the .egg-info structure
At 06:06 PM 5/16/2009 +0200, Tarek Ziadé wrote: Ok I've changed the PEP with all the points you mentioned, if you want to take a look. Some notes: 1. Why ';' separation, instead of tabs as in PEP 262? Aren't semicolons a valid character in filenames? 2. "if the installed file is located in a directory in site-packages" should refer not to site-packages but to the directory containing the .egg-info directory. 3. get_egg_info_file needs to be specified as using '/'-separated paths and converting to OS paths if appropriate. There's also the problem that the mode it opens the file in (binary or text) is unspecified. 4. There should probably be a way to iterate over the projects in a directory, since it's otherwise impossible for an installation tool to find out what project(s) "own" a file that conflicts with something being installed. Alternatively, reshaping the file API to allow querying by path as well as by project might work. 5. If any cache mechanisms are to be used by the API, the API *must* make it possible to bypass or explicitly manage that cache, as otherwise installation tools and tools that manipulate sys.path at runtime may end up using incorrect data. 6. get_files() doesn't document whether the yielded paths are absolute or relative, local or cross-platform, etc. I need to find back your comments for this part, I must have missed them. That's the last part I didn't work out yet on the current PEP revision. Well, if you can't find them, the EggFormats doc explains how these file/dir structures are currently laid out by setuptools, easy_install, pip, etc., and the PEP should probably reference that. Technically, this PEP doesn't so much propose a change to the EggFormats standard, as simply add a RECORD file to it, and propose stdlib support for reading and writing it. So, the PEP really should reference (i.e. link to) the existing standard. The EggFormats doc in turn cites pkg_resources doc for lower-level format issues, such as name and version normalization, filename escaping, file parsing, etc. This PEP should also probably be framed as a replacement for PEP 262, proposing to extend the de-facto standard for an installation database with uninstall support, and blessing selected portions of the de facto standard as an official standard. (Since that's pretty much exactly what it is.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376 : Changing the .egg-info structure
At 11:17 AM 5/16/2009 -0700, Glenn Linderman wrote: On approximately 5/16/2009 9:55 AM, came the following characters from the keyboard of P.J. Eby: At 06:06 PM 5/16/2009 +0200, Tarek Ziadé wrote: Ok I've changed the PEP with all the points you mentioned, if you want to take a look. Some notes: 1. Why ';' separation, instead of tabs as in PEP 262? Aren't semicolons a valid character in filenames? Why tabs? Aren't tabs a valid character in filenames? (hint: Both are valid in POSIX filenames, neither are valid in Windows filenames) ";" *is* valid in Windows filenames, actually. Tabs aresn't. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376 : Changing the .egg-info structure
At 04:04 PM 5/19/2009 +0200, Tarek Ziadé wrote: On Sat, May 16, 2009 at 6:55 PM, P.J. Eby wrote: > > 1. Why ';' separation, instead of tabs as in PEP 262? Aren't semicolons a > valid character in filenames? I am changing this into a . for now. What about Antoine's idea about doing a quote() on the names ? I like the CSV idea better, since the csv module is available in 2.3 and up. We should just pick a dialect with unambiguous quoting rules. From my point of view seems more simple to deal with, if 3rd-party tools want to work on these files without using pkgutil or Python. True, but then CSV files are still pretty common. One other possibility that might work is using a vertical bar as a separator. My preference rank at the moment is probably tabs, CSV, or vertical bar. But I don't really care all that much, so let the people who care decide. Personally, though, I don't see much point to cross-language manipulation of the file. System packaging tools have their own way of keeping track of this stuff. So unless somebody's using it to *build* system packages (e.g. making an RPM builder), they don't need this. Now, about the APIs... > 4. There should probably be a way to iterate over the projects in a > directory, since it's otherwise impossible for an installation tool to find > out what project(s) "own" a file that conflicts with something being > installed. Alternatively, reshaping the file API to allow querying by path > as well as by project might work. I am adding a "get_projects" api: get_projects() -> iterator Provides an iterator that will return (name, path) tuples, where `name` is the name of a registered project and `path` the path to its `egg-info` directory. But for the use case you are mentioning, what about an explicit API: get_owners(paths) -> sequence of project names returns a sequence of tuple. For each path in the "paths" list, a tuple of project names is returned > > 5. If any cache mechanisms are to be used by the API, the API *must* make it > possible to bypass or explicitly manage that cache, as otherwise > installation tools and tools that manipulate sys.path at runtime may end up > using incorrect data. work in progress - (I am afraid I have to write an advanced prototype to be able to know exaclty how the cache might work, and so, what API we should have) I think it would be simpler to have explicit object types representing things like a directory, a collection of directories, and individual projects, and these object types should be part of the API. Any function-oriented API should just be exposed as the methods of a default singleton. Other Python modules follow this pattern -- and it's what I copied for the pkg_resources design. It gives a nice tradeoff between keeping the simple things simple, and complex things possible, as well as keeping mechanism and policy separate. Right now, the API design you're trying to do is being burdened by using strings and tuples to represent things that could just as easily be objects with their own methods, instead of things you have to pass back into other APIs. This also makes caching more complex, because you can't just have one main object with stuff hanging off; you've got to have a bunch of dictionaries, tuples, lists, sets, etc. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376
At 05:42 PM 6/22/2009 +0200, Tarek Ziadé wrote: Wouldn't it be better to use the native line terminator on the current platform? (someone might want to edit or at least view the file) Good idea, I'll change that, As long as the file is always *read* with "U" mode, so that you can't mess it up, especially if the install is to a directory shared between platforms. The idea of this API is to find out of a distribution "owns" a file, e.g. is the only distribution that uses it, so it can be safely removed. This could equally well be done by ``owners(path)``, returning a sequence of zero or more items. Any length <> 1 means the file can't be safely removed. Meanwhile, having the data about all the owners of a file would also be useful for tools that just want to inspect a directory's contents, for example, or to detect conflicts and overwrites. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376
At 10:38 AM 6/23/2009 +0200, Tarek Ziadé wrote: What about using the same names used in Python's site module: "sitedir" is the name used for a directory we named DistributionDirectory. No, a site dir is a Python-defined directory for site-installed packages, and/or a directory where .pth files are processed. Wrong connotations entirely, since packages may be installed to other directories, and typically are in e.g. shared hosting environments. DistributionDirectory is fine by me. DistributionDirectories sounds like what setuptools calls an Environment. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376
At 07:57 PM 6/29/2009 +0200, Tarek Ziadé wrote: Hello, If no one objects, I'd like to push PEP 376 in the "accepted" status and go ahead with its implementation, with continuous feedback at Distutils-SIG as we did to build it. I do have a question about the current draft... Do zipped distributions use EGG-INFO or a project-version.egg-info? This isn't spelled out in the PEP, although I get the general idea that the EGG-INFO format isn't supported, and thus the PEP and API do not support existing .egg files. This should probably be made clear, if that's the intention. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376
At 01:34 AM 7/1/2009 +0200, Tarek Ziadé wrote: On Wed, Jul 1, 2009 at 12:47 AM, Steven D'Aprano wrote: > I don't see how this proposal will help in the second case. If you > install distribution Spam, containing file spam.py, and then install > distribution Ham, which requires spam.py, what is to prevent you from > removing Spam and breaking Ham? > > If you don't propose a solution for the dependency problem, you should > say so. This problem is solved as described later in the PEP, with the API that allows you to get the list of the distributions that use a given file. (thanks to the RECORD files) If Spam and Ham use smap.py, and if you uninstall Spam, this file will not be removed because the API will tell you its used in both distributions. That's not the scenario he's talking about. He's talking about the case where Ham has an 'install_requires' of Spam. That is, a runtime dependency, not a shared file. Good question, I have never created such distribution. Aren't they read-only files ? setuptools' bdist_egg command has an option to exclude source from an .egg, but it doesn't do anything special with permissions. I don't think any other current install tools support source-free installation. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376
At 07:11 AM 7/1/2009 +1000, Nick Coghlan wrote: To address PJE's question in the PEP, it may be worth expanding on this in the backwards compatibility section explaining how the new distutils metadata system avoids getting confused by the old pre-standardisation installation formats (e.g. it may be that the directory names and/or filenames all deliberately differ from current approaches precisely so they can coexist without interfering with each other) In particular, it should be explicitly noted that the API does not support .egg files (or directories), and thus cannot substitute for the pkg_resources API, as far as detecting installed packages. That is, users who need to unequivocally detect the presence/version of a project on sys.path, etc. will still have to use pkg_resources. (Because pkg_resources supports the format proposed by the PEP, but the PEP does not support .egg files and directories.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376
At 04:29 PM 7/1/2009 +0200, Tarek Ziadé wrote: - Phase 1 : introduction of the egg-info file in distutils Philipp introduced the creation of a file named xxx.egg-info file in 2006 (see http://bugs.python.org/issue1459476) alongside distutils-installed package, that contains the metadata of the distribution. - Phase 2: two new formats in the setuptools project Then he created two new formats in the setuptools project: This is backwards. The .egg-info file was added to distutils and setuptools *after* support for the other formats. In other words, it was a third format added to allow other software (such as setuptools and friends) to detect the presence of packages installed via the distutils, and thereby prevent other tools from installing duplicates. (.egg-info directories and .egg files/directories already existed.) This means that all the third-party tools out there already conform to that standard, and that packages installed in other formats will not benefit from the new APIs. which means that people that want to work with distributions installed as .egg directories will have to use setuptools APIs. Which makes sense. Yes and no. Not providing uninstall support is reasonable, but the PEP also has features to query packages in general. (There's also no technical reason why comparable manifest and uninstall support can't be provided for .egg files and directoriees, since they already have an implicit manifest: their contents. However, since I'm not currently possessed of the time to provide a patch myself, I'm not going to lobby for this as a feature.) > If you want, you can then add a cross-reference note, saying something > like "The setuptools structure, as proposed in the EggFormats > documentation for that package [ref], is a subset of this standard. In > order to conform to this PEP, setuptools will have to be amended to > only install .egg-info directories in the format defined by this PEP". I'll work that way. (Technically, it's an extension of a subset of EggFormats, since it adds RECORD and INSTALLER.) > > "However, it will impact the setuptools and pip projects, but given > the fact that..." > > Confusing. Will these tools need to change (I believe so) or not? If > they will need to change, that hardly counts as "no deep consequences" > - there's the whole backward compatibility issue for them to handle. I'll add this in a backward compatibility section, as suggested earlier by someone. They don't need to change, unless they're going to support being uninstalled by the tool (in which case they'll need to add RECORD and INSTALLER). ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376
At 09:06 PM 7/2/2009 +0100, Paul Moore wrote: Maybe the answer is that distutils reject *all* uses of absolute paths on Windows. The PEP can then say whatever you want as far as I'm concerned, as it doesn't affect me. If the existing bugs don't get fixed, though, I'd say that the RECORD file should be explicitly disallowed from containing absolute paths on Windows (for safety reasons if nothing else). In non bdist cases (e.g. setup.py with manual options, but also easy_install, pip, etc.) you can specify where data files and scripts go on Windows, and they don't have to be relative locations. So, absolute paths need to be supported, despite the existence of broken bdist targets on Windows. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376
At 02:54 PM 7/3/2009 +0100, Paul Moore wrote: Eggs are fundamentally a PEP 302 zip file format. There are some extra bits of metadata for setuptools/easy_install in there (as I understand things) but essentially they are zip files. When you say "decoupling the egg format", I assume you mean "decoupling the egg metadata" - which is fine, but to properly decouple, you need API level access to the metadata. PEP 376 offers read-only access, but as you rightly point out, it is only for filesystem data (and some form of zip file, which appears to be limited in some way, as it isn't PEP 302 based, and the actual format isn't defined anywhere). The basic point here is that PEP 376 needs to define precisely how pkgutil.get_distributions() scans sys.path looking for ".egg-info directories". What does it do for sys.path entries that don't correspond to filesystem directories? (Note - these may or may not be zip files. Even if they are zip files, an earlier entry on sys.path_hooks could have taken precedence. At the very least, you should only process path entries as zip files if their importer - in sys.path_importer_cache or via an explicit path hook scan - is a zipimporter object.). To be honest, this is a major can of worms. But if PEP 376 is not going to support PEP 302, then it must state that fact explicitly, to avoid giving people false expectations - particularly with Brett's importlib in Python 3.1, which will make it far easier for people to experiment with new packaging formats such as the ones Lukasz mentions above. And it MUST fail gracefully in the face of unsupported importer types. Well, we could always resurrect PEP 365, since pkg_resources already has documented extensible support for arbitrary importers. That solves backward *and* forward compatibility. Then PEP 376's uninstall facilities could be implemented using pkg_resources' existing metadata query features. The primary downside to that, of course, is that it brings in the matter of version specifications and dependencies... which appear to be a contentious topic. (Note that Tarek is proposing to drop the PEP 386 proposal to standardize a much more restrictive scheme than seutptools' version parser, precisely because of the controversy.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376
At 12:28 AM 7/4/2009 +1000, Nick Coghlan wrote: I suspect this limitation of the PEP 302 APIs is the origin of the setuptools format that embeds the metadata inside the distribution - it lets you get at the metadata without having to assume that it exists directly on the filesystem anywhere. I think you have this backwards; it's setuptools that doesn't care where (or whether) the metadata exists on the file system; it delegates metadata operations to a "metadata provider" that's usually an adapter over a PEP 302 "loader". See http://peak.telecommunity.com/DevCenter/PkgResources#supporting-custom-importers for the API details of how to register support for arbitrary PEP 302 importers and loaders. (Which presumably, Lukasz is using. I didn't know that anybody was actually using it, but it's nice to know that the documentation is apparently sufficient for *some* people. ;-) ) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mercurial migration: progress report (PEP 385)
At 12:20 PM 7/4/2009 +0900, Stephen J. Turnbull wrote: IME, Mercurial strongly encourages a non-branching style. Although I can't fully explain in concrete terms what makes me feel that way, it's certainly consistent with your own inclination to advise "subset branches". Part of it comes from the fact that you can't have a single revision on two branches. I would really like the node of a release branch to be on both the branch and the mainline so that the tag appears in the history of both, but that's not possible. I'm a total newbie to Mercurial - I've basically done little more than the tutorial, in fact. So somebody correct me if what I'm about to say is completely stupid. Wouldn't the simple thing to do in Mercurial, just be to use different repositories for long-lived branches? I mean, if you're not merging them that much anyway, what's the point? Isn't it just as easy to pull from another repository, as from another branch within that repository? ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376 - get_egginfo_files
At 05:26 PM 7/5/2009 +0100, Paul Moore wrote: def get_distribution(name): for d in get_distributions(): if d.name == name: return d return None Btw, this is broken code anyway, because it's not handling case-insensitivity or name canonicalization. (I've mentioned these issue previously on the distutils-sig.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376 and PEP 302 - allowing import hooks to provide distribution metadata
At 03:13 PM 7/5/2009 +0200, Tarek Ziadé wrote: The loader would still work with "DIST_NAME.egg-info" directories for compatibility with existing format in the query APIs, but the Distutils install command would rather create "DIST_NAME.metadata" Note that this would then break setuptools without adding any benefit; ".metadata" is less precise and less unique than '.egg-info'. If you want a clearer name, '.pydist' or some such would at least be reasonably specific. (It'd still have a backward compatibility problem, but at least then there'd be some benefit to the name change.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376 and PEP 302 - allowing import hooks to provide distribution metadata
At 08:43 PM 7/5/2009 +0200, Tarek Ziadé wrote: But if it's based on PEP 302 protocols and if the pkgutil code works with the sys.meta_path hook, setuptools could then provide its loader, based on its EggFormats and act as a provider without being broken. You misunderstand me. The whole point of putting .egg-info in distutils in the first place was to enable setuptools to detect the presence of disutils-installed packages. That's what's broken by changing the name. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376 and PEP 302 - allowing import hooks to provide distribution metadata
At 07:10 AM 7/6/2009 +1000, Nick Coghlan wrote: By using a new name for the directory we *guarantee* that old packaging utilities won't get confused by the new format (they simply won't acknowledge its existence). This is incorrect; they will get confused because they will think that the relevant package is *not* installed, and proceed to install a duplicate. That's why .egg-info was added to the stdlib in the first place. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376 and PEP 302 - allowing import hooks to provide distribution metadata
At 10:32 AM 7/6/2009 +0100, Paul Moore wrote: I'm +0 on changing the name, as long as it's the *only* "do it this way because setuptools isn't going to change" issue. Please note that I never said that. I was the one who suggested ".pydist", remember? I just don't want to have to complicate the pkg_resources directory scan to support a stupid name like "metadata", no offense to Tarek. ;-) I'd sooner not make the change at all, just because I doubt I'm the only person who has code that depends on the old name. (Consider the Linux OS packagers who treat .egg-info files specially at the moment.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376 and PEP 302 - allowing import hooks to provide distribution metadata
At 04:14 PM 7/6/2009 +0100, Paul Moore wrote: 2009/7/6 Nick Coghlan : > P.J. Eby wrote: >> At 08:43 PM 7/5/2009 +0200, Tarek Ziadé wrote: >>> But if it's based on PEP 302 protocols and if the pkgutil code works >>> with the sys.meta_path hook, >>> setuptools could then provide its loader, based on its EggFormats and >>> act as a provider without being broken. >> >> You misunderstand me. The whole point of putting .egg-info in distutils >> in the first place was to enable setuptools to detect the presence of >> disutils-installed packages. That's what's broken by changing the name. This is getting confusing. Is Phillip saying that setuptools will cope with the file changing to a directory without modification, but it can't handle a change in the name? The existing versions of setuptools will read a file or a directory with no problem; it's the name change that will require a code change, and it's a rather more complex issue than just one name change, because it'll need to support both names. What's more, on the build/install side, it'll have to figure out whether to use the new name or the old name when creating a project's metadata for installation in single-version mode. In other words, this will likely affect pip as well, or at least the parts of setuptools that pip uses. My site-packages has a confusing mix of egginfo directories and files. Note that I NEVER use setuptools other than where an existing package's setup.py requires it. In that case, I still only do python setup.py bdist_wininst and install the generated installer. So is PEP 376 going to be able to cope with the stuff I have installed at the moment? If not, what's the point??? If I understand Tarek's proposal correctly, then no, it will not cope. If setuptools is not going to change to conform to PEP 376, then any tools built using PEP 376 will fail to recognise my coverage install. I'm all in favor of adding RECORD support to setuptools; it was in fact my idea to have the file there in the first place. Adding a RECORD file doesn't introduce any new and weird name migration requirements, which is why I'd rather not change the extension if we can avoid it. Reading both names is painful, writing both is more so, and I'm not sure how many tools/users *besides* setuptools will be affected by a name change. > How much information does setuptools actually need in order to tell that > a distribution is already present? Presumably the existing .egginfo > files generated by distutils are sufficient for that task? It appears so, but setuptools doesn't use (or create!!!) those files in its own installer formats. Setuptools treats an .egg-info file as if it were a PKG-INFO file contained in an .egg-info directory. This allows it to treat distutils-supplied .egg-info files as if they were setuptools-supplied .egg-info directories containing exactly one file. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376 and PEP 302 - allowing import hooks to provide distribution metadata
At 07:18 PM 7/6/2009 +0200, Tarek Ziadé wrote: >> So is PEP 376 going to be able to cope with the stuff I have installed >> at the moment? If not, what's the point??? > > If I understand Tarek's proposal correctly, then no, it will not cope. Why that ? Can you detail ? On a system that uses only plain distutils distributions, it'll work. Paul's question, as I understood it, was whether pkgutil would handle the existing .egg-info's. Your proposal, as I understood it, was that pkgutil would not detect .egg-info's without a plugin of some sort... and you've not proposed a way of loading such a thing, which means that to use pkgutil with that plugin, an application would have to know that the plugin exists and import it. Unless you're planning to have a plugin registration facility (ala entry points) so that plugins can be used without each application needing to discover them on its own? ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376 and PEP 302 - allowing import hooks to provide distribution metadata
At 07:15 PM 7/6/2009 +0100, Paul Moore wrote: My point was that distributions which use setuptools in their setup.py, *even if there's no runtime dependency on setuptools* end up with non-standard .egg-info's. There is no good reason for this, from my POV as a package user. So if setuptools is breaking otherwise entirely standard packages, so that they won't work with core distutils mechanisms, that's wrong. Setuptools is refusing to play nicely with other packaging systems. Now at the moment, this is a moot point, because the stdlib doesn't have any package discovery mechanisms. Not so; it has .egg-info. (See footnote below.) The PEP 302 hooks I'm writing provide the "plugin" ability for alternative storage formats. But there's no provision for alternative *metadata* formats. What's the point of a standard if people don't follow it? I'm confused. Are you saying pkgutil will recognize .egg-info, or not? The only practical difference between the existing formats and the proposed new one (AFAICR) is the extension of the directory name, and the presence or absence of the RECORD file. My POV is that it would be a shame to discard compatibility with existing installed software, just for a bikeshedding of the directory name. Especially if it means that any tools that currently generate .egg-info's (and I'm not 100% positive setuptools is the only case) will need to grow options for whether to generate the old name, possibly based on what Python version... except you can install new distutils into old Python... etc. If this were strictly a matter of, "from Python 3.x on, generate .pydist", it's a lot less of a problem for me and everybody else, because Python 3 is effectively going to force a fork (or at least a branch) of setuptools anyway. For Python 2.x, having *two* names for .egg-info (which is what I understand the proposal to effectively be) is just asking for trouble, IMO. And my concern is that currently these compatibility issues are being handwaved rather than considered up front. To reiterate, summarize, and attempt to make as concrete as I can: I'm fine with a rename, *if and only if* it can be done in a way that never requires a tool to generate BOTH files for compatibility reasons. AFAICT, this can only be done by making the extension dependent upon the Python version, and this version distinction must be made *even if* the distutils themselves are backported to an earlier Python version. (Alternatively, this could be resolved by *not* backporting the PEP 376-compatible distutils/pkgutil to 2.x.) Currently, there is no official Python 3 release of setuptools, so compatibility can (IMO) be considered moot there. What's more, it is easy to conditionally determine whether to use '.egg-info' or '.pydist' based on Python version. What is not easy and likely to be confusing for everyone, is having the possibility for *both* to exist in the same system, *without* also having bidirectional interoperability. So, I'm +1 for no change (obviously), and +0 for "always .pydist in 3.x and always .egg-info in 2.x", whether the latter part is achieved by making distutils/pkgutil use a version-dependent extension, or by refusing to backport distutils/pkgutil to 2.x. I'm -1 for having .pydist ever appear on a 2.x Python, either via stdlib or backporting. (These comments also apply for any name other than '.pydist'.) Footnote: there has been some discussion that implies that somehow .egg-info was hijacked into the stdlib, and that it should therefore not be supported. This is incorrect: at Pycon the year this happened (2005 IIRC), Guido *blessed setuptools for inclusion in the stdlib* in his keynote. Yes, *all* of setuptools, including easy_install, eggs, the whole nine yards. A few months later, when I actually checked in the code, there were protests from a few people on Python-Dev, some of which I agreed were valid. I reverted the checkin, floating instead a compromise proposal to add .egg-info support to distutils (i.e. the install_egg_info command) to allow setuptools and other 3rd party tools to detect the presence of distutils-installed packages. At the time, the assumption was that eggs and setuptools would still, eventually, make it into the stdlib -- at the time, Guido had not yet changed his mind about that. So the ".egg-info" name was not controversial, to the best of my recollection, and the inclusion did not violate any of the expressed concerns of the three persons who objected to including setuptools whole-hog. So, from my POV at least, dropping .egg-info is dropping backward compatibility on a feature that was *intentionally* added to the stdlib -- with approval from Guido and non-protest from Python-Dev -- specifically to provide that compatibility. That doesn't obligate it to stay... but I suggest that it *does* mean Python 3, not 2, is the appropriate place for dropping it (if it is indeed to be dropp
Re: [Python-Dev] PEP 376 - Open questions
At 07:38 PM 7/6/2009 +0100, Paul Moore wrote: As promised, here are some open questions on PEP 376. - Will the public API names be changed from *egginfo* to *metadata*? +1 (FWIW, 'metadata' is what pkg_resources API refers to this kind of stuff as.) - What precisely are the use cases for absolute path names? Concrete examples are needed. With the current spec, some things can go wrong (e.g., see below), so we need real use cases to know how to address this. The purpose is to be able to remove scripts, data files, documentation, etc. that's stored in some place that's not package-relative, and survive a move of the package directory itself. For example, suppose I install with --prefix=~, and have files in ~/bin and ~/lib. I then move ~/lib/python2.7 to ~/pylib. If the scripts were installed with a relative path, this would break the link to the script still living in ~/bin. Now that I'm thinking about this, it occurs to me that there is another way to address this besides mixing relative and absolute paths in RECORD, and that is to make *all* the paths relative, and include an original installation location in INSTALLER. Then, it would be possible to detect whether the installation location had been moved and handle it accordingly... for some definition of "accordingly". I agree that further discussion is probably needed to resolve this. - How will bdist_wininst/bdist_msi/bdist_rpm be updated? bdist_wininst, bdist_dumb, and various others use 'install --root' pattern to generate files for installation, which means that they would transparently end up writing a correct RECORD file, except for the inclusion of incorrect absolute paths for non-libdir-relative files. However, if we used the "all relative in RECORD, with a base in INSTALLER", these cases could transparently be treated as another instance of install directory relocation. I don't know if bdist_msi does a --root install before generating the .msi; if it does, then it should work the same way. - Can distutils be made to install files in places the current RECORD file spec can't handle? (I think the answer is "yes"). What happens then? Surely an absolute path is sufficient? - Should distribution names be case insensitive on case insensitive filesystems? For comparison, module/package names are always case sensitive even on case insensitive systems. IMO, they should be case-insensitive *regardless* of filesystem case, and I believe PyPI has already begun disallowing packages with only case-distinct names (e.g. 'Zope' vs. 'ZoPe') to be registered. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] "Absolute" paths in PEP 376 RECORD files
Just an idea... suppose that instead of using "real" absolute paths in the RECORD file for non-local files (scripts, data, etc) we changed the format to include a "prefix" field, containing something like LIBDIR, SHARE, SCRIPTS, etc., ala bdist_wininst internals? Also, we could include a separate (optional) PREFIXES file defining what those locations translated to at install time. Dumb bdists would still have their build paths there (or delete the file before packaging, or use real paths instead of the --root prefixed versions). Upside: relocation can be detected and handled, RECORD remains cross-platform, and bdists are ok. Downside: more complex API required to read/manipulate paths and delete files, since you need to be able to check that you have the right prefixes, and may have to ask the user for a prefix you don't recognize, if the default in PREFIXES doesn't match. Thoughts? (Also, as a special case, any file that's actually installed to LIBDIR or a subdirectory thereof (even if it's technically a "data" file or script), will probably need to be designated under LIBDIR prefix to prevent runtime breakage in the event any system package maintainers are tempted to use RECORD files as a way of forcing a HFS conformance. These are installation prefixes, *not* content types.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376 - Open questions
At 03:23 PM 7/7/2009 +0200, Tarek Ziadé wrote: When I started to work on this I didn't realize the gigantic amount of work and coordination it requires No one expects the package inquisition. ;-) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376 - Open questions
At 11:30 PM 7/7/2009 +0200, Tarek Ziadé wrote: On Tue, Jul 7, 2009 at 10:31 PM, P.J. Eby wrote: > At 03:23 PM 7/7/2009 +0200, Tarek Ziadé wrote: >> >> When I started to work on this I didn't realize the gigantic amount of >> work and coordination it requires > > No one expects the package inquisition. ;-) > > Sorry, i've looked in the english dictionary but I don't get this one. what do you mean ? It's a Monty Python reference, i.e. "no one expects the Spanish Inquisition". In the Spanish Inquisition sketch, a character is asked several questions he can't answer, and then complains that he wasn't expecting some kind of Spanish inquisition, at which point three scarlet-robed cardinals burst into the room and exclaim, "NO ONE expects the Spanish Inquisition!" So I punned on that in order to imply that nobody who takes on the job of packaging expects to be asked the kind of detailed, unanswerable questions that come with the territory of packaging, or to be subjected to torture by third parties merely for trying to do something useful. And, by further implication, I was thus expressing camaraderie with you on your initiation into the wonderful world of trying to satisfy everyone in Pythonland, and our shared experience in the area of "if I'd really known what I was getting into, I probably wouldn't have". ;-) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376 - Open questions
At 01:13 PM 7/8/2009 +0100, Paul Moore wrote: An alternative solution would be for the bdist_xxx commands to ignore the RECORD file generated in the temp area, and build its own on the target machine when the installer is run. This is conceptuially far cleaner, it's in line with the implicit assumptions in the PEP, and it avoids a lot of the problems with path formats (the filenames in RECORD could be absolute local pathnames, and that's the end of the story). But - and it's a big but - this dumps RECORD file handling on the bdist_xxx command, and they all need to implement the same code, resulting in a lot of duplication and opportunity for inconsistency. The PREFIXES approach I suggested doesn't have these problems; the bdist installer only needs to write a PREFIXES file, and doesn't need to touch RECORD (which would be generated by the install --root). In fact, since PREFIXES is strictly advisory, it can be generated by the install --root and left alone at installation time. And, if it's generated without the --root prefix included, RPMs would have a correct PREFIXES file without further manipulation. Plus, the bdist_wininst .exe installer can easily write a PREFIXES file, since it has to know the prefixes in order to do an installation in the first place. (We still need to hash out an initial list of prefixes, of course.) At 08:59 AM 7/8/2009 -0400, Eric Smith wrote: I agree with this. For RPM's, there's a whole other database of what files were installed. Is it really the intent that a file will be managed by multiple different installers? That I can install with RPM but remove with some python-installer (or other) tool? That way lies madness. In fact, I see RECORD as an installer-specific detail that doesn't need to be standardized at all. This is a misunderstanding. The purpose is to let an *installer* (like easy_install) know that these files belong to some other installer, and not allow overwriting them. That's why there's also an INSTALLER file. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376 - Open questions
At 03:57 PM 7/8/2009 +0100, Paul Moore wrote: Who's going to use the APIs based around the RECORD file? Anyone? The distutils uninstall facility, for starters. easy_install and pip also will, eventually. In all three cases, the use will be in order to avoid overwriting files belonging to a system package manager, and in order to uninstall/upgrade their own files. (I can't personally commit that pip will need this; but it's currently the only installer besides the distutils that installs things in this format, and would therefore need this to prevent overwriting things.) Setuptools will also need to generate RECORD, INSTALLER, and PREFIXES appropriately for both development and actual installations. Seriously - who is actually going to USE any of the APIs being proposed in PEP 376? If no-one's interested, we should stop right now. Do any of the people from the summit read this list? If not, can anyone contact them and get some feedback? Package management people aren't likely to be doing anything with this stuff directly; if they need a list of installed files they're probably using --root or --record at package build time, not needing to sniff stuff and installation time, and they already have installation facilities. Their only real need in relation to this will be to set an option to generate an INSTALLER file that says you can't uninstall their packages manually, thereby preventing the package from being uninstalled by the distutils utility, or any other well-behaved PEP 376-based package management tool. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376 - Open questions
At 05:22 PM 7/8/2009 +0100, Paul Moore wrote: 2009/7/8 P.J. Eby : > At 03:57 PM 7/8/2009 +0100, Paul Moore wrote: >> >> Who's going to use the APIs based around the RECORD file? Anyone? > > The distutils uninstall facility, for starters. easy_install and pip also > will, eventually. Is pip getting this via setuptools, or will it be independently changing its code? (Who's the pip contact who could confirm this directly?) It would likely be changing its code directly, with respect to supporting overwrite-avoidance and uninstall-before-upgrade scenarios. (Generation support would likely come from setuptools, dependent on who implemented first.) Ian Bicking is pip's author. In case (1), I'd imagine the user is careful enough (given that he's doing everything manually) to avoid installing something that's already present. Obviously, you've never tried doing this yourself. ;-) My guess would actually be that in Unixy environments these days, people either use virtualenv or buildouts to manage such things. Buildout handles uninstall-before-update (for eggs, anyway, I'm not sure about other things) and virtualenv makes it easy to create throwaway Python installs. So if I follow this, there's a use for this in making sure the new distutils uninstall option doesn't break something it shouldn't, but as the uninstall uses RECORD, it won't be (accidentally) usable on distributions that didn't follow PEP 376 and create a RECORD file. That's an interesting point. But for non-distutils tools, it'd be nice to be able to find out whether another package manager is administering some particular file. For that matter, the distutils themselves should (ultimately) support uninstall-before-upgrade and avoid overwriting another package manager's files. If the only driver for this PEP is setuptools, then I'm -1 on it. If it were being driven by setuptools, I'd have just implemented it myself and presented it as a fait accompli. I can't speak to Tarek's motives, but I assume that, as stated in the PEP, the primary driver is supporting the distutils being able to uninstall things, and secondarily to allow other tools to be built on top of the API. a packaging tool *other* than setuptools (or setuptools-derived projects) Is there really such a thing? ;-) AFAIK, every published tool for managing Python projects is either distutils-based or setuptools-based. (Things like scons and pymake and various other project build tools don't seem to fall under "packaging tool" in this sense.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376 - Open questions
At 01:58 PM 7/8/2009 -0400, Eric Smith wrote: I really don't see this use case. Supporting multiple installers for the same file (or even just trying to prevent two installers from writing the same file)? Wouldn't you be better off just saying an installer can't overwrite any existing file? Likewise, I don't see a use case for installing with one installer and uninstalling with another. Put those two together, and the mechanism that an installer uses to record what files it installed is a private implementation detail. You'll have to ask Tarek about that. As I've said in another email, if I were doing this for setuptools I'd just go and do it; it's already on my backlog of things to do. The point of the PEP is to provide an uninstall facility for the distutils, and to expose those records via an API so that other tools can be built on top of it. Whether anything other than distutils-based tools will actually use the format directly is entirely irrelevant to the PEP's merits, since bdist_* tools and platform-specific distutils installers may need to read or write the format. Hence, a PEP is appropriate to describe a format and API for the distutils family of tools to operate on. The specific proposed format is based on an existing de facto standard for installed package metadata, so that the many existing packaging tools based on that other format and existing APIs can easily read or write this data, and so that the API can also be used as a way to detect the presence of already-installed distributions of software -- even ones installed other ways. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376 - Open questions
At 09:38 AM 7/9/2009 +0900, Stephen J. Turnbull wrote: Eric Smith writes: > But I think we've veered into metadata that describes what has been > installed. I don't think that's so useful. As I've said, this is private > to the installers. If 2 installers want to communicate with each other > about what they've installed, then they can agree on that data. I just > don't find it generally useful for all installers, and therefore not > useful for distutils. ISTM that the problem that it solves is uninstall in the absence of the original installer. Or uninstall where the installer is "setup.py install", actually. Am I to understand that you don't think that use case is important? Or that there's another way to do this? ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] "Absolute" paths in PEP 376 RECORD files
At 06:51 PM 7/8/2009 -0700, Sridhar Ratnakumar wrote: Is there any reason why RECORD file can't be generated at runtime? Applications can be relocatable, yet require plugins and libraries to be upgraded, installed, uninstalled, etc. (Also, RECORD is cross-platform to allow cross-platform installation/testing of cross-platform packages; that's why I propose keeping the prefix information separate.) Also, why should the RECORD file be generated at all by bdist* commands? bdist commands that use "install --root" will get it as a side effect, so they don't have to do anything special. They needn't even generate the PREFIXES, if PREFIXES is also generated by "install --root", and omits the root from the prefixes written. In other words, most bdist tools wouldn't change. bdist_wininst and bdist_msi might need to regenerate PREFIXES at actual installation time, but AFAIK those are the only ones. (And the spec would allow for the possibility that a PREFIXES file can be incorrect.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376 - Open questions
At 11:20 PM 7/8/2009 -0400, Eric Smith wrote: P.J. Eby wrote: ISTM that the problem that it solves is uninstall in the absence of the original installer. Or uninstall where the installer is "setup.py install", actually. I think we need to move away from "setup.py install". It's the antithesis of static metadata. Please note that that's entirely out of scope for the PEP at hand. That being said, the rest of your proposal is strikingly similar to a proposal I previously floated on the distutils-sig for a concept called BUILDS. The main difference is that I suggested that the spec should include a standard interface for running build operations that would produce the manifest (equivalent to your setup.info), and that distutils and setuptools should provide setup.py commands to generate said manifest, to allow for a seamless transition. There was very little comment on the proposal, perhaps because it involves a lot of work that most people are sane enough not to sign up for. ;-) (Or more to the point, it's the sort of thing that never gets off the ground for design by consensus. The best way to make something like your proposal to happen is to go off and build it, and get it to take over. If it can't win substantial market share on its own merits, it probably doesn't deserve to be blessed as a standard.) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 376 - Open questions
At 11:28 PM 7/8/2009 -0400, Eric Smith wrote: Eventually, I'd like PEP 376 to support system packagers too. So for example, if you did "apt-get install python-pyqt4", then running "pip install python-pyqt4" should return without installing anything .. as RECORD will be part of the .deb previously installed. As for generating the RECORD file, I vote for generating it during install time (w/ absolute paths). I think we should explicitly not support this. What if pip and apt-get (or rpm, or others) install the same package in different places because of system conventions (/usr vs. /usr/local vs. /opt, say)? There's no way we're ever going to get this right, and it's not worth complicating our lives over it. Seriously: Is there some real world use case I'm missing? Does any existing install system support this? The use case described above is supported right now by easy_install and pip; it doesn't require a RECORD file, though. An .egg-info file or an .egg-info directory with a PKG-INFO is enough to prevent duplicate installation. I don't know how pip handles file overwrites; easy_install OTOH never overwrites anything that's not in an .egg or a .pth. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] "Absolute" paths in PEP 376 RECORD files
At 02:46 PM 7/9/2009 -0400, Tres Seaver wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Nick Coghlan wrote: > P.J. Eby wrote: >>> Also, >>> why should the RECORD file be generated at all by bdist* commands? >> bdist commands that use "install --root" will get it as a side effect, >> so they don't have to do anything special. They needn't even generate >> the PREFIXES, if PREFIXES is also generated by "install --root", and >> omits the root from the prefixes written. In other words, most bdist >> tools wouldn't change. bdist_wininst and bdist_msi might need to >> regenerate PREFIXES at actual installation time, but AFAIK those are the >> only ones. (And the spec would allow for the possibility that a >> PREFIXES file can be incorrect.) > > I haven't seen any suggestions for handling non-relative paths that are > cleaner than this. The distutils based list that Tarek posted also gives > a good "starter set" of supported entries in the PREFIXES file: > > PURELIB= > PLATLIB= > HEADERS= > SCRIPTS= > DATA= > > The question then would be whether relative paths should be handled > directly in the RECORD file (as in the current PEP) or by writing a "." > into PREFIXES to indicate relative paths. Per discussion with Matthias Klose, it would be useful to split out the following prefixes for "special" kinds of data: - - docs - - i18n / locales Unfortunately, the distutils don't currently have a way to specify these. I think we'll need to use only the ones that are distutils install targets at first, but of course the format is extensible. OTOH, one supposed that at least the 'data_files' setup argument could be enhanced to allow a data type flag, that would then be used as a prefix name for those files, with a fallback to the DATA prefix if there isn't one specified. But that's something that will require tagging support in the install_data command. As it is, refactoring the distutils to track prefixes for the RECORD file is probably going to be... challenging. ;-) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] "Absolute" paths in PEP 376 RECORD files
At 03:31 PM 7/10/2009 +0200, Tarek Ziadé wrote: On Thu, Jul 9, 2009 at 9:09 PM, P.J. Eby wrote: > At 02:46 PM 7/9/2009 -0400, Tres Seaver wrote: >> >> -BEGIN PGP SIGNED MESSAGE- >> Hash: SHA1 >> >> Nick Coghlan wrote: >> > P.J. Eby wrote: >> >>> Also, >> >>> why should the RECORD file be generated at all by bdist* commands? >> >> bdist commands that use "install --root" will get it as a side effect, >> >> so they don't have to do anything special. They needn't even generate >> >> the PREFIXES, if PREFIXES is also generated by "install --root", and >> >> omits the root from the prefixes written. In other words, most bdist >> >> tools wouldn't change. bdist_wininst and bdist_msi might need to >> >> regenerate PREFIXES at actual installation time, but AFAIK those are >> >> the >> >> only ones. (And the spec would allow for the possibility that a >> >> PREFIXES file can be incorrect.) >> > >> > I haven't seen any suggestions for handling non-relative paths that are >> > cleaner than this. The distutils based list that Tarek posted also gives >> > a good "starter set" of supported entries in the PREFIXES file: >> > >> > PURELIB= >> > PLATLIB= >> > HEADERS= >> > SCRIPTS= >> > DATA= >> > >> > The question then would be whether relative paths should be handled >> > directly in the RECORD file (as in the current PEP) or by writing a "." >> > into PREFIXES to indicate relative paths. I like the idea of a separated PREFIXES file, then using variables in the RECORD file. Please note that my proposal intentionally omitted "variables", but instead used a separate column to designate the prefix. This allows some guarantees regarding path sanity that could otherwise be abused, like combining prefixes or inserting them in the middle of paths. In this way, the formal conversion of a path would be os.path.join(PREFIX, *relative_path.split('/')), rather than a variable expansion. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com