from:"P.J. Eby"

Re: [Python-Dev] Reviving restricted mode?

2009-02-22 Thread P.J. Eby


At 08:45 AM 2/22/2009 -0800, Guido van Rossum wrote:

I've received some enthusiastic emails from someone who wants to
revive restricted mode. He started out with a bunch of patches to the
CPython runtime using ctypes, which he attached to an App Engine bug:

http://code.google.com/p/googleappengine/issues/detail?id=671

Based on his code (the file secure.py is all you need, included in
secure.tar.gz) it seems he believes the only security leaks are
__subclasses__, gi_frame and gi_code. (I have since convinced him that
if we add "restricted" guards to these attributes, he doesn't need the
functions added to sys.)

I don't recall the exploits that Samuele once posted that caused the
death of rexec.py -- does anyone recall, or have a pointer to the
threads?


Just a question, but, if you just need a pure-python restricted 
environment for App Engine, why not just use the RestrictedPython 
package (i.e., http://pypi.python.org/pypi/RestrictedPython )?



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Reviving restricted mode?

2009-02-22 Thread P.J. Eby

At 07:56 PM 2/22/2009 -0800, Guido van Rossum wrote:

On Sun, Feb 22, 2009 at 7:39 PM, P.J. Eby  wrote:
> Just a question, but, if you just need a pure-python restricted environment
> for App Engine, why not just use the RestrictedPython package (i.e.,
> http://pypi.python.org/pypi/RestrictedPython )?

How does that work? Remember, app engine doesn't support certain
things, and bytecode manipulations (if that's what RestrictedPython
does) are one of the unsupported things.

It doesn't modify bytecode, it modifies an AST.  It basically 
replaces prints, and attribute/item read/writes with function 
calls.  Unfortunately, it does this AST modification by running as a 
traversal against the stdlib compiler package's AST, not a modern 
AST.  So, I suppose it might not be usable as-is on app engine.

It does, however, have the advantage of having been used in Zope for 
oh, six or seven years now?  ISTM that it first came out around the 
same time as Python 2.3, and the latest version just dropped support 
for Python 2.1 and 2.2.  So, if you want something that wasn't thrown 
together in an afternoon, it might be a good thing to take a look at.  ;-)

The other reason I can think of is that Tav is a capabilities purist. :-)

You can implement capabilities on top of RestrictedPython; it's 
simply a policy-neutral enforcement framework.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Challenge: Please break this! [Now with blog post]

2009-02-24 Thread P.J. Eby

At 03:52 PM 2/24/2009 +0100, Victor Stinner wrote:

Le Tuesday 24 February 2009 15:46:04 Andrew Dalke, vous avez écrit :
> A goal is to use this in App Engine, yes? Which uses cgitb to report
> errors? Which needs these restricted frame attributes to report the
> values of variables when the error occurred?

We should be able to restore the original environment. Example:

   ...
   jail(evil_func)  # called in the jail
   # unsafe environment with __subclasses__, f_code, etc.
   ...

Of course, you'll have to ensure that anything you do with data from 
the jail is also jailed...  that callbacks run in the jail, 
etc.  (This is one advantage of the RestrictedPython approach -- the 
jailing of the restricted code isn't dependent on some global state; 
it's wired right into the restricted code.)

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] ABCs and MRO

2009-03-02 Thread P.J. Eby

At 10:41 AM 3/2/2009 +, Paul Moore wrote:

2009/3/2 Benjamin Peterson :
> 2009/3/1 Paul Moore :
>>
>> Is it worth getting simplegeneric exposed in 3.1
>> (http://bugs.python.org/issue5135)? If it's going to be in 2.7, I'd
>> like to see it hit 3.1. The patch is against trunk (for 2.7) at the
>> moment, I'm not sure what the process would be for forward-porting it
>> (do I generate a new patch against the py3k branch, or should it be
>> applied to trunk and merged in?)

By the way guys, are you aware of:

  http://pypi.python.org/pypi/simplegeneric

There might be a bit of name confusion by exposing pkgutils' internal 
simplegeneric there.  Perhaps it should be called "trivialgeneric", 
as it's even tinier than simplegeneric.  ;-)

The key problem with the patch is that ABCs do not play well with the
type of introspection required to implement a generic function -
namely enumeration of the superclasses of a class. The MRO of the
class is fine for normal inheritance, but for ABCs it is possible to
register classes which don't inherit from the ABC, so that you have a
situation where issubclass (C, MyABC) can be true without MyABC being
in C.__mro__:

>>> import abc
>>> class MyABC(object):
... __metaclass__ = abc.ABCMeta
...
>>> class C(object):
... pass
...
>>> MyABC.register(C)
>>> issubclass(C, MyABC)
True
>>> C.__mro__
(, )
>>>

More generally, there is NO WAY to determine the list of classes for
which issubclass(C, x) is true.

This could be considered a limitation of, or a bug in, ABCs, I don't
have a particular opinion on that, but it does mean that no code which
relies on being able to traverse the class inheritance graph will see
ABCs. One particular case of this is (any implementation I can think
of, of) generic functions.

In my view, this implies one of the following:

1) It should be a documented limitation of such code that it doesn't
work with ABCs (and conversely, this limitation of ABCs should be
documented in the ABC documentation)
2) Generic functions, and any other code requiring this type of
introspection, is essentially useless unless it can support ABCs, and
should not be used in the light of this limitation.
3) This is a bug in ABCs and should be fixed.
4) Something else I didn't think of :-)

In my view, (2) is an unreasonable position to take, given the fact
that (as I understand it) ABCs are supposed to be largely optional and
shouldn't affect code that doesn't care about them...

It's not clear to me how (3) should be addressed. Adding a slot to all
classes to hold a list of ABCs they are registered against seems to be
a large overhead for a relatively rarely used feature. I guess having
a global registry of ABC registrations could work, but it seems
clumsy. Any other suggestions?

This isn't really a new problem; if you base your generic function 
methods off of interfaces implemented by a type or instance, you have 
the same basic issues.

For systems that use a cache based on object type (like Guido's 
tuple-dispatch prototype, and my enhanced version in PEAK-Rules), the 
actual lookup is not a big deal.  You have a type-based test and you 
cache the result for the type.

PEAK-Rules' predicate dispatching is a bit more complex, because you 
need a rather more complex type test; the tree generator has to look 
at whether a type test is an ABC, and effectively translate it to 
"oldstyleisinstance(arg, ABC) or not oldstyleisinstance(arg, ABC) and 
ABC.__instancecheck__(arg)".  (Where oldstyleisinstance represents an 
__instancecheck__-free version of isinstance.)

This isn't a major problem either, just a bit of a bore/pain to implement.

The hairier issue for these types of systems is method precedence, 
though.  Since __mro__'s have to be consistently ordered, you can 
straightforwardly determine whether one class is "more specific" than 
another in a static way.  But with dynamic registration, the question 
could be more complex.

Personally, I'd like to see some way to subscribe to changes in ABC 
registration, so that generic functions or other tools can update 
their caches.  With that feature, you might even be able to implement 
full ABC support for simplegeneric, by treating ABC registrations as 
equivalent to mass registration of the ABC's registrants.

That is, if "AnABC.register(X)" and "afunc.register(AnABC, meth)" 
then "afunc.register(X, meth)".  So each time AnABC gets a new 
registrant, you automatically register the ABC method for the new 
registrant, as long as there's not already a method registered for 
that specific type.  That would probably be sufficient for what 
simplegeneric is doing.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] ABCs and MRO

2009-03-02 Thread P.J. Eby

At 09:14 PM 3/2/2009 +, Paul Moore wrote:

2009/3/2 Jeffrey Yasskin :
> I tend to think it's a bug in ABCs. You seem to have thought of
> several possible ways to fix it, and I don't have strong preferences
> between them.

I've discussed ways of fixing simplegeneric, but not of fixing the
issue with ABCs. I'm not sure the ABC "issue" is fixable - after all,
it's by design that ABCs can implement __issubclass__ and "magically"
become superclasses of arbitrary classes as a result.

I'm not happy about fixing simplegeneric, though, as the whole point
was just to expose an existing implementation, because it might be
generally useful. If we start expanding and enhancing it, there are
better implementations already available on PyPI (better in the sense
of having seen real-world use). And if they don't handle ABCs, then
that might indicate that needing to handle ABCs isn't as vital as this
discussion would seem to imply (but I have no real-world data myself
to make such a claim).

I would say it's mainly an annoyance, as long as you have a way to 
access the method registered for a given ABC, because then you can 
explicitly register it for types that are being dynamically detected.

Now, how widespread an annoyance/FAQ issue it'll be, will depend on:

1. whether the stdlib itself is using GF's and
2. whether those GF's are meaningfully used against ABCs, and
3. how common it is for classes to be "implicit subclasses" of one of 
those ABCs *and* used with one of those GFs.

This sounds like a lot of conditions that have to line up, for it to 
become an (easily worked-around) annoyance.

My inclination is simply to document that it works on explicit type 
relationships, and note that ABC's and other exotic users of 
__*check__ methods may require explicit method registrations to work correctly.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 377 - allow enter() methods to skip the statement body

2009-03-15 Thread P.J. Eby


At 06:28 AM 3/16/2009 +1000, Nick Coghlan wrote:

There are some practical hurdles to that idea (specifically, creating a
callable which uses its parent's namespace rather than having its own),
but the basic concept seems sound.


Actually, that bit's pretty simple -- they're just "nonlocal" 
variables.  But the cost of creating that function, and the cost of 
having cell variables in the surrounding function is potentially 
high.  (In CPython at least, function calls are optimized in certain 
ways when the function only has "fast" locals, and no "cell" locals.)


The cost of creating the function (though not the code object) could 
be held till runtime, since the interpreter could wait until it's 
sure there's a __with__ method before doing a MAKE_CLOSURE on the code object.


Of course, at that point, what's the difference between:

with foo() as bar:
baz

and...

@foo
def bar():
   baz

except for being slightly less verbose? (due to missing nonlocal 
statements, etc.)


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 380 (yield from a subgenerator) comments

2009-03-21 Thread P.J. Eby


At 04:45 PM 3/21/2009 +1000, Nick Coghlan wrote:

I really like the PEP - it's a solid extension of the ideas introduced
by PEP 342.


(Replying to you since I haven't seen any other thread on this)

My concern is that allowing 'return value' in generators is going to 
be confusing, since it effectively causes the return value to 
"disappear" if you're not using it in this special way with some 
framework that takes advantage.


However, if you *do* have some framework that takes advantage of 
generators to do microthreads, then it is most likely already written 
so as to have things like 'yield Return(value)' to signal a return, 
and to handle 'yield subgenerator()' without the use of additional syntax.


So, I don't really see the point of the PEP.  'yield from' seems 
marginally useful, but I really dislike making it an expression, 
rather than a statement.  The difference seems just a little too 
subtle, considering how radically different the behavior 
is.   Overall, it has the feel of jamming a framework into the 
language, when doing the same thing in a library is pretty trivial.


I'd almost rather see a standard or "reference" trampoline added to 
the stdlib (preferably with a way to register handling for 
specialized yielded types IO/scheduling hooks), than try to cram half 
a trampoline into the language itself.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 380 (yield from a subgenerator) comments

2009-03-21 Thread P.J. Eby


At 10:21 AM 3/22/2009 +1200, Greg Ewing wrote:

P.J. Eby wrote:

My concern is that allowing 'return value' in generators is going 
to be confusing, since it effectively causes the return value to 
"disappear" if you're not using it in this special way with some 
framework that takes advantage.


But part of all this is that you *don't* need a special
framework to get the return value -- all you need is a
caller that uses a yield-from statement. There are uses
for that besides threading systems.


Such as?  I've been wracking my brain trying to come up with any 
*other* occasion where I'd need -- or even find it useful -- to have 
one generator yield the contents of another generator to its caller, 
and then use a separate return value in itself.  (I'm thus finding it 
hard to believe there's a non-contrived example that's not doing I/O, 
scheduling, or some other form of co-operative multitasking.)


In any case, you didn't address the confusion issue: the inability of 
generators to return a value is there for a good reason, and adding a 
return value that doesn't actually return anywhere unless you use it 
in a yield-from expression -- an expression that both looks like a 
statement and has control-flow side-effects -- seems both 
over-complex and an invitation to confusion.


This is different from plain yield expressions, in that plain yield 
expressions are *symmetric*: the value returned from the yield 
expression comes from the place where control flow is passed by the 
yield.  That is, 'x = yield y' takes value y, passes control flow to 
the caller, and then returns a result from the caller.  It's like an 
inverse function call.  'x = yield from y', on the other hand, first 
passes control to y, then the caller, then y, then the caller, an 
arbitrary number of times, and then finally returns a value from y, 
not the caller.


This is an awful lot of difference in control flow for only a slight 
change in syntax -- much more of a difference than the difference 
between yield statements and yield expressions.


So at present (for whatever those opinions are worth), I'd say -0 on 
a yield-from *statement* (somewhat useful but maybe not worth 
bothering with), +0 on a reference trampoline in the stdlib (slightly 
better than doing nothing at all, but not by much), and -1 on 
yield-from expressions and return values (confusing complication with 
very narrowly focused benefit, reasonably doable with library code).


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 380 (yield from a subgenerator) comments

2009-03-22 Thread P.J. Eby


At 08:11 PM 3/22/2009 +1200, Greg Ewing wrote:

P.J. Eby wrote:

(I'm thus finding it hard to believe there's a non-contrived 
example that's not doing I/O, scheduling, or some other form of 
co-operative multitasking.)


Have you seen my xml parser example?

http://www.cosc.canterbury.ac.nz/greg.ewing/python/yield-from/

Whether you'll consider it contrived or not I don't know
(contrivedness being such a subjective property) but it
illustrates the style of programming I'm trying to support
with the return-value feature.


I find the parser *without* yield-from to be much easier to follow 
what's going on, actually...  and don't see what benefit was obtained 
by the additional complication of using send().



In any case, you didn't address the confusion issue: the inability 
of generators to return a value is there for a good reason,


It's there because formerly there was nowhere for the
return value to go. If there is somewhere for it to go,
the restriction will no longer be needed.


But that's begging the question (in the original meaning of the 
phrase) of why we *want* to have two ways to return data from a generator.




As for confusion, we ignore the return values of function
calls all the time, without worrying that someone might be
confused by the fact that their return value doesn't go
anywhere. And that's the right way to think of a yield-from
expression -- as a kind of function call, not a kind of yield.


But it's not a function call -- it's multiple *inverted* function 
calls, followed by special handling of the last iteration of the 
iterator it takes.


The control flow is also hard to explain, as is the implementation.



If there's anything confusing, it's the presence of the
word 'yield'. Its only virtue is that it gives a clue that
the construct has something to do with generators, but
you'll have to RTM to find out exactly what. Nobody has
thus far suggested any better name, however.


Perhaps this is because it's not that interesting of a feature.  As I 
said, I wouldn't fight a yield-from statement without all this 
return-value stuff, although it still seems like too much trouble to me.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 380 (yield from a subgenerator) comments

2009-03-24 Thread P.J. Eby


At 10:22 PM 3/24/2009 -0400, Steve Holden wrote:

There is one non-trivial extension that I've been chewing over for a
while. What if you want to yield not the values from the generator but
some function of those values? The present proposal appears to have no
way to specify that. What about extending the syntax somewhat to

  yield expr for x from X

The idea is that x should be a a bound variable in expr, but the "expr
for x" could be optional to yield the existing proposal as a degenerate
case.


That would be spelled:

   yield from (expr for x in X)

And the compiler could optionally optimize away the 
genexpr.  Assuming, of course, that this is considered valuable 
enough to implement in the first place, which I don't think it 
is...  especially not with the return bit factored in.


Now, if somebody came up with a different way to spell the extra 
value return, I wouldn't object as much to that part.  I can just see 
people inadvertently writing 'return x' as a shortcut for 'yield x; 
return', and then having what seem like mysterious off-by-one errors, 
or being confused by receiving a generator object instead of their 
desired non-generator return value.


It also seems weird that the only syntactically-supported way to get 
the generator's "return value" is to access it inside *another* 
generator...  which *also* can't return the return value to anyone!


But if it were spelled 'raise Return(value)' or 'raise 
StopIteration(value)' or something similar (or even had its own 
syntax!), I wouldn't object, as it would then be obvious how to get 
the value, and there could be no possible confusion with a regular 
return value.


The unusual spelling would also signal that something unusual (i.e., 
multitasking) is taking place, similar to the way some frameworks use 
things like 'yield Return(value)' to signal the end of a task and its 
return value, in place of a value in the stream.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 380 (yield from a subgenerator) comments

2009-03-25 Thread P.J. Eby


At 06:03 PM 3/25/2009 +1200, Greg Ewing wrote:

I wanted a way of writing suspendable functions that
can call each other easily. (You may remember I
originally wanted to call it "call".) Then I noticed
that it would also happen to provide the functionality
of earlier "yield from" suggestions, so I adopted that
name.


I still don't see what you gain from making this syntax, vs. putting 
something like this in the stdlib (rough sketch):


class Task(object):
def __init__(self, geniter):
self.stack = [geniter]

def __iter__(self):
return self

def send(self, value=None):
if not self.stack:
raise RuntimeError("Can't resume completed task")
return self._step(value)

send = next

def _step(self, value=None, exc_info=()):
while self.stack:
try:
it = self.stack[-1]
if exc_info:
try:
rv = it.throw(*exc_info)
finally:
exc_info = ()
elif value is not None:
rv = it.send(value)
else:
rv = it.next()
except:
value = None
exc_info = sys.exc_info()
if exc_info[0] is StopIteration:
exc_info = ()   # not really an error
self.pop()
else:
value, exc_info = yield_to(rv, self)
else:
if exc_info:
raise exc_info[0], exc_info[1], exc_info[2]
else:
return value

def throw(self, *exc_info):
if not self.stack:
raise RuntimeError("Can't resume completed task")
return self._step(None, exc_info)

def push(self, geniter):
self.stack.append(geniter)
return None, ()

def pop(self, value=None):
if self.stack:
it = self.stack.pop()
if hasattr(it, 'close'):
try:
it.close()
except:
return None, sys.exc_info()
return value, ()

@classmethod
def factory(cls, func):
def decorated(*args, **kw):
return cls(func(*args, **kw))
return decorated


def yield_to(rv, task):
# This could/should be a generic function, to allow yielding to
# deferreds, sockets, timers, and other custom objects
if hasattr(rv, 'next'):
return task.push(rv)
elif isinstance(rv, Return):
return task.pop(rv.value)
else:
return rv, ()

class Return(object):
def __init__(self, value=None):
self.value = value


@Task.factory
def sample_task(arg1, another_arg):
# blah blah
something = (yield subtask(...))

yield Return(result)

def subtask(...):
...
yield Return(myvalue)


The trampoline (the _step() method) handles the co-operative aspects, 
and modifying the yield_to() function allows you to define how 
yielded values are processed.  By default, they're sent back into the 
generator that yields them, but you can pass a Return() to terminate 
the generator and pass the value up to the calling 
generator.  Yielding another generator, on the other hand, "calls" 
that generator within the current task, and the same rules apply.


Is there some reason why this won't do what you want, and can't be 
modified to do so?  If so, that should be part of the PEP, as IMO it 
otherwise lacks motivation for a language feature vs. say, a stdlib 
module.  If 'yield_to' is a generic function or at least supports 
registration of some kind, a feature like this would be interoperable 
with a wide variety of frameworks -- you could register deferreds and 
delayed calls and IO objects from Twisted, for example.  So it's not 
like the feature would be creating an entire new framework of its 
own.  Rather, it'd be a front-end to whatever framework (or no 
framework) you're using.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] "setuptools has divided the Python community"

2009-03-25 Thread P.J. Eby

At 12:25 PM 3/25/2009 +, Antoine Pitrou wrote:

Paul Moore  gmail.com> writes:
>
> 3. Setuptools, unfortunately, has divided the Python distribution
> community quite badly.

Wait a little bit, and it's gonna be even worse, now that buildout 
and pip seem
to become popular. For example, the TurboGears people are 
considering switching

from setuptools to pip...

You mean from *easy_install* to pip.  pip still uses and supports setuptools.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] "setuptools has divided the Python community"

2009-03-25 Thread P.J. Eby


At 08:32 AM 3/25/2009 -0500, Olemis Lang wrote:

Sometimes it also happens that, once one such build/packaging systems
is adopted, it is difficult to switch to using another one, since apps
(... and plugins systems ...) are frequently hard-coupled to the pkg
sys «exotic features» and support (... what comes to my mind right now
is Trac plugins which are a little bit dependant on setuptools entry
points and so on ...).


Please note that entry points are not coupled to easy_install.  They 
have a documented file format and API that's *distributed* with 
setuptools, but is not dependent on it and does not require .egg 
files, either.  There's nothing stopping an independent 
implementation in the stdlib, ala PEP 376's metadata API, which is 
designed to be backward compatible with setuptools.  It could be 
extended to support entry points as well.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] "setuptools has divided the Python community"

2009-03-25 Thread P.J. Eby


At 07:40 AM 3/25/2009 -0700, Guido van Rossum wrote:

Well, sorry, but this complex layered interdependent architecture is
one of the *causes* of confusion -- apart from you nobody knows what
is what exactly,


I'll pick a minor nit here...  buildout, pip, and a wide variety of 
other tools and frameworks out there wouldn't exist if I were really 
the only person who knew "what's what".  And I believe most of those 
people will be at the summit.


That having been said, the setuptools documentation definitely sucks 
for the casual reader or curious observer, as opposed to those who 
have serious itches to scratch in the area of dependencies or 
plugins.  Combine that with the practical-but-impure choices I made 
in easy_install to get *something* working, and you have a recipe for 
the current situation.


pkg_resources, for example, is only bundled with setuptools because 
it couldn't go in the stdlib when it was written.  easy_install, 
OTOH, is bundled with setuptools because *setuptools* isn't in the 
stdlib!  (And of course, both use pkg_resources.)


So ironically, setuptools is bundled in the way that it is, precisely 
*because* there's no support for dependencies in the stdlib...  and 
nicely illustrates why smaller libraries (and less bundling) is a *good* thing.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] "setuptools has divided the Python community"

2009-03-25 Thread P.J. Eby


At 10:11 AM 3/25/2009 -0500, Olemis Lang wrote:

... but Trac plugins *do require* egg files ... (AFAIK after reading
Trac docs and implementation of plugin upload from /admin/plugins, egg
cache for plugins ... and so on ...) and this is what I was talking
about ... apps (e.g. Trac) depending *today* on setuptools (e.g. egg
files ...)


Trac uses entry points to access plugins.  Those plugins do not 
require .egg files, unless Trac specifically wrote code of their own 
that requires .egg files.


If you install a plugin from source using pip or a system package 
manager, no .egg file is involved -- you'll have an .egg-info 
directory instead, alongside the installed code.  The pkg_resources 
API for accessing entry points works with either .egg files or 
.egg-info directories.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] "setuptools has divided the Python community"

2009-03-25 Thread P.J. Eby


At 01:29 PM 3/25/2009 -0400, Terry Reedy wrote:

Paul Moore wrote:

2009/3/25 Tarek Ziadé :



Since setuptools came on the scene, I can state with some certainty
that many packages which would otherwise have been distributed as
bdist_wininst installers, now aren't. In some cases, only source
packages are provided (on the basis that easy_install will build what
you need). In those cases, I can accept that maybe the developer would
not have built Windows installers even before setuptools arrived. But
in a significant number of cases - including setuptools itself -
binary, version-specific eggs for Windows are provided, but no
bdist_wininst installers. If the developer is willing to build an egg,
he could just as easily have built an installer - but he now has to
choose - build one or the other, or both. And not everyone chooses the
same way.


Is it possible to write an egg to bdist converter (or vice versa)?


easy_install can convert bdist_wininst installers to .egg files; I'm 
not sure if the reverse is possible in practice, although it 
certainly is in principle.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] "setuptools has divided the Python community"

2009-03-25 Thread P.J. Eby


At 06:08 PM 3/25/2009 -0500, Barry Warsaw wrote:

I've found setuptools entry points difficult to work with for plugins,


I'd be interested in hearing more about your specific difficulties, 
although it's probably off-topic for Python-Dev.  Perhaps via the 
distutils-sig, since we don't have a plugins sig?  ;-)


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] "setuptools has divided the Python community"

2009-03-25 Thread P.J. Eby

At 11:35 AM 3/25/2009 -0500, Olemis Lang wrote:

Yes you're right, Trac requires .egg files for local plugins installs
(... in /plugins folder ;) so that not all environments but only one
be able to use the plugin ... but that's not exactly what I'm saying,
since setuptools AFAIK *MUST* be already there ... so Trac depends on
setuptools.

It depends on pkg_resources, which you can manually extract from the 
setuptools package if you like.  I believe there may be some distros 
that actually do this, with separate system packages for "setuptools" 
(really pkg_resources) and setuptools-devel (the setuptools.* code) 
-- in which case you can have pkg_resources installed without the 
rest of setuptools being present.

> If you install a plugin from source using pip

Like you said setuptools is already there ... pip depends on
setuptools ... isn't it?

Yes, but that doesn't make *Trac* dependent on setuptools, 
particularly in the system package manager case.

> or a system package manager,
> no .egg file is involved -- you'll have an .egg-info directory instead,
> alongside the installed code.  The pkg_resources API for accessing entry
> points works with either .egg files or .egg-info directories.
>

Like I just said setuptools AFAIK *MUST* be there ... isnt it ? Am I
still wrong ?

Not on the system using the package manager, no.  (By 'setuptools' 
I'm refering to the setuptools.* modules, not the setuptools 
distribution...  which is just another byproduct of our globally 
confusing terminology for these things.)

BTW ... if you (or anyone else ;) suggests me another pkg «like»
setuptools (supporting entry points and so on ...) so that I can try
how to install and use Trac plugins without setuptools + pkg_* in the
middle ... I'll be open to try it out ... and learn a little ;)

You'd need pkg_resources, and if you're not using setuptools to 
initially generate the files to install, you'd have to create the 
entry point files by hand, following the EggFormats documentation I 
linked to previously.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 380 (yield from a subgenerator) comments

2009-03-26 Thread P.J. Eby


At 09:24 PM 3/25/2009 -0700, Guido van Rossum wrote:

ISTR that the motivation for adding new syntax is that the best you
can do using a trampoline library is still pretty cumbersome to use
when you have to write a lot of tasks and subtasks, and when using
tasks is just a tool for getting things done rather than an end goal
in itself. I agree that the motivation and the comparison should be
added to the PEP (perhaps moving the trampoline sample
*implementation* to a reference or an appendix, since it is only the
appearance of the trampoline-*using* code that matters).


In particular, it should explain why these choices are so costly as 
to justify new syntax and a complex implementation:


* decorator clearly identifying intent to create a task vs. no 
indication of task-ness

* "yield Return(value)" vs. "return value"
* "result = yield subgenerator()" vs. "result = yield from subgenerator()"

Especially since the first two differences arguably make the 
non-"yield from" code clearer, and the third makes it more compact.




with judicious use of decorators and helper classes you can get a
reasonable approximation, and I think Twisted uses something like
this, so it's not just theory. I think the best you can do without new
syntax though is still pretty cumbersome and brittle, which is why I
have encouraged your PEP.


On the "cumbersome" side, there are only three differences, as I've 
shown above... and one of them uses less syntax than the PEP.


I'm not sure what you mean by brittle.  Twisted and PEAK have both 
had generator-based tasks for ages, and have been used in production 
for years, even before generators had send()/throw() -- the addition 
of yield expressions made it possible to get rid of the one previous 
brittleness, where you needed to do things like:


  yield subgenerator(); result = resume()

in order to pass values or exceptions in.  Since send()/throw() was 
added, the need to call a function after each yield was eliminated, 
at least from PEAK and my newer Trellis library; haven't looked at 
Twisted's tasks in a while.  I believe there are other 
generator-based task libraries available on PyPI, but can't comment 
on their robustness.


(The existence of throw(), by the way, makes it possible to produce 
tracebacks that look *exactly* as if you had called a series of 
functions, rather than iterating over a bunch of generators.  The 
sample code I gave should do this correctly, as it was created by 
yanking out working, tested code from my Trellis library, and cutting 
out all the Trellis-specific bits.)


If someone can find any semantic differences between the code I 
posted and the yield-from proposal (apart from the absence of the 
"for x in y: yield x" part of the functionality), I'd like to know 
about it...  and it should go in the PEP.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 380 (yield from a subgenerator) comments

2009-03-26 Thread P.J. Eby


At 08:43 PM 3/26/2009 +1200, Greg Ewing wrote:

Trying to think of a better usage example that
combines send() with returning values, I've realized
that part of the problem is that I don't actually
know of any realistic uses for send() in the first
place.

Can anyone point me to any? Maybe it will help
to inspire a better example.


Er, well, I don't know what anybody *else* wanted them for, but I 
wanted them to implement improved trampoline functions, vs. earlier 
Python versions.  ;-)


The trampoline example I gave uses send() in order to pass the return 
values from one generator back into another.  Of course, the task 
object also has a send(), so if you do find another use case for 
send() in a co-operative context, it should be equally doable with 
the trampoline.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 380 (yield from a subgenerator) comments

2009-03-26 Thread P.J. Eby

At 10:56 AM 3/26/2009 +, Antoine Pitrou wrote:

Guido van Rossum  python.org> writes:
>
> That's stating it a little too strongly. Phillip has shown how with
> judicious use of decorators and helper classes you can get a
> reasonable approximation, and I think Twisted uses something like
> this, so it's not just theory. I think the best you can do without new
> syntax though is still pretty cumbersome and brittle, which is why I
> have encouraged your PEP.

It remains to be seen whether Twisted and other libraries (Kamaelia?) can
benefit from this PEP.

They don't get any new features, and would require (possibly 
significant) changes in order to be able to take advantage of the syntax.

And they *still* wouldn't be able to do away with their trampolines 
-- the new trampolines would just be able to avoid the need for a 
generator stack, if they previously had one to begin with.  From your 
description, it sounds like Twisted's version of this doesn't even use a stack.

(Note: by "trampoline" I mean, "thing that processes yielded values 
and manages the resumption of the generator", which need not be 
global.  The example trampoline I posted earlier is also implemented 
as a decorator, and could be trivially extended via a lookup table to 
handle deferreds, delayed calls, or whatever else you wanted it to 
support as yield targets.)

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 380 (yield from a subgenerator) comments

2009-03-26 Thread P.J. Eby


At 12:20 PM 3/26/2009 -0700, Guido van Rossum wrote:

By brittle I meant again having to be aware of those details of the
mechanism that exist because of syntactic limitations, e.g.
accidentally writing "return X" instead of "yield Return(X)".


In that case, you'd either have a syntax error under the current 
rules (because you're using yield in the same function), or else 
you'd have a normal function (no yields) that worked in the way you 
expect it to.


IIRC, the example I gave would treat a non-Return(), non-generator 
value as a value to be passed back into the current generator, such 
that if you defined a function f, and did:


 g = yield f()

and f was not a generator or did not return one, then the above is 
equivalent to:


 g = f()

and is not an error.  Granted, this can fail if f() can return some 
other sort of iterator, but arguably that's an error in the *caller*, 
which should not use yield to call such a function -- and the need to 
distinguish the type of function you're calling is still present in 
PEP 380, i.e., you still need to know whether the function you're 
calling is a generator with a special return mechanism.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 380 (yield from a subgenerator) comments

2009-03-26 Thread P.J. Eby


At 12:27 PM 3/26/2009 -0700, Guido van Rossum wrote:

There is some clear low-hanging fruit for Greg's proposal
where no trampoline or helpers are needed -- but where currently
refactoring complex code containing many yield statements is
cumbersome due to the nee to write each "subroutine" call as "for x in
subroutine(): yield x" -- being able to replace this with "yield from
subroutine()" is a conceptual advantage to me that is not proportional
to the number of characters saved.


Right - I don't object to the concept of "yield from" -- I'm -0 on 
that.  What I don't like is the confusion of adding "return values" 
to generators, at least using the 'return' statement.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] "setuptools has divided the Python community"

2009-03-26 Thread P.J. Eby

At 03:28 PM 3/26/2009 -0500, Guido van Rossum wrote:

2009/3/26 Barry Warsaw :
> BTW, under a better name, I would support putting pkg_resources in the
> stdlib.

Last time I looked it was an incredibly complicated piece of code that
would have to be refactored considerably before it would be
maintainable by the core developers. I never did manage to get a good
understanding of the code, but I expect that a lot of the complexity
exists so that it works for all Python versions. The stdlib version
shouldn't need this -- it should only care about providing a stable
API that works with the current version.

As someone else suggested, moving some of the functionality to PEP 
302 interfaces would also help.  Most of the code, though, deals with 
locating/inspecting installed distributions, resolving version 
requirements, and managing sys.path.  And most of the nastiest 
complexity comes from trying to support true filename access to 
resources -- if that were dropped from the stdlib, there'd be no need 
for egg caches and the like, along with all the complexity entailed.

Application environments such as Chandler, Trac, Zope, etc. that want 
their plugins to live in .egg files wouldn't necessarily be able to 
use such an API, but the independent pkg_resources wouldn't be 
disappearing.  (Of course, they could also implement 
application-specific file extraction, if the stdlib API included the 
ability to inspect and open zipped resources.)

The other significant source of complexity is dynamic management of 
namespace packages; specifically, trying to handle the situation 
where new sys.path entries (e.g. .egg files added as plugins) need to 
have their contents added to existing sys.modules __path__ 
entries.  This is perhaps another feature that could be dropped from 
the stdlib version, given a way to interop with pkg_resources or a replacement.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] "setuptools has divided the Python community"

2009-03-26 Thread P.J. Eby


At 11:27 PM 3/26/2009 +, Paul Moore wrote:

What I'd really like is essentially some form of "virtual filesystem"
access to stuff addressed relative to a Python package name,


Note that relative to a *Python package name* isn't quite as useful, 
due to namespace packages.  To be unambiguous as to the targeted 
resource, one needs to be able to reference a specific project, and 
that requires you to go off the name of a module *within* a 
package.  For example, 'zope.somemodule' rather than just 'zope'.  To 
put it another way, you don't want to use a module that comes from an 
__init__.py as your origin point, but rather, some other .py file 
within the package.  Then you can talk to that module's loader 
without fear of ambiguity.




 rather
than to a filesystem path. How much filesystem functionality is
required is somewhat up for debate, but the basic listdir, get
(text/binary) contents, open (text/binary) stream would be a good
start (as Guido pointed out).


"isdir" and "exists" would also be nice; pkg_resources provides 
those, plus listdir and string/stream.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 380 (yield from a subgenerator) comments

2009-03-26 Thread P.J. Eby


At 10:39 PM 3/26/2009 -0500, Guido van Rossum wrote:

That +0 could turn into a +1 if there was a way to flag this as an
error (at runtime), at least if the return is actually executed:

def g():
yield 42
return 43

for x in g():
print x# probably expected to print 42 and then 43

Perhaps the exception used in this case could be a different exception
than StopIteration? Regular iteration could either just pass this
exception through or explicitly check for it (a single pointer
comparison could usually suffice), depending on whether it would be a
subclass of StopIteration.


Could we at least have some syntax like 'return from yield with 43', 
to distinguish it from a regular return, clarify that it's returning 
a value to a yield-from statement, and emphasize that you need a 
yield-from to call it?


If it doesn't have some sort of variant syntax, the error message for 
the return exception is going to need to be rather verbose in order 
to be clear.  However, if there is a variant syntax, then an error 
message like "'return from yield' without 'yield from'" might be 
clear enough, and we can keep the current error for returning values 
in generators.


That way, the paired special syntax is clearly identifiable as 
coroutine/microthread control flow, in a way that's both TOOOWTDI and EIBTI.


One remaining quirk or missing piece: ISTM there needs to be a way to 
extract the return value without using a yield-from statement.  I 
mean, you could write a utility function like:


   def unyield(geniter):
   try:
   while 1: geniter.next()
   except GeneratorReturn as v:
   return v.value

OTOH, I suppose this function is still a trampoline, just one that 
doesn't actually do anything except return an eventual exit value.  I 
suppose you could do a slightly improved one thus:


def unyield(geniter, value=None, func=lambda v: v)
   try:
   while 1: value=func(geniter.send(value))
   except GeneratorReturn as v:
   return v.value

And drop it into itertools or some such.  It's sort of like an 
all-purpose map/reduce for generators, so that all you need to do is 
pass in a function to do whatever processing you need (e.g. I/O 
waiting) on the values yielded.  You could also use another 
generator's send() method as the function passed in, in which case 
you'd basically have a pair of coroutines...  and whichever returned 
a value first would end up as the return value of the overall 
function.  That'd probably be pretty useful for the sort of simple 
(non I/O) coroutines Greg seems to have in mind.


Or, these could just be examples in the PEP, I suppose.  They're not 
terribly difficult to write...  but then I might be biased since I've 
written a ridiculous number of coroutine trampolines for Python 
generators over the last how-many-ever years Python has had generators.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 380 (yield from a subgenerator) comments

2009-03-26 Thread P.J. Eby


At 04:08 PM 3/27/2009 +1300, Greg Ewing wrote:

You can't expect to improve something like that by
stuffing yield-from into the existing framework, because
the point of yield-from is to render the framework
itself unnecessary.


But it doesn't.  You still need *something* that processes the 
yielded values, since practical frameworks have various things to 
yield "to" - i/o, time, mouse clicks, whatever.  Correctly dealing 
with the call stack part is tedious to implement, sure, but it's not 
really the focal point of a microthreading framework.


Usually, you need to have some way to control which microthreads are 
actually to be executing, vs. the ones that are waiting for a 
particular time, an I/O operation, or some other sort of event.  None 
of that stuff goes away just by taking care of the call stack.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] "setuptools has divided the Python community"

2009-03-27 Thread P.J. Eby


At 01:49 PM 3/27/2009 +0100, M.-A. Lemburg wrote:

(*) I've had a go at this a few months ago and then found out
that the egg format itself is not documented anywhere.


It's been documented for just under three years now.  Here's where 
you quoted the email where I announced that documentation, provided 
links to it, and asked you to let me know if there's anything else 
you'd need in it:


http://mail.python.org/pipermail/python-dev/2006-April/064496.html

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] version compare function into main lib

2009-03-27 Thread P.J. Eby

At 05:08 PM 3/27/2009 +0100, M.-A. Lemburg wrote:

On 2009-03-27 17:01, Eric Smith wrote:
> Martin v. Löwis wrote:
>>> Correct me if I wrong, but shouldn't Python include function for
>>> version comparisons?
>>
>> On the packaging summit yesterday, people agreed that yes, we should
>> have something like that in the standard library, and it should be more
>> powerful than what distutils currently offers.
>
> Yes.
>
>> There was no conclusion of how specifically that functionality should
>> be offered; several people agreed that Python should mandate a standard
>> format, which it is then able to compare. So you might not be able to
>> spell it "10.3.40-beta", but perhaps "10.3.40b1" or "10.3.40~beta".
>
> I got the impression that people are generally happy with what
> setuptools provides for version parsing and comparison.
>
> Does anyone think that's not a good model?

Instead of trying to parse some version string, distutils should
require defining the version as tuple with well-defined entries -
much like what we have in sys.version_info for Python.

The developer can then still use whatever string format s/he wants.

The version compare function would then work on this version tuple
and probably be called cmp() (at least in Python 2.x ;-).

By the way, pkg_resources.parse_version of course returns a tuple 
that can be compared with cmp().

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 380 (yield from a subgenerator) comments

2009-03-27 Thread P.J. Eby

At 03:28 AM 3/27/2009 -0400, Scott Dial wrote:

P.J. Eby wrote:
> One remaining quirk or missing piece: ISTM there needs to be a way to
> extract the return value without using a yield-from statement.  I mean,
> you could write a utility function like:
>
>def unyield(geniter):
>try:
>while 1: geniter.next()
>except GeneratorReturn as v:
>return v.value

My first thought was to ask why it was not equivalent to say:

x = yield g
x = yield from g

This would seem like a more obvious lack of parallelism to pick on wrt.
return values.

Because yield-from means you're "inlining" the generator, such that 
sends go into that generator, rather than into the current generator.

This unyield() operation seems contrived. Never before have you been
able to write a generator that returns a value, why would these suddenly
become common practice? The only place a return value seems useful is
when refactoring a generator and you need to mend having loss of a
shared scope. What other use is there for a return value?

The use case which these things are being proposed for is to replace 
most of the stack-management code that's currently needed for 
coroutine trampolines.  In such a case, you're likely using 
generators to perform long-running asynchronous operations, or else 
coroutines where two functions are co-operating to produce a result, 
each with its own control flow.

For example, you might have a generator that yields socket objects to 
wait for them to be ready to read or write, then returns a line of 
text read from the socket.  You would unyield this if you wanted to 
write top-level code that was *not* also such a task.  Similarly, you 
might write coroutines where one reads data from a file and sends it 
to a parser, and then the parser sends data back to a main program.

In either case, an unyield would either be the synchronous top-level 
loop of the program, or part of the top-level code.  Either you need 
to get the finished top-level object from your parser at the end of 
its operation, or you are waiting for all your asynchronous I/O tasks 
to complete.

It would seem unfortunate for it to be considered a runtime error since
this would prevent sharing a generator amongst "yield from" and
non-"yield from" use cases.

Has anyone shown a use case for doing so?  I might be biased due to 
previous experience with these things, but I don't see how you write 
a function where both the yielded values *and* the return value are 
useful...  and if you did, you'd still need some sort of unyield operation.

Notice that in both the I/O and coroutine use cases, the point of 
yielding is primarily *to allow other code to execute*, and possibly 
pass a value back IN to the generator.  The values passed *out* by 
the generator are usually either ignored, an indicator of what the 
generator wants to be passed back in, or what sort of event it is 
waiting for before it's to be resumed.

In other words, they're usually not data -- they're just something 
that gets looped over as the task progresses.

As Greg has said a number of times, we allow functions to return values
with them silently being ignored all the time.

Sure.  But right now, the return value of a generator function *is 
the generator*.  And you're free to ignore that, sure.

But this is a "second" return value that only goes to a special place 
with special syntax -- without that syntax, you can't access it.

But in the use cases where you'd actually want to make such a 
function return a value to begin with, it's because that value is the 
value you *really* want from the function -- the only reason it's a 
generator is because it needs to be paused and resumed along the way 
to getting that return value.

If you're writing a function that yields values for other than 
control flow reasons, it's probably a bad idea for it to also have a 
"return" value  because then you'd need an unyield operation to 
get at the data.

And it seems to me that people are saying, "but that's no problem, 
I'll just use yield-from to get the value".  But that doesn't *work*, 
because it turns the function where you use it into another generator!

The generators have to *stop* somewhere, in order for you to *use* 
their return values -- which makes the return feature ONLY relevant 
to co-routine use cases -- i.e., places where you have trampolines or 
a top-level loop to handle the yields...

And conversely, if you *have* such a generator, its real return value 
is the special return value, so you're not going to be able to use it 
outside the coroutine structure...  so "ignoring its return value" 
doesn't make any sense.  You'd have to write a loop over the 
generator, *just to ignore the va

Re: [Python-Dev] "setuptools has divided the Python community"

2009-03-27 Thread P.J. Eby

At 11:37 PM 3/26/2009 -0500, Eric Smith wrote:

P.J. Eby wrote:
> As someone else suggested, moving some of the functionality to PEP 302
> interfaces would also help.  Most of the code, though, deals with
> locating/inspecting installed distributions, resolving version
> requirements, and managing sys.path.  And most of the nastiest
> complexity comes from trying to support true filename access to
> resources -- if that were dropped from the stdlib, there'd be no need
> for egg caches and the like, along with all the complexity entailed.
>
> Application environments such as Chandler, Trac, Zope, etc. that want
> their plugins to live in .egg files wouldn't necessarily be able to use
> such an API, but the independent pkg_resources wouldn't be
> disappearing.  (Of course, they could also implement
> application-specific file extraction, if the stdlib API included the
> ability to inspect and open zipped resources.)

Could you comment on why they couldn't use such an API?

If a plugin includes C code (.so/.dll), or uses a library that 
operates on filenames rather than bytes in memory (e.g. gettext), 
then the resources would need to be extracted from the 
.egg.  pkg_resources transparently extracts such resources to a cache 
directory when you ask for a resource's filename, rather than asking 
for a stream or string of its contents.

This feature represents a significant chunk of the complexity and 
code size of pkg_resources -- and I was proposing ways to cut down on 
that complexity and code size, for a (limited) stdlib version of the 
functionality.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] "setuptools has divided the Python community"

2009-03-27 Thread P.J. Eby

At 08:12 PM 3/27/2009 +0100, M.-A. Lemburg wrote:

On 2009-03-27 17:19, P.J. Eby wrote:
> At 01:49 PM 3/27/2009 +0100, M.-A. Lemburg wrote:
>> (*) I've had a go at this a few months ago and then found out
>> that the egg format itself is not documented anywhere.
>
> It's been documented for just under three years now.  Here's where you
> quoted the email where I announced that documentation, provided links to
> it, and asked you to let me know if there's anything else you'd need in it:
>
> http://mail.python.org/pipermail/python-dev/2006-April/064496.html

Thanks for reminding me. I must have forgotten about that
wiki page and instead looked on the setuptools page.

For future reference, you can also Google "Python egg formats" or 
"Python egg file format" or other such phrases, and it always comes 
up as the first or second hit.  Ironically, that's how I found it the 
other day myself, as I didn't remember right off what the wiki page 
name was.  ;-)

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] "setuptools has divided the Python community"

2009-03-27 Thread P.J. Eby


At 10:22 PM 3/27/2009 +0100, M.-A. Lemburg wrote:

Perhaps someone should start working on a tool called "FryingPan" to
create "Omelettes", ie. all eggs squashed into a single ZIP file... ;-)


They're called baskets actually.  ;-)  There's no tool to do it, but 
pkg_resources does support multiple eggs being contained in the same zipfile.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] My summit notes (packaging)

2009-03-27 Thread P.J. Eby


At 03:06 PM 3/27/2009 -0500, Tarek Ziadé wrote:

 They both aim at the
  same goal besides a few differences, and they both rely
  on a new metadata introduced by setuptools, wich is.
  "install_requires". This new metadata extends the metadata.
  described in PEP 314 but is slightly different from.
  what is descibred in the Draft PEP 345  ("Requires").
..
  PEP 345 introduces "Requires" and "Provides" wich are
  are implemented in Distutils and PyP, but are not
  widely used. 40 out of +4000 if I remember correctly. Martin will
  correct me here if I am wrong.


FYI, The reason setuptools uses a different way of specifying 
requirements is that the PEP-proposed way could not be used without 
some kind of indexed repository of packages -- and PyPI did not index 
"provides" at the time.  Also, the PEP-proposed versioning scheme was 
not compatible with the versioning schemes actually used in the field 
at the time.


These conditions could be considered to have changed now, or be 
changeable given enough will and volunteer effort.  Since setuptools 
was only a 1.5-person effort back in the day (i.e., me plus 
occasional contribs from Ian Bicking, Bob Ippolito, and Jim Fulton), 
and backward compatibility was a critical requirement to get 
adoption, using RPM-style provides/requires was not an option at that time.




- there should be an API to get metadata for a package without
actually executing any of the package's installation script.


Please note that where dependencies are concerned, practical 
considerations require that, at least for currently-released versions 
of Python, a package's dependencies may well be Python-version 
specific.  A library that targets Python 2.4 and 2.5 and uses 
wsgiref, sqlite, ctypes, or ElementTree, for example, may have 
different dependencies depending on the version it is being installed in.


Of course, if future Python versions declare metadata for what they 
include, this could be resolved.  But for existing versions it's a 
bit of a sticky issue.


Also, it's quite likely that platform-specific dependencies may exist 
as well.  It might be possible to accommodate these things with a 
sufficiently flexible format, but currently, the only way to handle 
them with distutils/setuptools is in the setup script.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 380 (yield from a subgenerator) comments

2009-03-27 Thread P.J. Eby


At 12:53 PM 3/28/2009 +1200, Greg Ewing wrote:

Guido van Rossum wrote:


Perhaps the crux is that *if* you accidentally use "return " in
a vanilla generator expecting the value to show up somewhere, you are
probably enough of a newbie that debugging this will be quite hard.
I'd like not to have such a newbie trap lying around.


Okay, so would you be happy if the for-loop were to
raise an exception if it gets a StopIteration whose
value is not None?


Wouldn't it have to be more than just for-loops?  What about list(), 
map(), ...?


It seems a lot simpler to just make it use a different exception, as 
nothing else has to change for that to work correctly; the new 
construct can just catch it, and everywhere else it's an error.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 380 (yield from a subgenerator) comments

2009-03-28 Thread P.J. Eby

At 06:52 AM 3/28/2009 -0500, Guido van Rossum wrote:

> 2. In addition to the "yield from" syntax for delegating to a
> subgenerator, also add new syntax for returning values from
> subgenerators so that the basic "return X" can continue to trigger
> SyntaxError.
>
> Since option 2 would most likely lead to a bikeshed discussion of epic
> proportions, I'm currently a fan of option 1 ;)

Me too. It also seems option 2 doesn't help us decide what it should
do: I still think that raising StopIteration(value) would be
misleading to vanilla users of the generators.

Agreed.  (I still think new syntax is a plus, though, as it helps to 
clearly signal both the intent to make the generator a "subtask" and 
the need to call it with yield-from.  Readability counts.)

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 382: Namespace Packages

2009-04-02 Thread P.J. Eby


At 10:32 AM 4/2/2009 -0500, Martin v. Löwis wrote:

I propose the following PEP for inclusion to Python 3.1.
Please comment.


An excellent idea.  One thing I am not 100% clear on, is how to get 
additions to sys.path to work correctly with this.  Currently, when 
pkg_resources adds a new egg to sys.path, it uses its existing 
registry of namespace packages in order to locate which packages need 
__path__ fixups.  It seems under this proposal that it would have to 
scan sys.modules for objects with __path__ attributes that are lists 
that begin with a '*', instead...  which is a bit troubling because 
sys.modules doesn't always only contain module objects.  Many major 
frameworks place lazy module objects, and module proxies or wrappers 
of various sorts in there, so scanning through it arbitrarily is not 
really a good idea.


Perhaps we could add something like a sys.namespace_packages that 
would be updated by this mechanism?  Then, pkg_resources could check 
both that and its internal registry to be both backward and forward compatible.


Apart from that, this mechanism sounds great!  I only wish there was 
a way to backport it all the way to 2.3 so I could drop the messy 
bits from setuptools.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 382: Namespace Packages

2009-04-02 Thread P.J. Eby


At 10:33 PM 4/2/2009 +0200, M.-A. Lemburg wrote:

That's going to slow down Python package detection a lot - you'd
replace an O(1) test with an O(n) scan.


I thought about this too, but it's pretty trivial considering that 
the only time it takes effect is when you have a directory name that 
matches the name you're importing, and that it will only happen once 
for that directory, unless there is no package on sys.path with that 
name, and the program tries to import the package multiple times.  In 
other words, the overhead isn't likely to be much, compared to the 
time needed to say, open and marshal even a trivial __init__.py file.




Alternative Approach:
-

Wouldn't it be better to stick with a simpler approach and look for
"__pkg__.py" files to detect namespace packages using that O(1) check ?


I thought the same thing (or more precisely, a single .pkg file), but 
when I got lower in the PEP I saw the reason was to support system 
packages not having overlapping filenames.  The PEP could probably be 
a little clearer about the connection between needing *.pkg and the 
system-package use case.




One of the namespace packages, the defining namespace package, will have
to include a __init__.py file.


Note that there is no such thing as a "defining namespace package" -- 
namespace package contents are symmetrical peers.




The above mechanism allows the same kind of flexibility we already
have with the existing normal __init__.py mechanism.

* It doesn't add yet another .pth-style sys.path extension (which are
difficult to manage in installations).

* It always uses the same naive sys.path search strategy. The strategy
is not determined by some file contents.


The above are also true for using only a '*' in .pkg files -- in that 
event there are no sys.path changes.  (Frankly, I'm doubtful that 
anybody is using extend_path and .pkg files to begin with, so I'd be 
fine with a proposal that instead used something like '.nsp' files 
that didn't even need to be opened and read -- which would let the 
directory scan stop at the first .nsp file found.




* The search is only done once - on the first import of the package.


I believe the PEP does this as well, IIUC.



* It's possible to have a defining package dir and add-one package
dirs.


Also possible in the PEP, although the __init__.py must be in the 
first such directory on sys.path.  (However, such "defining" packages 
are not that common now, due to tool limitations.)


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 382: Namespace Packages

2009-04-02 Thread P.J. Eby


At 03:21 AM 4/3/2009 +0200, Matthias Klose wrote:
+1 speaking as a downstream packaging python for Debian/Ubuntu I 
welcome this approach.  The current practice of shipping the very 
same file (__init__.py) in different packages leads to conflicts for 
the installation of these packages (this is not specific to dpkg, 
but is true for rpm packaging as well). Current practice of 
packaging (for downstreams) so called "name space packages" is: - 
either to split out the namespace __init__.py into a 
separate(linux distribution) package (needing manual packaging 
effort for eachname space package) - using downstream specific 
packaging techniques to handle conflicting files(diversions) - 
replicating the current behaviour of setuptools simply overwriting 
thefile conflicts. Following this proposal (downstream) 
packaging of namespace packages is made possible independent of any 
manual downstream packaging decisions or any downstream specific 
packaging decisions


A clarification: setuptools does not currently install the 
__init__.py file when installing in 
--single-version-externally-managed or --root mode.  Instead, it uses 
a project-version-nspkg.pth file that essentially simulates a 
variation of Martin's .pkg proposal, by abusing .pth file 
support.  If this PEP is adopted, setuptools would replace its 
nspkg.pth file with a .pkg file on Python versions that provide 
native support for .pkg imports, keeping the .pth file only for older Pythons.


(.egg files and directories will not be affected by the change, 
unless the zipimport module will also supports .pkg files...  and 
again, only for Python versions that support the new approach.)


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 382: Namespace Packages

2009-04-03 Thread P.J. Eby


At 10:15 PM 4/3/2009 +0200, Martin v. Löwis wrote:

I should make it clear that this is not the case. I envision it to work
this way: import zope
- searches sys.path, until finding either a directory zope, or a file
  zope.{py,pyc,pyd,...}
- if it is a directory, it checks for .pkg files. If it finds any,
  it processes them, extending __path__.
- it *then* checks for __init__.py, taking the first hit anywhere
  on __path__ (just like any module import would)
- if no .pkg was found, nor an __init__.py, it proceeds with the next
  sys.path item (skipping the directory entirely)


Ah, I missed that.  Maybe the above should be added to the PEP to clarify.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 382: Namespace Packages

2009-04-06 Thread P.J. Eby


At 02:00 PM 4/6/2009 +0100, Chris Withers wrote:

Martin v. Löwis wrote:

Chris Withers wrote:

Would this support the following case:

I have a package called mortar, which defines useful stuff:

from mortar import content, ...

I now want to distribute large optional chunks separately, but ideally
so that the following will will work:

from mortar.rbd import ...
from mortar.zodb import ...
from mortar.wsgi import ...

Does the PEP support this?

That's the primary purpose of the PEP.


Are you sure?

Does the pep really allow for:

from mortar import content
from mortar.rdb import something

...where 'content' is a function defined in mortar/__init__.py and 
'something' is a function defined in mortar/rdb/__init__.py *and* 
the following are separate distributions on PyPI:


- mortar
- mortar.rdb

...where 'mortar' does not contain 'mortar.rdb'.


See the third paragraph of http://www.python.org/dev/peps/pep-0382/#discussion

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 382: Namespace Packages

2009-04-07 Thread P.J. Eby

At 02:30 PM 4/7/2009 +0200, M.-A. Lemburg wrote:

>> Wouldn't it be better to stick with a simpler approach and look for
>> "__pkg__.py" files to detect namespace packages using that O(1) check ?
>
> Again - this wouldn't be O(1). More importantly, it breaks system
> packages, which now again have to deal with the conflicting file names
> if they want to install all portions into a single location.

True, but since that means changing the package infrastructure, I think
it's fair to ask distributors who want to use that approach to also take
care of looking into the __pkg__.py files and merging them if
necessary.

Most of the time the __pkg__.py files will be empty, so that's not
really much to ask for.

This means your proposal actually doesn't add any benefit over the 
status quo, where you can have an __init__.py that does nothing but 
declare the package a namespace.  We already have that now, and it 
doesn't need a new filename.  Why would we expect OS vendors to start 
supporting it, just because we name it __pkg__.py instead of __init__.py?

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 382: Namespace Packages

2009-04-07 Thread P.J. Eby

At 04:58 PM 4/7/2009 +0200, M.-A. Lemburg wrote:

On 2009-04-07 16:05, P.J. Eby wrote:
> At 02:30 PM 4/7/2009 +0200, M.-A. Lemburg wrote:
>> >> Wouldn't it be better to stick with a simpler approach and look for
>> >> "__pkg__.py" files to detect namespace packages using that O(1)
>> check ?
>> >
>> > Again - this wouldn't be O(1). More importantly, it breaks system
>> > packages, which now again have to deal with the conflicting file names
>> > if they want to install all portions into a single location.
>>
>> True, but since that means changing the package infrastructure, I think
>> it's fair to ask distributors who want to use that approach to also take
>> care of looking into the __pkg__.py files and merging them if
>> necessary.
>>
>> Most of the time the __pkg__.py files will be empty, so that's not
>> really much to ask for.
>
> This means your proposal actually doesn't add any benefit over the
> status quo, where you can have an __init__.py that does nothing but
> declare the package a namespace.  We already have that now, and it
> doesn't need a new filename.  Why would we expect OS vendors to start
> supporting it, just because we name it __pkg__.py instead of __init__.py?

I lost you there.

Since when do we support namespace packages in core Python without
the need to add some form of magic support code to __init__.py ?

My suggestion basically builds on the same idea as Martin's PEP,
but uses a single __pkg__.py file as opposed to some non-Python
file yaddayadda.pkg.

Right... which completely obliterates the primary benefit of the 
original proposal compared to the status quo.  That is, that the PEP 
382 way is more compatible with system packaging tools.

Without that benefit, there's zero gain in your proposal over having 
__init__.py files just call pkgutil.extend_path() (in the stdlib 
since 2.3, btw) or pkg_resources.declare_namespace() (similar 
functionality, but with zipfile support and some other niceties).

IOW, your proposal doesn't actually improve the status quo in any way 
that I am able to determine, except that it calls for loading all the 
__pkg__.py modules, rather than just the first one.  (And the 
setuptools implementation of namespace packages actually *does* load 
multiple __init__.py's, so that's still no change over the status quo 
for setuptools-using packages.)

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] decorator module in stdlib?

2009-04-08 Thread P.J. Eby


At 10:51 AM 4/8/2009 -0700, Guido van Rossum wrote:

I would like it even less if an API cared about the
*actual* signature of a function I pass into it.


One notable use of callable argument inspection is Bobo, the 
12-years-ago predecessor to Zope, which used argument information to 
determine form or query string parameter names.  (Were Bobo being 
written for the first time today for Python 3, I imagine it would use 
argument annotations to specify types, instead of requiring them to 
be in the client-side field names.)


Bobo, of course, is just a single case of the general pattern of 
tools that expose a callable to some other (possibly 
explicitly-typed) system.  E.g., wrapping Python functions for 
exposure to C, Java, .NET, CORBA, SOAP, etc.


Anyway, it's nice for decorators to be transparent to inspection when 
the decorator doesn't actually modify the calling signature, so that 
you can then use your decorated functions with tools like the above.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Rethinking intern() and its data structure

2009-04-10 Thread P.J. Eby


At 06:52 PM 4/10/2009 +1000, Nick Coghlan wrote:

This problem (slow application startup times due to too many imports at
startup, which can in turn can be due to top level imports for library
or framework functionality that a given application doesn't actually
use) is actually the main reason I sometimes wish for a nice, solid lazy
module import mechanism that manages to avoid the potential deadlock
problems created by using import statements inside functions.


Have you tried http://pypi.python.org/pypi/Importing ? Or more 
specifically, http://peak.telecommunity.com/DevCenter/Importing#lazy-imports ?


It does of course use the import lock, but as long as your top-level 
module code doesn't acquire locks (directly or indirectly), it 
shouldn't be possible to deadlock.  (Or more precisely, to add any 
*new* deadlocks that you didn't already have.)


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 382: Namespace Packages

2009-04-14 Thread P.J. Eby


At 05:02 PM 4/14/2009 +0200, M.-A. Lemburg wrote:

I don't see the emphasis in the PEP on Linux distribution support and the
remote possibility of them wanting to combine separate packages back
into one package as good argument for adding yet another separate hierarchy
of special files which Python scans during imports.

That said, note that most distributions actually take the other route:
they try to split up larger packages into smaller ones, so the argument
becomes even weaker.


I think you've misunderstood something about the use case.  System 
packaging tools don't like separate packages to contain the *same 
file*.  That means that they *can't* split a larger package up with 
your proposal, because every one of those packages would have to 
contain a __pkg__.py -- and thus be in conflict with each 
other.  Either that, or they would have to make a separate system 
package containing *only* the __pkg__.py, and then make all packages 
using the namespace depend on it -- which is more work and requires 
greater co-ordination among packagers.


Allowing each system package to contain its own .pkg or .nsp or 
whatever files, on the other hand, allows each system package to be 
built independently, without conflict between contents (i.e., having 
the same file), and without requiring a special pseudo-package to 
contain the additional file.


Also, executing multiple __pkg__.py files means that when multiple 
system packages are installed to site-packages, only one of them 
could possibly be executed.  (Note that, even though the system 
packages themselves are not "combined", in practice they will all be 
installed to the same directory, i.e., site-packages or the platform 
equivalent thereof.)


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 382: Namespace Packages

2009-04-14 Thread P.J. Eby


At 10:59 PM 4/14/2009 +0200, M.-A. Lemburg wrote:

You are missing the point: When breaking up a large package that lives in
site-packages into smaller distribution bundles, you don't need namespace
packages at all, so the PEP doesn't apply.

The way this works is by having a base distribution bundle that includes
the needed __init__.py file and a set of extension bundles the add
other files to the same directory (without including another copy of
__init__.py). The extension bundles include a dependency on the base
package to make sure that it always gets installed first.


If we're going to keep that practice, there's no point to having the 
PEP: all three methods (base+extensions, pkgutil, setuptools) all 
work just fine as they are, with no changes to importing or the stdlib.


In particular, without the feature of being able to drop that 
practice, there would be no reason for setuptools to adopt the 
PEP.  That's why I'm -1 on your proposal: it's actually inferior to 
the methods we already have today.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 382: Namespace Packages

2009-04-15 Thread P.J. Eby

At 09:51 AM 4/15/2009 +0200, M.-A. Lemburg wrote:

On 2009-04-15 02:32, P.J. Eby wrote:
> At 10:59 PM 4/14/2009 +0200, M.-A. Lemburg wrote:
>> You are missing the point: When breaking up a large package that lives in
>> site-packages into smaller distribution bundles, you don't need namespace
>> packages at all, so the PEP doesn't apply.
>>
>> The way this works is by having a base distribution bundle that includes
>> the needed __init__.py file and a set of extension bundles the add
>> other files to the same directory (without including another copy of
>> __init__.py). The extension bundles include a dependency on the base
>> package to make sure that it always gets installed first.
>
> If we're going to keep that practice, there's no point to having the
> PEP: all three methods (base+extensions, pkgutil, setuptools) all work
> just fine as they are, with no changes to importing or the stdlib.

Again: the PEP is about creating a standard for namespace
packages. It's not about making namespace packages easy to use for
Linux distribution maintainers. Instead, it's targeting *developers*
that want to enable shipping a single package in multiple, separate
pieces, giving the user the freedom to the select the ones she needs.

Of course, this is possible today using various other techniques. The
point is that there is no standard for namespace packages and that's
what the PEP is trying to solve.

> In particular, without the feature of being able to drop that practice,
> there would be no reason for setuptools to adopt the PEP.  That's why
> I'm -1 on your proposal: it's actually inferior to the methods we
> already have today.

It's simpler and more in line with the Python Zen, not inferior.

You are free not to support it in setuptools - the methods
implemented in setuptools will continue to work as they are,
but continue to require support code and, over time, no longer
be compatible with other tools building upon the standard
defined in the PEP.

In the end, it's the user that decides: whether to go with a
standard or not.

Up until this point, I've been trying to help you understand the use 
cases, but it's clear now that you already understand them, you just 
don't care.

That wouldn't be a problem if you just stayed on the sidelines, 
instead of actively working to make those use cases more difficult 
for everyone else than they already are.

Anyway, since you clearly understand precisely what you're doing, I'm 
now going to stop trying to explain things, as my responses are 
apparently just encouraging you, and possibly convincing bystanders 
that there's some genuine controversy here as well.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 382: Namespace Packages

2009-04-15 Thread P.J. Eby


At 09:10 AM 4/15/2009 -0700, Aahz wrote:

For the benefit of us bystanders, could you summarize your vote at this
point?  Given the PEP's intended goals, if you do not oppose the PEP, are
there any changes you think should be made?


I'm +1 on Martin's original version of the PEP, subject to the point 
brought up by someone that .pkg should be changed to a different extension.


I'm -1 on all of MAL's proposed revisions, as IMO they are a step 
backwards: they "standardize" an approach that will create problems 
that don't need to exist, and don't exist now.  Martin's proposal is 
an improvement on the status quo, Marc's proposal is a dis-improvement.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 382: Namespace Packages

2009-04-15 Thread P.J. Eby


At 06:15 PM 4/15/2009 +0200, M.-A. Lemburg wrote:

The much more common use case is that of wanting to have a base package
installation which optional add-ons that live in the same logical
package namespace.


Please see the large number of Zope and PEAK distributions on PyPI as 
minimal examples that disprove this being the common use case.  I 
expect you will find a fair number of others, as well.


In these cases, there is NO "base package"...  the entire point of 
using namespace packages for these distributions is that a "base 
package" is neither necessary nor desirable.


In other words, the "base package" scenario is the exception these 
days, not the rule.  I actually know specifically of only one other 
such package besides your mx.* case, the logilab ll.* package.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 382: Namespace Packages

2009-04-15 Thread P.J. Eby

At 02:52 PM 4/15/2009 -0400, A.M. Kuchling wrote:

On Wed, Apr 15, 2009 at 01:59:34PM -0400, P.J. Eby wrote:
> Please see the large number of Zope and PEAK distributions on PyPI as
> minimal examples that disprove this being the common use case.  I expect
> you will find a fair number of others, as well.
   ...
> In other words, the "base package" scenario is the exception these days,
> not the rule.  I actually know specifically of only one other such
> package besides your mx.* case, the logilab ll.* package.

Isn't that pretty even, then?  zope.* and PEAK are two examples of one
approach; and mx.* and ll.* are two examples that use the base package
approach.  Neither approach seems to be the more common one, and both
are pretty rare.

If you view the package listings on PyPI, you'll see that the "pure" 
namespaces currently in use include:

alchemist.*
amplecode.*
atomisator.*
bda.*
benri.*
beyondskins.*
bliptv.*
bopen.*
borg.*
bud.*
...

This is just going down to the 'b's, looking only at packages whose 
PyPI project name reflects a nested package name, and only including 
those with entries that:

1. use setuptools,
2. declare one or more namespace packages, and
3. do not depend on some sort of "base" or "core" package.

Technically, setuptools doesn't support base packages anyway, but if 
the organization appeared to be based on a "core+plugins/addons" 
model (as opposed to "collection of packages grouped in a namespace") 
I didn't include it in the list above -- i.e., I'm bending over 
backwards to be fair in the count.

If somebody wants to do a formal count of base vs. pure, it might 
provide interesting stats.  I initially only mentioned Zope and PEAK 
because I have direct knowledge of the developers' intent regarding 
their namespace packages.

However, now that I've actually looked at a tiny sample of PyPI, it's 
clear that the actual field use of pure namespace packages has 
positively exploded since setuptools made it practical to use them.

It's unclear, however, who is using base packages besides mx.* and 
ll.*, although I'd guess from the PyPI listings that perhaps Django 
is.  (It seems that "base" packages are more likely to use a 
'base-extension' naming pattern, vs. the 'namespace.project' pattern 
used by "pure" packages.)

Of course, I am certainly not opposed to supporting base packages, 
and Martin's version of PEP 382 is a plus for setuptools because it 
would allow setuptools to better support the "base" scenario.

But pure packages are definitely not a minority; in fact, a 
superficial observation of the full PyPI list suggests that there may 
be almost as many projects using pure-namespace packages, as there 
are non-namespaced projects!

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 382: Namespace Packages

2009-04-15 Thread P.J. Eby


At 10:20 PM 4/15/2009 +0200, M.-A. Lemburg wrote:

Whether base packages are useful or not is really a side aspect
of the PEP and my proposal.


It's not whether they're useful, it's whether they're required.  Your 
proposal *requires* base packages, and for people who intend to use 
pure packages, this is NOT a feature: it's a bug.


Specifically, it introduces a large number of unnecessary, 
boilerplate dependencies to their package distribution strategy.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 382: Namespace Packages

2009-04-15 Thread P.J. Eby


At 10:00 PM 4/15/2009 +0200, Tarek Ziadé wrote:

Now for the "base" or "core" package, what peoplethat uses setuptools
do most of the time:

1- they use zc.buildout so they don't need a base package : they list
in a configuration files all packages needed
   to build the application, and one of these package happen to have
the scripts to launch the application.

2 - they have a "main" package that doesn't use the same namespace,
but uses setuptools instal_requires metadata
 to include namespaced packages. It acts like zc.buildout in some ways.

For example, you mentioned atomisator.* in your example, this app has
a main package called "Atomisator" (notice the upper A)
that uses strategy #2


I think that there is some confusion here.  A "main" package or 
buildout that assembles a larger project from components is not the 
same thing as having a "base" package for a namespace package.


A base or core package is one that is depended upon by most or all of 
the related projects.  In other words, the dependencies are in the 
*opposite direction* from what you described above.  To have a base 
package in setuptools, you would move the target code from the 
namespace package __init__.py to another module or subpackage within 
your namespace, then make all your other projects depend on the 
project containing that module or subpackage.


And I explicitly excluded from my survey any packages that were 
following this strategy, on the assumption that they might consider 
switching to an __init__.py or __pkg__.py strategy if some version of 
PEP 382 were supported by setuptools, since they already have a 
"base" or "core" project -- in that case, they are only changing ONE 
of their packages' distribution metadata to adopt the new strategy, 
because the dependencies already exist.




So :
- having namespaces natively in Python is a big win (Namespaces are
one honking great idea -- let's do more of those!)
- being able to still write some code under the primary namespace is
something I (and lots of people) wish we could do
  with setuptools, so it's a big win too.


Yes, that's why I support Martin's proposal: it would allow 
setuptools to support this case in the future, and it would also 
allow improved startup times for installations with many 
setuptools-based namespace packages installed in flat form.  (Contra 
MAL's claims of decreased performance: adopting Martin's proposal 
allows there to be *fewer* .pth files read at startup, because only 
.pkg files for an actually-imported package need to be read.)


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 382: Namespace Packages

2009-04-15 Thread P.J. Eby


At 09:59 AM 4/16/2009 +0900, Stephen J. Turnbull wrote:

I think that for this PEP it's more important to unify
the various use cases for namespace packages than it is to get rid of
the .pth files.


Actually, Martin's proposal *does* get rid of the .pth files in 
site-packages, and replaces them with other files inside the 
individual packages.  (Thereby speeding startup times when many 
namespace packages are present but only a few are used.)


So Martin's proposal is a win for performance and even for decreasing 
clutter.  (The same number of special files will be present, but they 
will be moved inside the namespace package directories instead of 
being in the parent directory.)




AFAICS the only things PJE has said about base packages is that

  (a) they aren't a universal use case for namespace packages, and
  (b) he'd like to be able to support them in setuptools, but admits
  that at present they aren't.


...and that Martin's proposal would actually permit me to do so, 
whereas MAL's proposal would not.


Replacing __init__.py with a __pkg__.py wouldn't change any of the 
tradeoffs for how setuptools handles namespace packages, except to 
add an extra variable to consider (i.e., two filenames to keep track of).


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 382: Namespace Packages

2009-04-16 Thread P.J. Eby


At 03:46 AM 4/16/2009 +, gl...@divmod.com wrote:


On 15 Apr, 09:11 pm, p...@telecommunity.com wrote:
I think that there is some confusion here.  A "main" package or 
buildout that assembles a larger project from components is not the 
same thing as having a "base" package for a namespace package.


I'm certainly confused.

Twisted has its own system for "namespace" packages, and I'm not 
really sure where we fall in this discussion.  I haven't been able 
to follow the whole thread, but my original understanding was that 
the PEP supports "defining packages", which we now seem to be 
calling "base packages", just fine.


Yes, it does.  The discussion since the original proposal, however, 
has been dominated by MAL's counterproposal, which *requires* a 
defining package.


There is a slight distinction between "base package" and "defining 
package", although I suppose I've been using them a bit interchangeably.


Base package describes a use case: you have a base package which is 
extended in the same namespace.  In that use case, you may want to 
place your base package in the defining package.


In contrast, setuptools does not support a defining package, so if 
you have a base package, you must place it in a submodule or 
subpackage of the namespace.


Does that all make sense now?

MAL's proposal requires a defining package, which is 
counterproductive if you have a pure package with no base, since it 
now requires you to create an additional project on PyPI just to hold 
your defining package.



I'd appreciate it if the PEP could also be extended cover Twisted's 
very similar mechanism for namespace packages, 
"twisted.plugin.pluginPackagePaths".  I know this is not quite as 
widely used as setuptools' namespace package support, but its 
existence belies a need for standardization.


The PEP also seems a bit vague with regard to the treatment of other 
directories containing __init__.py and *.pkg files.


Do you have a clarification to suggest?  My understanding (probably a 
projection) is that to be a nested namespace package, you have to 
have a parent namespace package.



  The concept of a "defining package" seems important to avoid 
conflicts like this one:


   http://twistedmatrix.com/trac/ticket/2339

More specifically I don't quite understand the PEP's intentions 
towards hierarchical packages.  It says that all of sys.path will be 
searched, but what about this case?


In Twisted, the suggested idiom to structure a project which wants 
to provide Twisted plugins is to have a directory structure like this:


 MyProject/
   myproject/
 __init__.py
   twisted/
 plugins/
   myproject_plugin.py

If you then put MyProject on PYTHONPATH, MyProject/twisted/plugins 
will be picked up automatically by the plugin machinery.


Namespaces are not plugins and vice versa.  The purpose of a 
namespace package is to allow projects managed by the same entity to 
share a namespace (ala Java "package" names) and avoid naming 
conflicts with other authors.


A plugin system, by contrast, is explicitly intended for use by 
multiple authors, so the use case is rather different...  and using 
namespace packages for plugins actually *increases* the possibility 
of naming conflicts, unless you add back in another level of 
hierarchy.  (As apparently you are recommending via "myproject_plugin".)



  However, as "twisted" is *not* a "namespace" package in the same 
way, .py files in MyProject/twisted/ would not be picked up - this 
is very much intentional, since the "twisted" namespace is intended 
to be reserved for packages that we actually produce.  If either 
MyProject/twisted or MyProject/twisted/plugins/ had an __init__.py, 
then no modules in MyProject/twisted/plugins/ would be picked up, 
because it would be considered a conflicting package.


Precisely.  Note, however, that neither is twisted.plugins a 
namespace package, and it should not contain any .pkg files.  I don't 
think it's reasonable to abuse PEP 382 namespace packages as a plugin 
system.  In setuptools' case, a different mechanism is provided for 
locating plugin code, and of course Twisted already has its own 
system for the same thing.  It would be nice to have a standardized 
way of locating plugins in the stdlib, but that will need to be a 
different PEP.



I hope this all makes sense.  As I understand it, both setuptools 
and the proposed standard would either still have the bug described 
by ticket 2339 above, or would ignore twisted/plugins/ as a 
namespace package because its parent isn't a namespace package.


If twisted/ lacked an __init__.py, then setuptools would ignore 
it.  Under PEP 382, the same, unless it had .pkg files.  (Again, 
setuptools explicitly does not support using namespace packages as a 
plugin mechanism.)



P.S.: vendor packaging systems *ARE* a major use case for just about 
any aspect of Python's package structure.  I really liked MvL's 
coverage of "vendor packages", in the PEP, since this cou

Re: [Python-Dev] PEP 382: Namespace Packages

2009-04-16 Thread P.J. Eby


At 03:58 AM 4/17/2009 +, gl...@divmod.com wrote:
Just as a use-case: would the Java "com.*" namespace be an example 
of a "pure package with no base"?  i.e. lots of projects are in it, 
but no project owns it?


Er, I suppose.  I was thinking more of the various 'com.foo' and 
'org.bar' packages as being the pure namespaces in question.  For 
Python, a "flat is better than nested" approach seems fine at the moment.



Just to clarify things on my end: "namespace package" to *me* means 
"package with modules provided from multiple distributions (the 
distutils term)".  The definition provided by the PEP, that a 
package is spread over multiple directories on disk, seems like an 
implementation detail.


Agreed.


Entries on __path__ slow down import, so my understanding of the 
platonic ideal of a system python installation is one which has a 
single directory where all packages reside, and a set of metadata 
off to the side explaining which files belong to which distributions 
so they can be uninstalled by a package manager.


True... except that part of the function of the PEP is to ensure that 
if you install those separately-distributed modules to the same 
directory, it still needs to work as a package and not have any 
inter-package file conflicts.



Of course, for a development installation, easy uninstallation and 
quick swapping between different versions of relevant dependencies 
is more important than good import performance.  So in that case, 
you would want to optimize differently by having all of your 
distributions installed into separate directories, with a long 
PYTHONPATH or lots of .pth files to point at them.


And of course you may want a hybrid of the two.


Yep.


So another clarification I'd like in the PEP is an explanation of 
motivation.  For example, it comes as a complete surprise to me that 
the expectation of namespace packages was to provide only 
single-source namespaces like zope.*, peak.*, twisted.*.  As I 
mentioned above, I implicitly thought this was more for com.*, 
twisted.plugins.*.


Well, aside from twisted.plugins, I wasn't aware of anybody in Python 
doing that...  and as I described, I never really interpreted that 
through the lens of "namespace package" vs. "plugin finding".



Right now it just says that it's a package which resides in multiple 
directories, and it's not made clear why that's a desirable feature.


Good point; perhaps you can suggest some wording on these matters to Martin?


Okay.  So what I'm hearing is that Twisted should happily continue 
using our own wacky __path__-calculation logic for twisted.plugins, 
but that *twisted* should be a namespace package so that our 
separate distributions (TwistedCore, TwistedWeb, TwistedConch, et. 
al.) can be installed into separate directories.


Yes.

Thanks for taking the time to participate in this and add another 
viewpoint to the mix, not to mention clarifying some areas where the 
PEP could be clearer.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 382: Namespace Packages

2009-05-01 Thread P.J. Eby


At 05:35 PM 5/1/2009 +0100, Chris Withers wrote:

P.J. Eby wrote:
It's unclear, however, who is using base packages besides mx.* and 
ll.*, although I'd guess from the PyPI listings that perhaps Django 
is.  (It seems that "base" packages are more likely to use a 
'base-extension' naming pattern, vs. the 'namespace.project' 
pattern used by "pure" packages.)


I'll stress it again in case you missed it the first time: I think 
the main reason people use "pure namespace" versus "base namespace" 
packages is because hardly anyone know how to do the latter, not 
because there is no desire to do so!


I didn't say there's *no* desire, however IIRC the only person who 
*ever* asked on distutils-sig how to do a base package with 
setuptools was the author of the ll.* packages.  And in the case of 
at least the zope.* peak.* and osaf.* namespace packages it was 
specifically *not* the intention to have a base __init__.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 382: Namespace Packages

2009-05-01 Thread P.J. Eby

At 07:41 PM 5/1/2009 +0200, Martin v. Löwis wrote:

>> It's unclear, however, who is using base packages besides mx.* and
>> ll.*, although I'd guess from the PyPI listings that perhaps Django
>> is.  (It seems that "base" packages are more likely to use a
>> 'base-extension' naming pattern, vs. the 'namespace.project' pattern
>> used by "pure" packages.)
>
> I'll stress it again in case you missed it the first time: I think the
> main reason people use "pure namespace" versus "base namespace" packages
> is because hardly anyone know how to do the latter, not because there is
> no desire to do so!
>
> I, for one, have been trying to figure out how to do "base namespace"
> packages for years...

You mean, without PEP 382?

That won't be possible, unless you can coordinate all addon packages.
Base packages are a feature solely of PEP 382.

Actually, if you are using only the distutils, you can do this by 
listing only modules in the addon projects; this is how the ll.* 
tools are doing it.  That only works if the packages are all being 
installed in the same directory, though, not as eggs.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] .pth files are evil

2009-05-09 Thread P.J. Eby

At 04:18 PM 5/9/2009 +0200, Martin v. LÃ¶wis wrote:

Zooko O'Whielacronx wrote:
> .pth files are why I can't easily use GNU stow with easy_install.
> If installing a Python package involved writing new files into the
> filesystem, but did not require reading, updating, and re-writing any
> extant files such as .pth files, then GNU stow would Just Work with
> easy_install the way it Just Works with most things.

Please understand that this is the fault of easy_install, not of .pth
files. There is no technical need for easy_install to rewrite .pth
files on installation. It could just as well have created new .pth
files, rather than modifying existing ones.

If you always use --single-version-externally-managed with easy_install,
it will stop editing .pth files on installation.

It's --multi-version (-m) that does 
that.  --single-version-externally-managed is a "setup.py install" option.

Both have the effect of not editing .pth files, but they do so in 
different ways.  The "setup.py install" option causes it to install 
in a distutils-compatible layout, whereas --multi-version simply 
drops .egg files or directories in the target location and leaves it 
to the user (or the generated script wrappers) to add them to sys.path.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] .pth files are evil

2009-05-09 Thread P.J. Eby


At 04:42 PM 5/9/2009 +0200, Martin v. Löwis wrote:

>> If you always use --single-version-externally-managed with easy_install,
>> it will stop editing .pth files on installation.
>
> It's --multi-version (-m) that does that.
> --single-version-externally-managed is a "setup.py install" option.
>
> Both have the effect of not editing .pth files, but they do so in
> different ways.  The "setup.py install" option causes it to install in a
> distutils-compatible layout, whereas --multi-version simply drops .egg
> files or directories in the target location and leaves it to the user
> (or the generated script wrappers) to add them to sys.path.

Ah, ok. Is there also an easy_install invocation that unpacks the zip
file into some location of sys.path (which then wouldn't require
editing sys.path)?


Not as yet.  I'm sort of waiting to see what comes out of PEP 376 
discussions re: an installation manifest...  but then, if I actually 
had time to work on it right now, I'd probably just implement something.


Currently, you can use pip to do that, though, as long as the 
packages you want are in source form.  pip doesn't unzip eggs as yet.


It would be really straightforward, though, for someone to implement 
an easy_install variant that does this.  Just invoke "easy_install 
-Zmaxd /some/tmpdir packagelist" to get a full set of unpacked .egg 
directories in /some/tmpdir, and then move the contents of the 
resulting .egg subdirs to the target location, renaming EGG-INFO 
subdirs to projectname-version.egg-info subdirs.


(Of course, this ignores the issue of uninstalling previous versions, 
or overwriting of conflicting files in the target -- does pip handle these?)


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] how GNU stow is complementary rather than alternative to distutils

2009-05-10 Thread P.J. Eby


At 12:04 PM 5/10/2009 -0600, Zooko Wilcox-O'Hearn wrote:

The thing that prevents this from working with setuptools is that
setuptools creates a file named easy_install.pth during the "python 
./ setup.py install --prefix=foo" if you build two different Python

packages this way, they will each create an easy_install.pth file,
and then when you ask GNU stow to link the two resulting packages
into your system, it will say "You are asking me to install two
different packages which both claim that they need to write a file
named '/usr/local/lib/python2.5/site-packages/easy_install.pth'.


Adding --record and --single-version-externally-managed to that 
command line will prevent the .pth file from being used or needed, 
although I believe you already know this.


(What that mode won't do is install dependencies automatically.) 


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] .pth files are evil

2009-05-11 Thread P.J. Eby

At 04:42 PM 5/9/2009 +0200, Martin v. Löwis wrote:

>> If you always use --single-version-externally-managed with easy_install,
>> it will stop editing .pth files on installation.
>
> It's --multi-version (-m) that does that.
> --single-version-externally-managed is a "setup.py install" option.
>
> Both have the effect of not editing .pth files, but they do so in
> different ways.  The "setup.py install" option causes it to install in a
> distutils-compatible layout, whereas --multi-version simply drops .egg
> files or directories in the target location and leaves it to the user
> (or the generated script wrappers) to add them to sys.path.

Ah, ok. Is there also an easy_install invocation that unpacks the zip
file into some location of sys.path (which then wouldn't require
editing sys.path)?

No; you'd have to use the -e option to easy_install to download and 
extract a source version of the package; then run that package's 
setup.py, e.g.:

   easy_install -eb /some/tmpdir SomeProject
   cd /some/tmpdir/someproject  # subdir is always lowercased/normalized
   setup.py install --single-version-externally-managed --record=...

I suspect that this is basically what pip is doing under the hood, as 
that would explain why it doesn't support .egg files.

I previously posted code to the distutils-sig that was an .egg 
unpacker with appropriate renaming, though.  It was untested, and 
assumes you already checked for collisions in the target directory, 
and that you're handling any uninstall manifest yourself.  It could 
probably be modified to take a filter function, though, something like:

def flatten_egg(egg_filename, extract_dir, filter=lambda s,d: d):
 eggbase = os.path.filename(egg_filename)+'-info'
 def file_filter(src, dst):
 if src.startswith('EGG-INFO/'):
 src = eggbase+s[8:]
 dst = os.path.join(extract_dir, *src.split('/'))
 return filter(src, dst)
 return unpack_archive(egg_filename, extract_dir, file_filter)

Then you could pass in a None-returning filter function to check and 
accumulate collisions and generate a manifest.  A second run with the 
default filter would do the unpacking.

(This function should work with either .egg files or .egg directories 
as input, btw, since unpack_archive treats a directory input as if it 
were an archive.)

Anyway, if you used "easy_install -mxd /some/tmpdir [specs]" to get 
your target eggs found/built, you could then run this flattening 
function (with appropriate filter functions) over the *.egg contents 
of /some/tmpdir to do the actual installation.

(The reason for using -mxd instead of -Zmaxd or -zmaxd is that we 
don't care whether the eggs are zipped or not, and we leave out the 
-a so that dependencies already present on sys.path aren't copied or 
re-downloaded to the target; only dependencies we don't already have 
will get dropped in /some/tmpdir.)

Of course, the devil of this is in the details; to handle conflicts 
and uninstalls properly you would need to know what namespace 
packages were in the eggs you are installing.  But if you don't care 
about blindly overwriting things (as the distutils does not), then 
it's actually pretty easy to make such an unpacker.

I mainly haven't made one myself because I *do* care about things 
being blindly overwritten.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376 : Changing the .egg-info structure

2009-05-14 Thread P.J. Eby


At 12:21 AM 5/15/2009 +0200, Tarek Ziadé wrote:

Hello

I'm proposing this PEP, which has been discussed in Distutils-SIG, for
inclusion in Python 2.7 and 3.2

http://www.python.org/dev/peps/pep-0376/

Please comment !


I'd like to reiterate my suggestion that the uninstall record include 
size and checksum information, ala PEP 262's "FILES" section.  This 
would allow the uninstall function to validate whether a file has 
been modified, and thus prevent uninstalling a locally-modified file, 
or a file installed in some other way.


It may also be that providing an uninstall API that simply yields 
files to be uninstalled, with data about their existence/modification 
status, would be more useful than a blind uninstall operation with a 
filter function.


Also, the PEP doesn't document what happens if a single file was 
installed by more than one package.  Ideally, a file with identical 
size/checksum that belongs to more than one project should be 
silently left alone, and a file installed by more than one project 
with *different* size/checksum should be warned about and left alone.


Next, the doc for the metadata API functions seems quite 
sparse.  ISTR that I've previously commented on such issues as case- 
and punctuation-insensitivity of project names, and '/' separation in 
egg_info subpaths, but these don't seem to have been incorporated 
into the current version of the PEP.


These are important considerations in general, btw, because project 
name and version canonicalization and escaping are an important part 
of both generating and parsing .egg-info filenemaes.  At minimum, the 
relevant setuptools docs that define these standards should be cited.


Finally, the "Definitions" section also claims that a project 
installs one or more packages, but a project may not contain *any* 
packages; it may have a standalone module, or just a script, data, or metadata.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376 : Changing the .egg-info structure

2009-05-15 Thread P.J. Eby


At 08:32 AM 5/15/2009 +0200, Jeroen Ruigrok van der Werven wrote:
Agreed. Within FreeBSD's ports the installed package registration 
gets a MD5 hash per file recorded. Size is less interesting though, 
since essentially this information is encapsulated within the hash. 
Remove one byte from the file and your hash is already different.


Which also means that in that case you can skip computing the 
MD5.  The size allows you to easily notice an overwrite/corruption 
without further processing. 


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376 : Changing the .egg-info structure

2009-05-15 Thread P.J. Eby

At 08:32 AM 5/15/2009 +0200, Tarek Ziadé wrote:

2009/5/15 P.J. Eby :
>  Ideally, a file with identical size/checksum that
> belongs to more than one project should be silently left alone, and a file
> installed by more than one project with *different* size/checksum should be
> warned about and left alone.

I think the path is the info that should be looked at.

By "a file that belongs to more than one project" I meant a single 
file on *disk* (i.e., one absolute path).

But I don't think you want to leave alone a file with identical 
size/checksum that belongs to more than one project when it's not 
the same absolute path.

That wouldn't be "a file" then, would it?  ;-)

Here's an example why : if two different packages includes the
"feedparser.py" module
(from the FeedParser project) for conveniency, and if you remove one package,
you *do* want to remove its "feeparser.py" module even if it exists 
in the other

project.

Right, that would be *two files*, though, not one file.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376 : Changing the .egg-info structure

2009-05-16 Thread P.J. Eby


At 06:06 PM 5/16/2009 +0200, Tarek Ziadé wrote:

Ok I've changed the PEP with all the points you mentioned, if you want
to take a look.


Some notes:

1. Why ';' separation, instead of tabs as in PEP 262?  Aren't 
semicolons a valid character in filenames?


2. "if the installed file is located in a directory in site-packages" 
should refer not to site-packages but to the directory containing the 
.egg-info directory.


3. get_egg_info_file needs to be specified as using '/'-separated 
paths and converting to OS paths if appropriate.  There's also the 
problem that the mode it opens the file in (binary or text) is unspecified.


4. There should probably be a way to iterate over the projects in a 
directory, since it's otherwise impossible for an installation tool 
to find out what project(s) "own" a file that conflicts with 
something being installed.  Alternatively, reshaping the file API to 
allow querying by path as well as by project might work.


5. If any cache mechanisms are to be used by the API, the API *must* 
make it possible to bypass or explicitly manage that cache, as 
otherwise installation tools and tools that manipulate sys.path at 
runtime may end up using incorrect data.


6. get_files() doesn't document whether the yielded paths are 
absolute or relative, local or cross-platform, etc.




I need to find back your comments for this part, I must have missed
them. That's
the last part I didn't work out yet on the current PEP revision.


Well, if you can't find them, the EggFormats doc explains how these 
file/dir structures are currently laid out by setuptools, 
easy_install, pip, etc., and the PEP should probably reference that.


Technically, this PEP doesn't so much propose a change to the 
EggFormats standard, as simply add a RECORD file to it, and propose 
stdlib support for reading and writing it.  So, the PEP really should 
reference (i.e. link to) the existing standard.  The EggFormats doc 
in turn cites pkg_resources doc for lower-level format issues, such 
as name and version normalization, filename escaping, file parsing, etc.


This PEP should also probably be framed as a replacement for PEP 262, 
proposing to extend the de-facto standard for an installation 
database with uninstall support, and blessing selected portions of 
the de facto standard as an official standard.  (Since that's pretty 
much exactly what it is.)


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376 : Changing the .egg-info structure

2009-05-16 Thread P.J. Eby


At 11:17 AM 5/16/2009 -0700, Glenn Linderman wrote:
On approximately 5/16/2009 9:55 AM, came the following characters 
from the keyboard of P.J. Eby:

At 06:06 PM 5/16/2009 +0200, Tarek Ziadé wrote:

Ok I've changed the PEP with all the points you mentioned, if you want
to take a look.

Some notes:
1. Why ';' separation, instead of tabs as in PEP 262?  Aren't 
semicolons a valid character in filenames?



Why tabs?  Aren't tabs a valid character in filenames?
(hint: Both are valid in POSIX filenames, neither are valid in 
Windows filenames)


";" *is* valid in Windows filenames, actually.  Tabs aresn't.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376 : Changing the .egg-info structure

2009-05-19 Thread P.J. Eby

At 04:04 PM 5/19/2009 +0200, Tarek Ziadé wrote:

On Sat, May 16, 2009 at 6:55 PM, P.J. Eby  wrote:
>
> 1. Why ';' separation, instead of tabs as in PEP 262?  Aren't semicolons a
> valid character in filenames?

I am changing this into a . for now.

What about Antoine's idea about doing a quote() on the names ?

I like the CSV idea better, since the csv module is available in 2.3 
and up.  We should just pick a dialect with unambiguous quoting rules.

From my point of view  seems more simple to deal with, if 3rd-party
tools want to work on these files without using pkgutil or Python.

True, but then CSV files are still pretty common.

One other possibility that might work is using a vertical bar as a separator.

My preference rank at the moment is probably tabs, CSV, or vertical 
bar.  But I don't really care all that much, so let the people who care decide.

Personally, though, I don't see much point to cross-language 
manipulation of the file.  System packaging tools have their own way 
of keeping track of this stuff.  So unless somebody's using it to 
*build* system packages (e.g. making an RPM builder), they don't need this.

Now, about the APIs...

> 4. There should probably be a way to iterate over the projects in a
> directory, since it's otherwise impossible for an installation tool to find
> out what project(s) "own" a file that conflicts with something being
> installed.  Alternatively, reshaping the file API to allow querying by path
> as well as by project might work.

I am adding a "get_projects" api:

  get_projects() -> iterator

  Provides an iterator that will return (name, path) tuples, where `name`
  is the name of a registered project and `path` the path to its `egg-info`
  directory.

But for the use case you are mentioning, what about an explicit API:

  get_owners(paths) -> sequence of project names

  returns a sequence of tuple. For each path in the "paths" list, a
tuple of project names
  is returned

>
> 5. If any cache mechanisms are to be used by the API, the API 
*must* make it

> possible to bypass or explicitly manage that cache, as otherwise
> installation tools and tools that manipulate sys.path at runtime may end up
> using incorrect data.

work in progress - (I am afraid I have to write an advanced prototype
to be able to know
exaclty how the cache might work, and so, what API we should have)

I think it would be simpler to have explicit object types 
representing things like a directory, a collection of directories, 
and individual projects, and these object types should be part of the API.

Any function-oriented API should just be exposed as the methods of a 
default singleton.  Other Python modules follow this pattern -- and 
it's what I copied for the pkg_resources design.  It gives a nice 
tradeoff between keeping the simple things simple, and complex things 
possible, as well as keeping mechanism and policy separate.

Right now, the API design you're trying to do is being burdened by 
using strings and tuples to represent things that could just as 
easily be objects with their own methods, instead of things you have 
to pass back into other APIs.  This also makes caching more complex, 
because you can't just have one main object with stuff hanging off; 
you've got to have a bunch of dictionaries, tuples, lists, sets, etc.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376

2009-06-22 Thread P.J. Eby


At 05:42 PM 6/22/2009 +0200, Tarek Ziadé wrote:

Wouldn't it be better to use the native line terminator on the current
platform? (someone might want to edit or at least view the file)


Good idea, I'll change that,



As long as the file is always *read* with "U" mode, so that you can't 
mess it up, especially if the install is to a directory shared 
between platforms.



The idea of this API is to find out of a distribution "owns" a file, 
e.g. is the only distribution

that uses it, so it can be safely removed.


This could equally well be done by ``owners(path)``, returning a 
sequence of zero or more items.  Any length <> 1 means the file can't 
be safely removed.  Meanwhile, having the data about all the owners 
of a file would also be useful for tools that just want to inspect a 
directory's contents, for example, or to detect conflicts and overwrites.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376

2009-06-23 Thread P.J. Eby


At 10:38 AM 6/23/2009 +0200, Tarek Ziadé wrote:

What about using the same names used in Python's site module:
"sitedir" is the name used for
a directory we named DistributionDirectory.


No, a site dir is a Python-defined directory for site-installed 
packages, and/or a directory where .pth files are processed.  Wrong 
connotations entirely, since packages may be installed to other 
directories, and typically are in e.g. shared hosting environments.


DistributionDirectory is fine by me.  DistributionDirectories sounds 
like what setuptools calls an Environment.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376

2009-06-30 Thread P.J. Eby


At 07:57 PM 6/29/2009 +0200, Tarek Ziadé wrote:

Hello,

If no one objects, I'd like to push PEP 376 in the "accepted" status
and go ahead with its implementation,
with continuous feedback at Distutils-SIG as we did to build it.


I do have a question about the current draft...  Do zipped 
distributions use EGG-INFO or a project-version.egg-info?  This isn't 
spelled out in the PEP, although I get the general idea that the 
EGG-INFO format isn't supported, and thus the PEP and API do not 
support existing .egg files.  This should probably be made clear, if 
that's the intention.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376

2009-06-30 Thread P.J. Eby

At 01:34 AM 7/1/2009 +0200, Tarek Ziadé wrote:

On Wed, Jul 1, 2009 at 12:47 AM, Steven D'Aprano wrote:
> I don't see how this proposal will help in the second case. If you
> install distribution Spam, containing file spam.py, and then install
> distribution Ham, which requires spam.py, what is to prevent you from
> removing Spam and breaking Ham?
>
> If you don't propose a solution for the dependency problem, you should
> say so.

This problem is solved as described later in the PEP, with the API
that allows you to get the
list of the distributions that use a given file. (thanks to the RECORD files)

If Spam and Ham use smap.py, and if you uninstall Spam, this file will
not be removed
because the API will tell you its used in both distributions.

That's not the scenario he's talking about.  He's talking about the 
case where Ham has an 'install_requires' of Spam.  That is, a runtime 
dependency, not a shared file.

Good question, I have never created such distribution.
Aren't they read-only files ?

setuptools' bdist_egg command has an option to exclude source from an 
.egg, but it doesn't do anything special with permissions.  I don't 
think any other current install tools support source-free installation.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376

2009-06-30 Thread P.J. Eby


At 07:11 AM 7/1/2009 +1000, Nick Coghlan wrote:

To address PJE's question in the PEP, it may be worth expanding on this
in the backwards compatibility section explaining how the new distutils
metadata system avoids getting confused by the old pre-standardisation
installation formats (e.g. it may be that the directory names and/or
filenames all deliberately differ from current approaches precisely so
they can coexist without interfering with each other)


In particular, it should be explicitly noted that the API does not 
support .egg files (or directories), and thus cannot substitute for 
the pkg_resources API, as far as detecting installed packages.


That is, users who need to unequivocally detect the presence/version 
of a project on sys.path, etc. will still have to use pkg_resources.


(Because pkg_resources supports the format proposed by the PEP, but 
the PEP does not support .egg files and directories.)


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376

2009-07-01 Thread P.J. Eby

At 04:29 PM 7/1/2009 +0200, Tarek Ziadé wrote:

- Phase 1 : introduction of the egg-info file in distutils

Philipp introduced the creation of a file named xxx.egg-info file in 2006
(see http://bugs.python.org/issue1459476)  alongside
distutils-installed package, that contains
the metadata of the distribution.

- Phase 2:  two new formats in the setuptools project

Then he created two new formats in the setuptools project:

This is backwards.

The .egg-info file was added to distutils and setuptools *after* 
support for the other formats.  In other words, it was a third format 
added to allow other software (such as setuptools and friends) to 
detect the presence of packages installed via the distutils, and 
thereby prevent other tools from installing duplicates.

(.egg-info directories and .egg files/directories already existed.)

This means that all the third-party tools out there already conform to
that standard, and that packages installed
in other formats will not benefit from the new APIs. which means that
people that want to work with distributions installed as .egg
directories will have to use setuptools APIs.
Which makes sense.

Yes and no.  Not providing uninstall support is reasonable, but the 
PEP also has features to query packages in general.

(There's also no technical reason why comparable manifest and 
uninstall support can't be provided for .egg files and directoriees, 
since they already have an implicit manifest: their 
contents.  However, since I'm not currently possessed of the time to 
provide a patch myself, I'm not going to lobby for this as a feature.)

> If you want, you can then add a cross-reference note, saying something
> like "The setuptools structure, as proposed in the EggFormats
> documentation for that package [ref], is a subset of this standard. In
> order to conform to this PEP, setuptools will have to be amended to
> only install .egg-info directories in the format defined by this PEP".

I'll work that way.

(Technically, it's an extension of a subset of EggFormats, since it 
adds RECORD and INSTALLER.)

>
> "However, it will impact the setuptools and pip projects, but given
> the fact that..."
>
> Confusing. Will these tools need to change (I believe so) or not? If
> they will need to change, that hardly counts as "no deep consequences"
> - there's the whole backward compatibility issue for them to handle.

I'll add this in a backward compatibility section, as suggested
earlier by someone.

They don't need to change, unless they're going to support being 
uninstalled by the tool (in which case they'll need to add RECORD and 
INSTALLER).

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376

2009-07-02 Thread P.J. Eby


At 09:06 PM 7/2/2009 +0100, Paul Moore wrote:

Maybe the answer is that distutils reject *all* uses of absolute paths
on Windows. The PEP can then say whatever you want as far as I'm
concerned, as it doesn't affect me. If the existing bugs don't get
fixed, though, I'd say that the RECORD file should be explicitly
disallowed from containing absolute paths on Windows (for safety
reasons if nothing else).


In non bdist cases (e.g. setup.py with manual options, but also 
easy_install, pip, etc.) you can specify where data files and scripts 
go on Windows, and they don't have to be relative locations.  So, 
absolute paths need to be supported, despite the existence of broken 
bdist targets on Windows.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376

2009-07-03 Thread P.J. Eby


At 02:54 PM 7/3/2009 +0100, Paul Moore wrote:
Eggs are fundamentally a PEP 302 zip file format. There are some 
extra bits of metadata for setuptools/easy_install in there (as I 
understand things) but essentially they are zip files. When you say 
"decoupling the egg format", I assume you mean "decoupling the egg 
metadata" - which is fine, but to properly decouple, you need API 
level access to the metadata. PEP 376 offers read-only access, but 
as you rightly point out, it is only for filesystem data (and some 
form of zip file, which appears to be limited in some way, as it 
isn't PEP 302 based, and the actual format isn't defined anywhere). 
The basic point here is that PEP 376 needs to define precisely how 
pkgutil.get_distributions() scans sys.path looking for ".egg-info 
directories". What does it do for sys.path entries that don't 
correspond to filesystem directories? (Note - these may or may not 
be zip files. Even if they are zip files, an earlier entry on 
sys.path_hooks could have taken precedence. At the very least, you 
should only process path entries as zip files if their importer - in 
sys.path_importer_cache or via an explicit path hook scan - is a 
zipimporter object.). To be honest, this is a major can of worms. 
But if PEP 376 is not going to support PEP 302, then it must state 
that fact explicitly, to avoid giving people false expectations - 
particularly with Brett's importlib in Python 3.1, which will make 
it far easier for people to experiment with new packaging formats 
such as the ones Lukasz mentions above. And it MUST fail gracefully 
in the face of unsupported importer types.


Well, we could always resurrect PEP 365, since pkg_resources already 
has documented extensible support for arbitrary importers.  That 
solves backward *and* forward compatibility.  Then PEP 376's 
uninstall facilities could be implemented using pkg_resources' 
existing metadata query features.


The primary downside to that, of course, is that it brings in the 
matter of version specifications and dependencies...  which appear to 
be a contentious topic.  (Note that Tarek is proposing to drop the 
PEP 386 proposal to standardize a much more restrictive scheme than 
seutptools' version parser, precisely because of the controversy.)


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376

2009-07-03 Thread P.J. Eby


At 12:28 AM 7/4/2009 +1000, Nick Coghlan wrote:

I suspect this limitation of the PEP 302 APIs is the origin of the
setuptools format that embeds the metadata inside the distribution - it
lets you get at the metadata without having to assume that it exists
directly on the filesystem anywhere.


I think you have this backwards; it's setuptools that doesn't care 
where (or whether) the metadata exists on the file system; it 
delegates metadata operations to a "metadata provider" that's usually 
an adapter over a PEP 302 "loader".


See 
http://peak.telecommunity.com/DevCenter/PkgResources#supporting-custom-importers 
for the API details of how to register support for arbitrary PEP 302 
importers and loaders.  (Which presumably, Lukasz is using.  I didn't 
know that anybody was actually using it, but it's nice to know that 
the documentation is apparently sufficient for *some* people.  ;-) )


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Mercurial migration: progress report (PEP 385)

2009-07-03 Thread P.J. Eby


At 12:20 PM 7/4/2009 +0900, Stephen J. Turnbull wrote:

IME, Mercurial strongly encourages a non-branching style.  Although I
can't fully explain in concrete terms what makes me feel that way,
it's certainly consistent with your own inclination to advise "subset
branches".  Part of it comes from the fact that you can't have a
single revision on two branches.  I would really like the node of a
release branch to be on both the branch and the mainline so that the
tag appears in the history of both, but that's not possible.


I'm a total newbie to Mercurial - I've basically done little more 
than the tutorial, in fact.  So somebody correct me if what I'm about 
to say is completely stupid.


Wouldn't the simple thing to do in Mercurial, just be to use 
different repositories for long-lived branches?  I mean, if you're 
not merging them that much anyway, what's the point?  Isn't it just 
as easy to pull from another repository, as from another branch 
within that repository?


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376 - get_egginfo_files

2009-07-05 Thread P.J. Eby


At 05:26 PM 7/5/2009 +0100, Paul Moore wrote:

def get_distribution(name):
for d in get_distributions():
if d.name == name:
return d
return None


Btw, this is broken code anyway, because it's not handling 
case-insensitivity or name canonicalization.  (I've mentioned these 
issue previously on the distutils-sig.)


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376 and PEP 302 - allowing import hooks to provide distribution metadata

2009-07-05 Thread P.J. Eby


At 03:13 PM 7/5/2009 +0200, Tarek Ziadé wrote:

The loader would still work with "DIST_NAME.egg-info" directories for
compatibility with
existing format in the query APIs, but the Distutils install command
would rather create  "DIST_NAME.metadata"


Note that this would then break setuptools without adding any 
benefit; ".metadata" is less precise and less unique than 
'.egg-info'.  If you want a clearer name, '.pydist' or some such 
would at least be reasonably specific.  (It'd still have a backward 
compatibility problem, but at least then there'd be some benefit to 
the name change.)


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376 and PEP 302 - allowing import hooks to provide distribution metadata

2009-07-05 Thread P.J. Eby


At 08:43 PM 7/5/2009 +0200, Tarek Ziadé wrote:

But if it's based on PEP 302 protocols and if the pkgutil code works
with the sys.meta_path hook,
setuptools could then provide its loader, based on its EggFormats and
act as a provider without being broken.


You misunderstand me.  The whole point of putting .egg-info in 
distutils in the first place was to enable setuptools to detect the 
presence of disutils-installed packages.  That's what's broken by 
changing the name.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376 and PEP 302 - allowing import hooks to provide distribution metadata

2009-07-05 Thread P.J. Eby


At 07:10 AM 7/6/2009 +1000, Nick Coghlan wrote:

By using a new name for the
directory we *guarantee* that old packaging utilities won't get confused
by the new format (they simply won't acknowledge its existence).


This is incorrect; they will get confused because they will think 
that the relevant package is *not* installed, and proceed to install 
a duplicate.  That's why .egg-info was added to the stdlib in the first place.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376 and PEP 302 - allowing import hooks to provide distribution metadata

2009-07-06 Thread P.J. Eby


At 10:32 AM 7/6/2009 +0100, Paul Moore wrote:

I'm +0 on changing the name, as long as it's the *only* "do it this
way because setuptools isn't going to change" issue.


Please note that I never said that.  I was the one who suggested 
".pydist", remember?


I just don't want to have to complicate the pkg_resources directory 
scan to support a stupid name like "metadata", no offense to Tarek.  ;-)


I'd sooner not make the change at all, just because I doubt I'm the 
only person who has code that depends on the old name.  (Consider the 
Linux OS packagers who treat .egg-info files specially at the moment.)


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376 and PEP 302 - allowing import hooks to provide distribution metadata

2009-07-06 Thread P.J. Eby

At 04:14 PM 7/6/2009 +0100, Paul Moore wrote:

2009/7/6 Nick Coghlan :
> P.J. Eby wrote:
>> At 08:43 PM 7/5/2009 +0200, Tarek Ziadé wrote:
>>> But if it's based on PEP 302 protocols and if the pkgutil code works
>>> with the sys.meta_path hook,
>>> setuptools could then provide its loader, based on its EggFormats and
>>> act as a provider without being broken.
>>
>> You misunderstand me.  The whole point of putting .egg-info in distutils
>> in the first place was to enable setuptools to detect the presence of
>> disutils-installed packages.  That's what's broken by changing the name.

This is getting confusing. Is Phillip saying that setuptools will cope
with the file changing to a directory without modification, but it
can't handle a change in the name?

The existing versions of setuptools will read a file or a directory 
with no problem; it's the name change that will require a code 
change, and it's a rather more complex issue than just one name 
change, because it'll need to support both names.

What's more, on the build/install side, it'll have to figure out 
whether to use the new name or the old name when creating a project's 
metadata for installation in single-version mode.  In other words, 
this will likely affect pip as well, or at least the parts of 
setuptools that pip uses.

My site-packages has a confusing mix of egginfo directories and files.
Note that I NEVER use setuptools other than where an existing
package's setup.py requires it. In that case, I still only do python
setup.py bdist_wininst and install the generated installer.

So is PEP 376 going to be able to cope with the stuff I have installed
at the moment? If not, what's the point???

If I understand Tarek's proposal correctly, then no, it will not cope.

If setuptools is not going to change to conform to PEP 376, then any
tools built using PEP 376 will fail to recognise my coverage install.

I'm all in favor of adding RECORD support to setuptools; it was in 
fact my idea to have the file there in the first place.

Adding a RECORD file doesn't introduce any new and weird name 
migration requirements, which is why I'd rather not change the 
extension if we can avoid it.

Reading both names is painful, writing both is more so, and I'm not 
sure how many tools/users *besides* setuptools will be affected by a 
name change.

> How much information does setuptools actually need in order to tell that
> a distribution is already present? Presumably the existing .egginfo
> files generated by distutils are sufficient for that task?

It appears so, but setuptools doesn't use (or create!!!) those files
in its own installer formats.

Setuptools treats an .egg-info file as if it were a PKG-INFO file 
contained in an .egg-info directory.  This allows it to treat 
distutils-supplied .egg-info files as if they were 
setuptools-supplied .egg-info directories containing exactly one file.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376 and PEP 302 - allowing import hooks to provide distribution metadata

2009-07-06 Thread P.J. Eby

At 07:18 PM 7/6/2009 +0200, Tarek Ziadé wrote:

>> So is PEP 376 going to be able to cope with the stuff I have installed
>> at the moment? If not, what's the point???
>
> If I understand Tarek's proposal correctly, then no, it will not cope.

Why that ? Can you detail ? On a system that uses only plain distutils
distributions, it'll work.

Paul's question, as I understood it, was whether pkgutil would handle 
the existing .egg-info's.  Your proposal, as I understood it, was 
that pkgutil would not detect .egg-info's without a plugin of some 
sort...  and you've not proposed a way of loading such a thing, which 
means that to use pkgutil with that plugin, an application would have 
to know that the plugin exists and import it.

Unless you're planning to have a plugin registration facility (ala 
entry points) so that plugins can be used without each application 
needing to discover them on its own?

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376 and PEP 302 - allowing import hooks to provide distribution metadata

2009-07-06 Thread P.J. Eby


At 07:15 PM 7/6/2009 +0100, Paul Moore wrote:

My point was that distributions which use setuptools in their
setup.py, *even if there's no runtime dependency on setuptools* end up
with non-standard .egg-info's. There is no good reason for this, from
my POV as a package user. So if setuptools is breaking otherwise
entirely standard packages, so that they won't work with core
distutils mechanisms, that's wrong. Setuptools is refusing to play
nicely with other packaging systems.

Now at the moment, this is a moot point, because the stdlib doesn't
have any package discovery mechanisms.


Not so; it has .egg-info.  (See footnote below.)



The PEP 302 hooks I'm writing provide the "plugin" ability for
alternative storage formats. But there's no provision for alternative
*metadata* formats. What's the point of a standard if people don't
follow it?


I'm confused.  Are you saying pkgutil will recognize .egg-info, or not?

The only practical difference between the existing formats and the 
proposed new one (AFAICR) is the extension of the directory name, and 
the presence or absence of the RECORD file.


My POV is that it would be a shame to discard compatibility with 
existing installed software, just for a bikeshedding of the directory 
name.  Especially if it means that any tools that currently generate 
.egg-info's (and I'm not 100% positive setuptools is the only case) 
will need to grow options for whether to generate the old name, 
possibly based on what Python version...  except you can install new 
distutils into old Python... etc.


If this were strictly a matter of, "from Python 3.x on, generate 
.pydist", it's a lot less of a problem for me and everybody else, 
because Python 3 is effectively going to force a fork (or at least a 
branch) of setuptools anyway.


For Python 2.x, having *two* names for .egg-info (which is what I 
understand the proposal to effectively be) is just asking for trouble, IMO.


And my concern is that currently these compatibility issues are being 
handwaved rather than considered up front.


To reiterate, summarize, and attempt to make as concrete as I can:

I'm fine with a rename, *if and only if* it can be done in a way that 
never requires a tool to generate BOTH files for compatibility 
reasons.  AFAICT, this can only be done by making the extension 
dependent upon the Python version, and this version distinction must 
be made *even if* the distutils themselves are backported to an 
earlier Python version.  (Alternatively, this could be resolved by 
*not* backporting the PEP 376-compatible distutils/pkgutil to 2.x.)


Currently, there is no official Python 3 release of setuptools, so 
compatibility can (IMO) be considered moot there.  What's more, it is 
easy to conditionally determine whether to use '.egg-info' or 
'.pydist' based on Python version.  What is not easy and likely to be 
confusing for everyone, is having the possibility for *both* to exist 
in the same system, *without* also having bidirectional interoperability.


So, I'm +1 for no change (obviously), and +0 for "always .pydist in 
3.x and always .egg-info in 2.x", whether the latter part is achieved 
by making distutils/pkgutil use a version-dependent extension, or by 
refusing to backport distutils/pkgutil to 2.x.  I'm -1 for having 
.pydist ever appear on a 2.x Python, either via stdlib or backporting.


(These comments also apply for any name other than '.pydist'.)

Footnote: there has been some discussion that implies that somehow 
.egg-info was hijacked into the stdlib, and that it should therefore 
not be supported.  This is incorrect: at Pycon the year this happened 
(2005 IIRC), Guido *blessed setuptools for inclusion in the stdlib* 
in his keynote.  Yes, *all* of setuptools, including easy_install, 
eggs, the whole nine yards.


A few months later, when I actually checked in the code, there were 
protests from a few people on Python-Dev, some of which I agreed were 
valid.  I reverted the checkin, floating instead a compromise 
proposal to add .egg-info support to distutils (i.e. the 
install_egg_info command) to allow setuptools and other 3rd party 
tools to detect the presence of distutils-installed packages.


At the time, the assumption was that eggs and setuptools would still, 
eventually, make it into the stdlib -- at the time, Guido had not yet 
changed his mind about that.  So the ".egg-info" name was not 
controversial, to the best of my recollection, and the inclusion did 
not violate any of the expressed concerns of the three persons who 
objected to including setuptools whole-hog.


So, from my POV at least, dropping .egg-info is dropping backward 
compatibility on a feature that was *intentionally* added to the 
stdlib -- with approval from Guido and non-protest from Python-Dev -- 
specifically to provide that compatibility.


That doesn't obligate it to stay...  but I suggest that it *does* 
mean Python 3, not 2, is the appropriate place for dropping it (if it 
is indeed to be dropp

Re: [Python-Dev] PEP 376 - Open questions

2009-07-06 Thread P.J. Eby


At 07:38 PM 7/6/2009 +0100, Paul Moore wrote:

As promised, here are some open questions on PEP 376.

- Will the public API names be changed from *egginfo* to *metadata*?


+1 (FWIW, 'metadata' is what pkg_resources API refers to this kind of 
stuff as.)




- What precisely are the use cases for absolute path names? Concrete
examples are needed. With the current spec, some things can go wrong
(e.g., see below), so we need real use cases to know how to address
this.


The purpose is to be able to remove scripts, data files, 
documentation, etc. that's stored in some place that's not 
package-relative, and survive a move of the package directory itself.


For example, suppose I install with --prefix=~, and have files in 
~/bin and ~/lib.  I then move ~/lib/python2.7 to ~/pylib.  If the 
scripts were installed with a relative path, this would break the 
link to the script still living in ~/bin.


Now that I'm thinking about this, it occurs to me that there is 
another way to address this besides mixing relative and absolute 
paths in RECORD, and that is to make *all* the paths relative, and 
include an original installation location in INSTALLER.  Then, it 
would be possible to detect whether the installation location had 
been moved and handle it accordingly...  for some definition of "accordingly".


I agree that further discussion is probably needed to resolve this.



- How will bdist_wininst/bdist_msi/bdist_rpm be updated?


bdist_wininst, bdist_dumb, and various others use 'install --root' 
pattern to generate files for installation, which means that they 
would transparently end up writing a correct RECORD file, except for 
the inclusion of incorrect absolute paths for non-libdir-relative 
files.  However, if we used the "all relative in RECORD, with a base 
in INSTALLER", these cases could transparently be treated as another 
instance of install directory relocation.


I don't know if bdist_msi does a --root install before generating the 
.msi; if it does, then it should work the same way.




- Can distutils be made to install files in places the current RECORD
file spec can't handle? (I think the answer is "yes"). What happens
then?


Surely an absolute path is sufficient?



- Should distribution names be case insensitive on case insensitive
filesystems? For comparison, module/package names are always case
sensitive even on case insensitive systems.


IMO, they should be case-insensitive *regardless* of filesystem case, 
and I believe PyPI has already begun disallowing packages with only 
case-distinct names (e.g. 'Zope' vs. 'ZoPe') to be registered.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] "Absolute" paths in PEP 376 RECORD files

2009-07-07 Thread P.J. Eby

Just an idea...  suppose that instead of using "real" absolute paths 
in the RECORD file for non-local files (scripts, data, etc) we 
changed the format to include a "prefix" field, containing something 
like LIBDIR, SHARE, SCRIPTS, etc., ala bdist_wininst internals?


Also, we could include a separate (optional) PREFIXES file defining 
what those locations translated to at install time.  Dumb bdists 
would still have their build paths there (or delete the file before 
packaging, or use real paths instead of the --root prefixed versions).


Upside: relocation can be detected and handled, RECORD remains 
cross-platform, and bdists are ok.


Downside: more complex API required to read/manipulate paths and 
delete files, since you need to be able to check that you have the 
right prefixes, and may have to ask the user for a prefix you don't 
recognize, if the default in PREFIXES doesn't match.


Thoughts?

(Also, as a special case, any file that's actually installed to 
LIBDIR or a subdirectory thereof (even if it's technically a "data" 
file or script), will probably need to be designated under LIBDIR 
prefix to prevent runtime breakage in the event any system package 
maintainers are tempted to use RECORD files as a way of forcing a HFS 
conformance.  These are installation prefixes, *not* content types.)


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376 - Open questions

2009-07-07 Thread P.J. Eby


At 03:23 PM 7/7/2009 +0200, Tarek Ziadé wrote:

When I started to work on this I didn't realize the gigantic amount of
work and coordination it requires


No one expects the package inquisition.  ;-)

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376 - Open questions

2009-07-07 Thread P.J. Eby

At 11:30 PM 7/7/2009 +0200, Tarek Ziadé wrote:

On Tue, Jul 7, 2009 at 10:31 PM, P.J. Eby wrote:
> At 03:23 PM 7/7/2009 +0200, Tarek Ziadé wrote:
>>
>> When I started to work on this I didn't realize the gigantic amount of
>> work and coordination it requires
>
> No one expects the package inquisition.  ;-)
>
>

Sorry, i've looked in the english dictionary but I don't get this one.
what do you mean ?

It's a Monty Python reference, i.e. "no one expects the Spanish Inquisition".

In the Spanish Inquisition sketch, a character is asked several 
questions he can't answer, and then complains that he wasn't 
expecting some kind of Spanish inquisition, at which point three 
scarlet-robed cardinals burst into the room and exclaim, "NO ONE 
expects the Spanish Inquisition!"

So I punned on that in order to imply that nobody who takes on the 
job of packaging expects to be asked the kind of detailed, 
unanswerable questions that come with the territory of packaging, or 
to be subjected to torture by third parties merely for trying to do 
something useful.

And, by further implication, I was thus expressing camaraderie with 
you on your initiation into the wonderful world of trying to satisfy 
everyone in Pythonland, and our shared experience in the area of "if 
I'd really known what I was getting into, I probably wouldn't have".  ;-)

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376 - Open questions

2009-07-08 Thread P.J. Eby


At 01:13 PM 7/8/2009 +0100, Paul Moore wrote:

An alternative solution would be for the bdist_xxx commands to ignore
the RECORD file generated in the temp area, and build its own on the
target machine when the installer is run. This is conceptuially far
cleaner, it's in line with the implicit assumptions in the PEP, and it
avoids a lot of the problems with path formats (the filenames in
RECORD could be absolute local pathnames, and that's the end of the
story). But - and it's a big but - this dumps RECORD file handling on
the bdist_xxx command, and they all need to implement the same code,
resulting in a lot of duplication and opportunity for inconsistency.


The PREFIXES approach I suggested doesn't have these problems; the 
bdist installer only needs to write a PREFIXES file, and doesn't need 
to touch RECORD (which would be generated by the install --root).  In 
fact, since PREFIXES is strictly advisory, it can be generated by the 
install --root and left alone at installation time.  And, if it's 
generated without the --root prefix included, RPMs would have a 
correct PREFIXES file without further manipulation.  Plus, the 
bdist_wininst .exe installer can easily write a PREFIXES file, since 
it has to know the prefixes in order to do an installation in the first place.


(We still need to hash out an initial list of prefixes, of course.)


At 08:59 AM 7/8/2009 -0400, Eric Smith wrote:
I agree with this. For RPM's, there's a whole other database of what 
files were installed. Is it really the intent that a file will be 
managed by multiple different installers? That I can install with 
RPM but remove with some python-installer (or other) tool? That way 
lies madness. In fact, I see RECORD as an installer-specific detail 
that doesn't need to be standardized at all.


This is a misunderstanding.  The purpose is to let an *installer* 
(like easy_install) know that these files belong to some other 
installer, and not allow overwriting them.  That's why there's also 
an INSTALLER file.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376 - Open questions

2009-07-08 Thread P.J. Eby


At 03:57 PM 7/8/2009 +0100, Paul Moore wrote:

Who's going to use the APIs based around the RECORD file? Anyone?


The distutils uninstall facility, for starters.  easy_install and pip 
also will, eventually.


In all three cases, the use will be in order to avoid overwriting 
files belonging to a system package manager, and in order to 
uninstall/upgrade their own files.


(I can't personally commit that pip will need this; but it's 
currently the only installer besides the distutils that installs 
things in this format, and would therefore need this to prevent 
overwriting things.)


Setuptools will also need to generate RECORD, INSTALLER, and PREFIXES 
appropriately for both development and actual installations.




Seriously - who is actually going to USE any of the APIs being
proposed in PEP 376? If no-one's interested, we should stop right now.
Do any of the people from the summit read this list? If not, can
anyone contact them and get some feedback?


Package management people aren't likely to be doing anything with 
this stuff directly; if they need a list of installed files they're 
probably using --root or --record at package build time, not needing 
to sniff stuff and installation time, and they already have 
installation facilities.  Their only real need in relation to this 
will be to set an option to generate an INSTALLER file that says you 
can't uninstall their packages manually, thereby preventing the 
package from being uninstalled by the distutils utility, or any other 
well-behaved PEP 376-based package management tool.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376 - Open questions

2009-07-08 Thread P.J. Eby

At 05:22 PM 7/8/2009 +0100, Paul Moore wrote:

2009/7/8 P.J. Eby :
> At 03:57 PM 7/8/2009 +0100, Paul Moore wrote:
>>
>> Who's going to use the APIs based around the RECORD file? Anyone?
>
> The distutils uninstall facility, for starters.  easy_install and pip also
> will, eventually.

Is pip getting this via setuptools, or will it be independently
changing its code? (Who's the pip contact who could confirm this
directly?)

It would likely be changing its code directly, with respect to 
supporting overwrite-avoidance and uninstall-before-upgrade 
scenarios.  (Generation support would likely come from setuptools, 
dependent on who implemented first.)  Ian Bicking is pip's author.

In case (1), I'd imagine the user is careful enough (given that he's
doing everything manually) to avoid installing something that's
already present.

Obviously, you've never tried doing this yourself.  ;-)

My guess would actually be that in Unixy environments these days, 
people either use virtualenv or buildouts to manage such 
things.  Buildout handles uninstall-before-update (for eggs, anyway, 
I'm not sure about other things) and virtualenv makes it easy to 
create throwaway Python installs.

So if I follow this, there's a use for this in making sure the new
distutils uninstall option doesn't break something it shouldn't, but
as the uninstall uses RECORD, it won't be (accidentally) usable on
distributions that didn't follow PEP 376 and create a RECORD file.

That's an interesting point.  But for non-distutils tools, it'd be 
nice to be able to find out whether another package manager is 
administering some particular file.  For that matter, the distutils 
themselves should (ultimately) support uninstall-before-upgrade and 
avoid overwriting another package manager's files.

If the only driver for this PEP is setuptools, then I'm -1 on it.

If it were being driven by setuptools, I'd have just implemented it 
myself and presented it as a fait accompli.  I can't speak to Tarek's 
motives, but I assume that, as stated in the PEP, the primary driver 
is supporting the distutils being able to uninstall things, and 
secondarily to allow other tools to be built on top of the API.

a packaging tool *other* than setuptools (or setuptools-derived projects)

Is there really such a thing?  ;-)

AFAIK, every published tool for managing Python projects is either 
distutils-based or setuptools-based.  (Things like scons and pymake 
and various other project build tools don't seem to fall under 
"packaging tool" in this sense.)

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376 - Open questions

2009-07-08 Thread P.J. Eby


At 01:58 PM 7/8/2009 -0400, Eric Smith wrote:
I really don't see this use case. Supporting multiple installers for 
the same file (or even just trying to prevent two installers from 
writing the same file)? Wouldn't you be better off just saying an 
installer can't overwrite any existing file?


Likewise, I don't see a use case for installing with one installer 
and uninstalling with another.


Put those two together, and the mechanism that an installer uses to 
record what files it installed is a private implementation detail.


You'll have to ask Tarek about that.  As I've said in another email, 
if I were doing this for setuptools I'd just go and do it; it's 
already on my backlog of things to do.  The point of the PEP is to 
provide an uninstall facility for the distutils, and to expose those 
records via an API so that other tools can be built on top of it.


Whether anything other than distutils-based tools will actually use 
the format directly is entirely irrelevant to the PEP's merits, since 
bdist_* tools and platform-specific distutils installers may need to 
read or write the format.  Hence, a PEP is appropriate to describe a 
format and API for the distutils family of tools to operate on.


The specific proposed format is based on an existing de facto 
standard for installed package metadata, so that the many existing 
packaging tools based on that other format and existing APIs can 
easily read or write this data, and so that the API can also be used 
as a way to detect the presence of already-installed distributions of 
software -- even ones installed other ways.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376 - Open questions

2009-07-08 Thread P.J. Eby

At 09:38 AM 7/9/2009 +0900, Stephen J. Turnbull wrote:

Eric Smith writes:

 > But I think we've veered into metadata that describes what has been
 > installed. I don't think that's so useful. As I've said, this is private
 > to the installers. If 2 installers want to communicate with each other
 > about what they've installed, then they can agree on that data. I just
 > don't find it generally useful for all installers, and therefore not
 > useful for distutils.

ISTM that the problem that it solves is uninstall in the absence of
the original installer.

Or uninstall where the installer is "setup.py install", actually.

  Am I to understand that you don't think that
use case is important?  Or that there's another way to do this?

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] "Absolute" paths in PEP 376 RECORD files

2009-07-08 Thread P.J. Eby


At 06:51 PM 7/8/2009 -0700, Sridhar Ratnakumar wrote:

Is there any reason why RECORD file can't be generated at runtime?


Applications can be relocatable, yet require plugins and libraries to 
be upgraded, installed, uninstalled, etc.


(Also, RECORD is cross-platform to allow cross-platform 
installation/testing of cross-platform packages; that's why I propose 
keeping the prefix information separate.)




 Also,
why should the RECORD file be generated at all by bdist* commands?


bdist commands that use "install --root" will get it as a side 
effect, so they don't have to do anything special.  They needn't even 
generate the PREFIXES, if PREFIXES is also generated by "install 
--root", and omits the root from the prefixes written.  In other 
words, most bdist tools wouldn't change.  bdist_wininst and bdist_msi 
might need to regenerate PREFIXES at actual installation time, but 
AFAIK those are the only ones.  (And the spec would allow for the 
possibility that a PREFIXES file can be incorrect.)


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376 - Open questions

2009-07-08 Thread P.J. Eby


At 11:20 PM 7/8/2009 -0400, Eric Smith wrote:

P.J. Eby wrote:

ISTM that the problem that it solves is uninstall in the absence of
the original installer.

Or uninstall where the installer is "setup.py install", actually.


I think we need to move away from "setup.py install". It's the 
antithesis of static metadata.


Please note that that's entirely out of scope for the PEP at hand.

That being said, the rest of your proposal is strikingly similar to a 
proposal I previously floated on the distutils-sig for a concept 
called BUILDS.  The main difference is that I suggested that the spec 
should include a standard interface for running build operations that 
would produce the manifest (equivalent to your setup.info), and that 
distutils and setuptools should provide setup.py commands to generate 
said manifest, to allow for a seamless transition.


There was very little comment on the proposal, perhaps because it 
involves a lot of work that most people are sane enough not to sign 
up for.  ;-)


(Or more to the point, it's the sort of thing that never gets off the 
ground for design by consensus.  The best way to make something like 
your proposal to happen is to go off and build it, and get it to take 
over.  If it can't win substantial market share on its own merits, it 
probably doesn't deserve to be blessed as a standard.)


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 376 - Open questions

2009-07-08 Thread P.J. Eby


At 11:28 PM 7/8/2009 -0400, Eric Smith wrote:
Eventually, I'd like PEP 376 to support system packagers too. So 
for example, if you did "apt-get install python-pyqt4", then 
running "pip install python-pyqt4" should return without installing 
anything .. as RECORD will be part of the .deb previously 
installed. As for generating the RECORD file, I vote for generating 
it during install time (w/ absolute paths).


I think we should explicitly not support this. What if pip and 
apt-get (or rpm, or others) install the same package in different 
places because of system conventions (/usr vs. /usr/local vs. /opt, 
say)? There's no way we're ever going to get this right, and it's 
not worth complicating our lives over it.


Seriously: Is there some real world use case I'm missing? Does any 
existing install system support this?


The use case described above is supported right now by easy_install 
and pip; it doesn't require a RECORD file, though.  An .egg-info file 
or an .egg-info directory with a PKG-INFO is enough to prevent 
duplicate installation.


I don't know how pip handles file overwrites; easy_install OTOH never 
overwrites anything that's not in an .egg or a .pth.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] "Absolute" paths in PEP 376 RECORD files

2009-07-09 Thread P.J. Eby

At 02:46 PM 7/9/2009 -0400, Tres Seaver wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Nick Coghlan wrote:
> P.J. Eby wrote:
>>>  Also,
>>> why should the RECORD file be generated at all by bdist* commands?
>> bdist commands that use "install --root" will get it as a side effect,
>> so they don't have to do anything special.  They needn't even generate
>> the PREFIXES, if PREFIXES is also generated by "install --root", and
>> omits the root from the prefixes written.  In other words, most bdist
>> tools wouldn't change.  bdist_wininst and bdist_msi might need to
>> regenerate PREFIXES at actual installation time, but AFAIK those are the
>> only ones.  (And the spec would allow for the possibility that a
>> PREFIXES file can be incorrect.)
>
> I haven't seen any suggestions for handling non-relative paths that are
> cleaner than this. The distutils based list that Tarek posted also gives
> a good "starter set" of supported entries in the PREFIXES file:
>
> PURELIB=
> PLATLIB=
> HEADERS=
> SCRIPTS=
> DATA=
>
> The question then would be whether relative paths should be handled
> directly in the RECORD file (as in the current PEP) or by writing a "."
> into PREFIXES to indicate relative paths.

Per discussion with Matthias Klose, it would be useful to split out the
following prefixes for "special" kinds of data:

- - docs
- - i18n / locales

Unfortunately, the distutils don't currently have a way to specify 
these.  I think we'll need to use only the ones that are distutils 
install targets at first, but of course the format is extensible.

OTOH, one supposed that at least the 'data_files' setup argument 
could be enhanced to allow a data type flag, that would then be used 
as a prefix name for those files, with a fallback to the DATA prefix 
if there isn't one specified.  But that's something that will require 
tagging support in the install_data command.  As it is, refactoring 
the distutils to track prefixes for the RECORD file is probably going 
to be...  challenging.  ;-)

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] "Absolute" paths in PEP 376 RECORD files

2009-07-10 Thread P.J. Eby


At 03:31 PM 7/10/2009 +0200, Tarek Ziadé wrote:

On Thu, Jul 9, 2009 at 9:09 PM, P.J. Eby wrote:
> At 02:46 PM 7/9/2009 -0400, Tres Seaver wrote:
>>
>> -BEGIN PGP SIGNED MESSAGE-
>> Hash: SHA1
>>
>> Nick Coghlan wrote:
>> > P.J. Eby wrote:
>> >>>  Also,
>> >>> why should the RECORD file be generated at all by bdist* commands?
>> >> bdist commands that use "install --root" will get it as a side effect,
>> >> so they don't have to do anything special.  They needn't even generate
>> >> the PREFIXES, if PREFIXES is also generated by "install --root", and
>> >> omits the root from the prefixes written.  In other words, most bdist
>> >> tools wouldn't change.  bdist_wininst and bdist_msi might need to
>> >> regenerate PREFIXES at actual installation time, but AFAIK those are
>> >> the
>> >> only ones.  (And the spec would allow for the possibility that a
>> >> PREFIXES file can be incorrect.)
>> >
>> > I haven't seen any suggestions for handling non-relative paths that are
>> > cleaner than this. The distutils based list that Tarek posted also gives
>> > a good "starter set" of supported entries in the PREFIXES file:
>> >
>> > PURELIB=
>> > PLATLIB=
>> > HEADERS=
>> > SCRIPTS=
>> > DATA=
>> >
>> > The question then would be whether relative paths should be handled
>> > directly in the RECORD file (as in the current PEP) or by writing a "."
>> > into PREFIXES to indicate relative paths.

I like the idea of a separated PREFIXES file, then using variables in
the RECORD file.


Please note that my proposal intentionally omitted "variables", but 
instead used a separate column to designate the prefix.  This allows 
some guarantees regarding path sanity that could otherwise be abused, 
like combining prefixes or inserting them in the middle of paths.  In 
this way, the formal conversion of a path would be 
os.path.join(PREFIX, *relative_path.split('/')), rather than a 
variable expansion.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

1 2 3 >

1 - 100 of 291 matches

Mail list logo