from:"PJ Eby"

[Python-Dev] PEP 487 vs 422 (dynamic class decoration)

2015-04-01 Thread PJ Eby

I recently got an inquiry from some of my users about porting some of
my libraries to Python 3 that make use of the Python 2 __metaclass__
facility.  While checking up on the status of PEP 422 today, I found
out about its recently proposed replacement, PEP 487.

While PEP 487 is a generally fine PEP, it actually *rules out* the
specific use case that I wanted PEP 422 for in the first place:
dynamic addition of callbacks or decorators for use at class creation
time without requiring explicit inheritance or metaclass
participation.  (So that e.g. method decorators can access the
enclosing class at class definition time.)

As discussed previously prior to the creation of PEP 422, it is not
possible to port certain features of my libraries to work on Python 3
without some form of that ability, and the only thing that I know of
that could even *potentially* provide that ability outside of PEP 422
is monkeypatching __build_class__ (which might not even work).  That
is, the very thing that PEP 422 was created to avoid the need for.
;-)

One possible alteration would be to replace __init_subclass__ with
some sort of __init_class__ invoked on the class that provides it, not
just subclasses.  That would allow the kind of dynamic decoration that
PEP 422 allows.  However, this approach was rather specifically ruled
out in earlier consideration of PEP 422, so

Another alternative would be to have the default __init_subclass__
look at a class-level __decorators__ attribute, as originally
discussed for PEP 422.  That would solve *my* problem, but feels too
much like adding more than One Way To Do It.

So...  honestly, I'm not sure where to go from here.  Is there any
chance that this is going to be changed, or revert to the PEP 422
approach, or...  something?  If so, what Python version will the
"something" be in?  Or is this use case just going to be a dead parrot
in Python 3, period?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 487 vs 422 (dynamic class decoration)

2015-04-01 Thread PJ Eby

On Wed, Apr 1, 2015 at 10:39 PM, Nick Coghlan  wrote:
> On 2 April 2015 at 07:35, PJ Eby  wrote:
>> I recently got an inquiry from some of my users about porting some of
>> my libraries to Python 3 that make use of the Python 2 __metaclass__
>> facility.  While checking up on the status of PEP 422 today, I found
>> out about its recently proposed replacement, PEP 487.
>>
>> While PEP 487 is a generally fine PEP, it actually *rules out* the
>> specific use case that I wanted PEP 422 for in the first place:
>> dynamic addition of callbacks or decorators for use at class creation
>> time without requiring explicit inheritance or metaclass
>> participation.  (So that e.g. method decorators can access the
>> enclosing class at class definition time.)
>
> How hard is the requirement against relying on a mixin class or class
> decorator to request the defining class aware method decorator
> support? Is the main concern with the fact that failing to apply the
> right decorator/mixin at the class level becomes a potentially silent
> failure where the class aware method decorators aren't invoked
> properly?

The concern is twofold: it breaks proper information hiding/DRY, *and*
it fails silently.  It should not be necessary for clients of package
A1 (that uses a decorator built using package B2) to mixin a metaclass
or decorator from package C3 (because B2 implemented its decorators
using C3), just for package A1's decorator to work properly in the
*client package's class*.  (And then, of course, this all silently
breaks if you forget, and the breakage might happen at the A1, B2, or
C3 level.)

Without a way to hook into the class creation process, there is no way
to verify correctness and prevent the error from passing silently.
(OTOH, if there *is* a way to hook into the creation process, the
problem is solved: there's no need to mix anything in anyway, because
the hook can do whatever the mixin was supposed to do.)

The only way PEP 487 could be a solution is if the default
`object.__init_subclass__` supported one of the earlier __decorators__
or __autodecorate__ proposals, or if the PEP were for an
`__init_class__` that operated on the defining class, instead of
operating only on subclasses.  (I need to hook the creation of a class
that's *being defined*, not the definition of its future subclasses.)

> My preference at this point would definitely be to introduce a mixin
> class into the affected libraries and frameworks with an appropriate
> PEP 487 style __init_subclass__ that was a noop in Python 2 (which
> would rely on metaclass injection instead), but implemented the
> necessary "defining class aware" method decorator support in Python 3.

If this were suitable for the use case, I'd have done it already.
DecoratorTools has had a mixin that provides a __class_init__ feature
since 2007, which could be ported to Python 3 in a straighforward
manner as a third-party module.  (It's just a mixin that provides a
metaclass; under 3.x it could probably just be a plain metaclass with
no mixin.)

> The question of dynamically injecting additional base classes from the
> class body to allow the use of certain method decorators to imply
> specific class level behaviour could then be addressed as a separate
> proposal (e.g. making the case for an "__append_mixins__" attribute),
> rather than being linked directly to the question of how we going
> about defining inherited creation time behaviour without needing a
> custom metaclass.

Then maybe we should do that first, since PEP 487 doesn't do anything
you can't *already* do with a mixin, all the way back to Python 2.2.

IOW, there's no need to modify the core just to have *that* feature,
since if you control the base class you can already do what PEP 487
does in essentially every version of Python, ever.  If that's all PEP
487 is going to do, it should just be a PyPI package on a
stdlib-inclusion track, not a change to core Python.  It's not
actually adding back any of the dynamicness (dynamicity?
hookability?) that PEP 3115 took away.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 487 vs 422 (dynamic class decoration)

2015-04-02 Thread PJ Eby

On Thu, Apr 2, 2015 at 4:46 AM, Nick Coghlan  wrote:
> On 2 April 2015 at 16:38, PJ Eby  wrote:
>>
>> IOW, there's no need to modify the core just to have *that* feature,
>> since if you control the base class you can already do what PEP 487
>> does in essentially every version of Python, ever.  If that's all PEP
>> 487 is going to do, it should just be a PyPI package on a
>> stdlib-inclusion track, not a change to core Python.  It's not
>> actually adding back any of the dynamicness (dynamicity?
>> hookability?) that PEP 3115 took away.
>
> The specific feature that PEP 487 is adding is the ability to
> customise creation of subclasses without risking the introduction of a
> metaclass conflict.  That allows it to be used in situations where
> adopting any of the existing metaclass based mechanisms would require
> a potential compatibility break

But metaclass conflicts are *also* fixable in end-user code, and have
been since 2.2.  All you need to do is use a metaclass *function* that
automatically merges the metaclasses involved, which essentially
amounts to doing `class MergedMeta(base1.__class__,
base2.__class__,...)`.  (Indeed, I've had a library for doing just
that since 2002, that originally ran on Python 2.2,.)

On Python 3, it's even easier to use that approach, because you can
just use something like `class whatever(base1, base2,
metaclass=noconflict)` whenever a conflict comes up.  (And because the
implementation wouldn't have to deal with classic classes or
__metaclass__, as my Python 2 implementation has to.)

IOW, *all* of PEP 487 is straightforward to implement in userspace as
a metaclass and a function that already exist off-the-shelf in Python
2...  and whose implementations would be simplified by porting them to
Python 3, and dropping any extraneous features:

* http://svn.eby-sarna.com/PEAK/src/peak/util/Meta.py?view=markup
(the `makeClass` function does what my hypothetical `noconflict` above
does, with a slightly different API, and support for classic classes,
__metaclass__, etc., that could all be stripped out)

* http://svn.eby-sarna.com/DecoratorTools/peak/util/decorators.py?view=markup
 (see the `classy_class` metaclass and `classy` mixin base that
implement features similar to `__init_subclass__`, plus others that
could be stripped out)

Basically, you can pull out those functions/classes (and whatever else
they use in those modules), port 'em to Python 3, make any API changes
deemed suitable, and call it a day.  And the resulting code could go
to a stdlib metaclass utility module after a reasonable break-in
period.

> (as well as being far more
> approachable as a mechanism than the use of custom metaclasses).

Sure, nobody's arguing that it's not a desirable feature.  I
*implemented* that mechanism for Python 2 (eight years ago) because
it's easier to use even for those of us who are fully versed in the
dark metaclass arts. ;-)   Here's the documentation:

http://peak.telecommunity.com/DevCenter/DecoratorTools#meta-less-classes

So the feature doesn't even require *stdlib* adoption, let alone
changes to Python core.  (Heck, I wasn't even the first to implement
this feature: Zope had it for Python *1.5.2*, in their
ExtensionClass.)

It's a totally solved problem in Python 2, although the solution is
admittedly not widely known.  If the PEP 487 metaclass library,
however, were to just port some bits of my code to Python 3 this could
be a done deal already and available in *all* versions of Python 3,
not just the next one.

> The gap I agree this approach leaves is a final
> post-namespace-execution step that supports establishing any class
> level invariants implied by decorators and other functions used in the
> class body. Python 2 allowed that to be handled with a dynamically
> generated __metaclass__ and PEP 422 through __autodecorate__, while
> PEP 487 currently has no equivalent mechanism.

Right.  And it's *only* having such a mechanism available by *default*
that requires a language change.  Conversely, if we *are* making a
language change, then adding a hook that allows method decorators to
access the just-defined class provides roughly the same generality
that Python 2 had in this respect.

All I want is the ability for method decorators to find out what class
they were added to, at the time the class is built, rather than having
to wait for an access or invocation that may never come.  This could
be as simple as __build_class__ or type.__call__ looking through the
new class's dictionary for objects with a `__used_in_class__(cls,
name)` method, e.g.:

for k, v in dict.items():
if hasattr(v, '__used_in_class__'):
v.__used_in_class__(cls, k)

This doesn't do what PEP 487 or 422 do, but it's the bare minimum for
what I need, and it actually allows this type of decor

Re: [Python-Dev] PEP 487 vs 422 (dynamic class decoration)

2015-04-02 Thread PJ Eby

On Thu, Apr 2, 2015 at 1:42 PM, PJ Eby  wrote:
> If the PEP 487 metaclass library,
> however, were to just port some bits of my code to Python 3 this could
> be a done deal already and available in *all* versions of Python 3,
> not just the next one.

Just for the heck of it, here's an actual implementation and demo of
PEP 487, that I've tested with 3.1, 3.2, and 3.4 (I didn't have a copy
of 3.3 handy):

https://gist.github.com/pjeby/75ca26f8d2a7a0c68e30

The first module is just a demo that shows the features in use.  The
second module is the implementation.

Notice that the actual *functionality* of PEP 487 is just *16 lines*
in Python 3...  including docstrings and an `__all__` definition.  ;-)

The other 90 lines of code are only there to implement the
`noconflict` feature for fixing metaclass conflicts... and quite a lot
of *those* lines are comments and docstrings.  ;-)

Anyway, I think this demo is a knockout argument for why PEP 487
doesn't need a language change: if you're writing an __init_subclass__
method you just include the `pep487.init_subclasses` base in your base
classes, and you're done.  It'll silently fail if you leave it out
(but you'll notice that right away), and it *won't* fail in
third-party subclasses because the *third party* didn't include it.

In contrast, PEP 422 provided a way to have both the features
contemplated by 487, *and* a way to allow method-level decorators to
discover the class at class creation time.  If there's going to be a
language change, it should include that latter feature from the
outset.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 487 vs 422 (dynamic class decoration)

2015-04-02 Thread PJ Eby

On Thu, Apr 2, 2015 at 6:24 PM, Martin Teichmann
 wrote:
> The whole point of PEP 487 was to reduce PEP 422 so much that
> it can be written in python and back-ported.

As I said earlier, it's a fine feature and should be in the stdlib for
Python 3.  (But it should have a `noconflict` feature added, and it
doesn't need a language change.)

However, since my specific use case was the one PEP 422 was originally
written to solve, and PEP 487 does not address that use case, it is
not a suitable substitute *for PEP 422*.

This is also not your fault; you didn't force Nick to withdraw it,
after all.  ;-)

My main concern in this thread, however, is ensuring that either the
use case behind PEP 422 doesn't get dropped, or that Nick is now okay
with me implementing that feature by monkeypatching __build_class__.
Since he practically begged me not to do that in 2012, and IIRC
*specifically created* PEP 422 to provide an alternative way for me to
accomplish this *specific* use case, I wanted to see what his current
take was.  (That is, did he forget the history of the PEP, or does he
no longer care about userspace code hooking __build_class__?  Is there
some other proposal that would be a viable alternative? etc.)

> Now you want to be able to write decorators whose details
> are filled in at class creation time.

Not "now"; it's been possible to do this in Python 2 for over a
decade, and code that does so is in current use by other packages.
The package providing this feature (DecoratorTools) was downloaded 145
times today, and 3274 times in the past month, so there is active,
current use of it by other Python 2 packages.  (Though I don't know
how many of them depend directly or indirectly upon this particular
feature.)

Currently, however, it is not possible to port this feature of
DecoratorTools (or any other package that uses that feature,
recursively) to Python 3, due to the removal of __metaclass__ and the
lack of any suitable substitute hook.

> Your point is that you want to be able to use your decorators
> without having to ask users to also inherit a specific class.
> I personally don't think that's desirable.  Many frameworks out
> there have such kind of decorators and mandatory base classes
> and that works fine.

The intended use case is for generic method decorators that have
nothing to do with the base class per se, so inheriting from a
specific base-class is an anti-feature in this case.

> The only problem remains once you need to
> inherit more than one of those classes, as their metaclasses
> most likely clash. This is what PEP 487 fixes.

No, it addresses the issue for certain *specific* metaclass use cases.
It does not solve the problem of metaclass conflict in general; for
that you need something like the sample `noconflict` code I posted,
which works for Python 3.1+ and doesn't require a language change.

> So my opinion is that it is not too hard a requirement to ask
> a user to inherit a specific mixin class for the sake of using
> a decorator.

If this logic were applied to PEP 487 as it currently stands, the PEP
should be rejected, since its use case is even *more* easily
accomplished by inheriting from a specific mixin class.  (Since the
feature only works on subclasses anyway!)

Further, if the claim is that metaclass conflict potential makes PEP
487 worthy of a language change, then by the same logic method
decorators are just as worthy of a language change, since any mixin
required to use a method decorator would be *just as susceptible* to
metaclass conflicts as SubclassInit.  (Notably, the stdlib's ABCMeta
is a common cause of metaclass conflicts in Python 2.6+ -- if you mix
in anything that implements an ABC by subclassing it, you will get a
metaclass conflict.)

Finally, I of course disagree with the conclusion that it's okay to
require mixins in order for method decorators to access the containing
class, since it is not a requirement in Python 2, due to the
availability of the __metaclass__ hook.  Further, PEP 422 was
previously approved to fix this problem, and has a patch in progress,
so I'm understandably upset by its sudden withdrawal and lack of
suitable replacement.

So personally, I think that PEP 422 should be un-withdrawn (or
replaced with something else), and PEP 487 should be retargeted
towards defining a `metaclass` module for the stdlib, including a
`noconflict` implementation to address metaclass conflict issues.
(Mine or someone else's, as long as it works.)  PEP 487 should not be
a proposal to change the language, as the provided features don't
require it.  (And it definitely shouldn't pre-empt a separately useful
feature that *does* require a language change.)

At this point, though, I mostly just want to get some kind of closure.
After three years, I'd like to know if this is a yea or nay, so I can
port the thing and move on, whether it's through a standardized
mechanism or ugly monkeypatching.  Honestly, the only reason I'm even
discussing this in the first

Re: [Python-Dev] PEP 487 vs 422 (dynamic class decoration)

2015-04-02 Thread PJ Eby

On Thu, Apr 2, 2015 at 10:29 PM, Greg Ewing  wrote:
> On 04/03/2015 02:31 PM, Nick Coghlan wrote:
>>
>> If I'm understanding PJE's main concern correctly it's that this
>> approach requires explicitly testing that the decorator has been
>> applied correctly in your automated tests every time you use it, as
>> otherwise there's a risk of a silent failure when you use the
>> decorator but omit the mandatory base class that makes the decorator
>> work correctly.
>
>
> Could the decorator be designed to detect that situation
> somehow? E.g. the first time the decorated method is called,
> check that the required base class is present.

No, because in the most relevant use case, the method will never be
called if the base class isn't present.  For more details, see also
the previous discussion at
https://mail.python.org/pipermail/python-dev/2012-June/119883.html
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 487 vs 422 (dynamic class decoration)

2015-04-02 Thread PJ Eby

On Thu, Apr 2, 2015 at 9:31 PM, Nick Coghlan  wrote:
> On 3 April 2015 at 08:24, Martin Teichmann  wrote:
> However, I'm also now wondering if it may be possible to reach out to
> the pylint authors (similar to what Brett did for the "pylint --py3k"
> flag) and ask for a way to make it easy to register "base class,
> decorator" pairs where pylint will complain if it sees a particular
> method decorator but can't determine at analysis time if the named
> base class is in the MRO for the class defining the method.

Will it *also* check the calling chain of the decorator, or any other
thing that's called or invoked in the class body,to find out if
somewhere, somehow, it asks for a class decoration?  If not, it's not
going to help with this use case.

There are many ways to solve this problem by re-adding a hook -- you
and I have proposed several, in 2012 and now.

There are none, however, which do not involve putting back the
hookability that Python 3 took out, except by using hacks like
sys.set_trace() or monkeypatching __build_class__.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 487 vs 422 (dynamic class decoration)

2015-04-03 Thread PJ Eby

On Fri, Apr 3, 2015 at 8:44 AM, Martin Teichmann
 wrote:
> This proposal can actually be seen as an extension to the __class__
> and super() mechanism of normal methods: methods currently have the
> priviledge to know which classes they are defined in, while descriptors
> don't. So we could unify all this by giving functions a __post_process__
> method which sets the __class__ in the function body. This is about the
> same as what happened when functions got a __get__ method to turn
> them into object methods.
>
> While this all is in the making, PJ could monkey-patch __build_class__
> to do the steps described above, until it gets accepted into cpython.
> So I pose the question to PJ: would such an approach solve the
> problems you have?

Universal member post-processing actually works *better* for the
motivating use case than the metaclass or class level hooks, so yes.

In practice, there is one potential hiccup, and that's that decorators
which aren't aware of __post_process__ will end up masking it.  But
that's not an insurmountable obstacle.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 487 vs 422 (dynamic class decoration)

2015-04-03 Thread PJ Eby

On Fri, Apr 3, 2015 at 11:04 AM, Nick Coghlan  wrote:
> Extending the descriptor protocol to include a per-descriptor hook that's
> called at class definition time sounds like a potentially nice way to go to
> me. While you *could* still use it to arbitrarily mutate the class object,
> it's much clearer that's not the intended purpose, so I don't see it as a
> major problem.

Just to be clear, mutating the class object was never the point for my
main use case that needs the PEP 422 feature; it was for method
overloads that are called remotely and need to be registered
elsewhere.  For some of my other use cases, adding metadata to the
class is a convenient way to do things, but classes are generally
weak-referenceable so the add-on data can be (and often is) stored in
a weak-key dictionary rather than placed directly on the class.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 487 vs 422 (dynamic class decoration)

2015-04-03 Thread PJ Eby

On Fri, Apr 3, 2015 at 4:21 AM, Nick Coghlan  wrote:
> That means I'm now OK with monkeypatching __build_class__ being the
> only way to get dynamic hooking of the class currently being defined
> from the class body - folks that really want that behaviour can
> monkeypatch it in, while folks that think it's a bad idea don't need
> to worry about.

I'd still prefer to only do that as an emulation of an agreed-upon
descriptor notification protocol, such that it's a backport of an
approved PEP, so I hope we can work that out.  But I guess if not,
then whatever works.  I just wish you'd been okay with it in 2012, as
there was more than once in the last few years where I had some
downtime and thought about trying to do some porting work.  :-(

And in the meantime, the only alternative Python implementation I know
of that's made *any* headway on Python 3 in the last few years (i.e.,
PyPy 3) *includes* a compatibly monkeypatchable __build_class__.  It
appears that the *other* obstacles to making a compatible Python 3
implementation are a lot tougher for implementers to get over than
compatibility with __build_class__.  ;-)

> Neither PEP 422 nor 487 are designed to eliminate metaclass conflicts
> in general, they're primarily designed to let base classes run
> arbitrary code after the namespace has been executed in a subclass
> definition *without* needing a custom metaclass.

And yet the argument was being made that the lack of custom metaclass
was a feature because it avoided conflict.  I'm just trying to point
out that if avoiding conflict is desirable, building *every possible
metaclass feature* into the Python core isn't a scalable solution.  At
this point, co-operative inheritance is a well-understood model in
Python, so providing an API to automatically mix metaclasses
(explicitly, at first) seems like a good step towards solving the
metaclass conflict problem in general.

When Guido introduced the new MRO scheme in Python 2.2, he noted that
the source he'd gotten that scheme from had explained that it could be
extended to automatically mixing metaclasses, but he (Guido) didn't
want to do that in Python until more experience was had with the new
MRO scheme in general.  And I think we have enough experience with
that *now*, to be able to take a step forward, by providing a
stdlib-blessed metaclass mixer.  It not only makes the prototype,
PyPI-based version of PEP 487 more usable immediately, it will also
encourage people to develop metaclasses as *mixins* rather than
one-size-fits-all monoliths.

For example, there's no reason that both of PEP 487''s features need
to live in the *same* metaclass, if you could trivially mix
metaclasses at the point where you inherit from bases with different
metaclasses.  (And eventually, a future version of Python could do the
mixing automatically, without the `noconflict` function.  The theory
was well-understood for other languages, after all, before Python 2.2
even came out.)

> No, you can't do it currently without risking a backwards
> incompatibility through the introduction of a custom metaclass.

Right...  which is precisely why I'm suggesting  the `noconflict()`
metaclass factory function as a *general* solution for providing
useful metaclasses, and why I think that PEP 487 should break the
namespacing and subclass init features into separate metaclasses, and
add that noconflict feature.  It will then become a good example for
people moving forward writing metaclasses.

Basically, as long as you don't have the pointless conflict errors,
you can write co-operative metaclass mixins as easily as you can write
regular co-operative mixins.  I was missing this point myself because
I've been too steeped in Python 2's complexities: writing a usable
version of `noconflict()` is a lot more complex and its invocation far
more obscure.  In Python 2, there's classic classes, class- and
module-level __metaclass__, ExtensionClass, and all sorts of other
headaches for automatic mixing.  In Python 3, though, all that stuff
goes out the window, and even my 90-line version that's almost half
comments is probably still overengineered compared to what's actually
needed to do the mixing.

>> Further, if the claim is that metaclass conflict potential makes PEP
>> 487 worthy of a language change, then by the same logic method
>> decorators are just as worthy of a language change, since any mixin
>> required to use a method decorator would be *just as susceptible* to
>> metaclass conflicts as SubclassInit.
>
> There wouldn't be a custom metaclass involved in the native
> implementation of PEP 487, only in the backport.

Right...  and if there were a native implementation of PEP 422, that
would also be the case for PEP 422.  The point is that if the PEP 487
can justify a *language* change to avoid needing a metaclass, then
arguably PEP 422 has an even *better* justification, because its need
to avoid needing a metaclass is at least as strong.  Indeed, you said
the same yourself as recent

Re: [Python-Dev] PEP 487 vs 422 (dynamic class decoration)

2015-04-05 Thread PJ Eby

On Sat, Apr 4, 2015 at 9:33 PM, Nick Coghlan  wrote:
> So actually reading https://gist.github.com/pjeby/75ca26f8d2a7a0c68e30
> properly, you're starting to convince me that a "noconflict" metaclass
> resolver would be a valuable and viable addition to the Python 3 type
> system machinery.
>
> The future possible language level enhancement would then be to make
> that automatic resolution of metaclass conflicts part of the *default*
> metaclass determination process. I realise you've been trying to
> explain that to me for a few days now, I'm just writing it out
> explicitly to make it clear I finally get it :)

I'm glad you got around to reading it.  Sometimes it's really
frustrating trying to get things like that across.

What's funny is that once I actually 1) wrote that version, and 2)
ended up doing a version of six's with_metaclass() function so I could
write 2/3 mixed code in DecoratorTools, I realized that there isn't
actually any reason why I can't write a Python 2 version of
noconflict.  Indeed, with a slight change to eliminate ClassType from
the metaclass candidate list, the Python 3 version would also work as
the Python 2 version: just use it as the explicit __metaclass__, or
use with_metaclass, i.e.:

class something(base1, base2, ...):
__metaclass__ = noconflict

# ...

or:

class something(with_metaclass(noconflict, base1, base2, ...)):
# ...

And the latter works syntactically from Python 2.3 on up.

> My apologies for that - while I don't actually recall what I was
> thinking when I said it, I suspect I was all fired up that PEP 422 was
> definitely the right answer, and hence thought I'd have an official
> solution in place for you in fairly short order. I should have let you
> know explicitly when I started having doubts about it, so you could
> reassess your porting options.

Well, at least it's done now.  Clearing up the issue allowed me to
spend some time on porting some of the relevant libraries this
weekend, where I promptly ran into challenges with several of the
*other* features removed from Python 3 (like tuple arguments), but
fortunately those are issues more of syntactic convenience than
irreplaceable functionality.  ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] async/await in Python; v2

2015-04-22 Thread PJ Eby

On Tue, Apr 21, 2015 at 1:26 PM, Yury Selivanov  wrote:
> It is an error to pass a regular context manager without ``__aenter__``
> and ``__aexit__`` methods to ``async with``.  It is a ``SyntaxError``
> to use ``async with`` outside of a coroutine.

I find this a little weird.  Why not just have `with` and `for` inside
a coroutine dynamically check the iterator or context manager, and
either behave sync or async accordingly?  Why must there be a
*syntactic* difference?

Not only would this simplify the syntax, it would also allow dropping
the need for `async` to be a true keyword, since functions could be
defined via "def async foo():" rather than "async def foo():"

...which, incidentally, highlights one of the things that's been
bothering me about all this "async foo" stuff: "async def" looks like
it *defines the function* asynchronously (as with "async with" and
"async for"), rather than defining an asynchronous function.  ISTM it
should be "def async bar():" or even "def bar() async:".

Also, even that seems suspect to me: if `await` looks for an __await__
method and simply returns the same object (synchronously) if the
object doesn't have an await method, then your code sample that
supposedly will fail if a function ceases to be a coroutine *will not
actually fail*.

In my experience working with coroutine systems, making a system
polymorphic (do something appropriate with what's given) and
idempotent (don't do anything if what's wanted is already done) makes
it more robust.  In particular, it eliminates the issue of mixing
coroutines and non-coroutines.

To sum up: I can see the use case for a new `await` distinguished from
`yield`, but I don't see the need to create new syntax for everything;
ISTM that adding the new asynchronous protocols and using them on
demand is sufficient.  Marking a function asynchronous so it can use
asynchronous iteration and context management seems reasonably useful,
but I don't think it's terribly important for the type of function
result.  Indeed, ISTM that the built-in `object` class could just
implement `__await__` as a no-op returning self, and then *all*
results are trivially asynchronous results and can be awaited
idempotently, so that awaiting something that has already been waited
for is a no-op.  (Prior art: the Javascript Promise.resolve() method,
which takes either a promise or a plain value and returns a promise,
so that you can write code which is always-async in the presence of
values that may already be known.)

Finally, if the async for and with operations have to be distinguished
by syntax at the point of use (vs. just always being used in
coroutines), then ISTM that they should be `with async foo:` and `for
async x in bar:`, since the asynchronousness is just an aspect of how
the main keyword is executed.

tl;dr: I like the overall ideas but hate the syntax and type
segregation involved: declaring a function async at the top is OK to
enable async with/for semantics and await expressions, but the rest
seems unnecessary and bad for writing robust code.  (e.g. note that
requiring different syntax means a function must either duplicate code
or restrict its input types more, and type changes in remote parts of
the program will propagate syntax changes throughout.)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Minimal async event loop and async utilities (Was: PEP 492: async/await in Python; version 4)

2015-05-15 Thread PJ Eby

On Mon, May 11, 2015 at 6:05 PM, Guido van Rossum  wrote:
> OTOH you may look at micropython's uasyncio -- IIRC it doesn't have Futures
> and it definitely has I/O waiting.

Here's a sketch of an *extremely* minimal main loop that can do I/O
without Futures, and might be suitable as a PEP example.  (Certainly,
it would be hard to write a *simpler* example than this, since it
doesn't even use any *classes* or require any specially named methods,
works with present-day generators, and is (I think) both 2.x/3.x
compatible.)

coroutines = [] # round-robin of currently "running" coroutines

def schedule(coroutine, val=None, err=None):
coroutines.insert(0, (coroutine, val, err))

def runLoop():
while coroutines:
(coroutine, val, err) = coroutines.pop()
try:
if err is not None:
suspend = coroutine.throw(err)
else
suspend = coroutine.send(val)
except StopIteration:
# coroutine is finished, so don't reschedule it
continue

except Exception:
# framework-specific detail  (i.e., log it, send
# to an error handling coroutine, or just stop the program
# Here, we just ignore it and stop the coroutine
continue

else:
if hasattr(suspend, '__call__') and suspend(coroutine):
continue
else:
# put it back on the round-robin list
schedule(coroutine)

To use it, `schedule()` one or more coroutines, then call `runLoop()`,
which will run as long as there are things to do.  Each coroutine
scheduled must yield *thunks*: callable objects that take a coroutine
as a parameter, and return True if the coroutine should be suspended,
or False if it should continue to run.  If the thunk returns true,
that means the thunk has taken responsibility for arranging to
`schedule()` the coroutine with a value or error when it's time to
send it the result of the suspension.

You might be asking, "wait, but where's the I/O?"  Why, in a
coroutine, of course...

readers = {}
writers = {}
timers = []

def readable(fileno):
"""yield readable(fileno) resumes when fileno is readable"""
def suspend(coroutine):
readers[fileno] = coroutine
return True
return suspend

def writable(fileno):
"""yield writable(fileno) resumes when fileno is writable"""
def suspend(coroutine):
writers[fileno] = coroutine
return True
return suspend

def sleepFor(seconds):
"""yield sleepFor(seconds) resumes after that much time"""
return suspendUntil(time.time() + seconds)

def suspendUntil(timestamp):
"""yield suspendUntil(timestamp) resumes when that time is reached"""
def suspend(coroutine)
heappush(timers, (timestamp, coroutine)
return suspend

   def doIO():
while coroutines or readers or writers or timers:

# Resume scheduled tasks
while timers and timers[0][0] <= time.time():
ts, coroutine = heappop(timers)
schedule(coroutine)

if readers or writers:
if coroutines:
# Other tasks are running; use minimal timeout
timeout = 0.001
else if timers:
timeout = max(timers[0][0] - time.time(), 0.001)
else:
timeout = 0 # take as long as necessary
r, w, e = select(readers, writers, [], timeout)
for rr in r: schedule(readers.pop(rr))
for ww in w: schedule(writers.pop(ww))

yield   # allow other coroutines to run

schedule(doIO())  # run the I/O loop as a coroutine

(This is painfully incomplete for a real framework, but it's a rough
sketch of how one of peak.events' first drafts worked, circa early
2004.)

Basically, you just need a coroutine whose job is to resume coroutines
whose scheduled time has arrived, or whose I/O is ready.  And of
course, some data structures to keep track of such things, and an API
to update the data structures and suspend the coroutines.  The I/O
loop exits once there are no more running tasks and nothing waiting on
I/O...  which will also exit the runLoop.  (A bit like a miniature
version of NodeJS for Python.)

And, while you need to preferably have only *one* such I/O coroutine
(to prevent busy-waiting), the I/O coroutine is completely
replaceable.  All that's required to implement one is that the core
runloop expose the count of active coroutines.  (Notice that, apart
from checking the length of `coroutines`, the I/O loop shown above
uses only the public `schedule()` API and the exposed thunk-suspension
protocol to do its thing.)

Also, note that you *can* indeed have mult

Re: [Python-Dev] PEP: Collecting information about git

2015-09-12 Thread PJ Eby

On Sat, Sep 12, 2015 at 9:54 AM, Oleg Broytman  wrote:
> The plan is to extend the PEP in the future collecting information
> about equivalence of Mercurial and git scenarios to help migrating
> Python development from Mercurial to git.

I couldn't find any previous discussion about this, but I figure I
should mention:

If the motivation here is to get away from the often-awful bitbucket
to the friendlier and more-popular Github, then it might be useful to
know that hg-git works beautifully with Github.  I have over a dozen
open source projects on Github that I manage entirely using hg command
lines without having yet touched git at all.  Even the forks and pull
requests I've done of others' projects on Github worked just fine, so
long as I remember to use hg bookmarks instead of hg branches.

It's possible there are things you can't do with Mercurial on Github,
but I haven't encountered one thus far.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 451 update

2013-10-25 Thread PJ Eby

I've not really had time to review this PEP yet, but from skimming
discussion to date, the only thing I'm still worried about is whether
this will break lazy import schemes that use a module subclass that
hooks __getattribute__ and calls reload() in order to perform what's
actually an *initial* load.

IOW, does anything in this proposal rely on a module object having
*any* attributes besides __name__ set at reload() time?  That is, is
there an assumption that a module being reloaded has

1. Been loaded, and
2. Is being reloaded via the same location, __loader__, etc. as before?

At least through all 2.x, reload() just uses module.__name__ to
restart the module find-and-load process, and does not assume that
__loader__ is valid in advance.

(Also, if this has changed in recent Python versions independent of
this PEP, it's a backwards-compatibility break that should be
documented somewhere.)


On Thu, Oct 24, 2013 at 2:05 AM, Eric Snow  wrote:
> I've had some offline discussion with Brett and Nick about PEP 451
> which has led to some meaningful clarifications in the PEP.  In the
> interest of pulling further discussions back onto this
> (archived/public) list, here's an update of what we'd discussed and
> where things are at. :)
>
> * path entry finders indicate that they found part of a possible
> namespace package by returning a spec with no loader set (but with
> submodule_search_locations set).  Brett wanted some clarification on
> this.
> * The name/path signature and attributes of file-based finders in
> importlib will no longer be changing.  Brett had some suggestions on
> the proposed change and it became clear that the the change was
> actually pointless.
> * I've asserted that there shouldn't be much difficulty in adjusting
> pkgutil and other modules to work with ModuleSpec.
> * Brett asked for clarification on whether the "load()" example from
> the PEP would be realized implicitly by the import machinery or
> explicitly as a method on ModuleSpec.  This has bearing on the ability
> of finders to return instances of ModuleSpec subclasses or even
> ModuleSpec-like objects (a la duck typing).  The answer is the it will
> not be a method on ModuleSpec, so it is effectively just part of the
> general import system implementation.  Finders may return any object
> that provides the attributes of ModuleSpec.  I will be updating the
> PEP to make these points clear.
>
> * Nick suggested writing a draft patch for the language reference
> changes (the import page).  Such a patch will be a pretty good
> indicator of the impact of PEP 451 on the import system and should
> highlight any design flaws in the API.  This is on my to-do list
> (hopefully by tomorrow).
> * Nick also suggested moving all ModuleSpec methods to a separate
> class that will simply make use of a separate, existing ModuleSpec
> instance.  This will help address several issues, particularly by
> relaxing the constraints on what finders can return, but also by
> avoiding the unnecessary exposure of the methods via every
> module.__spec__.  I plan on going with this, but currently am trying
> out the change to see if there are any problems I've missed.  Once I
> feel good about it I'll update the PEP.
>
> That about sums up our discussions.  I have a couple of outstanding
> updates to the PEP to make when I get a chance, as well as putting up
> a language reference patch for review.
>
> -eric
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/pje%40telecommunity.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 451 update

2013-10-25 Thread PJ Eby

On Fri, Oct 25, 2013 at 1:15 PM, Brett Cannon  wrote:
>
> On Fri, Oct 25, 2013 at 12:24 PM, PJ Eby  wrote:
>> At least through all 2.x, reload() just uses module.__name__ to
>> restart the module find-and-load process, and does not assume that
>> __loader__ is valid in advance.
>
>
> That doesn't make much sense in a post-importlib world where import makes
> sure that __loader__ is set (which it has since Python 3.3). Otherwise you
> are asking for not just a reload but a re-find as well.

That's a feature, not a bug.  A reload() after changing sys.path
*should* take into account the change, not to mention any changes to
meta_path, path hooks, etc.  (And it's how reload() worked before
importlib.)

I suppose it's not really documented all that well, but way way back
in the 2.3 timeframe I asked for a tweak to PEP 302 to make sure that
reload() (in the re-find sense) would work properly with PEP 302
loaders like zipimport -- some of the language still in the PEP is
there specifically to support this use case.  (Specifically, the bit
that says loaders *must* use the existing module object in sys.modules
if there's one there already, so that reload() will work.  It was
actually in part to ensure that reload() would work in the case of a
re-find.)

It appears that since then, the PEP has been changed in a way that
invalidates part of the purpose of the prior change; I guess I missed
the discussion of that change last year.  :-(

ISTM there should've been such a discussion, since IIRC importlib
wasn't supposed to change any Python semantics, and this is a
non-trivial change to the semantics of reload() even in cases that
aren't doing lazy imports or other such munging.  reload() used to
take sys.* and __path__ changes into account, and IMO should continue
to do so.  If this is an intentional change in reload() semantics,
other Python implementations need to know about this too!

(That being said, I'm not saying I shouldn't or couldn't have tested
this in 3.3 and found out about it that way.  And the existence of
issue18698 suggests that nobody's relying yet on even the
*fundamental* semantics of PEP 302 reload() working properly in 3.3,
since that was an even bigger change that nobody spotted till a couple
of months ago.)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 451 update

2013-10-25 Thread PJ Eby

On Fri, Oct 25, 2013 at 3:15 PM, Brett Cannon  wrote:
> On Fri, Oct 25, 2013 at 2:10 PM, PJ Eby  wrote:
>> On Fri, Oct 25, 2013 at 1:15 PM, Brett Cannon  wrote:
>> > On Fri, Oct 25, 2013 at 12:24 PM, PJ Eby  wrote:
>> >> At least through all 2.x, reload() just uses module.__name__ to
>> >> restart the module find-and-load process, and does not assume that
>> >> __loader__ is valid in advance.
>> >
>> >
>> > That doesn't make much sense in a post-importlib world where import
>> > makes
>> > sure that __loader__ is set (which it has since Python 3.3). Otherwise
>> > you
>> > are asking for not just a reload but a re-find as well.
>>
>> That's a feature, not a bug.  A reload() after changing sys.path
>> *should* take into account the change, not to mention any changes to
>> meta_path, path hooks, etc.  (And it's how reload() worked before
>> importlib.)
>
>
> Fair enough, but in my mind that obviously doesn't really click for what I
> view as a reload in an importlib world where secret import code no longer
> exists. When I think re-load I think "load again", not "find the module
> again and execute a load with a possibly new loader".

Sure, and the reference manual is rather vague on this point.
However, I would guess that at least some web frameworks with
automatic reload support are going to barf on this change in at least
some edge cases.

(OTOH, it's unlikely the bugs will ever be reported, because the
problem will mysteriously go away once the process is restarted,
probably never to occur again.)

Mostly, this just seems like an ugly wart -- Python should be dynamic
by default, and that includes reloading.  While the import machinery
has lots of ugly caching under the hood, a user-level function like
reload() should not require you to do the equivalent of saying, "no,
really...  I want you to *really* reload, not just pull in whatever
exists where you found it last time, while ignoring whether I switched
from module to package or vice versa, or just fixed my sys.path so I
can load the right version of the module."

It is a really tiny thing in the overall scheme of things, because
reload() is not used all that often, but it's still a thing.  If this
isn't treated as a bug, then the docs for reload() at least need to
include a forward-supported workaround so you can say "no, really...
*really* reload" in an approved fashion.

(ISTM that any production code out there that currently uses reload()
would want to perform the "really reload" incantation in order to
avoid the edge cases, even if they haven't actually run into any of
them yet.)

> And in a PEP 451 world it should be dead-simple to make this work the way
> you want in your own code even if this doesn't go the way you want::
>
>   spec = importlib.find_spec(name)
>   module.__spec__ = spec
>   importlib.reload(module)  # Which in itself is essentially
> init_module_attrs(spec, module); spec.loader.exec_module(module)
>
> Heck, you can do this in Python 3.3 right now::
>
>   loader = importlib.find_loader(name)
>   module = sys.modules[name]
>   module.__loader__ = loader
>   importlib.reload(module)

And will that later version still work correctly in a PEP 451 world,
or will you have to detect which world you live in before waving this
particular dead chicken?  ;-)

> Ah, okay. That is not explicit in the PEP beyond coming off a total nuisance
> in order to support reloading by the loader, not an explicit finder + loader
> use-case.

Yeah, it actually was to ensure that you could reload a module using a
different loader than the one that originally loaded it, e.g. due to a
change in path hooks, etc.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 451 update

2013-10-27 Thread PJ Eby

On Sun, Oct 27, 2013 at 1:03 AM, Nick Coghlan  wrote:
> Now, regarding the signature of exec_module(): I'm back to believing
> that loaders should receive a clear indication that a reload is taking
> place. Legacy loaders have to figure that out for themselves (by
> seeing that the module already exists in sys.modules), but we can do
> better for the new API by making the exec_module signature look like:
>
> def exec_module(self, module, previous_spec=None):
> # module is as per the current PEP 451 text
> # previous_spec would be set *only* in the reload() case
> # loaders that don't care still need to accept it, but can
> just ignore it

Just to be clear, this means that a lazy import implementation that
creates a module object without a __spec__ in the first place will
look like an initial import?  Or will that crash importlib because of
a missing __spec__ attribute?

That is, is reload()'s contract adding a new prerequisite for the
object passed to it?

(The specific use case is creating a ModuleType subclass instance for
lazy importing upon attribute access.  Pre-importlib, all that was
needed was a working __name__ attribute on the module.)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 451 update

2013-10-27 Thread PJ Eby

On Sun, Oct 27, 2013 at 4:59 PM, Nick Coghlan  wrote:
>
> On 28 Oct 2013 02:37, "PJ Eby"  wrote:
>>
>> On Sun, Oct 27, 2013 at 1:03 AM, Nick Coghlan  wrote:
>> > Now, regarding the signature of exec_module(): I'm back to believing
>> > that loaders should receive a clear indication that a reload is taking
>> > place. Legacy loaders have to figure that out for themselves (by
>> > seeing that the module already exists in sys.modules), but we can do
>> > better for the new API by making the exec_module signature look like:
>> >
>> > def exec_module(self, module, previous_spec=None):
>> > # module is as per the current PEP 451 text
>> > # previous_spec would be set *only* in the reload() case
>> > # loaders that don't care still need to accept it, but can
>> > just ignore it
>>
>> Just to be clear, this means that a lazy import implementation that
>> creates a module object without a __spec__ in the first place will
>> look like an initial import?  Or will that crash importlib because of
>> a missing __spec__ attribute?
>>
>> That is, is reload()'s contract adding a new prerequisite for the
>> object passed to it?
>>
>> (The specific use case is creating a ModuleType subclass instance for
>> lazy importing upon attribute access.  Pre-importlib, all that was
>> needed was a working __name__ attribute on the module.)
>
> For custom loaders, that's part of the contract for create_module() (since
> you'll get an ordinary module otherwise),

Huh?  I don't understand where custom loaders come into it.  For that
matter, I don't understand what "get an ordinary module object" means
here, either.

I'm talking about userspace code that implements lazy importing
features, like the lazyModule() function in this module:

   http://svn.eby-sarna.com/Importing/peak/util/imports.py?view=markup

Specifically, I'm trying to get an idea of how much that code will
need to change under the PEP (and apparently under importlib in
general).

> and so long as *setting* the
> special module attributes doesn't cause the module to be imported during the
> initial load operation, attribute access based lazy loading will work fine
> (and you don't even have to set __name__, since the import machinery will
> take care of that).

There's no "initial load operation", just creation of a dummy module
and stuffing it into sys.modules.  The way it works is that in, say,
foo/__init__.py, one uses:

 bar = lazyModule('foo.bar')
 baz = lazyModule('foo.baz')

Then anybody importing 'foo.bar' or 'foo.baz'  (or using "from foo
import bar", etc.) ends up with the lazy module.  That is, it's for
lazily exposing APIs, not something used as an import hook.

> For module level lazy loading that injects a partially initialised module
> object into sys.modules rather than using a custom loader or setting a
> __spec__ attribute, yes, the exec_module invocation on reloading would
> always look like a fresh load operation (aside from the fact that the custom
> instance would already be in sys.modules from the first load operation).

Right.

> It *will* still work, though (at least, it won't break any worse than such 
> code
> does today, since injecting a replacement into sys.modules really isn't
> reload friendly in the first place).

Wait, what?  Who's injecting a replacement into sys.modules?  A
replacement of what?  Or do you mean that loaders aren't supposed to
create new modules, but use the one in sys.modules?

Honestly, I'm finding all this stuff *really* confusing, which is kind
of worrying.  I mean, I gather I'm one of the handful of people who
really understood how importing *used to work*, and I'm having a lot
of trouble wrapping my brain around the new world.

(Granted, I think that may be because I understand how a lot of old
corner cases work, but what's bugging me is that I no longer
understand how those old corners work under the new regime, nor do I
feel I understand what the *new* corners will be.  This may also just
be communication problems, and the fact that it's been months since I
really walked through importlib line by line, and have never really
walked through it (or PEP 451) quite as thoroughly as I have import.c.
 I also seem to be having trouble grokking why the motivating use
cases for PEP 451 can't be solved by just providing people with good
base classes to use for writing loaders -- i.e., I don't get why the
core protocol has to change to address the use case of writing loaders
more easily.  The new protocol seems way more complex than PEP 302,
and ISTM the complexity could just be pushed off to the loader side of
the protocol without creating more interdependency between importlib
and the loaders.)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 451 update

2013-10-31 Thread PJ Eby

On Thu, Oct 31, 2013 at 5:52 AM, Nick Coghlan  wrote:
>
> On 31 Oct 2013 18:52, "Eric Snow"  wrote:
>>
>> On Wed, Oct 30, 2013 at 10:24 PM, Nick Coghlan  wrote:
>> > There's also the option of implementing the constraint directly in the
>> > finder, which *does* have the necessary info (with the change to pass
>> > the
>> > previous spec to find_spec).
>>
>> Yeah, I thought of that.  I just prefer the more explicit
>> supports_reload().  That said...
>>
>> >
>> > I still think it makes more sense to leave this out for the moment -
>> > it's
>> > not at all clear we need the extra method, and adding it later would be
>> > a
>> > straightforward protocol update.
>>
>> ...I agree that makes the most sense for now. :)
>>
>> BTW, thanks for pushing these issues.  I think the API has gotten
>> pretty solid.  I just need to make sure the PEP covers the cases and
>> conclusions we're discussing.
>
> Thanks are also due to PJE for making me realise we were handwaving too much
> when it came to the expected reload semantics :)

You're welcome.  ;-)  But speaking of handwaving, I also want to be
sure that loader developers know that "reloading" is only really
"reloading" if there's a previous existing spec, or the module type
is...

Hm.   Actually, I think I now know how to state what's bugging me
every time I see this "supports_reload()" or "reload=True" or other
reloading flags in this process.

I think that references to reloading should be replaced with
references to what's *actually* at issue, because "reloading" itself
is vague and carries too many assumptions for a loader author to
understand or get right.  (Look how hard it is for *us*!)

That is, I think we should clarify what use cases there are for
knowing whether a "reload" is happening, and address those use cases
explicitly rather than lumping them under a general heading.

For example, if the reason a loader cares about reloading is because
it's a C extension using a custom module type, and the existing module
isn't of the right type, then we should just spell out how to handle
it.  (e.g. raise an exception)

If the reason a loader cares about reloading is because of some sort
of caching or reuse, then we should just spell out how to handle that,
too.

Lumping these cases together under a "reloading" flag or a check for
"reloading" support is a nasty code smell, because it requires a
loader developer to have the *same* vaguely-defined idea of
"reloading" as the PEP authors.  ;-)

I also suspect, that if properly spelled out, those use cases are
going to boil down to:

1. Throwing errors if you have an existing module object you can't
load into, and
2. Passing in a previous spec object, if available

In other words, loaders should not really have any responsibility for
or concept of "reloading" -- they always load into a module object
(that they may or may not have created), and they may get given a spec
from a previous load.  They should deal only in "module reuse" and
"spec reuse".  While a typical reload() might involve both reuses,
there are cases where one sort of reuse could occur independently, and
not all loaders care about both (or even either) condition.

At any rate, it means a loader author doesn't have to figure out how
to handle "reloading", all they have to figure out is whether they can
load into a particular module object, and whether they can do
something useful with a spec that was previously used to load a module
with the same name -- a spec that may or may not refer to a similar
previous loader.  These are rather more well-defined endeavors than
trying to determine in the abstract whether one "supports reload".
;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Multiple inheritance from builtin (C) types [still] supported in Python3?

2014-04-29 Thread PJ Eby

On Mon, Apr 28, 2014 at 7:26 PM, Paul Sokolovsky  wrote:

> Well, sure I did, as I mentioned, but as that's first time I see that
> code (that specific piece is in typeobject.c:extra_ivars()), it would
> take quite some time be sure I understand all aspects of it. Thanks for
> confirming that it's governed essentially by CPython implementation
> details and not some language-level semantics like metaclasses (I
> mentioned them because error message in Python2 did so, though Python3
> doesn't refer to metaclasses).
>
> An example would really help me to get a feel of the issue, but I
> assume lack of them means that there's no well-known idiom where such
> inheritance is used, and that's good enough on its own. I also tried to
> figure how it's important to support such multi-base cases, so the code
> I write didn't require complete rewrite if it hits one day, but
> everything seems to turn out to be pretty extensible.
>

>From memory of the last time I dealt with this, the rules were that you
could mix two classes only if their __slots__ differed from their common
__base__ by *at most* __dict__ and/or __weakref__.  The dict and weakref
slots are special, in that the type structure contains their offsets, which
makes them relocatable in subclasses.  But any other __slots__ aren't
relocatable in subclasses, because the type structure doesn't directly keep
track of the offsets.  (The slot descriptors do.)

But I don't think there's anything in principle that requires this, it's
just the implementation.  You could in theory relocate __slots__ defined
from Python code in order to make a merged subclass.  It's just that the
effective "__slots__" of C code can't be moved, because C code is expecting
to find them at specific offsets.  Therefore, if two types define their own
struct fields, they can't be inherited from unless one is a subtype of the
other.

In the C code (again if I recall correctly), this is done using the
__base__ attribute of the type, which indicates what struct layout the
object will use.  A type can have a larger struct than its base type,
adding its own fields after the base type's struct fields.  (The dict and
weakref fields are added -- if they are added -- *after* the base struct
fields.  If your __base__ already has them, those offsets within the
existing layout are used, which is why them being in another base class's
__slots__ isn't a problem.)

When you create a new type, CPython looks at your bases to find a suitable
__base__.  If two of your bases inherit from each other, the ancestor can
be ignored, keeping the more-derived one as a candidate __base__.  If a
base adds only __dict__ and/or __weakref__ (or neither) to its __base__,
then its __base__ is a candidate (not recursively, though).  If at the end
there is more than one base left standing, then it's an error, since you
have bases with incompatible layouts.

That is not a precise description of the algorithm, but that's the gist of
how it works.  __base__ is a slot on the type object and is tracked at the
C level in order to sort out layouts like this.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Fwd: PEP 426 is now the draft spec for distribution metadata 2.0

2013-02-20 Thread PJ Eby

On Wed, Feb 20, 2013 at 5:30 AM, M.-A. Lemburg  wrote:
> The wording in the PEP alienates the egg format by defining
> an incompatible new standard for the location of the metadata
> file:

This isn't a problem, because there's not really a use case at the
moment for eggs to include a PEP 426-format metadata file, and if they
ever do, it ought to be called METADATA, so that pkg_resources will
know to read it if there are no egg-format dependencies listed.
Setuptools also doesn't care what format PKG-INFO is in, as it only
ever reads the "Version:" field, and that only in the case of
in-development source packages.

> It's easy to upgrade distribute and distutils to write
> metadata 1.2 format, simply by changing the version in the
> PKG-INFO files.

As soon as distutils does it, setuptools will do it, because
setuptools delegates metadata writing to distutils.  So there's no
"alienation" here.

What will need to happen at some point is for pkg_resources to
implement support for PEP 426-style version requirements, which I
haven't tried to fully wrap my head around as yet.  I'm hoping that
there are simple textual substitutions (e.g. regexes) that can be done
to convert them to pkg_resources requirements.  If need be, I'll swipe
whatever's needed from distlib.  ;-)

In the meantime, eggs aren't actually going anywhere, since wheels
aren't actually trying to replace all of their use cases.  And since
the new metadata and binary formats don't actually add much new
functionality over what eggs already do, eggs wouldn't lose much by
not being able to use the same metadata, anyway.  ;-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Distutils] PEP 426 is now the draft spec for distribution metadata 2.0

2013-02-20 Thread PJ Eby

On Tue, Feb 19, 2013 at 6:42 AM, Nick Coghlan  wrote:
> Nothing in the PEP is particularly original - almost all of it is
> either stolen from other build and packaging systems, or is designed
> to provide a *discoverable* alternative to existing
> setuptools/distribute/pip practices (specifically, the new extension
> mechanism means things like entry points can be defined in the
> standard metadata files without python-dev needing to get involved).

FWIW, I actually think this is a step in the wrong direction relative
to eggs; the ability to add new metadata files is a useful feature for
application frameworks.  For example, the EggTranslations project uses
egg metadata to implement resource localization plugins.  It lets you
have an application with plugins that either contain their own
translations, contain multiple translations for another plugin, a
single language translation for an assortment of plugins, etc.

These kinds of runtime-discovery use cases haven't seen much attention
in the metadata standard discussion.  On one level, that's fine,
because it makes sense that distribution-provided metadata should be
parseable by all tools, and that at build/download/install time the
performance and ease-of-use favor a single file approach.  That does
not mean, however, that the presence of other files is bad or should
be deprecated.  IMO, metadata that see significant runtime use
independent of the core metadata *should* appear in their own files,
even if redundant.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Planning on removing cache invalidation for file finders

2013-03-04 Thread PJ Eby

On Sun, Mar 3, 2013 at 12:31 PM, Brett Cannon  wrote:
> But how about this as a compromise over introducing write_module():
> invalidate_caches() can take a path for something to specifically
> invalidate. The path can then be passed to the invalidate_caches() on
> sys.meta_path. In the case of PathFinder it would take that path, try to
> find the directory in sys.path_importer_cache, and then invalidate the most
> specific finder for that path (if there is one that has any directory prefix
> match).
>
> Lots of little details to specify (e.g. absolute path forced anywhere in
> case a relative path is passed in by sys.path is all absolute paths? How do
> we know something is a file if it has not been written yet?), but this would
> prevent importlib from subsuming file writing specifically for source files
> and minimize performance overhead of invalidating all caches for a single
> file.

ISTR that when we were first discussing caching, I'd proposed a
TTL-based workaround for the timestamp granularity problem, and it was
mooted because somebody already proposed and implemented a similar
idea.  But my approach -- or at least the one I have in mind now --
would provide an "eventual consistency" guarantee, while still
allowing fast startup times.

However I think the experience with this heuristic so far shows that
the real problem isn't that the heuristic doesn't work for the normal
case; it works fine for that.  Instead, what happens is that *it
doesn't work when you generate modules*.

And *that* problem can be fixed without even invalidating the caches:
it can be fixed by doing some extra work when writing a module - e.g.
by making sure the directory mtime changes again after the module is
written.

For example, create the module under a temporary name, verify that the
directory mtime is different than it was before, then keep renaming it
to different temporary names until the mtime changes again, then
rename it to the final name.  (This would be very fast on some
platforms, much slower on others, but the OS itself would tell you
when it had worked.)  A utility routine to "write_module()" or
"write_package()" would be easier to find than advice that says to
invalidate the cache under thus-and-such conditions, as it would show
up in searches for writing modules dynamically or creating modules
dynamically, where you could only search for info about the cache if
you knew the cache existed.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] wsgi validator with asynchronous handlers/servers

2013-03-23 Thread PJ Eby

On Sat, Mar 23, 2013 at 3:05 PM, Luca Sbardella
 wrote:
> The pseudocode above does yields bytes before start_response, but they are
> not *body* bytes, they are empty bytes so that the asynchronous wsgi server
> releases the eventloop and call back at the next eventloop iteration.
>
> I'm I misinterpreting the pep, or the wsgi validator should be fixed
> accordingly?

The validator is correct for the spec.  You *must* call
start_response() before yielding any strings at all.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Safely importing zip files with C extensions

2013-03-27 Thread PJ Eby

On Wed, Mar 27, 2013 at 5:19 PM, Bradley M. Froehle
 wrote:
> I implemented just such a path hook  zipimporter plus the magic required
> for C extensions --- as a challenge to myself to learn more about the Python
> import mechanisms.
>
> See https://github.com/bfroehle/pydzipimport.

FYI, there appears to be a bug for Windows with packages: you're using
'/__init__' in a couple places that should actually be
os.sep+'__init__'.

This does seem like a good way to address the issue, for those rare
situations where this would be a good idea.

The zipped .egg approach was originally intended for user-managed
plugin directories for certain types of extensible platforms, where
"download a file and stick it in the plugins directory" is a
low-effort way to install plugins, without having to build a lot of
specialized install capability.

As Jim has pointed out, though, this doesn't generalize well to a
full-blown packaging system.

Technically, you can blame Bob Ippolito for this, since he's the one
who talked me into using eggs to install Python libraries in general,
not just as a plugin packaging mechanism.  ;-)

That being said, *unpacked* egg, er, wheels, are still a great way to
meet all of the "different apps needing different versions" use cases
(without needing one venv per app), and nowadays the existence of
automated installer tools means that using one to install a plugin for
a low-tech plugin system is not a big deal, as long as that tool
supports the simple unpacked wheel scenario.  So I wholeheartedly
support some kind of mount/unmount or "require"-type mechanism for
finding plugins.  pkg_resources even has an API for handling simple
dynamic plugin dependency resolution scenarios:

  http://peak.telecommunity.com/DevCenter/PkgResources#locating-plugins

It'd be a good idea if distlib provides a similar feature, or at least
the APIs upon which apps or frameworks can implement such features.

(Historical note for those who weren't around back then: easy_install
wasn't even an *idea* until well after eggs were created; the original
idea was just that people would build plugins and libraries as eggs
and manually drop them in directories, where a plugin support library
would discover them and add them to sys.path as needed.  And Bob and I
also considered a sort of "update site" mechanism ala Eclipse, with a
library to let apps fetch plugins.  But as soon as eggs existed and
PyPI allowed uploads, it was kind of an obvious follow-up to make an
installation tool as a kind of "technology demonstration"  which
promptly became a monster.   The full story with all its twists and
turns can also be found here:
http://mail.python.org/pipermail/python-dev/2006-April/064145.html )
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Can I introspect/reflect to get arguments exec()?

2013-03-27 Thread PJ Eby

On Tue, Mar 26, 2013 at 11:00 PM, Rocky Bernstein  wrote:
> Okay. But is the string is still somewhere in the CPython VM stack? (The
> result of LOAD_CONST 4 above). Is there a way to pick it up from there?

Maybe using C you could peek into the frame's value stack, but that's
not exposed to any Python API I know of.  But that still doesn't help
you, because the value will be removed from the stack before exec() is
actually called, which means if you go looking for it in code called
from the exec (e.g. the call event itself), you aren't going to see
the data.

> At the point that we are stopped the exec action hasn't taken place yet.

That doesn't help if you're using line-level tracing events.  At the
beginning of the line, the data's not on the call stack yet, and by
the time you enter the frame of the code being exec'd, it'll be off
the stack again.

Basically, there is no way to do what you're asking for, short of
replacing the built-in exec function with your own version.  And it
still won't help you with stepping through the source of functions
that are *compiled* using an exec() or compile(), or any other way of
ending up with dynamically-generated code you want to debug.

(Unless you use something like DecoratorTools to generate it, that is
-- DecoratorTools has some facilities for caching
dynamically-generated code so that it works properly with debuggers.
But that has to be done by the code doing the generation, not the
debugger.  If the code generator uses DecoratorTools' caching support,
then any debugger that uses the linecache module will Just Work.  It
might be nice for  the stdlib should have something like this, but you
could also potentially fake it by replacing the builtin eval, exec,
compile, etc. functions w/versions that cache the source.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Can I introspect/reflect to get arguments exec()?

2013-03-28 Thread PJ Eby

On Thu, Mar 28, 2013 at 6:43 AM, Rocky Bernstein  wrote:
> Of course the debugger uses sys.settrace too, so the evil-ness of that is
> definitely not a concern. But possibly I need to make sure that since the
> DecoratorTools and the debugger both hook into trace hooks they play nice
> together and fire in the right order.

DecoratorTools' trace hooking is unrelated to its linecache
functionality.  All you need from it is the cache_source() function;
you can pretty much ignore everything else for your purposes.  You'll
just need to give it a phony filename to work with, and the associated
string.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] relative import circular problem

2013-04-04 Thread PJ Eby

On Thu, Apr 4, 2013 at 11:17 AM, Guido van Rossum  wrote:
> I don't really see what we could change to avoid breaking code in any
> particular case

Actually, the problem has *nothing* to do with circularity per se;
it's that "import a.b" and "from a import b" behave differently in
terms of how they obtain the module 'a.b'...

And "from a import b" will *always* fail if 'a.b' is part of a cycle
with the current module, whereas "import a.b" will *always* succeed.

The workaround is to tell people to always use "import a.b" in the
case of circular imports; it's practically a FAQ, at least to me.  ;-)

The problem with "from import" is that it always tries to
getattr(a,'b'), even if 'a.b' is in sys.modules.  In contrast, a plain
import will simply fetch a.b from sys.modules first.

In the case of a normal import, this is no problem, because a.b is set
to sys.modules['a.b'] at the end of the module loading process.  But
if the import is circular, then the module is in sys.modules['a.b'],
but *not* yet bound to a.b.  So the "from import" fails.

So, this is actually an implementation quirk that could be fixed in a
(somewhat) straightforward manner: by making "from a import b" succeed
if 'a.b' is in sys.modules, the same way "import a.b" does.  It would
require a little bit of discussion to hash out the exact way to do it,
but it could be done.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] relative import circular problem

2013-04-04 Thread PJ Eby

On Thu, Apr 4, 2013 at 4:42 PM, Guido van Rossum  wrote:
> I do think it would be fine if "from a import b" returned the
> attribute 'b' of module 'a' if it exists, and otherwise look for
> module 'a.b' in sys.modules.

Technically, it already does that -- but inside of __import__, not in
the IMPORT_FROM opcode.

But then *after* doing that check-and-fallback, __import__ doesn't
assign a.b, because it assumes the recursive import it called has
already done this...  which means that when __import__ returns, the
IMPORT_FROM opcode tries and fails to do the getattr.

This could be fixed in one of two ways.   Either:

1. Change importlib._bootstrap._handle_fromlist() to set a.b if it
successfully imports 'a.b' (inside its duplicate handling for what
IMPORT_FROM does), or
2. Change the IMPORT_FROM opcode to handle the fallback itself

While the latter involves a bit of C coding, it has fewer potential
side-effects on the import system as a whole, and simply ensures that
if "import" would succeed, then so would "from...import" targeting the
same module.

(There might be other fixes I haven't thought of, but really, changing
IMPORT_FROM to fallback to a sys.modules check is probably by far the
least-invasive way to handle it.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] class name spaces inside an outer function

2013-04-27 Thread PJ Eby

On Sat, Apr 27, 2013 at 2:27 PM, Ethan Furman  wrote:
> I filed bug http://bugs.python.org/issue17853 last night.
>
> If somebody could point me in the right direction (mainly which files to
> look in), I'd be happy to attempt a patch.

Wow.  I had no idea Python actually did this (override class-local
references with ; I'd have expected your code to work.  I was even
more surprised to find that the same thing happens all the way back to
Python 2.3.  Guess I'm not nearly the wizard of abusing scope rules
that I thought I was.  ;-)

About the only workaround I can see is to put "Season = Season" at the
top of a class that uses this inside a function definition, or else to
define a special name 'enum' instead and hope that nobody ever tries
to define an enumeration inside a function with a local variable named
'enum'.  ;-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] enum discussion: can someone please summarize open issues?

2013-04-28 Thread PJ Eby

On Sun, Apr 28, 2013 at 7:37 PM, Steven D'Aprano  wrote:
> I have also suggested that that the enum package provide a decorator
> which can be used to explicitly flag values to *not* be turned into
> enum values. See here:
>
> http://mail.python.org/pipermail/python-dev/2013-April/125641.html

In that example, 'food = property(lambda:"skip")' would work in a
pinch. (Granted, it wouldn't be a *class* attribute, but you can make
a class attribute by assiging it after class creation is completed.)

And if you want to make your enum instances callable, ISTM the right
(or at least the One Obvious) way to do it is to add a __call__ method
to the class.

> Even if the Enum class doesn't support this feature, I ask that it be
> written in such a way that a subclass could add it (i.e. please expose
> a public method for deciding what to exclude).

Since you can exclude anything by it having a __get__ method, or
include it by making it *not* have a __get__ method, I'm not sure what
use case you're actually looking for.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Fighting the theoretical randomness of "is" on immutables

2013-05-06 Thread PJ Eby

On Mon, May 6, 2013 at 4:46 AM, Armin Rigo  wrote:
> This is clearly a language design issue though.  I can't really think
> of a use case that would break if we relax the requirement, but I
> might be wrong.  It seems to me that at most some modules like pickle
> which use id()-keyed dictionaries will fail to find some
> otherwise-identical objects, but would still work (even if tuples are
> "relaxed" in this way, you can't have cycles with only tuples).

I don't know if I've precisely understood the change you're proposing,
but I do know that in PEAK-Rules I use id() as an approximation for
"is" in order to build indexes of various "parameter is some_object"
conditions, for various "some_objects" and a given parameter.  The
rule engine takes id(parameter) at call time and then looks it up to
obtain a subset of applicable rules.  IIUC, this would require that
either "x is y" equates to "id(x)==id(y)", or else that there be some
way to determine in advance all the possible id(y)s that are now or
would ever be "is x", so they can be placed in the index.  Otherwise,
any use of an "is" condition would require a linear search of the
possibilities, as you could not rule out the possibility that a given
x was "is" to *some* y already in the index.

Of course, rules using "is" tend to be few and far between, outside of
some special cases, and their use with simple integers and strings
would be downright silly.  And on top of that, I'm not even sure
whether the "a <= b" notation you used was meant to signify "a implies
b" or "b implies a".  ;-)

But since you mentioned id()-keyed dictionaries and this is another
use of them that I know of, I figured I should at least throw it out
there for information's sake, regardless of which side of the issue it
lands on.  ;-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 443 - Single-dispatch generic functions

2013-05-23 Thread PJ Eby

On Thu, May 23, 2013 at 11:11 AM, Paul Moore  wrote:
> Is the debate between 1 and 2, or 1 and 3? Is it even possible to implement
> 3 without having 2 different names for "register"?

Yes.  You could do it as either:
@func.register
def doit(foo: int):
...

by checking for the first argument to register() being a function, or:

   @func.register()
def doit(foo: int):
...

by using a default None first argument.  In either case, you would
then raise a TypeError if there wasn't an annotation.

As to the ability to do multiple types registration, you could support
it only in type annotations, e.g.:

@func.register
def doit(foo: [int, float]):
...

without it being confused with being multiple dispatch.

One other thing about the register API that's currently unspecified in
the PEP: what does it return, exactly?

I generally lean towards returning the undecorated function, so that if you say:

@func.register
def do_int(foo: int):
...

You still have the option of calling it explicitly.  OTOH, some may
prefer to treat it like an overload and call it 'func' every time, in
which case register should return the generic function.  Some guidance
as to what should be the official One Obvious Way would be helpful
here.

(Personally, I usually name my methods explicitly because in debugging
it's a fast clue as to which piece of code I should be looking at.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 443 - Single-dispatch generic functions

2013-05-23 Thread PJ Eby

On Thu, May 23, 2013 at 2:59 PM, PJ Eby  wrote:
> I generally lean towards returning the undecorated function, so that if you 
> say:
>
> @func.register
> def do_int(foo: int):
> ...

Oops, forgot to mention:  one other advantage to returning the
undecorated function is that you can do this:

@func.register(int)
@func.register(float)
def do_num(foo):
   ...

Which neatly solves the multiple registration problem, even without
argument annotations.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 443 - Single-dispatch generic functions

2013-05-23 Thread PJ Eby

On Thu, May 23, 2013 at 6:58 PM, Ben Hoyt  wrote:
> It seems no one has provided
> decent use-case examples (apart from contrived ones)

Um, copy.copy(), pprint.pprint(), a bunch of functions in pkgutil
which are actually *based on this implementation already* and have
been since Python 2.5...  I don't see how any of those are contrived
examples.  If we'd had this in already, all the registration-based
functions for copying, pickling, etc. would likely have been
implemented this way, and the motivating example for the PEP is the
coming refactoring of pprint.pprint.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 443 - Single-dispatch generic functions

2013-05-23 Thread PJ Eby

On Thu, May 23, 2013 at 11:57 PM, Nick Coghlan  wrote:
> We should be able to use it to help deal with the "every growing
> importer API" problem, too. I know that's technically what pkgutil
> already uses it for, but elevating this from "pkgutil implementation
> detail" to "official stdlib functionality" should make it easier to
> document properly :)

Oh, that reminds me.  pprint() is actually an instance of a general
pattern that single dispatch GF's are good for: "visitor pattern"
algorithms.  There's a pretty good write-up on the general issues with
doing visitor pattern stuff in Python, and how single-dispatch GF's
can solve that problem, here:

http://peak.telecommunity.com/DevCenter/VisitorRevisited

The code samples use a somewhat different API from the PEP, but it's
pretty close.  The main issues solved are eliminating monkeypatching
and fixing the inheritance problems that occur when you use
'visit_foo' methods.  One of the samples actually comes from the old
'compiler' package in the stdlib...  which tells you how long ago I
did the write-up.  ;-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support)

2013-05-25 Thread PJ Eby

On Sat, May 25, 2013 at 8:08 AM, Łukasz Langa  wrote:
> The most important
> change in this version is that I introduced ABC support and completed
> a reference implementation.

Excellent!  A couple of thoughts on the implementation...

While the dispatch() method allows you to look up what implementation
would be *selected* for a target type, it does not let you figure out
whether a particular method has been *registered* for a type.

That is, if I have a class MyInt that subclasses int, I can't use
dispatch() to check whether a MyInt implementation has been
registered, because I might get back an implementation registered for
int or object.  ISTM there should be some way to get at the raw
registration info, perhaps by exposing a dictproxy for the registry.

Second, it should be possible to memoize dispatch() using a weak key
dictionary that is cleared if new ABC implementations have been
registered or when a call to register() is made.  The way to detect
ABC registrations is via the ABCMeta._abc_invalidation_counter
attribute: if its value is different than the previous value saved
with the cache, the cache must be cleared, and the new value stored.

(Unfortunately, this is a private attribute at the moment; it might be
a good idea to make it public, however, because it's needed for any
sort of type dispatching mechanism, not just this one particular
generic function implementation.)

Anyway, doing the memoizing in the wrapper function should bring the
overall performance very close to a hand-written type dispatch.  Code
might look something like:

# imported inside closure so that functools module
# doesn't force import of these other modules:
#
from weakref import ref, WeakKeyDictionary
from abc import ABCMeta

cache = WeakKeyDictionary()
valid_as_of = ABCMeta._abc_invalidation_counter

def wrapper(*args, **kw):
nonlocal valid_as_of
if valid_as_of != ABCMeta._abc_invalidation_counter:
cache.clear()
valid_as_of = ABCMeta._abc_invalidation_counter
cls = args[0].__class__
try:
impl = cache.data[ref(cls)]
except KeyError:
impl = cache[cls] = dispatch(cls)
return impl(*args, **kw)

def register(typ, func=None):
...
cache.clear()
...

This would basically eliminate doing any extra (Python) function calls
in the common case, and might actually be faster than my current
simplegeneric implementation on PyPI (which doesn't even do ABCs at
the moment).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support)

2013-05-25 Thread PJ Eby

On Sat, May 25, 2013 at 10:59 AM, Nick Coghlan  wrote:
> Given the global nature of the cache invalidation, it may be better as
> a module level abc.get_cache_token() function.

Well, since the only reason to ever use it is to improve performance,
it'd be better to expose it as an attribute than as a function.  ;-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] subclasses() return order

2013-05-25 Thread PJ Eby

On Sat, May 25, 2013 at 9:18 AM, Antoine Pitrou  wrote:
> In http://bugs.python.org/issue17936, I proposed making tp_subclasses
> (the internal container implementing object.__subclasses__) a dict.
> This would make the return order of __subclasses__ completely
> undefined, while it is right now slightly predictable. I have never seen
> __subclasses__ actually used in production code, so I'm wondering
> whether someone might be affected by such a change.

FWIW, when I've used __subclasses__, I've never depended on it having
a stable or predictable order.  (I find it somewhat difficult to
imagine *why* one would do that, but of course that doesn't mean
nobody has done it.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support)

2013-05-25 Thread PJ Eby

On Sat, May 25, 2013 at 4:16 PM, Łukasz Langa  wrote:
> So, the latest document is live:
> http://www.python.org/dev/peps/pep-0443/
>
> The code is here:
> http://hg.python.org/features/pep-443/file/tip/Lib/functools.py#l363
>
> The documentation here:
> http://hg.python.org/features/pep-443/file/tip/Doc/library/functools.rst#l189

Code and tests look great!

Nitpick on the docs and PEP, though: generic functions are not
composed of functions sharing the same name; it would probably be more
correct to say they're composed of functions that perform the same
operations on different types.  (I think the "names" language might be
left over from discussion of *overloaded* functions in PEP 3124 et al;
in any case we're actually recommending people *not* use the same
names now, so it's confusing.)

We should probably also standardize on the term used for the
registered functions.  The standard terminology is "method", but that
would be confusing in Python, where methods usually have a self
argument.  The PEP uses the term "implementation", and I think that
actually makes a lot of sense: a generic function is composed of
functions that implement the same operation for different types.

So I suggest changing this:

"""
Transforms a function into a single-dispatch generic function. A **generic
function** is composed of multiple functions sharing the same name. Which
form should be used during a call is determined by the dispatch algorithm.
When the implementation is chosen based on the type of a single argument,
this is known as **single dispatch**.

Adding an overload to a generic function is achieved by using the
:func:`register` attribute of the generic function. The
:func:`register` attribute is a decorator, taking a type paramater
and decorating a function implementing the overload for that type."""

to:

"""
Transforms a function into a single-dispatch generic function. A **generic
function** is composed of multiple functions implementing the same
operation for different types. Which
implementation should be used during a call is determined by the
dispatch algorithm.
When the implementation is chosen based on the type of a single argument,
this is known as **single dispatch**.

Adding an implementation to a generic function is achieved by using the
:func:`register` attribute of the generic function. The
:func:`register` attribute is a decorator, taking a type paramater
and decorating a function implementing the operation for that type."""

And replacing "overload" with "implementation" in the remainder of the
docs and code.

Last, but not least, there should be a stacking example somewhere in
the doc, as in the PEP, and perhaps the suggestion to name individual
implementations differently from each other and the main function --
perhaps as an adjunct to documenting that register() always returns
its argument unchanged.  (Currently, it doesn't mention what
register()'s return value is.)

(It may also be useful to note somewhere that, due to caching,
changing the base classes of an existing class may not change what
implementation is selected the next time the generic function is
invoked with an argument of that type or a subclass thereof.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support)

2013-05-28 Thread PJ Eby

On Tue, May 28, 2013 at 3:41 PM, Russell E. Owen  wrote:
> Is it true that this cannot be used for instance and class methods? It
> dispatches based on the first argument, which is "self" for instance
> methods, whereas the second argument would almost certainly be the
> argument one would want to use for conditional dispatch.

You can use a staticmethod and then delegate to it, of course.  But it
probably wouldn't be too difficult to allow specifying which argument
to dispatch on, e.g.:

@singledispatch.on('someArg')
def my_method(self, someArg, ...):
  ...

The code would look something like this:

def singledispatch(func, argPosn=0):
...
# existing code here...
...
def wrapper(*args, **kw):
return dispatch(args[argPosn].__class__)(*args, **kw)  #
instead of args[0]

def _dispatch_on(argname):
def decorate(func):
argPosn = # code to find argument position of argname for func
   return dispatch(func, argPosn)
return decorate

singledispatch.on = _dispatch_on

So, it's just a few lines added, but of course additional doc, tests,
etc. would have to be added as well.  (It also might be a good idea
for there to be some error checking in wrapper() to raise an
approriate TypeError if len(args)<=arg.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] doctest and pickle

2013-06-07 Thread PJ Eby

On Fri, Jun 7, 2013 at 1:54 PM, Mark Janssen  wrote:
> On Fri, Jun 7, 2013 at 10:50 AM, Mark Janssen  
> wrote:
>>> >>> from pickle import dumps, loads
>>> >>> Fruit.tomato is loads(dumps(Fruit.tomato))
>>> True
>>
>> Why are you using is here instead of ==?  You're making a circular
>> loop using "is"
>
> I should add that when you're serializing with pickle and then
> reloading, the objects should be seen as "essentially equivalent".
> This means that they are either byte-by-byte equivalent (not sure
> actually if Python actually guarantees this), or every element would
> still compare equal and that is what matters.

For global objects such as functions and classes -- and singletons
such as None, Ellipsis, True, and False -- pickling and unpickling is
actually supposed to retain the "is" relationship as well.

I don't know if enums *actually* preserve this invariant, but my
default expectation of the One Obvious Way would be that enums, being
uniquely-named objects that know their name and container, should be
considered global objects in the same fashion as classes and
functions, *and* that as singletons, they'd also be treated in the
same way as None, Ellipsis, etc.  That is, there are two independent
precedents for objects like that preserving "is" upon pickling and
unpickling.

(As another precedent, my own SymbolType library (available on PyPI)
preserves the "is"-ness of its named symbol objects upon pickling and
unpickling.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-checkins] cpython: Add reference implementation for PEP 443

2013-06-07 Thread PJ Eby

On Fri, Jun 7, 2013 at 10:27 AM, Thomas Wouters  wrote:
> This isn't a new bug, but it's exposed by always importing weakref and
> atexit during interpreter startup. I'm wondering if that's really necessary
> :)

Importing it during startup isn't necessary per se; imports needed
only by the generic function implementation can and should be imported
late, rather than at the time functools is imported.

However, if pkgutil was/is migrated to using this implementation of
generics, then it's likely going to end up imported during startup
anyway, because at least the -m startup path involves pkgutil.

In short, the overall answer right now is, "maybe", and the answer
later is "rather likely".  ;-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-checkins] cpython: Add reference implementation for PEP 443

2013-06-07 Thread PJ Eby

On Fri, Jun 7, 2013 at 5:16 PM, Łukasz Langa  wrote:
> On 7 cze 2013, at 22:50, PJ Eby  wrote:
>
>> On Fri, Jun 7, 2013 at 10:27 AM, Thomas Wouters  wrote:
>>> This isn't a new bug, but it's exposed by always importing weakref and
>>> atexit during interpreter startup. I'm wondering if that's really necessary
>>> :)
>>
>> In short, the overall answer right now is, "maybe", and the answer
>> later is "rather likely".  ;-)
>
> I would rather say that it's "rather certain".
>
> functools is necessary for setup.py to work while bootstrapping, whereas
> pkgutil is used in runpy.py which is always imported in Modules/main.c.
>
> So we're left with having to fix atexit to support subinterpreters. I wonder
> how difficult that will be.

If the problem really has to do with interpreter startup, then there
actually is a workaround possible, at the cost of slightly hairier
code.

If dispatch() looked in the registry *first* and avoided the cache in
that case, and lazily created the cache (including the weakref
import), then interpreter startup would not trigger an import of
weakref in the default case.

(Of course, depending on whether site/sitecustomize results in the use
of importer subclasses and such, this might not help.  It might be
necessary to take the even more complex tack of avoiding the use of a
cache entirely until an ABC is registered, and walking mro's.)

Anyway, it remains to be seen whether these workarounds are easier or
more difficult than fixing the atexit problem.  ;-)

Hmm...  actually, there are a couple other ways around this.
singledispatch doesn't use finalize(), so it doesn't really need
atexit.  It doesn't even do much with WeakKeyDictionary, so it could
actually just "from _weakref import ref", and inline the relevant
operations.

Or, WeakKeyDictionary could be pulled out into a separate module,
where singledispatch could pull it from without importing finalize.

Or, weakref.finalize could be fixed so that the atexit import and
register() are deferred until actual use.

(Of all of these, that last one actually sounds like the least
invasive workaround, with fewest lines of code likely to be changed.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] eval and triple quoted strings

2013-06-14 Thread PJ Eby

On Fri, Jun 14, 2013 at 2:11 PM, Ron Adam  wrote:
>
>
> On 06/14/2013 10:36 AM, Guido van Rossum wrote:
>>
>> Not a bug. The same is done for file input -- CRLF is changed to LF before
>> tokenizing.
>
>
>
> Should this be the same?
>
>
> python3 -c 'print(bytes("""\r\n""", "utf8"))'
> b'\r\n'
>
>
 eval('print(bytes("""\r\n""", "utf8"))')
> b'\n'

No, but:

eval(r'print(bytes("""\r\n""", "utf8"))')

should be.  (And is.)

What I believe you and Walter are missing is that the \r\n in the eval
strings are converted early if you don't make the enclosing string
raw.  So what you're eval-ing is not what you think you are eval-ing,
hence the confusion.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Classes with ordered namespaces

2013-06-27 Thread PJ Eby

On Thu, Jun 27, 2013 at 4:48 AM, Nick Coghlan  wrote:
> I'd be tempted to kill PEP 422 as not worth the hassle if we did this.

I assume you mean the "namespace" keyword part of PEP 422, since PEP
422's class initialization feature is entirely orthogonal to
definition order or namespace customization.  (Indeed, I cannot recall
a single instance of class initialization in my code that actually
*cared* about definition order.  Certainly I haven't had any
situations where a pre-existing definition order would've eliminated
the need for a class-level initialization.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Pre-PEP: Redesigning extension modules

2013-08-24 Thread PJ Eby

On Fri, Aug 23, 2013 at 4:50 AM, Stefan Behnel  wrote:
> Reloading and Sub-Interpreters
> ==
>
> To "reload" an extension module, the module create function is executed
> again and returns a new module type. This type is then instantiated as by
> the original module loader and replaces the previous entry in sys.modules.
> Once the last references to the previous module and its type are gone, both
> will be subject to normal garbage collection.

I haven't had a chance to address this on the import-sig discussion
yet about ModuleSpec, but I would like to just mention that one
property of the existing module system that I'm not sure either this
proposal or the ModuleSpec proposal preserves is that it's possible to
implement lazy importing of modules using standard reload() semantics.

My "Importing" package offers lazy imports by creating module objects
in sys.modules that are a subtype of ModuleType, and use a
__getattribute__ hook so that trying to use them fires off a reload()
of the module.  Because the dummy module doesn't have __file__ or
anything else initialized, the import system searches for the module
and then loads it, reusing the existing module object, even though
it's actually only executing the module code for the first time.

That the existing object be reused is important, because once the
dummy is in sys.modules, it can also be imported by other modules, so
references to it can abound everywhere, and we wish only for it to be
loaded lazily, without needing to trace down and replace all instances
of it.  This also preserves other invariants of the module system.

Anyway, the reason I was asking why reloading is being handled as a
special case in the ModuleSpec proposal -- and the reason I'm curious
about certain provisions of this proposal -- is that making the
assumption you can only reload something with the same
spec/location/etc. it was originally loaded with, and/or that if you
are reloading a module then you previously had a chance to do things
to it, doesn't jibe with the way things work currently.

That is to say, in the pure PEP 302 world, there is no special status
for "reload" that is different from "load" -- the *only* thing that's
different is that there is already a module object to use, and there
is *no guarantee that it's a module object that was initialized by the
loader now being invoked*.

AFAICT both this proposal and the ModuleSpec one are making an invalid
assumption per PEP 302, and aren't explicitly proposing to change the
status quo: they just assume things that aren't actually assured by
the prior specs or implementations.

So, for example, this extension module proposal needs to cover what
happens if an extension module is reloaded and the module object is
not of the type or instance it's expecting.  Must it do its own
checking?  Error handling?  Will some other portion of the import
system be expected to handle it?

For that matter, what happens (in either proposal) if you reload() a
module which only has a __name__, and no other attributes?  I haven't
tested with importlib, but with earlier Pythons this results in a
standard module search being done by reload().  But the ModuleSpec
proposal and this one seem to assume that a reload()-ed module must
already be associated with a loader, location, and/or spec.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] sys.intern should work on bytes

2013-09-20 Thread PJ Eby

On Fri, Sep 20, 2013 at 9:54 AM, Jesus Cea  wrote:
> Why str/bytes doesn't support weakrefs, beside memory use?

The typical use case for weakrefs is to break reference cycles, but
str and bytes can't *be* part of a reference cycle, so outside of
interning-like use cases, there's no need for weakref support there.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython: Rename contextlib.ignored() to contextlib.ignore().

2013-10-13 Thread PJ Eby

On Sun, Oct 13, 2013 at 10:05 AM, Antoine Pitrou  wrote:
> And for the record, it's not *my* objection; several other core
> developers have said -1 too: Ezio, Serhiy, Giampaolo, etc.

FWIW, I'm -1 also; the thread quickly convinced me that this is a
horrible idea, at least with the current name.

The feature itself I consider +0, maybe +0.5 if a good but short name
can be found.  I kind of like "abort_on()" as an accurate description
of what it actually does, but it most certainly does not *ignore*
exceptions, and it's going to create problems as soon as anybody adds
more than one statement to the block, and then reads their code back
without *really* thinking about it.  Not to mention how it's going to
bite people who copy and modify code snippets containing it.

On Sun, Oct 13, 2013 at 11:11 AM, Nick Coghlan  wrote:
> It's just as broken as the try/except equivalent. I consider that a
> feature, not a bug.

(Note: the following rant is about the *name*, not the context manager itself.)

Misleadingness and lack of readability is not a feature, it's a bug.

For example, even though I've been coding in Python since 1997, and
even participated in the original design process for "with", I *still*
misread the "with ignore:" block as ignoring the exceptions until I
*really* thought about it.

Wait, no, I misread it *entirely*, until somebody *else* pointed it out.  ;-)

And this is *despite* knowing on a gut level that *actually* ignoring
all the errors in a block *isn't possible in Python*.

I would not give most people much chance of noticing they made this
mistake, and even less chance of finding the problem afterwards.

This is like the infamous Stroop test, where you have a word like
"brown" only it's printed in blue ink and you have to say "blue" to
get the answer right.

If you've never taken a Stroop test, by the way, it's *really* hard.
It almost literally makes your brain *hurt* to disregard the text and
say the ink color instead, because your brain automatically reads the
word before you can stop it, so you are straining to stop yourself
from saying it so you can then try to *think* what color you're
supposed to say, and then your brain reads the word *again*, and...
well, it's really quite unpleasant is what it is.

Anyway, this feature, with its current name, is just the same: you
have to override your instinctive response to understand what it
*really* does, in any but the one-liner case.

And you have to do it *every time you read it in code*.

Only, because it'll mostly be used in the one-line case, you'll get
used to it being correct, until one day you make a change without
thinking, and create a bug that lies dormant for an extended period.

Plus, as soon as people see it being used, they'll think, "oh cool",
and use it in their code, not even knowing or thinking that it does
something they don't want, because they will never read the docs in
the first place.  (As Guido says, people learn languages by example.)

So call it "catching".  Call it "catch_and_exit_on".  Even "passing"
or "skipping" would be better.  And maybe "abort_on" or
"abort_without_raising" would be better still, as they describe what
will *really* happen.

But calling it "ignore" isn't "fits your brain", it's "abuses your
brain in a cruelly misleading manner".
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython: Rename contextlib.ignored() to contextlib.ignore().

2013-10-13 Thread PJ Eby

On Sun, Oct 13, 2013 at 1:58 PM, Alexander Belopolsky
 wrote:
> People who write code using contextlib
> are expected to know

People who *read* that code while learning Python cannot be expected
to know that it is not really possible to ignore errors in Python.

If this feature is used under any name that implies such, it will
within a few years become a FAQ and well-known wart, not to mention a
meme that "contextlib.ignore() is buggy, it only works if the error is
thrown from a single operation performed in C".

I say this latter phrasing because now that I've had time to think
about it, it is not at all merely a question of whether you wrap a
single line or single operation.

Quick, is this a safe use, or not:

with ignore(OSError):
 delete_widget_files(spam)

It sort of depends on the implementation of delete_widget_files, doesn't it?

In contrast:

   with abort_on(OSError):
 delete_widget_files(spam)

it's immediately clear that the error isn't going to be ignored; the
operation will be aborted.  Very different concept.

> that it is not a good idea to keep resources
> multiple unrelated statements within the
> with block will raise a mental red flag.

How will someone know this when they are reading code they found on
the internet?

It's one thing to have an operation whose name implies, "you need to
do more research to understand this".  But it's an entirely different
(and dangerous) thing to have an operation whose name implies you
already know everything you need to know, no need to think or study
further...  especially if what you know is actually wrong!

> It is also easy for
> lint-like tools to warn about abuse of ignore().

Since it's not sufficient to require a single operation, how will a
lint-like tool check this?

For example:

with ignore(AnError, OtherError):
 ping.pongy(foo, bar.spam(), fizzy())

Is this valid code, or not?  If you can't tell, how will a non-human
lint tool tell?

> Let's not try to improve readability of bad code

Actually, I want the good code to be "readable" in the sense of
understanding what the operation does, so that people copying it don't
end up with a serious misunderstanding of how the context manager
actually works.

There is no way that naive users aren't going to read it as ignoring
errors, and use it with something like:

with ignore(OSError):
for f in myfiles: os.unlink(f)

But this version is obviously *wrong*:

with abort_on(OSError):
for f in myfiles: os.unlink(f)

Upon looking at this code, you will quickly realize that you don't
intend to abort the loop, only the unlink, and will therefore rewrite
it to put the loop on the outside.

So, I am only trying to "improve readability of bad code" in the sense
of making it *obvious* that the code is in fact bad.  ;-)

(To put it another way, "ignore()" improves the readability of bad
code in the above example, because it makes the bad code look like
it's good.)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython: Rename contextlib.ignored() to contextlib.ignore().

2013-10-15 Thread PJ Eby

On Tue, Oct 15, 2013 at 11:52 AM, R. David Murray  wrote:
> I think 'trap' would be much clearer.

+1.  Short and sweet, and just ambiguous enough that you don't leap to
the conclusion that the error is ignored.  I agree that "suppress" is
basically a synonym for "ignore"; trap at least *implies* some kind of
control flow change, which is what's needed to prevent misconceptions.

Personally, I would rate "catch" higher than "trap" because it further
implies that it is catching a thrown exception, but would compromise
to "trap" if that'll end the thread sooner.  ;-)

>  What about making the context
> manager provide the trapped exception, in a fashion similar to
> what assertRaises does?

Sadly, that won't work, since context managers provide a value
*before* the block is executed, not after.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] cpython: Rename contextlib.ignored() to contextlib.ignore().

2013-10-15 Thread PJ Eby

On Tue, Oct 15, 2013 at 8:57 AM, Nick Coghlan  wrote:
> So, having been convinced that "ignore" was the wrong choice of name,
> reviewing the docs made it clear to me what the name *should* be.

>From the point of view of code *outside* a block, the error is indeed
suppressed.

But, as one of those examples actually points out, what's happening
from the POV *inside* the block is that the exception is "trapped".

So using "suppress" creates an ambiguity: are we suppressing these
errors *inside* the block, or *outside* the block?  The way it
actually works is errors are suppressed from the code *surrounding*
the block, but the word can equally be interpreted as suppressing
errors *inside* the block, in exactly the same way that "ignore" can
be misread.

So, if we're going with words that have precedent in the doc, the term
"trap", as used here:

> "If an exception is trapped merely in order to log it or to perform
> some action (rather than to suppress it entirely), the generator must
> reraise that exception."

is the only one used to describe the POV from inside the block, where
the error is...  well, being trapped.  ;-)

It is a more apt description of what actually happens, even if it's
only usable for the specific use case where an exception is trapped in
order to suppress it.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Status of the built-in virtualenv functionality in 3.3

2011-10-06 Thread PJ Eby

On Thu, Oct 6, 2011 at 12:02 PM, Barry Warsaw  wrote:

> Well, I have to be honest, I've *always* thought "nest" would be a good
> choice
> for a feature like this, but years ago (IIRC) PJE wanted to reserve that
> term
> for something else, which I'm not sure ever happened.
>

Actually, it was pretty much for this exact purpose -- i.e. it was the idea
of a virtual environment.  Ian just implemented it first, with some
different ideas about configuration and activation.  Since this is basically
the replacement for that, I don't have any objection to using the term here.
 (In my vision, "nest" was also the name of a package management tool for
creating such nests and manipulating their contents, though.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Packaging and binary distributions for Python 3.3

2011-10-09 Thread PJ Eby

On Sun, Oct 9, 2011 at 3:15 AM, Éric Araujo  wrote:

>  After all, if setuptools and then pkg_resources were turned
> down for inclusion in Python 2.5, it’s not now that we have packaging

 that we’ll change our mind and just bless eggs.

Actually, that's not what happened. I withdrew the approved-by-Guido,
announced-at-PyCon, and already-in-progress implementation, both because of
the lack of package management features, and because of support concerns
raised by Fredrik Lundh.  (At that time, the EggFormats doc didn't exist,
and there were not as many people familiar with the design or code as there
are now.)   For the full statement, see:

   http://mail.python.org/pipermail/python-dev/2006-April/064145.html

(The withdrawal is after a lot of background on the history of setuptools
and what it was designed for.)

In any case, it definitely wasn't the case that eggs or setuptools were
rejected for 2.5; they were withdrawn for reasons that didn't have anything
to do with the format itself.  (And, ironically enough, AFAIK the new
packaging module uses code that's actually based on the bits of setuptools
Fredrik was worried about supporting...  but at least there now are more
people providing that support.)

What we can do however
> is to see what bdist_egg does and define a new bdist command inspired by
> it, but without zipping, pkg_resource calls, etc.
>

Why?  If you just want a dumb bdist format, there's already bdist_dumb.
 Conversely, if you want a smarter format, why reinvent wheels?
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Packaging and binary distributions for Python 3.3

2011-10-09 Thread PJ Eby

On Sun, Oct 9, 2011 at 4:14 PM, Paul Moore  wrote:

> As regards the format, bdist_dumb is about the right level - but
> having just checked it has some problems (which if I recall, have been
> known for some time, and are why bdist_dumb doesn't get used).
> Specifically, bdist_dumb puts the location of site-packages ON THE
> BUILD SYSTEM into the archive, making it useless for direct unzipping
> on a target system which has Python installed somewhere else.
>

I don't know about the case for packaging/distutils2, but I know that in
original distutils, you can work around this by making bdist_dumb call the
install commands with different arguments.  That is, it's a relatively
shallow flaw in bdist_dumb.

bdist_wininst, for example, is basically a zipped bdist_dumb with altered
install arguments and an .exe header tacked on the front.  (Along with a
little extra data crammed in between the two.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Bring new features to older python versions

2011-10-11 Thread PJ Eby

On Tue, Oct 11, 2011 at 12:14 PM, Toshio Kuratomi wrote:

> This may not be the preferred manner to write decorators but it's fairly
> straightforward and easy to remember compared to, say, porting away from
> the
> with statement.
>

You can emulate 'with' using decorators, actually, if you don't mind a
nested function.  Some code from my Contextual library (minus the tests):

*def* *call_with*(ctxmgr):
*"""Emulate the PEP 343 "with" statement for Python versions <2.5

The following examples do the same thing at runtime::

Python 2.5+  Python 2.4
 -
with x as y: @call_with(x)
print y  def do_it(y):
 print y

``call_with(foo)`` returns a decorator that immediately invokes the
function it decorates, passing in the same value that would be bound by
the ``as`` clause of the ``with`` statement.  Thus, by decorating a
nested function, you can get most of the benefits of "with", at a cost of
being slightly slower and perhaps a bit more obscure than the 2.5 syntax.

Note: because of the way decorators work, the return value (if any) of the
``do_it()`` function above will be bound to the name ``do_it``.  So, this
example prints "42"::

@call_with(x)
def do_it(y):
return 42

print do_it

This is rather ugly, so you may prefer to do it this way instead, which
more explicitly calls the function and gets back a value::

def do_it(y):
return 42

print with_(x, do_it)
"""*
*return* with_.__get__(ctxmgr, type(ctxmgr))

*def* *with_*(ctx, func):
*"""Perform PEP 343 "with" logic for Python versions <2.5

The following examples do the same thing at runtime::

Python 2.5+  Python 2.3/2.4
 --
with x as y: z = with_(x,f)
z = f(y)

This function is used to implement the ``call_with()`` decorator, but
can also be used directly.  It's faster and more compact in the case where
the function ``f`` already exists.
"""*
inp = ctx.__enter__()
*try*:
retval = func(inp)
*except*:
*if* *not* ctx.__exit__(*sys.exc_info()):
*raise*
*else*:
ctx.__exit__(None, None, None)
*return* retval

This version doesn't handle the multi-context syntax of newer pythons, but
could probably be extended readily enough.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP397 no command line options to python?

2011-10-17 Thread PJ Eby

On Mon, Oct 17, 2011 at 8:55 AM, Sam Partington wrote:

> Yes it is a bit annoying to have to treat those specially, but other
> than -c/-m it does not need to understand pythons args, just check
> that the arg is not an explicit version specifier.  -q/-Q etc have no
> impact on how to treat the file.
>
> In fact there's no real need to treat -c differently as it's extremely
> unlikely that there is a file that might match. But for -m you can
> come up with a situation where if you it gets it wrong. e.g. 'module'
> and 'module.py' in the cwd.
>
> I would suggest that it is also unlikely that there will be any future
> options would need any special consideration.
>

What about -S (no site.py) and -E (no environment)?  These are needed for
secure setuid scripts on *nix; I don't know how often they'd be used in
practice on Windows.  (Basically, they let you isolate a script's effective
sys.path; there may be some use case overlap with virtual envs.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP397 no command line options to python?

2011-10-17 Thread PJ Eby

On Mon, Oct 17, 2011 at 8:00 PM, Mark Hammond wrote:

> On 18/10/2011 3:24 AM, PJ Eby wrote:
>
>> What about -S (no site.py) and -E (no environment)?  These are needed
>> for secure setuid scripts on *nix; I don't know how often they'd be used
>> in practice on Windows.  (Basically, they let you isolate a script's
>> effective sys.path; there may be some use case overlap with virtual envs.
>>
>
> It is worth pointing out that options can be specified directly in the
> shebang line - eg, a line like "#! /usr/bin/python -S" in a foo.py works as
> expected.

Ah, ok.  Never mind, then.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Packaging and binary distributions

2011-11-04 Thread PJ Eby

On Sun, Oct 30, 2011 at 6:52 PM, Paul Moore  wrote:

> On 30 October 2011 18:04, Ned Deily  wrote:
> > Has anyone analyzed the current packages on PyPI to see how many provide
> > binary distributions and in what format?
>
> A very quick and dirty check:
>
> dmg: 5
> rpm: 12
> msi: 23
> dumb: 132
> wininst: 364
> egg: 2570
>
> That's number of packages with binary distributions in that format.
> It's hard to be sure about egg distributions, as many of these could
> be pure-python (there's no way I know, from the PyPI metadata, to
> check this).
>

FYI, the egg filename will contain a distutils platform identifier (e.g.
'win32', 'macosx', 'linux', etc.)  after the 'py2.x' tag if the egg is
platform-specific.  Otherwise, it's pure Python.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Packaging and binary distributions

2011-11-04 Thread PJ Eby

Urgh.  I guess that was already answered.  Guess this'll teach me not to
reply to a thread before waiting for ALL the messages to download over a
low-bandwidth connection...  (am on the road at the moment and catching up
on stuff in spare cycles - sorry for the noise)

On Fri, Nov 4, 2011 at 10:24 PM, PJ Eby  wrote:

> On Sun, Oct 30, 2011 at 6:52 PM, Paul Moore  wrote:
>
>> On 30 October 2011 18:04, Ned Deily  wrote:
>> > Has anyone analyzed the current packages on PyPI to see how many provide
>> > binary distributions and in what format?
>>
>> A very quick and dirty check:
>>
>> dmg: 5
>> rpm: 12
>> msi: 23
>> dumb: 132
>> wininst: 364
>> egg: 2570
>>
>> That's number of packages with binary distributions in that format.
>> It's hard to be sure about egg distributions, as many of these could
>> be pure-python (there's no way I know, from the PyPI metadata, to
>> check this).
>>
>
> FYI, the egg filename will contain a distutils platform identifier (e.g.
> 'win32', 'macosx', 'linux', etc.)  after the 'py2.x' tag if the egg is
> platform-specific.  Otherwise, it's pure Python.
>
>
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 382 specification and implementation complete

2011-11-06 Thread PJ Eby

On Sun, Nov 6, 2011 at 7:29 AM, Nick Coghlan  wrote:

> I think this was based on the assumption that *existing* namespace
> package approaches would break under the new scheme. Since that is not
> the case, I suspect those previous objections were overstated (and all
> packaging related code manages to cope well enough with modules where
> the file name doesn't match the package name)
>

I was actually referring to all the code that does things like split
package names on '.' and then use os.path.join, or that makes assumptions
which are the moral equivalent of that.  PEP 402's version of namespace
packages should break less of that sort of code than adding a directory
name extension.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 402: Simplified Package Layout and Partitioning

2011-11-30 Thread PJ Eby

On Sat, Nov 26, 2011 at 11:53 AM, Éric Araujo  wrote:

> > Le 11/08/2011 20:30, P.J. Eby a écrit :
> >> At 04:39 PM 8/11/2011 +0200, Éric Araujo wrote:
>  >>> I’ll just regret that it's not possible to provide a module docstring
> >>> to inform that this is a namespace package used for X and Y.
> >> It *is* possible - you'd just have to put it in a "zc.py" file.  IOW,
> >> this PEP still allows "namespace-defining packages" to exist, as was
> >> requested by early commenters on PEP 382.  It just doesn't *require*
> >> them to exist in order for the namespace contents to be importable.
>
> That’s quite cool.  I guess such a namespace-defining module (zc.py
> here) would be importable, right?

Yes.

>  Also, would it cause worse
> performance for other zc.* packages than if there were no zc.py?
>

No.  The first import of a subpackage sets up the __path__, and all
subsequent imports use it.

> >>> A pure virtual package having no source file, I think it should have no

>>> __file__ at all.
>
> Antoine and someone else thought likewise (I can find the link if you
> want); do you consider it consensus enough to update the PEP?
>

Sure.  At this point, though, before doing any more work on the PEP I'd
like to have some idea of whether there's any chance of it being accepted.
 At this point, there seems to be a lot of passive, "Usenet nod syndrome"
type support for it, but little active support.

It doesn't help at all that I'm not really in a position to provide an
implementation, and the persons most likely to implement have been leaning
somewhat towards 382, or wanting to modify 402 such that it uses .pyp
directory extensions so that PEP 395 can be supported...

And while 402 is an extension of an idea that Guido proposed a few years
ago, he hasn't weighed in lately on whether he still likes that idea, let
alone whether he likes where I've taken it.  ;-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-09 Thread PJ Eby

On Fri, Dec 9, 2011 at 10:11 AM, Barry Warsaw  wrote:

> As Chris points out, this seems to be a use case tied to WSGI and PEP
> .  I
> guess it's an unfortunate choice for so recent a PEP, but maybe there was
> no
> way to do better.

For the record, "native strings" are defined the way they are because of
IronPython and Jython, which had unicode strings long before CPython.  At
the time WSGI was developed, the approach for Python 3 (then called "3000")
was expected to be similar, and the new I/O system was not (AFAIR) designed
yet.

All that changed in PEP  was introducing *byte* strings (to accommodate
the I/O changes), not native strings.

In fact, I'm not sure why people are bringing it into this discussion at
all: PEP  was designed to work well with 2to3, which does the right
thing for WSGI code: it converts 2.x "str" to 3.x "str", as it should.  If
you're writing 2.x WSGI code with 'u' literals, *your code is broken*.

WSGI doesn't need 'u' literals and never has.  It *does* need b'' literals
for stuff that refers to request and response bodies, but everything else
should be plain old string literals for the appropriate Python version.

It can certainly be useful in many contexts outside of WSGI.
>

And *only* there, pretty much.  ;-)  PEP  was designed to work with the
official upgrade path (2to3), which is why it has a concept of native
strings.  Thing is, if you mark them with a 'u', you're writing incorrect
code for  2.x.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Tag trackbacks with version (was Re: readd u'' literal support in 3.3?)

2011-12-10 Thread PJ Eby

On Fri, Dec 9, 2011 at 11:11 PM, Terry Reedy  wrote:

> This just gave me the idea of tagging tracebacks with the Python version
> number. Something like
>
> Traceback (Py3.2.2, most recent call last):
>
> and perhaps with the platform also
>
> Traceback (most recent call last) [Py3.2.2 on win23]:
>
> Since computation has stopped, the few extra milliseconds is trivial. This
> would certainly help on Python list and the tracker when people do post the
> traceback (which they do not always) without version and system (which they
> often do not, especially on Python list). It might suggest to people that
> this is important info to include. I wonder if this would also help with
> tracebacks sent to library/app developers.
>

Yes, but doctest will need to take this into account, both for its native
traceback matcher, and for traceback matches using ellipses.  Otherwise you
introduce more Python version hell for doctest users.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Tag trackbacks with version (was Re: readd u'' literal support in 3.3?)

2011-12-11 Thread PJ Eby

On Sat, Dec 10, 2011 at 5:30 PM, Terry Reedy  wrote:

> Is doctest really insisting that the whole line
>  Traceback (most recent call last):
> exactly match, with nothing added? It really should not, as that is not
> part of the language spec. This seems like the tail wagging the dog.
>

It's a regular expression match, actually.  The standard matcher ignores
everything between the Traceback line (matched by a regex) and the first
unindented line that follows in the doctest.  However, if you explicitly
try to match a traceback with the ellipsis matcher, intending to observe
whether certain specific lines are printed, then you wouldn't be using
doctest's built-in matcher, and that was the case I was concerned about.

However, as it turns out, I was confused about when this latter case
occurs: in order to do it, you have to actually intentionally print a
traceback (e.g. via traceback.format_exception() and friends), rather than
allowing the exception to propagate normally.  This doesn't happen nearly
as often in my doctests as I thought it did, but if format_exception()
changes it'll still affect some people.

The other piece I was pointing out was that if you change the message
without changing the doctest regex, then pasting an interpreter transcript
into a doctest will no longer work, because doctest will think it's trying
to match non-error output.  So that has to be changed when the exception
format changes.

So, no actual objection here; just saying that if you don't change that
regex, people who create *new* doctests with tracebacks won't be able to
get them to work without deleting the version info from their copy-pasted
tracebacks.  I was also concerned about a situation that, while it exists,
does not occur anywhere near as frequently as I thought it would in my own
tests, even for things that seriously abuse Python internals and likely
can't be ported to Python 3 anyway.  ;-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-12 Thread PJ Eby

On Mon, Dec 12, 2011 at 3:40 AM, Chris McDonough  wrote:

> Truth be told, in the vast majority of WSGI apps only high-level WSGI
> libraries (like WebOb and Werkzeug) and standalone middleware really
> needs to work with native strings.  And the middleware really should be
> using the high-level libraries to parse WSGI anyway.  So there are a
> finite number of places where it's actually a real issue.
>

And those only if they're using "six" or a similar joint-codebase strategy,
*and* using unicode_literals in a 2.x module that also does WSGI.  If
they're using 2to3 and stick with explicit u'', they'll be fine.

Unfortunately, AFAIR, nobody in the PEP  discussions brought up either
the unicode_literals import OR the strategy of using a common codebase, so
2to3 on plain code and writing new Python3 code were the only porting
scenarios discussed.  (Not that I'm sure it would've made a difference, as
I'm not sure what we could have done differently that would still support
simple Python3 code and easy 2to3 porting.)

As someone who ported WebOb and other stuff built on top of it to Python
> 3 without using "from __future__ import unicode_literals", I'm kinda sad
> that to be using best practice I'll have to go back and flip the
> polarity on everything.

Eh?  If you don't need unicode_literals, what's the problem?
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-13 Thread PJ Eby

On Tue, Dec 13, 2011 at 11:24 AM, Antoine Pitrou wrote:

> On Tue, 13 Dec 2011 15:28:31 +0100
> "Laurence Rowe"  wrote:
> >
> > The approach that most people seem to have settled on for porting
> > libraries to Python 3 is to make a single codebase that is compatible
> with
> > both Python 2 and Python 3, perhaps making use of the six library.
>
> Do you have evidence that "most" people have settled on that approach?
> (besides the couple of library writers who have commented on this
> thread)
>

I've seen more projects doing it that way than maintaining dual code bases.
 In retrospect, it seems way more attractive than having to run a converter
all the time, especially if I could run a "2to6" tool *once* and then
simply write new code using six-isms

Among other things, it means that:

* There's only one codebase
* If the conversion isn't perfect, you only have to fix it once
* Line numbers are the same
* There's no conversion step slowing down development

So, I expect that if the approach is at all viable, it'll quickly become
the One Obvious Way to do it.  In effect, 2to3 is a "purity" solution, but
six is more like a "practicality" solution.

And if there's official support for it, so much the better.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] readd u'' literal support in 3.3?

2011-12-13 Thread PJ Eby

On Tue, Dec 13, 2011 at 7:30 PM, Antoine Pitrou  wrote:

> On Tue, 13 Dec 2011 14:02:45 -0500
> PJ Eby  wrote:
> >
> > Among other things, it means that:
> >
> > * There's only one codebase
> > * If the conversion isn't perfect, you only have to fix it once
> > * Line numbers are the same
> > * There's no conversion step slowing down development
> >
> > So, I expect that if the approach is at all viable, it'll quickly become
> > the One Obvious Way to do it.
>
> Well, with all due respect, this is hand-waving. Sure, if it's
> viable, then fine. The question is if it's "viable", precisely. That
> depends on which project we're talking about.
>

What I'm saying is that it has many characteristics that are desirable for
people who need to support Python 2 and 3 - which is likely the most common
use case for library developers.

> In effect, 2to3 is a "purity" solution, but
> > six is more like a "practicality" solution.
>
> This sounds like your personal interpretation. I see nothing "pure" in
> 2to3.
>

It's "pure" in being optimized for a world where you just stop using Python
2 one day, and start using 3 the next, without any crossover support.

As someone else pointed out, this is a more common case for application
developers than for library developers.  However, until the libraries are
ported, it's harder for the app developers to port their apps.

Anyway, if you're supporting both 2 and 3, a common code base offers many
attractions, so if it can be done, it will.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-29 Thread PJ Eby

On Thu, Dec 29, 2011 at 8:32 AM, Christian Heimes  wrote:

> IMHO we don't have to alter the outcome of hash("some string"), hash(1)
> and all other related types. We just need to reduce the change the an
> attacker can produce collisions in the dict (and set?) code that looks
> up the slot (PyDictEntry). How about adding the random value in
> Object/dictobject.c:lookdict() and lookdict_str() (Python 2.x) /
> lookdict_unicode() (Python 3.x)? With this approach the hash of all our
> objects stay the same and just the dict code needs to be altered.

I don't understand how that helps a collision attack.  If you can still
generate two strings with the same (pre-randomized) hash, what difference
does it make that the dict adds a random number?  The post-randomized
number will still be the same, no?

Or does this attack just rely on the hash *remainders* being the same?  If
so, I can see how hashing the hash would help.  But since the attacker
doesn't know the modulus, and it can change as the dictionary grows, I
would expect the attack to require matching hashes, not just matching hash
remainders...  unless I'm just completely off base here.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-31 Thread PJ Eby

On Sat, Dec 31, 2011 at 7:03 AM, Stephen J. Turnbull wrote:

> While the dictionary probe has to start with a hash for backward
> compatibility reasons, is there a reason the overflow strategy for
> insertion has to be buckets containing lists?  How about
> double-hashing, etc?
>

This won't help, because the keys still have the same hash value. ANYTHING
you do to them after they're generated will result in them still colliding.

The *only* thing that works is to change the hash function in such a way
that the strings end up with different hashes in the first place.
 Otherwise, you'll still end up with (deliberate) collisions.

(Well, technically, you could use trees or some other O log n data
structure as a fallback once you have too many collisions, for some value
of "too many".  Seems a bit wasteful for the purpose, though.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-31 Thread PJ Eby

On Sat, Dec 31, 2011 at 4:04 PM, Jeffrey Yasskin  wrote:

> Hash functions are already unstable across Python versions. Making
> them unstable across interpreter processes (multiprocessing doesn't
> share dicts, right?) doesn't sound like a big additional problem.
> Users who want a distributed hash table will need to pull their own
> hash function out of hashlib or re-implement a non-cryptographic hash
> instead of using the built-in one, but they probably need to do that
> already to allow themselves to upgrade Python.
>

Here's an idea.  Suppose we add a sys.hash_seed or some such, that's
settable to an int, and defaults to whatever we're using now.  Then
programs that want a fix can just set it to a random number, and on Python
versions that support it, it takes effect.  Everywhere else it's a silent
no-op.

Downside: sys has to have slots for this to work; does sys actually have
slots?  My memory's hazy on that.  I guess actually it'd have to be
sys.set_hash_seed().  But same basic idea.

Anyway, this would make fixing the problem *possible*, while still pushing
off the hard decisions to the app/framework developers.  ;-)

Downside: every hash operation includes one extra memory access, but
strings only compute their hash once anyway.)

Given that changing dict won't help, and changing the default hash is a
non-starter, an option to set the seed is probably the way to go.  (Maybe
with an environment variable and/or command line option so users can work
around old code.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] http://mail.python.org/pipermail/python-dev/2011-December/115172.html

2012-01-01 Thread PJ Eby

On Sun, Jan 1, 2012 at 7:37 PM, Jim Jewett  wrote:

> Well, there is nothing wrong with switching to a different hash function
> after N
> collisions, rather than "in the first place".  The perturbation
> effectively does by
> shoving the high-order bits through the part of the hash that survives the
> mask.
>

Since these are true hash collisions, they will all have the same high
order bits.  So, the usefulness of the perturbation is limited mainly to
the common case where true collisions are rare.

> (Well, technically, you could use trees or some other O log n data
> > structure as a fallback once you have too many collisions, for some value
> > of "too many".  Seems a bit wasteful for the purpose, though.)
>
> Your WSGI specification < http://www.python.org/dev/peps/pep-0333/ >
> requires
> using a real dictionary for compatibility; storing some of the values
> outside the
> values array would violate that.

When I said "use some other data structure", I was referring to the
internal implementation of the dict type, not to user code.  The only
user-visible difference (even at C API level) would be the order of keys()
et al.  (In any case, I still assume this is too costly an implementation
change compared to changing the hash function or seeding it.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] That depends on what the meaning of "is" is (was Re: http://mail.python.org/pipermail/python-dev/2011-December/115172.html)

2012-01-01 Thread PJ Eby

On Sun, Jan 1, 2012 at 10:28 PM, Jim Jewett  wrote:

> Given the wording requiring a real dictionary, I would have assumed
> that it was OK (if perhaps not sensible) to do pointer arithmetic and
> access the keys/values/hashes directly.  (Though if the breakage was
> between python versions, I would feel guilty about griping too
> loudly.)
>

If you're going to be a language lawyer about it, I would simply point out
that all the spec requires is that "type(env) is dict" -- it says nothing
about how Python defines "type" or "is" or "dict".  So, you're on your own
with that one. ;-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] That depends on what the meaning of "is" is (was Re: http://mail.python.org/pipermail/python-dev/2011-December/115172.html)

2012-01-02 Thread PJ Eby

On Mon, Jan 2, 2012 at 4:07 PM, Jim Jewett  wrote:

> On Mon, Jan 2, 2012 at 1:16 AM, PJ Eby  wrote:
> > On Sun, Jan 1, 2012 at 10:28 PM, Jim Jewett 
> wrote:
> >>
> >> Given the wording requiring a real dictionary, I would have assumed
> >> that it was OK (if perhaps not sensible) to do pointer arithmetic and
> >> access the keys/values/hashes directly.  (Though if the breakage was
> >> between python versions, I would feel guilty about griping too
> >> loudly.)
>
> > If you're going to be a language lawyer about it, I would simply point
> out
> > that all the spec requires is that "type(env) is dict" -- it says nothing
> > about how Python defines "type" or "is" or "dict".  So, you're on your
> own
> > with that one. ;-)
>
> But the public header file <
> http://hg.python.org/cpython/file/3ed5a6030c9b/Include/dictobject.h >
> defines the typedef structs for PyDictEntry and _dictobject.
>
> What is the purpose of the requiring a "real dict" without also
> promising what the header file promises?
>
>
Er, just because it's in the .h doesn't mean it's in the public API.  But
in any event, if you're actually serious about this, I'd just point out
that:

1. The struct layout doesn't guarantee anything about insertion or lookup
algorithms,
2. If the data structure were changed, the header file would obviously
change as well, and
3. ISTM that Python does not even promise inter-version ABI compatibility
for internals like the dict object layout.

Are you seriously writing code that relies on the C structure layout of
dicts?  Because really, that was SO not the point of the dict type
requirement.  It was so that you could use Python's low-level *API* calls,
not muck about with the data structure directly.  I'm occasionally
considered notorious for abusing Python internals, but even I have to draw
the line somewhere.  ;-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Proposed PEP on concurrent programming support

2012-01-03 Thread PJ Eby

On Tue, Jan 3, 2012 at 7:40 PM, Mike Meyer  wrote:

> STM is a relatively new technology being experimented with in newer
> languages, and in a number of 3rd party libraries (both Peak [#Peak]_
> and Kamaelia [#Kamaelia]_ provide STM facilities).

I don't know about Kamaelia, but PEAK's STM (part of the Trellis
event-driven library) is *not* an inter-thread concurrency solution: it's
actually used to sort out the order of events in a co-operative
multitasking scenario.  So, it should not be considered evidence for the
practicality of doing inter-thread co-ordination that way in pure Python.

A suite is marked
> as a `transaction`, and then when an unlocked object is modified,
> instead of indicating an error, a locked copy of it is created to be
> used through the rest of the transaction. If any of the originals are
> modified during the execution of the suite, the suite is rerun from
> the beginning. If it completes, the locked copies are copied back to
> the originals in an atomic manner.
>

I'm not sure if "locked" is really the right word here.  A private copy
isn't "locked" because it's not shared.

The disadvantage is that any code in a transaction must be safe to run
> multiple times.  This forbids any kind of I/O.
>

More precisely, code in a transaction must be *reversible*, so it doesn't
forbid any I/O that can be undone.  If you can seek backward in an input
file, for example, or delete queued output data, then it can still be done.
 Even I/O like re-drawing a screen can be made STM safe by making the
redraw occur after a transaction that reads and empties a buffer written by
other transactions.

For
> instance, combining STM with explicit locking would allow explicit
> locking when IO was required,

I don't think this idea makes any sense, since STM's don't really "lock",
and to control I/O in an STM system you just STM-ize the queues.
 (Generally speaking.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Proposed PEP on concurrent programming support

2012-01-12 Thread PJ Eby

On Wed, Jan 11, 2012 at 7:01 PM, Mike Meyer  wrote:

> On Wed, 4 Jan 2012 00:07:27 -0500
> PJ Eby  wrote:
>  > On Tue, Jan 3, 2012 at 7:40 PM, Mike Meyer  wrote:
> > > For
> > > instance, combining STM with explicit locking would allow explicit
> > > locking when IO was required,
> > I don't think this idea makes any sense, since STM's don't really
> > "lock", and to control I/O in an STM system you just STM-ize the
> > queues. (Generally speaking.)
>
> I thought about that. I couldn't convince myself that STM by itself
> sufficient. If you need to make irreversible changes to the state of
> an object, you can't use STM, so what do you use? Can every such
> situation be handled by creating "safe" values then using an STM to
> update them?
>

If you need to do something irreversible, you just need to use an
STM-controlled queue, with something that reads from it to do the
irreversible things.  The catch is that your queue design has to support
guaranteed-successful item removal, since if the dequeue transaction fails,
it's too late.  Alternately, the queue reader can commit removal first,
then perform the irreversible operation...  but leave open a short window
for failure.  It depends on the precise semantics you're looking for.

In either case, though, the STM is pretty much sufficient, given a good
enough queue data structure.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Hashing proposal: change only string-only dicts

2012-01-18 Thread PJ Eby

On Tue, Jan 17, 2012 at 7:58 PM, "Martin v. Löwis" wrote:

> Am 17.01.2012 22:26, schrieb Antoine Pitrou:
> > Only 2 bits are used in ob_sstate, meaning 30 are left. These 30 bits
> > could cache a "hash perturbation" computed from the string and the
> > random bits:
> >
> > - hash() would use ob_shash
> > - dict_lookup() would use ((ob_shash * 103) ^ (ob_sstate & ~3))
> >
> > This way, you cache almost all computations, adding only a computation
> > and a couple logical ops when looking up a string in a dict.
>
> That's a good idea. For Unicode, it might be best to add another slot
> into the object, even though this increases the object size.
>

Wouldn't that break the ABI in 2.x?
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Hashing proposal: change only string-only dicts

2012-01-19 Thread PJ Eby

On Jan 18, 2012 12:55 PM, Martin v. Löwis  wrote:
>
> Am 18.01.2012 17:01, schrieb PJ Eby:
> > On Tue, Jan 17, 2012 at 7:58 PM, "Martin v. Löwis"  > <mailto:mar...@v.loewis.de>> wrote:
> >
> > Am 17.01.2012 22:26, schrieb Antoine Pitrou:
> > > Only 2 bits are used in ob_sstate, meaning 30 are left. These 30
bits
> > > could cache a "hash perturbation" computed from the string and the
> > > random bits:
> > >
> > > - hash() would use ob_shash
> > > - dict_lookup() would use ((ob_shash * 103) ^ (ob_sstate &
~3))
> > >
> > > This way, you cache almost all computations, adding only a
computation
> > > and a couple logical ops when looking up a string in a dict.
> >
> > That's a good idea. For Unicode, it might be best to add another
slot
> > into the object, even though this increases the object size.
> >
> >
> > Wouldn't that break the ABI in 2.x?
>
> I was thinking about adding the field at the end, so I thought it
> shouldn't. However, if somebody inherits from PyUnicodeObject, it still
> might - so my new proposal is to add the extra hash into the str block,
> either at str[-1], or after the terminating 0. This would cause an
> average increase of four bytes of the storage (0 bytes in 50% of the
> cases, 8 bytes because of padding in the other 50%).
>
> What do you think?

So far it sounds like the very best solution of all, as far as backward
compatibility is concerned.  If the extra bits are only used when two
strings have a matching hash value, the only doctests that could be
affected are ones testing for this issue.  ;-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Packaging and setuptools compatibility

2012-01-25 Thread PJ Eby

2012/1/24 Alexis Métaireau 

> Entrypoints basically are a plugin system. They are storing information in
> the metadata and then retrieving them when needing them. The problem with
> this, as everything when trying to get information from metadata is that we
> need to parse all the metadata for all the installed distributions. (say
> O(N)).
>

Note that this is why setuptools doesn't put entry points into PKG-INFO,
but instead uses separate metadata files.  Thus there is a lower "N" as
well as smaller files to parse.  ;-)

Entrypoints are also only one type of extension metadata supported by
setuptools; there is for example the EggTranslations system built on
setuptools metadata system: it allows plugins to provide translations and
localized resources for applications, and for other plugins in the same
application.  And it does this by using a different metadata file, again
stored in the installed project's metadata.

Since the new packaging metadata format is still a directory (replacing
setuptools' EGG-INFO or .egg-info directories), it seems a reasonable
migration path to simply install entry_points.txt and other metadata
extensions to that same directory, and provide API to iterate over all the
packages that offer a particular metadata file name.  Entry points work
this way now in setuptools, i.e. they iterate over all eggs containing
entry_points metadata, then parse and cache the contents.  An API for doing
the same sort of thing here seems appropriate.  This is still "meta" as
Glyph suggests, and allows both setuptools-style entry point plugins,
EggTranslations-style plugins, and whatever other sorts of plugin systems
people would like.  (I believe some other systems exist with this sort of
metadata scheme; ISTM that Paster has a metadata format, but I don't know
if it's exposed in egg-info metadata like this currently.)

Anyway, if you offer an API for finding packages by metadata file (or even
just a per-installed-package object API to query the existence of a
metadata file), and for process-level caching of extended metadata for
installed packages, that is sufficient for the above systems to work,
without needing to bless any particular plugin API per se.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Store timestamps as decimal.Decimal objects

2012-01-31 Thread PJ Eby

On Tue, Jan 31, 2012 at 7:35 PM, Nick Coghlan  wrote:

> Such a protocol can easily be extended to any other type - the time
> module could provide conversion functions for integers and float
> objects (meaning results may have lower precision than the underlying
> system calls), while the existing "fromtimestamp" APIs in datetime can
> be updated to accept the new optional arguments (and perhaps an
> appropriate class method added to timedelta, too). A class method
> could also be added to the decimal module to construct instances from
> integer components (as shown above), since that method of construction
> isn't actually specific to timestamps.
>

Why not just make it something like __fromfixed__() and make it a standard
protocol, implemented on floats, ints, decimals, etc.  Then the API is just
"time.time(type)", where type is any object providing a __fromfixed__
method.  ;-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Store timestamps as decimal.Decimal objects

2012-02-01 Thread PJ Eby

On Jan 31, 2012 11:08 PM, "Nick Coghlan"  wrote:
> PJE is quite right that using a new named protocol rather than a
> callback with a particular signature could also work, but I don't see
> a lot of advantages in doing so.

The advantage is that it fits your brain better.  That is, you don't have
to remember another symbol besides the type you wanted.  (There's probably
fewer keystrokes involved, too.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] requirements for moving import over to importlib?

2012-02-07 Thread PJ Eby

On Tue, Feb 7, 2012 at 3:07 PM, Brett Cannon  wrote:

> So, if there is going to be some baseline performance target I need to hit
> to make people happy I would prefer to know what that (real-world)
> benchmark is and what the performance target is going to be on a non-debug
> build. And if people are not worried about the performance then I'm happy
> with that as well. =)
>

One thing I'm a bit worried about is repeated imports, especially ones that
are inside frequently-called functions.  In today's versions of Python,
this is a performance win for "command-line tool platform" systems like
Mercurial and PEAK, where you want to delay importing as long as possible,
in case the code that needs the import is never called at all...  but, if
it *is* used, you may still need to use it a lot of times.

When writing that kind of code, I usually just unconditionally import
inside the function, because the C code check for an already-imported
module is faster than the Python "if" statement I'd have to clutter up my
otherwise-clean function with.

So, in addition to the things other people have mentioned as performance
targets, I'd like to keep the slowdown factor low for this type of scenario
as well.  Specifically, the slowdown shouldn't be so much as to motivate
lazy importers like Mercurial and PEAK to need to rewrite in-function
imports to do the already-imported check ourselves.  ;-)

(Disclaimer: I haven't actually seen Mercurial's delayed/dynamic import
code, so I can't say for 100% sure if they'd be affected the same way.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] requirements for moving import over to importlib?

2012-02-07 Thread PJ Eby

On Tue, Feb 7, 2012 at 5:24 PM, Brett Cannon  wrote:

>
> On Tue, Feb 7, 2012 at 16:51, PJ Eby  wrote:
>
>> On Tue, Feb 7, 2012 at 3:07 PM, Brett Cannon  wrote:
>>
>>> So, if there is going to be some baseline performance target I need to
>>> hit to make people happy I would prefer to know what that (real-world)
>>> benchmark is and what the performance target is going to be on a non-debug
>>> build. And if people are not worried about the performance then I'm happy
>>> with that as well. =)
>>>
>>
>> One thing I'm a bit worried about is repeated imports, especially ones
>> that are inside frequently-called functions.  In today's versions of
>> Python, this is a performance win for "command-line tool platform" systems
>> like Mercurial and PEAK, where you want to delay importing as long as
>> possible, in case the code that needs the import is never called at all...
>>  but, if it *is* used, you may still need to use it a lot of times.
>>
>> When writing that kind of code, I usually just unconditionally import
>> inside the function, because the C code check for an already-imported
>> module is faster than the Python "if" statement I'd have to clutter up my
>> otherwise-clean function with.
>>
>> So, in addition to the things other people have mentioned as performance
>> targets, I'd like to keep the slowdown factor low for this type of scenario
>> as well.  Specifically, the slowdown shouldn't be so much as to motivate
>> lazy importers like Mercurial and PEAK to need to rewrite in-function
>> imports to do the already-imported check ourselves.  ;-)
>>
>> (Disclaimer: I haven't actually seen Mercurial's delayed/dynamic import
>> code, so I can't say for 100% sure if they'd be affected the same way.)
>>
>
> IOW you want the sys.modules case fast, which I will never be able to
> match compared to C code since that is pure execution with no I/O.
>

Couldn't you just prefix the __import__ function with something like this:

 ...
 try:
  module = sys.modules[name]
 except KeyError:
  # slow code path

(Admittedly, the import lock is still a problem; initially I thought you
could just skip it for this case, but the problem is that another thread
could be in the middle of executing the module.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] requirements for moving import over to importlib?

2012-02-07 Thread PJ Eby

On Tue, Feb 7, 2012 at 6:40 PM, Terry Reedy  wrote:

> importlib could provide a parameterized decorator for functions that are
> the only consumers of an import. It could operate much like this:
>
> def imps(mod):
>def makewrap(f):
>def wrapped(*args, **kwds):
>print('first/only call to wrapper')
>g = globals()
>g[mod] = __import__(mod)
>g[f.__name__] = f
>f(*args, **kwds)
>wrapped.__name__ = f.__name__
>return wrapped
>return makewrap
>
> @imps('itertools')
> def ic():
>print(itertools.count)
>
> ic()
> ic()
> #
> first/only call to wrapper
> 
> 
>

If I were going to rewrite code, I'd just use lazy imports (see
http://pypi.python.org/pypi/Importing ).  They're even faster than this
approach (or using plain import statements), as they have zero per-call
function call overhead.  It's just that not everything I write can depend
on Importing.

Throw an equivalent into the stdlib, though, and I guess I wouldn't have to
worry about dependencies...

(To be clearer; I'm talking about the
http://peak.telecommunity.com/DevCenter/Importing#lazy-imports feature,
which sticks a dummy module subclass instance into sys.modules, whose
__gettattribute__ does a reload() of the module, forcing the normal import
process to run, after first changing the dummy object's type to something
that doesn't have the __getattribute__ any more.  This ensures that all
accesses after the first one are at normal module attribute access speed.
 That, and the "whenImported" decorator from Importing would probably be of
general stdlib usefulness too.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] requirements for moving import over to importlib?

2012-02-08 Thread PJ Eby

On Wed, Feb 8, 2012 at 4:08 PM, Brett Cannon  wrote:

>
> On Wed, Feb 8, 2012 at 15:31, Terry Reedy  wrote:
>
>> For top-level imports, unless *all* are made lazy, then there *must* be
>> some indication in the code of whether to make it lazy or not.
>>
>
> Not true; importlib would make it dead-simple to whitelist what modules to
> make lazy (e.g. your app code lazy but all stdlib stuff not, etc.).
>

There's actually only a few things stopping all imports from being lazy.
 "from x import y" immediately de-lazies them, after all.  ;-)

The main two reasons you wouldn't want imports to *always* be lazy are:

1. Changing sys.path or other parameters between the import statement and
the actual import
2. ImportErrors are likewise deferred until point-of-use, so conditional
importing with try/except would break.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] requirements for moving import over to importlib?

2012-02-09 Thread PJ Eby

On Feb 9, 2012 9:58 AM, "Brett Cannon"  wrote:
> This actually depends on the type of ImportError. My current solution
actually would trigger an ImportError at the import statement if no finder
could locate the module. But if some ImportError was raised because of some
other issue during load then that would come up at first use.

That's not really a lazy import then, or at least not as lazy as what
Mercurial or PEAK use for general lazy importing.  If you have a lot of
them, that module-finding time really adds up.

Again, the goal is fast startup of command-line tools that only use a small
subset of the overall framework; doing disk access for lazy imports goes
against that goal.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] requirements for moving import over to importlib?

2012-02-09 Thread PJ Eby

On Thu, Feb 9, 2012 at 2:53 PM, Mike Meyer  wrote:

> For those of you not watching -ideas, or ignoring the "Python TIOBE
> -3%" discussion, this would seem to be relevant to any discussion of
> reworking the import mechanism:
>
> http://mail.scipy.org/pipermail/numpy-discussion/2012-January/059801.html
>
> Interesting.  This gives me an idea for a way to cut stat calls per
sys.path entry per import by roughly 4x, at the cost of a one-time
directory read per sys.path entry.

That is, an importer created for a particular directory could, upon first
use, cache a frozenset(listdir()), and the stat().st_mtime of the
directory.  All the filename checks could then be performed against the
frozenset, and the st_mtime of the directory only checked once per import,
to verify whether the frozenset() needed refreshing.

Since a failed module lookup takes at least 5 stat checks (pyc, pyo, py,
directory, and compiled extension (pyd/so)), this cuts it down to only 1,
at the price of a listdir().  The big question is how long does a listdir()
take, compared to a stat() or failed open()?   That would tell us whether
the tradeoff is worth making.

I did some crude timeit tests on frozenset(listdir()) and trapping failed
stat calls.  It looks like, for a Windows directory the size of the 2.7
stdlib, you need about four *failed* import attempts to overcome the
initial caching cost, or about 8 successful bytecode imports.  (For Linux,
you might need to double these numbers; my tests showed a different ratio
there, perhaps due to the Linux stdib I tested having nearly twice as many
directory entries as the directory I tested on Windows!)

However, the numbers are much better for application directories than for
the stdlib, since they are located earlier on sys.path.  Every successful
stdlib import in an application is equal to one failed import attempt for
every preceding directory on sys.path, so as long as the average directory
on sys.path isn't vastly larger than the stdlib, and the average
application imports at least four modules from the stdlib (on Windows, or 8
on Linux), there would be a net performance gain for the application as a
whole.  (That is, there'd be an improved per-sys.path entry import time for
stdlib modules, even if not for any application modules.)

For smaller directories, the tradeoff actually gets better.  A directory
one seventh the size of the 2.7 Windows stdlib has a listdir() that's
proportionately faster, but failed stats() in that directory are *not*
proportionately faster; they're only somewhat faster.  This means that it
takes fewer failed module lookups to make caching a win - about 2 in this
case, vs. 4 for the stdlib.

Now, these numbers are with actual disk or network access abstracted away,
because the data's in the operating system cache when I run the tests.
 It's possible that this strategy could backfire if you used, say, an NFS
directory with ten thousand files in it as your first sys.path entry.
 Without knowing the timings for listdir/stat/failed stat in that setup,
it's hard to say how many stdlib imports you need before you come out
ahead.  When I tried a directory about 7 times larger than the stdlib,
creating the frozenset took 10 times as long, but the cost of a failed stat
didn't go up by very much.

This suggests that there's probably an optimal directory size cutoff for
this trick; if only there were some way to check the size of a directory
without reading it, we could turn off the caching for oversize directories,
and get a major speed boost for everything else.  On most platforms, the
stat().st_size of the directory itself will give you some idea, but on
Windows that's always zero.  On Windows, we could work around that by using
a lower-level API than listdir() and simply stop reading the directory if
we hit the maximum number of entries we're willing to build a cache for,
and then call it off.

(Another possibility would be to explicitly enable caching by putting a
flag file in the directory, or perhaps by putting a special prefix on the
sys.path entry, setting the cutoff in an environment variable, etc.)

In any case, this seems really worth a closer look: in non-pathological
cases, it could make directory-based importing as fast as zip imports are.
 I'd be especially interested in knowing how the listdir/stat/failed stat
ratios work on NFS - ISTM that they might be even *more* conducive to this
approach, if setup latency dominates the cost of individual system calls.

If this works out, it'd be a good example of why importlib is a good idea;
i.e., allowing us to play with ideas like this.  Brett, wouldn't you love
to be able to say importlib is *faster* than the old C-based importing?
 ;-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] requirements for moving import over to importlib?

2012-02-09 Thread PJ Eby

On Thu, Feb 9, 2012 at 5:34 PM, Robert Kern  wrote:

> On 2/9/12 10:15 PM, Antoine Pitrou wrote:
>
>> On Thu, 9 Feb 2012 17:00:04 -0500
>> PJ Eby  wrote:
>>
>>> On Thu, Feb 9, 2012 at 2:53 PM, Mike Meyer  wrote:
>>>
>>>  For those of you not watching -ideas, or ignoring the "Python TIOBE
>>>> -3%" discussion, this would seem to be relevant to any discussion of
>>>> reworking the import mechanism:
>>>>
>>>> http://mail.scipy.org/**pipermail/numpy-discussion/**
>>>> 2012-January/059801.html<http://mail.scipy.org/pipermail/numpy-discussion/2012-January/059801.html>
>>>>
>>>> Interesting.  This gives me an idea for a way to cut stat calls per
>>>>
>>> sys.path entry per import by roughly 4x, at the cost of a one-time
>>> directory read per sys.path entry.
>>>
>>
>> Why do you even think this is a problem with "stat calls"?
>>
>
> All he said is that reading about that problem and its solution gave him
> an idea about dealing with stat call overhead. The cost of stat calls has
> demonstrated itself to be a significant problem in other, more typical
> contexts.

Right.  It was the part of the post that mentioned that all they sped up
was knowing which directory the files were in, not the actual loading of
bytecode.  The thought then occurred to me that this could perhaps be
applied to normal importing, as a zipimport-style speedup.  (The zipimport
module caches each zipfile directory it finds on sys.path, so failed import
lookups are extremely fast.)

It occurs to me, too, that applying the caching trick to *only* the stdlib
directories would still be a win as soon as you have between four and eight
site-packages (or user specific site-packages) imports in an application,
so it might be worth applying unconditionally to system-defined stdlib
(non-site) directories.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] requirements for moving import over to importlib?

2012-02-10 Thread PJ Eby

On Fri, Feb 10, 2012 at 1:05 PM, Brett Cannon  wrote:

>
>
> On Thu, Feb 9, 2012 at 17:00, PJ Eby  wrote:
>
>> I did some crude timeit tests on frozenset(listdir()) and trapping failed
>> stat calls.  It looks like, for a Windows directory the size of the 2.7
>> stdlib, you need about four *failed* import attempts to overcome the
>> initial caching cost, or about 8 successful bytecode imports.  (For Linux,
>> you might need to double these numbers; my tests showed a different ratio
>> there, perhaps due to the Linux stdib I tested having nearly twice as many
>> directory entries as the directory I tested on Windows!)
>>
>
>> However, the numbers are much better for application directories than for
>> the stdlib, since they are located earlier on sys.path.  Every successful
>> stdlib import in an application is equal to one failed import attempt for
>> every preceding directory on sys.path, so as long as the average directory
>> on sys.path isn't vastly larger than the stdlib, and the average
>> application imports at least four modules from the stdlib (on Windows, or 8
>> on Linux), there would be a net performance gain for the application as a
>> whole.  (That is, there'd be an improved per-sys.path entry import time for
>> stdlib modules, even if not for any application modules.)
>>
>
> Does this comment take into account the number of modules required to load
> the interpreter to begin with? That's already like 48 modules loaded by
> Python 3.2 as it is.
>

I didn't count those, no.  So, if they're loaded from disk *after*
importlib is initialized, then they should pay off the cost of caching even
fairly large directories that appear earlier on sys.path than the stdlib.
 We still need to know about NFS and other ratios, though...  I still worry
that people with more extreme directory sizes or slow-access situations
will run into even worse trouble than they have now.

> First is that if this were used on Windows or OS X (i.e. the OSs we
> support that typically have case-insensitive filesystems), then this
> approach would be a massive gain as we already call os.listdir() when
> PYTHONCASEOK isn't defined to check case-sensitivity; take your 5 stat
> calls and add in 5 listdir() calls and that's what you get on Windows and
> OS X right now. Linux doesn't have this check so you would still be
> potentially paying a penalty there.
>

Wow.  That means it'd always be a win for pre-stdlib sys.path entries,
because any successful stdlib import equals a failed pre-stdlib lookup.
 (Of course, that's just saving some of the overhead that's been *added* by
importlib, not a new gain, but still...)

Second is variance in filesystems. Are we guaranteed that the stat of a
> directory is updated before a file change is made?
>

Not quite sure what you mean here.  The directory stat is used to ensure
that new files haven't been added, old ones removed, or existing ones
renamed.  Changes to the files themselves shouldn't factor in, should they?

> Else there is a small race condition there which would suck. We also have
> the issue of granularity; Antoine has already had to add the source file
> size to .pyc files in Python 3.3 to combat crappy mtime granularity when
> generating bytecode. If we get file mod -> import -> file mod -> import,
> are we guaranteed that the second import will know there was a modification
> if the first three steps occur fast enough to fit within the granularity of
> an mtime value?
>

Again, I'm not sure how this relates.  Automatic code reloaders monitor
individual files that have been previously imported, so the directory
timestamps aren't relevant.

Of course, I could be confused here.  Are you saying that if somebody makes
a new .py file and saves it, that it'll be possible to import it before
it's finished being written?  If so, that could happen already, and again
caching the directory doesn't make any difference.

Alternately, you could have a situation where the file is deleted after we
load the listdir(), but in that case the open will fail and we can fall
back...  heck, we can even force resetting the cache in that event.

I was going to say something about __pycache__, but it actually doesn't
> affect this. Since you would have to stat the directory anyway, you might
> as well just stat directory for the file you want to keep it simple. Only
> if you consider __pycache__ to be immutable except for what the interpreter
> puts in that directory during execution could you optimize that step (in
> which case you can stat the directory once and never care again as the set
> would be just updated by import whenever a new .pyc file was written).
>
> Having said all

Re: [Python-Dev] requirements for moving import over to importlib?

2012-02-10 Thread PJ Eby

On Feb 10, 2012 3:38 PM, "Brett Cannon"  wrote:
> On Fri, Feb 10, 2012 at 15:07, PJ Eby  wrote:
>> On Fri, Feb 10, 2012 at 1:05 PM, Brett Cannon  wrote:
>>> First is that if this were used on Windows or OS X (i.e. the OSs we
support that typically have case-insensitive filesystems), then this
approach would be a massive gain as we already call os.listdir() when
PYTHONCASEOK isn't defined to check case-sensitivity; take your 5 stat
calls and add in 5 listdir() calls and that's what you get on Windows and
OS X right now. Linux doesn't have this check so you would still be
potentially paying a penalty there.
>>
>>
>> Wow.  That means it'd always be a win for pre-stdlib sys.path entries,
because any successful stdlib import equals a failed pre-stdlib lookup.
 (Of course, that's just saving some of the overhead that's been *added* by
importlib, not a new gain, but still...)
>
>
> How so? import.c does a listdir() as well (this is not special to
importlib).

IIRC, it does a FindFirstFile on Windows, which is not the same thing.
That's one system call into a preallocated buffer, not a series of system
calls and creation of Python string objects.

> Don't care about automatic reloaders. I'm just asking about the case
where the mtime granularity is coarse enough to allow for a directory
change, an import to execute, and then another directory change to occur
all within a single mtime increment. That would lead to the set cache to be
out of date.

Ah.  Good point.  Well, if there's any way to know what the mtime
granularity is, we can avoid the race condition by never performing the
listdir when the current clock time is too close to the stat().  In effect,
we can bypass the optimization if the directory was just modified.

Something like:

mtime = stat(dir).st_mtime
if abs(time.time()-mtime)>unsafe_window:
 old_mtime, files = cache.get(dir, (-1, ()))
 if mtime!=old_mtime:
  files = frozenset(listdir(dir))
  cache[dir] = mtime, files
 # code to check for possibility of importing
 # and shortcut if found, or
 # exit with failure if no matching files

# fallthrough to direct filesystem checking

The "unsafe window" is presumably filesystem and platform dependent, but
ISTR that even FAT filesystems have 2-second accuracy.  The other catch is
the relationship between st_mtime and time.time(); I assume they'd be the
same in any sane system, but what if you're working across a network and
there's clock skew?  Ugh.

Worst case example would be say, accessing a FAT device that's been shared
over a Windows network from a machine whose clock is several hours off.  So
it always looks safe to read, even if it's just been changed.

What's the downside in that case?  You're trying to import something that
just changed in the last fraction of a second...  why?

I mean, sure, the directory listing will be wrong, no question.  But it
only matters that it was wrong if you added, removed, or renamed importable
files.  Why are you trying to import one of them?

Ah, here's a use case: you're starting up IDLE, and while it's loading, you
save some .py files you plan to import later.  Your editor saves them all
at once, but IDLE does the listdir() midway through.  You then do an import
from the IDLE prompt, and it fails because the listdir() didn't catch
everything.

Okay, now I know how to fix this.  The problem isn't that there's a race
condition per se, the problem is that the race results in a broken cache
later.  After all, it could just as easily have been the case that the
import failed due to timing.  The problem is that all *future* imports
would fail in this circumstance.

So the fix is a time-to-live recheck: if TTL seconds have passed since the
last use of the cached frozenset, reload it, and reset the TTL to infinity.

In other words:

mtime = stat(dir).st_mtime
now - time.time()
if abs(now-mtime)>unsafe_window:
 old_mtime, then, files = cache.get(dir, (-1, now, ()))
 if mtime!=old_mtime or then is not None and now-then>TTL:
  files = frozenset(listdir(dir))
  cache[dir] = mtime, now if mtime!=old_mtime else None, files
 # code to check for possibility of importing
 # and shortcut if found, or
 # exit with failure if no matching files

# fallthrough to direct filesystem checking

What this does (or should do) is handle clock-skew race condition stale
caches by reloading the listdir even if mtime hasn't changed, as soon as
TTL seconds have passed since the last snapshot was taken.  However, if the
mtime stays the same, no subsequent listdirs will occur.  As long as the
TTL is set high enough that a full startup of Python can occur, but low
enough that it resets by the time a hum

Re: [Python-Dev] [Python-checkins] cpython: Issue #14043: Speed up importlib's _FileFinder by at least 8x, and add a new

2012-02-20 Thread PJ Eby

On Mon, Feb 20, 2012 at 1:20 PM, Brett Cannon  wrote:

> On Sun, Feb 19, 2012 at 22:15, Nick Coghlan  wrote:
>
>> However, "very cool" on adding the caching in the default importers :)
>
>
> Thanks to PJE for bringing the idea up again and Antoine discovering the
> approach *independently* from PJE and myself and actually writing the code.
>

Where is the code, btw?   (I looked at your sandbox and didn't see it.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 414

2012-03-04 Thread PJ Eby

On Sat, Mar 3, 2012 at 5:02 AM, Lennart Regebro  wrote:

> I'm not sure that's true at all. In most cases where you support both
> Python 2 and Python 3, most strings will be "native", ie, without
> prefix in either Python 2 or Python 3. The native case is the most
> common case.
>

Exactly.  The reason "native strings" even exist as a concept in WSGI was
to make it so that the idiomatic manipulation of header data in both Python
2 and 3 would use plain old string constants with no special wrappers or
markings.

What's thrown the monkey wrench in here for the WSGI case is the use of
unicode_literals.  If you simply skip using unicode_literals for WSGI code,
you should be fine with a single 2/3 codebase.  But then you need some way
to mark some things as unicode...  which is how we end up back at this PEP.

I suppose WSGI could have gone the route of using byte strings for headers
instead, but I'm not sure it would have helped.  The design goals for PEP
 were to sanely support both 2to3 and 2+3 single codebases, and WSGI
does actually do that...  for the code that's actually doing WSGI stuff.

Ironically enough, the effect of the WSGI API is that it's all the *non*
WSGI-specific code in the same module that ends up needing to mark its
strings as unicode...  or else it has to use unicode_literals and mark all
the WSGI code with str().  There's really no good way to deal with a
*mixed* WSGI/non-WSGI module, except to use explicit markers on one side or
the other.

Perhaps the simplest solution of all might be to just isolate direct WSGI
code in modules that don't import unicode_literals.  Web frameworks usually
hide WSGI stuff away from the user anyway, and many are already natively
unicode in their app-facing APIs.  So, if a framework or library
encapsulates WSGI in a str-safe/unicode-friendly API, this really shouldn't
be an issue for the library's users.  But I suppose somebody's got to port
the libraries first.  ;-)

If anyone's updating porting strategy stuff, a mention of this in the tips
regarding unicode_literals would be a good idea.  i.e., something like:

"If you have 2.x modules which work with WSGI and also contain explicit u''
strings, you should not use unicode_literals unless you are willing to
explicitly mark all WSGI environment and header strings as native strings
using 'str()'.  This is necessary because WSGI headers and environment
keys/values are defined as byte strings in Python 2.x, and unicode strings
in 3.x.  Alternatively, you may continue to use u'' strings if you are
targeting Python 3.3+ only, or can use the import or install hooks provided
for Python 3.2, or if you are using 2to3...  but in this case you should
not use unicode_literals."

That could probably be written a lot more clearly.  ;-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Non-string keys in type dict

2012-03-07 Thread PJ Eby

On Wed, Mar 7, 2012 at 8:39 PM, Victor Stinner wrote:

> So my question is: what is the use case of such dict?


Well, I use them for this:

http://pypi.python.org/pypi/AddOns

(And I have various other libraries that depend on that library.)

Short version: AddOns are things you can use to dynamically extend
instances -- a bit like the "decorator" in "decorator pattern" (not to be
confused with Python decorators).  Rather than synthesize a unique string
as a dictionary key, I just used the AddOn classes themselves as keys.
 This works fine for object instances, but gets hairy once classes come
into play.  ( http://pypi.python.org/pypi/AddOns#class-add-ons - an
orthogonal alternative to writing hairy metaclasses with  registries for
special methods, persisted attributes, and all other sorts of things one
would ordinarily use metaclasses for.)

In principle, I could refactor AddOns to use synthetic (i.e. made-up)
strings as keys, but it honestly seemed unpythonic to me to make up a key
when the One Obvious key to use is the AddOn type itself.  (Or in some
cases, a tuple comprised of an AddOn type plus additional values - which
would mean string manipulation for every access.)

Another possible solution would be to not store addons directly in a class'
dictionary, but instead throw in an __addons__ key with a subdictionary;
again this seemed like pointless indirection, wasted memory and access time
when there's already a perfectly good dictionary lying about.

IOW, it's one of those places where Python's simple orthogonality seems
like a feature rather than a bug that needs fixing.  I mean, next thing you
know, people will be saying that *instance* dictionaries need to have only
string keys or something.  ;-)

Of course, if my library has to change to be able to work on 3.3, then I
guess it'll have to change.  IIRC, this is *probably* the only place I'm
using non-string keys in type or instance dictionaries, so in the big
scheme of porting costs, it's not that much.

But, since you asked, that's the main use case I know of for non-string
keys in type dictionaries, and I wouldn't be terribly surprised if I'm the
only person with public code that does this.  ;-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Non-string keys in type dict

2012-03-08 Thread PJ Eby

On Thu, Mar 8, 2012 at 2:43 AM, Ethan Furman  wrote:
>
> PJ Eby wrote:
>>
>> Short version: AddOns are things you can use to dynamically extend
instances -- a bit like the "decorator" in "decorator pattern" (not to be
confused with Python decorators).  Rather than synthesize a unique string
as a dictionary key, I just used the AddOn classes themselves as keys.
 This works fine for object instances, but gets hairy once classes come
into play.
>
>
> Are you able to modify classes after class creation in Python 3? Without
using a metaclass?

For ClassAddOns, it really doesn't matter; you can't remove them from the
class they attach to.  Addons created after the class is finalized use a
weakref dictionary to attach to their classes.

Now that I've gone back and looked at the code, the only reason that
ClassAddOns even use the class __dict__ in the first place is because it's
a convenient place to put them while the class is being built.  With only
slightly hairier code, I could use an __addons__ dict in the class
namespace while it's being built, but there'll then be a performance hit at
look up time to do cls.__dict__['__addons__'][key] instead of
cls.__dict__[key].

Actually, now that I'm thinking about it, the non-modifiability of class
dictionaries is actually a feature for this use case: if I make an
__addons__ dict, that dict is mutable.  That means I'll have to move to
string keys or have some sort of immutable dict type available...  ;-)
 (Either that, or do some other, more complex refactoring.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Zipping the standard library.

2012-03-10 Thread PJ Eby

On Sat, Mar 10, 2012 at 5:49 PM, Thomas Wouters  wrote:

> (And, yes, I'm zipping up the stdlib for Python 2.7 at Google, to reduce
> the impact on the aforementioned million of machines :)
>

You might want to consider instead backporting the importlib caching
facility, since it provides some of the zipimport benefits for plain old,
non-zipped modules.  Actually, a caching-only import hook that operated
that way wouldn't even need the whole of importlib, just a wrapper over the
standard C import that skips the unnecessary filesystem accesses.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Fwd: [Import-SIG] Where to discuss PEP 382 vs. PEP 402 (namespace packages)?

2012-03-12 Thread PJ Eby

Ugh; this was supposed to be sent to the list, not just Guido.  (I wish
Gmail defaulted to reply-all in the edit box.)

-- Forwarded message --
From: PJ Eby 
Date: Mon, Mar 12, 2012 at 12:16 AM
Subject: Re: [Import-SIG] Where to discuss PEP 382 vs. PEP 402 (namespace
packages)?
To: Guido van Rossum 

On Sun, Mar 11, 2012 at 10:39 PM, Guido van Rossum  wrote:

> I'm leaning towards PEP 402 or some variant. Let's have a pow-wow at
> the sprint tomorrow (I'll arrive in Santa Clara between 10 and 10:30).
> I do want to understand Nick's argument better; I haven't studied PEP
> 395 yet.
>

Note that PEP 395 can stay compatible with PEP 402 by a fairly
straightforward change: instead of implicitly and automagically guessing
the needed sys.path[0] change, it could be made explicit by adding
something like this to the top of script/modules that are inside a package:

import pkgutil
pkgutil.script_module(__name__, 'mypackage.thismodule')

Assuming __name__=='__main__', the API would set __main__.__qualname__, set
sys.modules[qualname] = __main__, and fix up sys.path[0] if and only if it
still is the parent directory of __main__.__file__.  (If
__name__!=='__main__' and it's not equal to the second argument either,
it'd be an error.)

Then, in the event of broken relative imports or module aliasing, the error
message can suggest adding a script_module() declaration to explicitly make
the file a "dual citizen" -- i.e., script/module.  (It's already possible
for PEP 395 to be confused by stray __init__.py files or __path__
manipulation; using error messages and explicit declaration instead of
guessing seems like a better route for 395 to take.)

Of course, it's also possible to fix the 395/402 incompatibility by
reintroducing some sort of marker, such as .pyp directory extensions or by
including *.pyp marker files within package directories.  The problem is
that these markers work against the intuitive nature of PEP 402 if they are
required, and they do not help 395 if nobody uses them due to their
optionality.  ;-)

(Last, but not least, the compromise approach: allow explicit script/module
declaration as a workaround for virtual packages, AND support automagic
__qualname__ recognition for self-contained packages...  but still give
error messages for broken relative imports and aliasing that suggest the
explicit declaration.)

Anyway, the other open issues for 402 are:

* Dealing with updates to sys.path
* Iterating available virtual packages

There was a Python-Dev discussion about the first, in which I realized that
sys.path updates can actually be handled transparently by making virtual
__path__ objects be special iterables rather than lists; but the PEP hasn't
been updated to reflect that.  (I was actually waiting for some sign of
BDFL interest before adding a potential complication like that to the PEP.)
 The relevant proposal was:

> This seems to lean in favor of making a simple reiterable wrapper
> type for the __path__, that only allows you to take the length and
> iterate over it.  With an appropriate design, it could actually
> update itself automatically, given a subname and a parent
> __path__/sys.path.  That is, it could keep a tuple copy of the
> last-seen parent path, and before iteration, compare
> tuple(self.parent_path) to self.last_seen_path.  If they're
> different, it rebuilds the value to be iterated over.

> Voila: transparent updating of all virtual __path__ values from
> sys.path changes (or modifications to self-contained __path__
> parents, btw), and trying to change it (or read an item from it
> positionally) will not create any silent failures.

> Alright...  *if* we support automatic updates to virtual __paths__,
> this is probably how we should do it.  (It will require, though, that
> imp.find_module be changed to use a different iteration method than
> PyList_GetItem, as it's quite possible a virtual __path__ will get
> passed into it.)

I actually drafted an implementation of this to work with importlib, so it
seems pretty feasible to support automatically-updated virtual paths that
change on the next import attempt if sys.path (or any parent __path__) has
changed since the last time.

Iterating virtual packages is a somewhat harder problem, since it's not
really practical to do an unbounded subdirectory search for importable
files. Probably, the pkgutil module-walking APIs just need to grow some
extra flags for virtual package searching, with some reasonable defaults.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] SocketServer issues

2012-03-14 Thread PJ Eby

On Wed, Mar 14, 2012 at 5:02 AM, Antoine Pitrou  wrote:

> On Wed, 14 Mar 2012 04:26:16 +
> Kristján Valur Jónsson  wrote:
> > Hi there.
> > I want to mention some issues I've had with the socketserver module, and
> discuss if there's a way to make it nicer.
> > So, for a long time we were able to create magic stackless mixin classes
> for
> > it, like ThreadingMixIn, and assuming we had the appropriate socket
> > replacement library, be able to use it nicely using tasklets.
>
> I don't really think the ability to "create magic stackless mixin
> classes" should be a driving principle for the stdlib.
>

But not needlessly duplicating functionality already elsewhere in the
stdlib probably ought to be.  ;-)

> So, my first question is:  Why not simply rely on the already built-in
> timeout
> > support in the socket module?
>
> In case you didn't notice, the built-in timeout support *also* uses
> select().
>

That's not really the point; the frameworks that implement nonblocking I/O
by replacing the socket module (and Stackless is only one of many) won't be
using that code.  If SocketServer uses only the socket module's API, then
those frameworks will be told about the timeout via the socket API, and can
then implement it their own way.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] SocketServer issues

2012-03-14 Thread PJ Eby

On Wed, Mar 14, 2012 at 12:29 PM, Antoine Pitrou wrote:

> On Wed, 14 Mar 2012 12:17:06 -0400
> PJ Eby  wrote:
> > That's not really the point; the frameworks that implement nonblocking
> I/O
> > by replacing the socket module (and Stackless is only one of many) won't
> be
> > using that code.
>
> Then they should also replace the select module.
>

That actually sounds like a good point.  ;-)

I'm not the maintainer of any of those frameworks, but IIRC some of them
*do* replace it.  Perhaps this would solve Stackless's problem here too?
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] svn.python.org and buildbots down

2012-03-19 Thread PJ Eby

On Mar 19, 2012 1:20 PM, "Ned Deily"  wrote:
>
> In article <20120319142539.7e83c...@pitrou.net>,
>  Antoine Pitrou  wrote:
> > [...] As for svn.python.org, is anyone
> > using it?
>
> The repo for the website (www.python.org) is maintained there.

It's also still setuptools' official home, though I've been doing some work
recently on migrating it to hg.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

1 2 >

1 - 100 of 190 matches

Mail list logo