Re: [Python-Dev] int(string)

2005-10-22 Thread Michael Hudson
Tim Peters <[EMAIL PROTECTED]> writes:

> Turns out it's _not_ input speed that's the problem here, and not even
> mainly the speed of integer mod:  the bulk of the time is spent in
> int(string) (and, yes, that's also far more important to the problem
> Neal was looking at than list.append time). If you can even track all
> the levels of C function calls that ends up invoking , you find
> yourself in PyOS_strtoul(), which is a nifty all-purpose routine that
> accepts inputs in bases 2 thru 36, can auto-detect base, and does
> platform-independent overflow checking at the cost of a division per
> digit.  All those are features, but it makes for slw conversion.

> I assume it's the overflow-checking that's the major time sink, and
> it's not correct anyway:  it does the check slightly differently for
> base 10 than for any other base, explained only in the checkin comment
> for rev 2.13, 8 years ago:
>
> For base 10, cast unsigned long to long before testing overflow.
> This prevents 4294967296 from being an acceptable way to spell zero!
>
> So what are the odds that base 10 was the _only_ base that had a "bad
> input" case for the overflow-check method used?  If you thought
> "slim", you were right ;-)  Here are other bad cases, under all Python
> versions to date (on a 32-bit box; if sizeof(long) == 8, there are
> different bad cases):
>
> int('10200202220122211', 3) = 0
[...]

Eek!

> Now fixing that is easy:  the problem comes from being too clever,

Surprise!

> doing both a multiply and an addition before checking for overflow. 
> Check each operation on its own and it would be bulletproof, without
> special-casing.  But that might be even slower (it would remove the
> branch special-casing 10, but add a cheap integer addition overflow
> check with its own branch).
>
> The challenge (should you decide to accept it ) is to replace
> the overflow-checking with something both correct _and_ much faster
> than doing n integer divisions for an n-character input.  For example,
> 36**6 < 2**32-1, so whenever the input has no more than 6 digits
> overflow is impossible regardless of base and regardless of platform. 
> That's simple and exploitable.  For extra credit, make int(string) go
> faster than preparing your taxes ;-)

So, you're suggesting dividing the input up into known non-overflowing
chunks and using the normal Python operations to combine those chunks,
relying on them overflowing to longs as needed?  All of the examples
you posted should have returned longs anyway, right?

I guess the change to automatically overflowing to longs has led to
some code that shows its history more than one would like.

Cheers,
mwh

-- 
  I think if we have the choice, I'd rather we didn't explicitly put
  flaws in the reST syntax for the sole purpose of not insulting the
  almighty.-- /will on the doc-sig
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] DRAFT: python-dev Summary for 2005-09-01 through 2005-09-16

2005-10-22 Thread Nick Coghlan
Guido van Rossum wrote:
> On 10/21/05, Tony Meyer <[EMAIL PROTECTED]> wrote:
>> This is over a month late, sorry, but here it is (Steve did his
>> threads ages ago; I've fallen really behind).
> 
> Better late than never! These summaries are awesome.

I certainly find them to be a very useful reminder of list threads that got 
overwhelmed by other discussions.

I'm still trying to close out the naming issues for PEP 343, but I hope to get 
back to the "Template.format" method idea eventually (along with an idea 
inspired by the discussion of the module level functions in the 're' module - 
how about providing similar module level functions in the string module that 
correspond to the methods of Template objects?).

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://boredomandlaziness.blogspot.com
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Comparing date+time w/ just time

2005-10-22 Thread skip
With significant input from Fred I made some changes to xmlrpclib a couple
months ago to better integrate datetime objects into xmlrpclib.  That raised
some problems because I neglected to add support for comparing datetime
objects with xmlrpclib.DateTime objects.  (The problem showed up in
MoinMoin.)  I've been working on that recently (adding rich comparison
methods to DateTime while retaining __cmp__ for backward compatibility), and
have second thoughts about one of the original changes.

I tried to support datetime, date and time objects.  My problems are with
support for time objects.  Marshalling datetimes as xmlrpclib.DateTime
objects is no problem (though you lose fractions of a second).  Marshalling
dates is reasonable if you treat the time as 00:00:00.  I decided to marshal
datetime.time objects by fixing the day portion of the xmlrpclib.DateTime
object as today's date.  That's the suspect part.

When I went back recently to add better comparison support, I decided to
compare xmlrpclib.DateTime objects with time objects by simply comparing the
HH:MM:SS part of the DateTime with the time object.  That's making me a bit
queazy now.  datetime.time(hour=23) would compare equal to any DateTime with
its time equal to 11PM.  Under the rule, "in the face of ambiguity, refuse
the temptation to guess", I'm inclined to dump support for marshalling and
comparison of time objects altogether.  Do others agree that was a bad idea?

Thx,

Skip
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Comparing date+time w/ just time

2005-10-22 Thread Guido van Rossum
On 10/22/05, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> With significant input from Fred I made some changes to xmlrpclib a couple
> months ago to better integrate datetime objects into xmlrpclib.  That raised
> some problems because I neglected to add support for comparing datetime
> objects with xmlrpclib.DateTime objects.  (The problem showed up in
> MoinMoin.)  I've been working on that recently (adding rich comparison
> methods to DateTime while retaining __cmp__ for backward compatibility), and
> have second thoughts about one of the original changes.
>
> I tried to support datetime, date and time objects.  My problems are with
> support for time objects.  Marshalling datetimes as xmlrpclib.DateTime
> objects is no problem (though you lose fractions of a second).  Marshalling
> dates is reasonable if you treat the time as 00:00:00.  I decided to marshal
> datetime.time objects by fixing the day portion of the xmlrpclib.DateTime
> object as today's date.  That's the suspect part.
>
> When I went back recently to add better comparison support, I decided to
> compare xmlrpclib.DateTime objects with time objects by simply comparing the
> HH:MM:SS part of the DateTime with the time object.  That's making me a bit
> queazy now.  datetime.time(hour=23) would compare equal to any DateTime with
> its time equal to 11PM.  Under the rule, "in the face of ambiguity, refuse
> the temptation to guess", I'm inclined to dump support for marshalling and
> comparison of time objects altogether.  Do others agree that was a bad idea?

Agreed.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Proposed resolutions for open PEP 343 issues

2005-10-22 Thread Nick Coghlan
I'm still looking for more feedback on the issues raised in the last update of 
PEP 343. There hasn't been much direct feedback so far, but I've rephrased and 
suggested resolutions for the outstanding issues based on what feedback I have 
received, and my own thoughts over the last week of so.

For those simply skimming, my proposed issue resolutions are:

   1. Use the slot name "__context__" instead of "__with__"
   2. Reserve the builtin name "context" for future use as described below
   3a. Give generator-iterators a native context that invokes self.close()
   3b. Use "contextmanager" as a builtin decorator to get generator-contexts
   4. Special case the __context__ slot to avoid the need to decorate it

For those that disagree with the proposed resolutions above, or simply would 
like more details, here's the reasoning:

1. Should the slot be named "__with__" or "__context__"?

 Guido raised this as a side comment during the discussion of
   PJE's task variables pre-PEP, and it's a fair question.
 The closest analogous slot method ("__iter__") is named after the protocol
   it relates to, rather than the associated statement/expression keyword (that
   is, the method isn't called "__for__").
 The next closest analogous slot is one that doesn't actually exist yet -
   the proposed "boolean" protocol. This again uses the name of a protocol
   rather than the associated keyword (that is, the name "__bool__" was
   suggested rather than "__if__").
 At the moment, PEP 343 makes the opposite choice - it uses the keyword,
   rather than the protocol name (that is, it uses "__with__" instead of using
   "__context__").
 That inconsistency would be a bad thing, in my opinion, and I propose that
   the slot should instead be named "__context__".

2. If the slot is called "__context__" what should a "context" builtin do?

 Again, considering existing slot names, a slot with a given name is
   generally invoked by the builtin type or function with the same name.
 This is true of the builtin types, and also true of iter, cmp and pow.
   getattr, setattr and delattr get in on the act as well.
 So, to be consistent, one would expect a "context" builtin to be able to
   be used such that "context(x)" invoked "x.__context__()".
 Such a method might special-case certain types, or have a two-argument
   form that accepted an "enter" function and an "exit" function, but using it
   to mark a generator that is to be used as a context manager (as currently
   suggested in PEP 343) would not be appropriate.
 I don't mind either way whether or not a "context" builtin is actually
   included for Python 2.5. However, even if it isn't included, the name should
   be reserved for that purpose (that is, we shouldn't be using it to name a
   generator decorator or a module).

3. How should generators behave when used as contexts?

 With PEP 342 accepted, generators pose a problem, because they have two
   possible uses as contexts. The first is for a generator that is intended to
   be used as an actual iterator. This case is a case of resource management -
   ensuring the close method is invoked on the generator-iterator when done
   with it (i.e., similar to the proposed native context for files).
 PEP 343 proposes a second use case for generators - to write custom
   context managers. In this case, the __enter__ method steps the generator
   once, and the __exit__ method finishes the job.
 I propose that we give generator-iterator objects resource management
   behaviour by default (i.e., __context__ and __enter__ methods that just
   "return self", and an __exit__ method that invokes "self.close()").
 The "contextmanager" builtin decorator from previous drafts of the PEP
   (called simply "context" in the current draft) can then be used to get the
   custom context manager behaviour.
 I previously thought giving generators a native context caused problems
   with getting silent failures when the "contextmanager" decorator was
   inadvertently omitted. This is still technically true - the "with" statement
   itself won't raise a TypeError because the generator is a legal context.
 However, with this bug, the context manager won't be getting entered *at
   all* (it gets closed without its next() method ever being called). Even the
   most cursory testing of the generator-context function should be able to
   tell whether the generator-context is being entered or not.
 The main alternative (having yet-another-decorator to give generators
   "auto-close" functionality) would be possible, but the additional builtin
   clutter would be getting to the point where it started to concern me. And
   given that "yield" inside "try/finally" is now always legal, I consider it
   reasonable that using a generator in a "with" statement would also always be
   legal.
 Further, if type_new special cases __context__ as suggested below, then
   the context behaviour of gen

Re: [Python-Dev] Comparing date+time w/ just time

2005-10-22 Thread Jim Fulton
[EMAIL PROTECTED] wrote:
> With significant input from Fred I made some changes to xmlrpclib a couple
> months ago to better integrate datetime objects into xmlrpclib.  That raised
> some problems because I neglected to add support for comparing datetime
> objects with xmlrpclib.DateTime objects.  (The problem showed up in
> MoinMoin.)  I've been working on that recently (adding rich comparison
> methods to DateTime while retaining __cmp__ for backward compatibility), and
> have second thoughts about one of the original changes.
> 
> I tried to support datetime, date and time objects.  My problems are with
> support for time objects.  Marshalling datetimes as xmlrpclib.DateTime
> objects is no problem (though you lose fractions of a second).  Marshalling
> dates is reasonable if you treat the time as 00:00:00. 

I don't think that is reasonable at all.  I would normally expect
a date to represent the whole day, not a particular, unspecified time.
Other people may have other expectations, but xmlrpclib should not
assume a particular interpretation.

 > I decided to marshal
> datetime.time objects by fixing the day portion of the xmlrpclib.DateTime
> object as today's date.  That's the suspect part.

Very very suspect. :)

> When I went back recently to add better comparison support, I decided to
> compare xmlrpclib.DateTime objects with time objects by simply comparing the
> HH:MM:SS part of the DateTime with the time object.  That's making me a bit
> queazy now.  datetime.time(hour=23) would compare equal to any DateTime with
> its time equal to 11PM.  Under the rule, "in the face of ambiguity, refuse
> the temptation to guess", I'm inclined to dump support for marshalling and
> comparison of time objects altogether.  Do others agree that was a bad idea?

I agree that it was a bad idea and that you should not try to marshal
time objects or compare time objects with DateTime objects.
Similarly, I strongly recommend that you also stop trying to marshal date
objects or compare date objects to DateTime objects.  After all,
if the datetime module doesn't allow compatison of date and datetime,
why should you try to compare date and DateTime?

Jim

-- 
Jim Fulton   mailto:[EMAIL PROTECTED]   Python Powered!
CTO  (540) 361-1714http://www.python.org
Zope Corporation http://www.zope.com   http://www.zope.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] int(string)

2005-10-22 Thread Tim Peters
[Tim]
...
>> int('10200202220122211', 3) = 0

I should have added that all those examples simply used 2**32 as
input, expressed as a string in the input base.  They're not the only
failing cases; e.g., this is also obviously wrong:

>>> int('10200202220122212', 3)
1

...

>> The challenge (should you decide to accept it ) is to replace
>> the overflow-checking with something both correct _and_ much faster
>> than doing n integer divisions for an n-character input.  For example,
>> 36**6 < 2**32-1, so whenever the input has no more than 6 digits
>> overflow is impossible regardless of base and regardless of platform.
>> That's simple and exploitable.  For extra credit, make int(string) go
>> faster than preparing your taxes ;-)

|Michael Hudson]
> So, you're suggesting dividing the input up into known non-overflowing
> chunks and using the normal Python operations to combine those chunks,
> relying on them overflowing to longs as needed?

Possibly.  I want int(str), for the comparatively short decimal
strings most apps convert most of the time, to be much faster too. 
The _simplest_ thing one could do with the observation is add a
number-of-digits counter to PyOS_strtoul's loop, skip the overflow
check entirely for the first six digits converted, and for every digit
(if any) after the sixth do "obviously correct" overflow checking.

That would save min(len(s), 6) integer divisions per call, and would
probably be a real speed win for most apps that do a lot of
int(string).  Slightly more ambitious would be to use a different
constant per base; e.g., for base 10 overflow is impossible if there
are no more than 9 digits, and exploiting that would buy that
int(decimal_str) would almost never need to do an integer division in
most apps.

The strategy you suggest could, if implemented carefully, speed all
int(string) and long(string) operations, except for long(string, base)
where base is a power of 2 (the latter case is highly optimized
already, in longobject.c's long_from_binary_base).

Speeding long(string) for non-power-of-2 bases is tricky.  It benefits
already from the internal muladd1() routine, which does the "multiply
by the base and add in the next digit" step in one gulp, mutating the
C representation of a long directly.  That's a very efficient loop in
part because it _knows_ the base fits in a single "Python long digit".

Combining larger chunks _could_ be faster, but the multiplication
problem gets harder if base**chunk_size exceeds a single Python long
digit.

So there are a world of possible complications here.  I'd be delighted
to see "just" correct overflow checking plus a major speed boost for
int(decimal_string) where the result does fit in a 32-bit unsigned int
(which I'm sure accounts for the vast bulk of dynamic real-life
int(string) invocations).

> All of the examples you posted should have returned longs anyway, right?

On a 32-bit box, yes.  Regardless of box, all of the original examples
should return 2**32.  The one at the top of this message should return
2**32+1.

> I guess the change to automatically overflowing to longs has led to
> some code that shows its history more than one would like.

Well, these particular cases were always broken -- they always
returned 0.  The difference is that in modern Pythons they should
return the right answer, while in older Pythons they should have
raised OverflowError.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] int(string)

2005-10-22 Thread Adam Olsen
> Tim Peters <[EMAIL PROTECTED]> writes:
>
> > Turns out it's _not_ input speed that's the problem here, and not even
> > mainly the speed of integer mod:  the bulk of the time is spent in
> > int(string) (and, yes, that's also far more important to the problem
> > Neal was looking at than list.append time). If you can even track all
> > the levels of C function calls that ends up invoking , you find
> > yourself in PyOS_strtoul(), which is a nifty all-purpose routine that
> > accepts inputs in bases 2 thru 36, can auto-detect base, and does
> > platform-independent overflow checking at the cost of a division per
> > digit.  All those are features, but it makes for slw conversion.
>
> > I assume it's the overflow-checking that's the major time sink,

Are you sure?

https://sourceforge.net/tracker/index.php?func=detail&aid=1334979&group_id=5470&atid=305470

That patch removes the division from the loop (and fixes the bugs),
but gives only a small increase in speed.

--
Adam Olsen, aka Rhamphoryncus
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposed resolutions for open PEP 343 issues

2005-10-22 Thread Guido van Rossum
On 10/22/05, Nick Coghlan <[EMAIL PROTECTED]> wrote:
> I'm still looking for more feedback on the issues raised in the last update of
> PEP 343. There hasn't been much direct feedback so far, but I've rephrased and
> suggested resolutions for the outstanding issues based on what feedback I have
> received, and my own thoughts over the last week of so.

Thanks for bringing this up again. It's been at the back of my mind,
but hasn't had much of a chance to come to the front lately...

> For those simply skimming, my proposed issue resolutions are:
>
>1. Use the slot name "__context__" instead of "__with__"

+1

>2. Reserve the builtin name "context" for future use as described below

+0.5. I don't think we'll need that built-in, but I do think that the
term "context" is too overloaded to start using it for anything in
particular.

>3a. Give generator-iterators a native context that invokes self.close()

I'll have to think about this one more, and I don't have time for that
right now.

>3b. Use "contextmanager" as a builtin decorator to get generator-contexts

+1

>4. Special case the __context__ slot to avoid the need to decorate it

-1. I expect that we'll also see generator *functions* (not methods)
as context managers. The functions need the decorator. For consistency
the methods should also be decorated explicitly.

For example, while I'm now okay (at the +0.5 level) with having files
automatically behave like context managers, one could still write an
explicit context manager 'opening':

@contextmanager
def opening(filename):
f = open(filename)
try:
yield f
finally:
f.close()

Compare to

class FileLike:

def __init__(self, ...): ...

def close(self): ...

@contextmanager
def __context__(self):
try:
yield self
finally:
self.close()

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Comparing date+time w/ just time

2005-10-22 Thread Fred L. Drake, Jr.
On Saturday 22 October 2005 07:48, [EMAIL PROTECTED] wrote:
 > ..., I'm inclined to dump support
 > for marshalling and comparison of time objects altogether.  Do others
 > agree that was a bad idea?

Very much.  As Jim notes, supporting date objects is more than a little 
questionable as well.  Dates and times, separate from a date-time, are 
completely unsupported by the bare XML-RPC protocol.  Applications must 
determine what they mean and how to encode them in XML-RPC separately if they 
need to do so.


  -Fred

-- 
Fred L. Drake, Jr.   
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] int(string)

2005-10-22 Thread Tim Peters
[Tim]
>> I assume it's the overflow-checking that's the major time sink,

[Adam Olsen]
> Are you sure?

No -- that's what "assume" means <0.7 wink>.  For example, there's a
long chain of function calls involved in int(string) too.

> 
>
> That patch removes the division from the loop (and fixes the bugs),
> but gives only a small increase in speed.

As measured how?  Platform, compiler, input, etc?  Is the "ULONG_MAX /
base" part compiled to inline code or to a call to a library routine
(e.g., if the latter, it could be that a dividend with "the sign bit
set" is extraordinarily expensive for unsigned division -- depends on
the  pair in use)?  If so, a small static table could
avoid all runtime division.  If not, note that the number of divisions
hasn't actually changed for 1-character input.  Etc.

In any case, I agree it _should_ fix the bugs (although it also needs
new tests to verify that).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] int(string)

2005-10-22 Thread Adam Olsen
On 10/22/05, Tim Peters <[EMAIL PROTECTED]> wrote:
> [Tim]
> >> I assume it's the overflow-checking that's the major time sink,
>
> [Adam Olsen]
> > Are you sure?
>
> No -- that's what "assume" means <0.7 wink>.  For example, there's a
> long chain of function calls involved in int(string) too.
>
> > 
> >
> > That patch removes the division from the loop (and fixes the bugs),
> > but gives only a small increase in speed.
>
> As measured how?  Platform, compiler, input, etc?  Is the "ULONG_MAX /
> base" part compiled to inline code or to a call to a library routine
> (e.g., if the latter, it could be that a dividend with "the sign bit
> set" is extraordinarily expensive for unsigned division -- depends on
> the  pair in use)?  If so, a small static table could
> avoid all runtime division.  If not, note that the number of divisions
> hasn't actually changed for 1-character input.  Etc.

AMD Athlon 2500+, Linux 2.6.13, GCC 4.0.2

[EMAIL PROTECTED]:~/src/Python-2.4.1$ python2.4 -m timeit 'int("9")'
100 loops, best of 3: 0.834 usec per loop
[EMAIL PROTECTED]:~/src/Python-2.4.1$ ./python -m timeit 'int("9")'
100 loops, best of 3: 0.801 usec per loop
[EMAIL PROTECTED]:~/src/Python-2.4.1$ python2.4 -m timeit 'int("9")'
100 loops, best of 3: 0.709 usec per loop
[EMAIL PROTECTED]:~/src/Python-2.4.1$ ./python -m timeit 'int("9")'
100 loops, best of 3: 0.717 usec per loop

Originally I just tried the longer string so I hadn't noticed that the
smaller string was slightly slower.  Oh well, caveat emptor.

--
Adam Olsen, aka Rhamphoryncus
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Definining properties - a use case for class decorators?

2005-10-22 Thread Reinhold Birkenfeld
Michele Simionato wrote:
> As other explained, the syntax would not work for functions (and it is
> not intended to).
> A possible use case I had in mind is to define inlined modules to be
> used as bunches
> of attributes. For instance, I could define a module as
> 
> module m():
> a = 1
> b = 2
> 
> where 'module' would be the following function:
> 
> def module(name, args, dic):
> mod = types.ModuleType(name, dic.get('__doc__'))
> for k in dic: setattr(mod, k, dic[k])
> return mod

Wow. This looks like an almighty tool. We can have modules, interfaces,
classes and properties all the like with this.

Guess a PEP would be nice.

Reinhold

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposed resolutions for open PEP 343 issues

2005-10-22 Thread Guido van Rossum
Here's another argument against automatically decorating __context__.

What if I want to have a class with a __context__ method that returns
a custom context manager that *doesn't* involve applying
@contextmanager to a generator?

While technically this is possible with your proposal (since such a
method wouldn't be a generator), it's exceedingly subtle for the human
reader. I'd much rather see the @contextmanager decorator to emphasize
the difference.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Comparing date+time w/ just time

2005-10-22 Thread skip
Based on feedback from Jim and Fred, I took out date and time object
marshalling and comparison.  (Actually, you can still compare an
xmlrpclib.DateTime object with a datetime.date object, because DateTime
objects can be compared with anything that has a timetuple method.)  There's
a patch at

http://python.org/sf/1330538

I went ahead and assigned it to Fred since he's worked with that code fairly
recently.

Skip
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Divorcing str and unicode (no more implicit conversions).

2005-10-22 Thread Bengt Richter
Please bear with me for a few paragraphs ;-)

One aspect of str-type strings is the efficiency afforded when all the encoding 
really
is ascii. If the internal encoding were e.g. fixed utf-16le for strings, maybe 
with today's
computers it would still be efficient enough for most actual string purposes 
(excluding
the current use of str-strings as byte sequences).

I.e., you'd still have to identify what was "strings" (of characters) and what 
was really
byte sequences with no implied or explicit encoding or character semantics.

Ok, let's make that distinction explicit: Call one kind of string a byte 
sequence and the
other a character sequence (representation being a separate issue).

A unicode object is of course the prime _general_ representation of a character 
sequence
in Python, but all the names in python source code (that become NAME tokens) 
are UIAM
also character sequences, and representable by a byte sequence interpreted 
according to
ascii encoding.

For the sake of discussion, suppose we had another _character_ sequence type 
that was
the moral equivalent of unicode except for internal representation, namely a str
subclass with an encoding attribute specifying the encoding that you _could_ use
to decode the str bytes part to get unicode (which you wouldn't do except when 
necessary).
We could call it class charstr(str): ... and have chrstr().bytes be the str 
part and
chrstr().encoding specify the encoding part.

In all the contexts where we have obvious encoding information, we can then 
generate
a charstr instead of a str. E.g., if the source of module_a has

# -*- coding: latin1 -*-
cs = 'über-cool'
then
type(cs)  # => 
cs.bytes  # => '\xfcber-cool'
cs.encoding # => 'latin-1'

and print cs would act like print cs.bytes.decode(cs.encoding) -- or I guess
sys.stdout.write(cs.bytes.decode(cs.encoding).encode(sys.stdout.encoding)
followed by
sys.stdout.write('\n'.decode('ascii').encode(sys.stdout.encoding)
for the newline of the print.

Now if module_b has

# -*- coding: utf8 -*-
cs = 'über-cool'

and we interactively
import module_a, module_b
and then
print module_a.cs + ' =?= ' + module_b.cs

what could happen ideally vs. what we have currently?
UIAM, currently we would just get the concatenation of
the three str byte sequences concatenated to make
'\xfcber-cool =?= \xc3\xbcber-cool'
and that would be printed as whatever that comes out as
without conversion when seen by the output according to
sys.stdout.encoding.

But if those cs instances had been charstr instances, the coding cookie
encoding information would have been preserved, and the interactive print could
have evaluated the string expression -- given cs.decode() as sugar for
(cs.bytes.decode(cs.encoding or globals().get('__encoding__') or
 __import__('sys').getdefaultencoding()))
-- as

module_a.cs.decode() + ' =?= '.decode() + module_b.cs.decode()

if pairwise terms differ in encoding as they might all here. If the interactive
session source were e.g. latin-1, like module_a, then
module_a.cs + ' =?= '
would not require an encoding change, because the ' =?= ' would be a charstr 
instance
with encoding == 'latin-1', and so the result would still be latin-1 that far.
But with module_b.cs being utf8, the next addition would cause the .decode() 
promotions
to unicode. In a console window, the ' =?= '.encoding might be 'cp437' or such, 
and
the first addition would then cause promotion (since module_a.cs.encoding != 
'cp437').

I have sneaked in run-time access to individual modules' encodings by assuming 
that
the encoding cookie could be compiled in as an explicit global __encoding__ 
variable
for any given module (what to have as __encoding__ for built-in modules could 
vary for
various purposes).

ISTM this could have use in situations where an encoding assumption is 
necessary and
currently 'ascii' is not as good a guess as one could make, though I suspect if 
string
literals became charstr strings instead of str strings, many if not most of 
those situations
would disappear (I'm saying this because ATM I can't think of an 'ascii'-guess 
situation that
wouldn't go away ;-) If there were a charchr() version of chr() that would 
result in
a charstr instead of a str, IWT one would want an easy-sugar default encoding 
assumption,
probably based on the same as one would assume for '%c' % num in a given module 
source
-- which presumably would be '%c'.encoding, where '%c' assumes the encoding of 
the module
source, normally recorded in __encoding__. So charchr(n) would act like 
chr(n).decode().encode(''.encoding) -- or more reasonably charstr(chr(n)), 
which would be
short for
charstr(chr(n), globals().get('__encoding__') or 
__import__('sys').getdefaultencoding())
Or some efficient equivalent ;-)

Using strings in dicts requires hashing to find key comparison candidates and 
comparison to
check for key equivalence. This would seem to point to some kind of normalized

[Python-Dev] AST reverts PEP 342 implementation and IDLE starts working again

2005-10-22 Thread Raymond Hettinger
FWIW, a few months ago, I reported that File New or File Open in IDLE
would crash Python as a result of the check-in implementing PEP 342.
Now, with AST checked-in, IDLE has started working again.  Given the
reconfirmation, I recommend that the 342 patch be regarded as suspect
and not be restored until the fault is found and repaired.


Raymond

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] AST reverts PEP 342 implementation and IDLE starts working again

2005-10-22 Thread Phillip J. Eby
At 01:30 AM 10/23/2005 -0400, Raymond Hettinger wrote:
>FWIW, a few months ago, I reported that File New or File Open in IDLE
>would crash Python as a result of the check-in implementing PEP 342.
>Now, with AST checked-in, IDLE has started working again.  Given the
>reconfirmation, I recommend that the 342 patch be regarded as suspect
>and not be restored until the fault is found and repaired.

PEP 342 is actually implemented in the HEAD.  See:

http://mail.python.org/pipermail/python-dev/2005-October/057477.html

So, your observation actually means that the bug, if any, was somewhere 
else, or was inadvertently fixed or hidden by the AST branch merge.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com