[Python-Dev] Should collections.Counter check for int?
I just noticed that while the docs say that "Counts are allowed to be any integer value including zero or negative counts", collections.Counter doesn't perform any check on the types of count values. Instead, non-numerical values will lead to strange behaviour or exceptions later on: >>> c = collections.Counter({'a':'3', 'b':'20', 'c':'100'}) >>> c.most_common(2) [('a', '3'), ('b', '20')] >>> c+c Traceback (most recent call last): File "", line 1, in File "/local/hagenf/lib/python3.1/collections.py", line 467, in __add__ if newcount > 0: TypeError: unorderable types: str() > int() I'd prefer Counter to refuse non-numerical values right away as the present behaviour may hide bugs (e.g. a forgotten string->int conversion). Any opinions? (And what about negative values or floats?) - Hagen ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Should collections.Counter check for int?
I'd prefer Counter to refuse non-numerical values right away as the present behaviour may hide bugs (e.g. a forgotten string->int conversion). Any opinions? (And what about negative values or floats?) Please file a report on bugs.python.org so that there's a record of this issue. Done: http://bugs.python.org/issue6038 - Hagen ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Iterator version of contextlib.nested
contextlib.nested has recently been deprecated on grounds of being unnecessary now that the with statement accepts multiple context managers. However, as has been mentioned before (http://mail.python.org/pipermail/python-dev/2009-May/089359.html), that doesn't cover the case of a variable number of context managers, i.e. with contextlib.nested(*list_of_managers) as list_of_results: or with contexlib.nested(*iterator_of_managers): It was suggested that in these use cases a custom context manager should be implemented. However, it seems that such an implementation would be an almost exact copy of the present code for "nested". I'm proposing to add an iterator version of "nested" to contextlib (possibly called "inested"), which takes an iterable of context managers instead of a variable number of parameters. The implementation could be taken over from the present "nested", only changing "def nested(*managers)" to "def inested(managers)". This has the advantage that an iterator can be passed to "inested", so that each context managers is created in the context of all previous ones, which was one of the reasons for introducing the multi-with statement in the first place. "contextlib.inested" would therefore be the generalization of the multi-with statement to a variable number of managers (and "contextlib.nested" would stay deprecated). - Hagen ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Iterator version of contextlib.nested
> The semantic change actually needed to make nested() more equivalent to > the multi-with statement is for it to accept zero-argument callables > that create context managers as arguments rather than pre-created > context managers. It seems to me that both passing callables which return managers and passing a generator which yields managers achieve about the same thing. Are you proposing the former just to avoid introducing a new interface? > Rather than changing the name of the function, this could be done by > inspecting the first argument for an "__enter__" method. If it has one, > use the old semantics (and issue a DeprecationWarning as in 3.1). > Otherwise, use the proposed new semantics. I guess this is much too late for 3.1, but could we then at least un-deprecate "contextlib.nested" for now? As it is, you get a DeprecationWarning for something like with contextlib.nested(*my_managers): without any good way to get rid of it. - Hagen ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Iterator version of contextlib.nested
> I actually almost asked for that to be changed to a > PendingDeprecationWarning when it was first added - Benjamin, do you > mind if I downgrade this warning to a pending one post rc2? I'm not sure what that would buy us. For the use case I mentioned it would be just as annoying to get a PendingDeprecationWarning. But if the warning was completely removed now, nested could still get deprecated in 3.2 as soon as some better mechanism for a variable number of managers has been provided. - Hagen ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Iterator version of contextlib.nested
> Part of the justification for the new with-statement syntax was > that nested() doesn't have a way to finalize the constructors > if one of them fails. I think the problem was a little bit more subtle: nested() gets passed managers, so their __init__()s should all have run when the first context is entered. The only problem comes up when the __exit__() of an outer manager tries to suppress an exception raised by the __enter__() of an inner one. This is a limited defect in that it doesn't affect the common situation where no __exit__() tries to suppress any exceptions. (In a quick glance over the std library I couldn't find a single instance of an exception-suppressing __exit__().). > And now > that we have the new with-statement syntax, it mostly just > represents a second-way-to-do-it (a second way that has > has the stated pitfall). So the functionalities of nested() and multi-with overlap in the common use cases, and each has its own limitation in an uncommon one. I agree that this situation is unfortunate, but I think introducing support for one uncommon case and removing it for another is not the way to go in 3.1. That's why I think nested() should stay un-deprecated until there is a replacement which handles a superset of its use cases. > The new statement was not designed to support passing in > tuples of context-managers. This issue was raised while > the new with-statement was being designed and it was > intentionally left-out (in part, because the use cases were > questionable FWIW, my use case (which made me notice the DeprecationWarning in the first place) is in a command dispatch function, which looks at the command to be executed and pre-processes its arguments in a uniform way. Part of that pre-processing is entering contexts of context manager before passing them along (and exiting them when the command finishes or raises an exception). > and in-part because there were other ways > to do it such as adding __enter__ and __exit__ to tuple). Which has not been done for 3.1. Granted, you could subclass tuple and add them yourself, but then you would mostly be copying what's already implemented in nested(). > I suggest a PEP for 2.7 and 3.2 for building-out the > with-statement to support tuples of context managers That sounds like a good idea. > IMO, this represents doing-it-the-right-way instead of preserving a > construct that is known to be problematic. > Leaving it in will enshrine it. I don't see the problem with deprecating it only after a completely suitable replacement is found. Why would it be any harder to deprecate nested() in 3.2? - Hagen ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Iterator version of contextlib.nested
> Unlike a full DeprecationWarning, a PendingDeprecationWarning is ignored > by default. You have to switch them on explicitly via code or a command > line switch in order to see them. Sorry, I should have made myself more familiar with the warnings mechanism before writing. In that case I'm fine with a PendingDeprecationWarning. :-) - Hagen ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mercurial: tag generation incorrect
> be32850b093f is listed > as having a child revision, 52b0a279fec6, and ISTM that *this* > should be the revision that got tagged. I think the tag is correct. Note that the concept of tagging is different in Mercurial, where a tag can only refer to a revision previous to the one where it is inserted in .hgtags. If I understand correctly, all relevant tagging revisions from SVN are replaced by Mercurial revisions setting tags, which then refer to their immediate predecessors. - Hagen ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Very Strange Argument Handling Behavior
> This behavior seems pretty strange to me, indeed PyPy gives the > TypeError for both attempts. I just wanted to confirm that it was in > fact intentional. Oleg already answered why f(**{1:3}) raises a TypeError. But your question seems to be rather why dict(**{1:3}) doesn't. For functions implemented in Python, non-string arguments are always rejected, but C functions (like the dict constructor) don't have to reject them. I don't see any benefit in allowing them, but it's probably not worth breaking code by disallowing them either. I couldn't find this documented. Perhaps we should just say "don't rely on being able to pass non-string keywords" somewhere? - Hagen ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Ask a question for a script about re.findall Modlue
> Your problem is easily explained however: the second argument to > p.findall() should be an offset, not a flag set. (You are confusing > re.findall() and p.findall().) I filed a doc bug for this: http://bugs.python.org/issue8785 Cheers, Hagen signature.asc Description: OpenPGP digital signature ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] (Not) delaying the 3.2 release
> Why not? Since the I/O speed problem is fixed, I have no idea what you > are referring to. Please do be concrete. There's still a performance issue with pickling, but if issue 3873 could be resolved, Python 3 would actually be faster there. - Hagen signature.asc Description: OpenPGP digital signature ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Prefetching on buffered IO files
> Ow... I've always assumed that seek() is essentially free, because > that's how a typical OS kernel implements it. If seek() is bad on > GzipFile, how hard would it be to fix this? I'd imagine that there's no easy way to make arbitrary seeks on a GzipFile fast. But wouldn't it be enough to optimize small relative (backwards) seeks? > How common is the use case where you need to read a gzipped pickle > *and* you need to leave the unzipped stream positioned exactly at the > end of the pickle? Not uncommon, I think. You need this for unpickling objects which were dumped one after another into a GzipFile, right? ISTM that the immediate performance issue can be solved by the present patch, and there's room for future improvement by optimizing GzipFile seeks and/or extending the IO API. Cheers, Hagen ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
>> During PEP 3003 discussion, it was suggested to handle it on a case by >> case basis, but I don't see discussion of the upgrade to 6.0.0 in PEP >> 3003. > > It's covered by "As the standard library is not directly tied to the > language definition it is not covered by this moratorium." How is this restricted to the stdlib if it defines the set of valid identifiers? - Hagen ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] set iteration order
Hi, I just hunted down a change in behaviour between Python 3.1 and 3.2 to possibly changed iteration order of sets due to the optimization in issue #8685. Of course, this order shouldn't be relied on in the first place, but the side effect of the optimization might be worth mentioning in "What's new", maybe also pointing out that the old behaviour can be simulated with {x for x in a if x not in b} in place of "a-b". Cheers, Hagen ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] set iteration order
> Code with any dependence on the iteration order of unordered collections > (other than the guarantee that d.keys() and d.values() match at any > given time as long as d is unchanged) is buggy. It's not a matter of dependence on iteration order, but of reproducibility (in my case there were minor numerical differences due to different iteration orders). I think we also warn about changes in pseudorandom number sequences, although you could argue that no code should depend on specific pseudorandom numbers. Cheers, Hagen ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] set iteration order
>> It's not a matter of dependence on iteration order, but of >> reproducibility (in my case there were minor numerical differences due >> to different iteration orders). > > Can you give a code example? I don’t understand your case. It's a bit involved (that's why it took me a while to locate the difference in behavior), but it boils down to a (learning) algorithm that in principle should not care about order of input data, but will in practice show slightly different numerical behavior. I ran into the problem when trying to exactly reproduce previously published experimental results. Of course, I should have anticipated this and fixed some arbitrary order in the first place. I just thought a note about this change might save someone in a similar situation some confusion. Cheers, Hagen ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
> On behalf of the Python development team, I am pleased to announce the > first release candidate of Python 3.2.1. Shouldn't there be a tag "v3.2.1rc1" in the hg repo? Cheers, Hagen ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
> P.S. "Shouldn't" makes it sound as if there was a mistake. Well, I thought there was. When do these tags get merged into "cpython" then? "v3.2.1b1" is there, but "v3.2.1rc1" isn't: http://hg.python.org/cpython/tags Cheers, Hagen ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [RELEASED] Python 3.2.1 rc 1
> 3.2.1b1 was already merged back. (And 3.2.1rc1 will also be merged back > soon, since there will be a 3.2.1rc2.) Thanks for the clarification! :-) Cheers, Hagen ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python 3.x and bytes
> EOH = b'\r'[0] > CHAR = b'C'[0] > DATE = b'D'[0] > FLOAT = b'F'[0] > INT = b'I'[0] > LOGICAL = b'L'[0] > MEMO = b'M'[0] > NUMBER = b'N'[0] > > This is not beautiful code. You still have the alternative EOH = ord('\r') CHAR = ord('C') ... which looks fine to me. Cheers, Hagen ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
> If the Unicode APIs only have correct unicode, sure. If not you'll > get errors translating to UTF-8 (and the byte APIs are supposed to > pass bad names through unaltered.) Kinda ironic, no? As far as I can see all Python Unicode strings can be encoded to UTF-8, even things like lone surrogates because Python doesn't care about them. So both the Unicode API and the binary API would be fail-safe on Windows. - Hagen ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python-3.0, unicode, and os.environ
>> As far as I can see all Python Unicode strings can be encoded to UTF-8, >> even things like lone surrogates because Python doesn't care about them. >> So both the Unicode API and the binary API would be fail-safe on Windows. > > Python is broken and needs to be fixed. > > http://bugs.python.org/issue3672 > http://bugs.python.org/issue3297 But the question of whether Python should care about lone surrogates or not is at best tangential to the issue at hand. If you have lone surrogates in the Unicode API (and didn't raise an exception on the way getting there), then the sensible thing is to encode them into lone UTF-8 surrogates. Even if you wanted to prevent lone surrogates, encoding to UTF-8 for the binary API would not be the place to enforce it. - Hagen ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Py3k: magical dir()
> Is there some reason no set tp_hash for rangeobject to > PyObject_HashNotImplemented ? http://bugs.python.org/issue4701 - Hagen ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Fwd: Partial function application 'from the right'
Ludvig Ericson wrote: > Well, I was trying to be funny and was under the impression that Python > 3.0 had Unicode identifiers, but apparently it doesn't. (I used …, not ...) It does, but they may not contain characters of the category "Punctuation, other": >>> import unicodedata >>> unicodedata.category("…") 'Po' - Hagen ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 393 Summer of Code Project
>> [...] some text drawing engines draw decomposed characters ("o" >> followed by " ̈" -> "ö") differently compared to their composite >> equivalents ("ö") and this may be perceived as better or worse. I'd >> like to offer an option to replace some decomposed characters with >> their composite equivalent before drawing but since other characters >> may look worse, I don't want to do a full normalization. > > Isn't this an issue properly solved by various normal forms? I think he's rather describing the need for custom "abnormal forms". - Hagen ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 393 Summer of Code Project
> Ok, I thought there was also a form normalized (denormalized?) to > decomposed form. But I'll take your word. If I understood the example correctly, he needs a mixed form, with some characters decomposed and some composed (depending on which one looks better in the given font). I agree that this sound more like a font problem, but it's a wide spread font problem and it may be necessary to address it in an application. But this is only one example of why an application-specific concept of graphemes different from the Unicode-defined normalized forms can be useful. I think the very concept of a grapheme is context, language, and culture specific. For example, in Chinese Pinyin it would be very natural to write tone marks with composing diacritics (i.e. in decomposed form). But then you have the vowel "ü" and it would be strange to decompose it into an "u" and combining diaeresis. So conceptually the most sensible representation of "lǜ" would be neither the composed not the decomposed normal form, and depending on its needs an application might want to represent it in the mixed form (composing the diaeresis with the "u", but leaving the grave accent separate). There must be many more examples where the conceptual context determines the right composition, like for "ñ", which is Spanish is certainly a grapheme, but in mathematics might be better represented as n-tilde. The bottom line is that, while an array of Unicode code points is certainly a generally useful data type (and PEP 393 is a great improvement in this regard), an array of graphemes carries many subtleties and may not be nearly as universal. Support in the spirit of unicodedata's normalization function etc. is certainly a good thing, but we shouldn't assume that everyone will want Python to do their graphemes for them. - Hagen ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [RELEASED] Python 3.2.2
> To download Python 3.2 visit: > > http://www.python.org/download/releases/3.2/ It's a bit confusing that the download link is to 3.2 and not 3.2.2. Cheers, Hagen ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com