Re: [Python-Dev] Add transform() and untranform() methods

2013-11-16 Thread Nick Coghlan
On 16 November 2013 21:49, M.-A. Lemburg wrote: > On 16.11.2013 01:47, Victor Stinner wrote: >> Adding transform()/untransform() method to bytes and str is a non >> trivial change and not everybody likes them. Anyway, it's too late for >> Python 3.4. > > Just to clarify: I still like the idea of a

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-16 Thread Nick Coghlan
On 16 November 2013 21:38, Nick Coghlan wrote: > On 16 November 2013 20:45, Victor Stinner wrote: >> Why not using str type for str and str subtypes, and bytes type for bytes >> and bytes-like object (bytearray, memoryview)? I don't think that we need an >> ABC here. > > We'd only need an ABC if

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-16 Thread Antoine Pitrou
On Sat, 16 Nov 2013 19:44:51 +1000 Nick Coghlan wrote: > > Aye, that was my conclusion (hence my proposal on issue 7475 back in April). > > Can I take that observation as a +1 for restoring the aliases as well? I see no harm in restoring the aliases personally, so +1 from me. Regards Antoine.

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-16 Thread M.-A. Lemburg
On 16.11.2013 01:47, Victor Stinner wrote: > Adding transform()/untransform() method to bytes and str is a non > trivial change and not everybody likes them. Anyway, it's too late for > Python 3.4. Just to clarify: I still like the idea of adding those methods. I just don't see what this addition

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-16 Thread Nick Coghlan
On 16 November 2013 20:45, Victor Stinner wrote: > Why not using str type for str and str subtypes, and bytes type for bytes > and bytes-like object (bytearray, memoryview)? I don't think that we need an > ABC here. We'd only need an ABC if info was added for supported input types. However, that'

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-16 Thread Victor Stinner
Why not using str type for str and str subtypes, and bytes type for bytes and bytes-like object (bytearray, memoryview)? I don't think that we need an ABC here. Victor Le 16 nov. 2013 10:44, "Nick Coghlan" a écrit : > On 16 Nov 2013 10:47, "Victor Stinner" wrote: > > > > 2013/11/16 Nick Coghlan

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-16 Thread Nick Coghlan
On 16 Nov 2013 10:47, "Victor Stinner" wrote: > > 2013/11/16 Nick Coghlan : > > To address Serhiy's security concerns with the compression codecs (which are > > technically independent of the question of restoring the aliases), I also > > plan to document how to systematically blacklist particular

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Victor Stinner
2013/11/16 Nick Coghlan : > To address Serhiy's security concerns with the compression codecs (which are > technically independent of the question of restoring the aliases), I also > plan to document how to systematically blacklist particular codecs in an > application by setting attributes on the

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Nick Coghlan
On 16 Nov 2013 02:36, "Antoine Pitrou" wrote: > > On Sat, 16 Nov 2013 00:46:15 +1000 > Nick Coghlan wrote: > > On 16 November 2013 00:04, Antoine Pitrou wrote: > > >> Rather than the more useful: > > >> > > >> >>> b"abcdef".decode("hex") > > >> Traceback (most recent call last): > > >> File ""

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Walter Dörwald
Am 15.11.2013 um 16:57 schrieb "Stephen J. Turnbull" : > > Walter Dörwald writes: >>> Am 15.11.2013 um 00:42 schrieb Serhiy Storchaka : >>> >>> 15.11.13 00:32, Victor Stinner написав(ла): And add transform() and untransform() methods to bytes and str types. In practice, it might be same

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Antoine Pitrou
On Sat, 16 Nov 2013 00:46:15 +1000 Nick Coghlan wrote: > On 16 November 2013 00:04, Antoine Pitrou wrote: > >> Rather than the more useful: > >> > >> >>> b"abcdef".decode("hex") > >> Traceback (most recent call last): > >> File "", line 1, in > >> TypeError: 'hex' decoder returned 'bytes' inst

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Ethan Furman
On 11/14/2013 11:13 PM, Nick Coghlan wrote: The proposal I posted to issue 7475 back in April (and, in the absence of any objections to the proposal, finally implemented over the past few weeks) was to take advantage of the fact that the codecs.encode and codecs.decode convenience functions exis

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Stephen J. Turnbull
Walter Dörwald writes: > Am 15.11.2013 um 00:42 schrieb Serhiy Storchaka : > > > > 15.11.13 00:32, Victor Stinner написав(ла): > >> And add transform() and untransform() methods to bytes and str types. > >> In practice, it might be same codecs registry for all codecs just with > >> a new att

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Nick Coghlan
On 16 November 2013 00:04, Antoine Pitrou wrote: >> Rather than the more useful: >> >> >>> b"abcdef".decode("hex") >> Traceback (most recent call last): >> File "", line 1, in >> TypeError: 'hex' decoder returned 'bytes' instead of 'str'; use >> codecs.decode() to decode to arbitrary types > >

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Antoine Pitrou
On Fri, 15 Nov 2013 23:50:23 +1000 Nick Coghlan wrote: > > My perspective is that, in current Python, that *is* the right thing > for people to do, and any hypothetical new API proposed for Python 3.5 > would do nothing to change what's right for Python 3.4 code (or Python > 2/3 compatible code).

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Nick Coghlan
On 15 November 2013 22:24, Paul Moore wrote: > On 15 November 2013 12:07, Victor Stinner wrote: >>> A new API for binary transforms is potentially an academically >>> interesting concept, but it solves zero current real world problems. >> >> I would like to reply the same for these codecs: they a

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Facundo Batista
On Thu, Nov 14, 2013 at 7:32 PM, Victor Stinner wrote: > I would prefer to split the registry of codecs to have 3 registries: > > - "encoding" (a better name can found): encode str=>bytes, decode bytes=>str > - bytes: encode bytes=>bytes, decode bytes=>bytes > - str: encode str=>str, decode str=

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread M.-A. Lemburg
On 15.11.2013 12:45, Nick Coghlan wrote: > On 15 November 2013 20:33, Antoine Pitrou wrote: >> On Fri, 15 Nov 2013 21:28:35 +1100 >> Steven D'Aprano wrote: >>> >>> One benefit is: >>> >>> import codecs >>> codec = get_name_of_compression_codec() >>> result = codecs.encode(data, codec) >> >> That'

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Paul Moore
On 15 November 2013 12:07, Victor Stinner wrote: >> A new API for binary transforms is potentially an academically >> interesting concept, but it solves zero current real world problems. > > I would like to reply the same for these codecs: they are not solving > any real world problem :-) As Nick

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Victor Stinner
2013/11/15 Nick Coghlan : > The reason I'm now putting some effort into better documenting the > status quo for codec handling in Python 3 and filing off some of the > rough edges (rather than proposing adding any new APIs to Python 3.x) > is because the users I care about in this matter are web de

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Antoine Pitrou
On Fri, 15 Nov 2013 21:45:31 +1000 Nick Coghlan wrote: > > The reason I'm now putting some effort into better documenting the > status quo for codec handling in Python 3 and filing off some of the > rough edges (rather than proposing adding any new APIs to Python 3.x) > is because the users I car

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Nick Coghlan
On 15 November 2013 20:33, Antoine Pitrou wrote: > On Fri, 15 Nov 2013 21:28:35 +1100 > Steven D'Aprano wrote: >> >> One benefit is: >> >> import codecs >> codec = get_name_of_compression_codec() >> result = codecs.encode(data, codec) > > That's a good point. > >> If encoding/decoding is intended

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Serhiy Storchaka
15.11.13 12:28, Steven D'Aprano написав(ла): One benefit is: import codecs codec = get_name_of_compression_codec() result = codecs.encode(data, codec) And this is a hole in a security if you don't check codec name before calling a codec. See topic about utilizing zip-bombs via codecs machiner

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Antoine Pitrou
On Fri, 15 Nov 2013 21:28:35 +1100 Steven D'Aprano wrote: > > One benefit is: > > import codecs > codec = get_name_of_compression_codec() > result = codecs.encode(data, codec) That's a good point. > If encoding/decoding is intended to be completely generic (even if 99% > of the uses will be w

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Steven D'Aprano
On Fri, Nov 15, 2013 at 10:22:28AM +0100, Antoine Pitrou wrote: > On Fri, 15 Nov 2013 09:03:37 +1000 Nick Coghlan wrote: > > > > > And add transform() and untransform() methods to bytes and str types. > > > In practice, it might be same codecs registry for all codecs just with > > > a new attribu

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Serhiy Storchaka
15.11.13 12:02, Steven D'Aprano написав(ла): It would be really good to be able to query the available codecs. For example, many applications offer an "Encoding" menu, where you can specify the codec used for text. That's hard in Python, since you can't retrieve a list of known codecs. And you

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Steven D'Aprano
On Fri, Nov 15, 2013 at 05:13:34PM +1000, Nick Coghlan wrote: > A few things I noticed while implementing the recent updates: > > - as you noted in your other email, while MAL is on record as saying > the codecs module is intended for arbitrary codecs, not just Unicode > encodings, readers of the

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread Antoine Pitrou
On Fri, 15 Nov 2013 09:03:37 +1000 Nick Coghlan wrote: > > > And add transform() and untransform() methods to bytes and str types. > > In practice, it might be same codecs registry for all codecs just with > > a new attribute. > > This is completely the wrong approach. There's zero justification

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-15 Thread M.-A. Lemburg
On 15.11.2013 08:13, Nick Coghlan wrote: > On 15 November 2013 11:10, Terry Reedy wrote: >> On 11/14/2013 5:32 PM, Victor Stinner wrote: >> >>> I don't like the functions codecs.encode() and codecs.decode() because >>> the type of the result depends on the encoding (second parameter). We >>> try t

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Nick Coghlan
On 15 November 2013 11:10, Terry Reedy wrote: > On 11/14/2013 5:32 PM, Victor Stinner wrote: > >> I don't like the functions codecs.encode() and codecs.decode() because >> the type of the result depends on the encoding (second parameter). We >> try to avoid this in Python. > > > Such dependence is

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Walter Dörwald
Am 15.11.2013 um 00:42 schrieb Serhiy Storchaka : > > 15.11.13 00:32, Victor Stinner написав(ла): >> And add transform() and untransform() methods to bytes and str types. >> In practice, it might be same codecs registry for all codecs just with >> a new attribute. > > If the transform() method wi

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Terry Reedy
On 11/14/2013 6:03 PM, Nick Coghlan wrote: You have to get it out of your head that codecs are just about text and and binary data. 99+% of the current codec module doc leads one to that impression. The fact that codecs are expected to have a file reader and writer and that the default 'stri

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Terry Reedy
On 11/14/2013 5:32 PM, Victor Stinner wrote: I don't like the functions codecs.encode() and codecs.decode() because the type of the result depends on the encoding (second parameter). We try to avoid this in Python. Such dependence is common with arithmetic. >>> 1 + 2 3 >>> 1 + 2.0 3.0 >>> 1 +

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Serhiy Storchaka
15.11.13 00:32, Victor Stinner написав(ла): And add transform() and untransform() methods to bytes and str types. In practice, it might be same codecs registry for all codecs just with a new attribute. If the transform() method will be added, I prefer to have only one transformation method and

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Nick Coghlan
On 15 Nov 2013 09:11, "Nick Coghlan" wrote: > > > On 15 Nov 2013 08:42, "Victor Stinner" wrote: > > > > Oh, I forgot to mention that I sent this email in reaction to this issue: > > > > http://bugs.python.org/issue19585 > > > > Modifying the critical PyFrameObject because the codecs API raises >

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Serhiy Storchaka
15.11.13 01:03, Nick Coghlan написав(ла): We already do this check in the existing convenience methods - it raises TypeError. The problem with this check is that it happens *after* encoding/decoding. This opens door for DoS (see my last message).

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Nick Coghlan
On 15 Nov 2013 08:42, "Victor Stinner" wrote: > > Oh, I forgot to mention that I sent this email in reaction to this issue: > > http://bugs.python.org/issue19585 > > Modifying the critical PyFrameObject because the codecs API raises > surprising errors doesn't sound correct. I prefer to fix how co

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Nick Coghlan
On 15 Nov 2013 08:34, "Victor Stinner" wrote: > > Hi, > > I saw that Nick Coghlan documented codecs.encode() and > codecs.decode(), and changed the exception raised when codecs like > rot_13 are used on bytes.decode() and str.encode(). > > I don't like the functions codecs.encode() and codecs.deco

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Victor Stinner
Oh, I forgot to mention that I sent this email in reaction to this issue: http://bugs.python.org/issue19585 Modifying the critical PyFrameObject because the codecs API raises surprising errors doesn't sound correct. I prefer to fix how codecs are used, than modifying the PyFrameObject. For more

[Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Victor Stinner
Hi, I saw that Nick Coghlan documented codecs.encode() and codecs.decode(), and changed the exception raised when codecs like rot_13 are used on bytes.decode() and str.encode(). I don't like the functions codecs.encode() and codecs.decode() because the type of the result depends on the encoding (