Re: [Python-Dev] bytes.from_hex()

2006-03-03 Thread Ron Adam
Greg Ewing wrote: > Ron Adam wrote: > >> This would apply to codecs that >> could return either bytes or strings, or strings or unicode, or bytes or >> unicode. > > I'd need to see some concrete examples of such codecs > before being convinced that they exist, or that they > couldn't just as we

Re: [Python-Dev] bytes.from_hex()

2006-03-03 Thread Greg Ewing
Ron Adam wrote: > This would apply to codecs that > could return either bytes or strings, or strings or unicode, or bytes or > unicode. I'd need to see some concrete examples of such codecs before being convinced that they exist, or that they couldn't just as well return a fixed type that you t

Re: [Python-Dev] bytes.from_hex()

2006-03-03 Thread Greg Ewing
Stephen J. Turnbull wrote: > Doesn't that make base64 non-text by analogy to other "look but don't > touch" strings like a .gz or vmlinuz? No, because I can take a piece of base64 encoded data and use a text editor to manually paste it in with some other text (e.g. a plain-text (not MIME) mail me

Re: [Python-Dev] bytes.from_hex()

2006-03-03 Thread Ron Adam
Greg Ewing wrote: > Ron Adam wrote: > >> This uses syntax to determine the direction of encoding. It would be >> easier and clearer to just require two arguments or a tuple. >> >> u = unicode(b, 'encode', 'base64') >> b = bytes(u, 'decode', 'base64') > > The point of the exercise wa

Re: [Python-Dev] bytes.from_hex()

2006-03-02 Thread Stephen J. Turnbull
> "Greg" == Greg Ewing <[EMAIL PROTECTED]> writes: Greg> (BTW, doesn't the fact that you *can* load an XML file into Greg> what we call a "text editor" say something?) Why not answer that question for yourself, and then turn that answer into a description of "text semantics"? For me,

Re: [Python-Dev] bytes.from_hex()

2006-03-02 Thread Greg Ewing
Stephen J. Turnbull wrote: > What you presumably meant was "what would you consider the proper type > for (P)CDATA?" No, I mean the whole thing, including all the <...> tags etc. Like you see when you load an XML file into a text editor. (BTW, doesn't the fact that you *can* load an XML file into

Re: [Python-Dev] bytes.from_hex()

2006-03-02 Thread Greg Ewing
Ron Adam wrote: > This uses syntax to determine the direction of encoding. It would be > easier and clearer to just require two arguments or a tuple. > > u = unicode(b, 'encode', 'base64') > b = bytes(u, 'decode', 'base64') The point of the exercise was to avoid using the terms 'en

Re: [Python-Dev] bytes.from_hex()

2006-03-02 Thread Delaney, Timothy (Tim)
Delaney, Timothy (Tim) wrote: > unicode.frombytes(cls, encoding) unicode.frombytes(encoding) ... Tim Delaney ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/m

Re: [Python-Dev] bytes.from_hex()

2006-03-02 Thread Delaney, Timothy (Tim)
Just van Rossum wrote: > My preference for bytes -> unicode -> bytes API would be this: > > u = unicode(b, "utf8") # just like we have now > b = u.tobytes("utf8") # like u.encode(), but being explicit > # about the resulting type +1 - I was going to write e

Re: [Python-Dev] bytes.from_hex()

2006-03-02 Thread Stephen J. Turnbull
> "Greg" == Greg Ewing <[EMAIL PROTECTED]> writes: Greg> But the base64 string itself *does* have text semantics. What do you mean by that? The strings of abstract "characters" defined by RFC 3548 cannot be concatenated in general, they may only be split at 4-character intervals, they ca

Re: [Python-Dev] bytes.from_hex()

2006-03-02 Thread Josiah Carlson
Just van Rossum <[EMAIL PROTECTED]> wrote: > > Ron Adam wrote: > > > Josiah Carlson wrote: > > > Greg Ewing <[EMAIL PROTECTED]> wrote: > > >>u = unicode(b) > > >>u = unicode(b, 'utf8') > > >>b = bytes['utf8'](u) > > >>u = unicode['base64'](b) # encoding > > >>b = bytes(u, '

Re: [Python-Dev] bytes.from_hex()

2006-03-02 Thread Just van Rossum
Ron Adam wrote: > Josiah Carlson wrote: > > Greg Ewing <[EMAIL PROTECTED]> wrote: > >>u = unicode(b) > >>u = unicode(b, 'utf8') > >>b = bytes['utf8'](u) > >>u = unicode['base64'](b) # encoding > >>b = bytes(u, 'base64') # decoding > >>u2 = unicode['piglatin'](u1) #

Re: [Python-Dev] bytes.from_hex()

2006-03-02 Thread Ron Adam
Josiah Carlson wrote: > Greg Ewing <[EMAIL PROTECTED]> wrote: >>u = unicode(b) >>u = unicode(b, 'utf8') >>b = bytes['utf8'](u) >>u = unicode['base64'](b) # encoding >>b = bytes(u, 'base64') # decoding >>u2 = unicode['piglatin'](u1) # encoding >>u1 = unicode(u2, '

Re: [Python-Dev] bytes.from_hex()

2006-03-01 Thread Josiah Carlson
Greg Ewing <[EMAIL PROTECTED]> wrote: >u = unicode(b) >u = unicode(b, 'utf8') >b = bytes['utf8'](u) >u = unicode['base64'](b) # encoding >b = bytes(u, 'base64') # decoding >u2 = unicode['piglatin'](u1) # encoding >u1 = unicode(u2, 'piglatin') # decoding Your

Re: [Python-Dev] bytes.from_hex()

2006-03-01 Thread Greg Ewing
Ron Adam wrote: > 1. We can specify the operation and not be sure of the resulting type. > >*or* > > 2. We can specify the type and not always be sure of the operation. > > maybe there's a way to specify both so it's unambiguous? Here's another take on the matter. When we're doing Unicode

Re: [Python-Dev] bytes.from_hex()

2006-03-01 Thread Ron Adam
Greg Ewing wrote: > Ron Adam wrote: > >> While playing around with the example bytes class I noticed code reads >> much better when I use methods called tounicode and tostring. >> >> b64ustring = b.tounicode('base64') >> b = bytes(b64ustring, 'base64') > > I don't like that, because it c

Re: [Python-Dev] bytes.from_hex()

2006-03-01 Thread Michael Urman
[My apologies Greg; I meant to send this to the whole list. I really need a list-reply button in GMail. ] On 3/1/06, Greg Ewing <[EMAIL PROTECTED]> wrote: > I don't like that, because it creates a dependency > (conceptually, at least) between the bytes type and > the unicode type. I only find hal

Re: [Python-Dev] bytes.from_hex()

2006-03-01 Thread Greg Ewing
Ron Adam wrote: > While playing around with the example bytes class I noticed code reads > much better when I use methods called tounicode and tostring. > > b64ustring = b.tounicode('base64') > b = bytes(b64ustring, 'base64') I don't like that, because it creates a dependency (conceptua

Re: [Python-Dev] bytes.from_hex()

2006-03-01 Thread Greg Ewing
Bill Janssen wrote: > No, once it's in a particular encoding it's bytes, no longer text. The point at issue is whether the characters produced by base64 are in a particular encoding. According to my reading of the RFC, they're not. -- Greg Ewing, Computer Science Dept, +

Re: [Python-Dev] bytes.from_hex()

2006-03-01 Thread Greg Ewing
Nick Coghlan wrote: >ascii_bytes = orig_bytes.decode("base64").encode("ascii") > >orig_bytes = ascii_bytes.decode("ascii").encode("base64") > > The only slightly odd aspect is that this inverts the conventional meaning of > base64 encoding and decoding, -1. Whatever we do, we shouldn't

Re: [Python-Dev] bytes.from_hex()

2006-03-01 Thread Michael Chermside
I wrote: > ... I will say that if there were no legacy I'd prefer the tounicode() > and tostring() (but shouldn't itbe 'tobytes()' instead?) names for Python 3.0. Scott Daniels replied: > Wouldn't 'tobytes' and 'totext' be better for 3.0 where text == unicode? Um... yes. Sorry, I'm not completely

Re: [Python-Dev] bytes.from_hex()

2006-03-01 Thread Scott David Daniels
Chermside, Michael wrote: > ... I will say that if there were no legacy I'd prefer the tounicode() > and tostring() (but shouldn't itbe 'tobytes()' instead?) names for Python 3.0. Wouldn't 'tobytes' and 'totext' be better for 3.0 where text == unicode? -- -- Scott David Daniels [EMAIL PROTECTED]

Re: [Python-Dev] bytes.from_hex()

2006-03-01 Thread Bill Janssen
> Huh... just joining here but surely you don't mean a text string that > doesn't use every character available in a particular encoding is > "really bytes"... it's still a text string... No, once it's in a particular encoding it's bytes, no longer text. As you say, > Keep these two concepts sepa

Re: [Python-Dev] bytes.from_hex()

2006-03-01 Thread Chermside, Michael
Ron Adam writes: > While playing around with the example bytes class I noticed code reads > much better when I use methods called tounicode and tostring. [...] > I'm not suggesting we start using to-type everywhere, just where it > might make things clearer over decode and encode. +1 I alwa

Re: [Python-Dev] bytes.from_hex()

2006-03-01 Thread Ron Adam
Nick Coghlan wrote: > All the unicode codecs, on the other hand, use encode to get from characters > to bytes and decode to get from bytes to characters. > > So if bytes objects *did* have an encode method, it should still result in a > unicode object, just the same as a decode method does (beca

Re: [Python-Dev] bytes.from_hex()

2006-03-01 Thread Nick Coghlan
Bill Janssen wrote: > Greg Ewing wrote: >> Bill Janssen wrote: >> >>> bytes -> base64 -> text >>> text -> de-base64 -> bytes >> It's nice to hear I'm not out of step with >> the entire world on this. :-) > > Well, I can certainly understand the bytes->base64->bytes side of > thing too. The "text"

Re: [Python-Dev] bytes.from_hex()

2006-03-01 Thread Donovan Baarda
On Tue, 2006-02-28 at 15:23 -0800, Bill Janssen wrote: > Greg Ewing wrote: > > Bill Janssen wrote: > > > > > bytes -> base64 -> text > > > text -> de-base64 -> bytes > > > > It's nice to hear I'm not out of step with > > the entire world on this. :-) > > Well, I can certainly understand the byte

Re: [Python-Dev] bytes.from_hex()

2006-02-28 Thread Greg Ewing
Bill Janssen wrote: > Well, I can certainly understand the bytes->base64->bytes side of > thing too. The "text" produced is specified as using "a 65-character > subset of US-ASCII", so that's really bytes. But it then goes on to say that these same characters are also a subset of EBCDIC. So it s

Re: [Python-Dev] bytes.from_hex()

2006-02-28 Thread Bill Janssen
Greg Ewing wrote: > Bill Janssen wrote: > > > bytes -> base64 -> text > > text -> de-base64 -> bytes > > It's nice to hear I'm not out of step with > the entire world on this. :-) Well, I can certainly understand the bytes->base64->bytes side of thing too. The "text" produced is specified as us

Re: [Python-Dev] bytes.from_hex()

2006-02-28 Thread Guido van Rossum
On 2/28/06, Greg Ewing <[EMAIL PROTECTED]> wrote: > Bill Janssen wrote: > > > bytes -> base64 -> text > > text -> de-base64 -> bytes > > It's nice to hear I'm not out of step with > the entire world on this. :-) What Bill proposes makes sense to me. -- --Guido van Rossum (home page: http://www.py

Re: [Python-Dev] bytes.from_hex()

2006-02-28 Thread Greg Ewing
Bill Janssen wrote: > bytes -> base64 -> text > text -> de-base64 -> bytes It's nice to hear I'm not out of step with the entire world on this. :-) -- Greg Ewing, Computer Science Dept, +--+ University of Canterbury, | Carpe post meridiam!

Re: [Python-Dev] bytes.from_hex()

2006-02-27 Thread Greg Ewing
Bill Janssen wrote: > I use it quite a bit for image processing (converting to and from the > "data:" URL form), and various checksum applications (converting SHA > into a string). Aha! We have a customer! For those cases, would you find it more convenient for the result to be text or bytes in P

Re: [Python-Dev] bytes.from_hex()

2006-02-27 Thread Bill Janssen
> If implementing a mime packer is really the only use case > for base64, then it might as well be removed from the > standard library, since 99.9% of all programmers will > never touch it. Those that do will need to have boned up I use it quite a bit for image processing (converting to and fr

Re: [Python-Dev] bytes.from_hex()

2006-02-27 Thread Greg Ewing
Stephen J. Turnbull wrote: > Greg> I'd be perfectly happy with ascii characters, but in Py3k, > Greg> the most natural place to keep ascii characters will be in > Greg> character strings, not byte arrays. > > Natural != practical. That seems to be another thing we disagree about -- t

Re: [Python-Dev] bytes.from_hex()

2006-02-26 Thread Stephen J. Turnbull
> "Greg" == Greg Ewing <[EMAIL PROTECTED]> writes: Greg> Stephen J. Turnbull wrote: >> I gave you one, MIME processing in email Greg> If implementing a mime packer is really the only use case Greg> for base64, then it might as well be removed from the Greg> standard libra

Re: [Python-Dev] bytes.from_hex()

2006-02-26 Thread Greg Ewing
Stephen J. Turnbull wrote: > I gave you one, MIME processing in email If implementing a mime packer is really the only use case for base64, then it might as well be removed from the standard library, since 99.9% of all programmers will never touch it. Those that do will need to have boned up

Re: [Python-Dev] bytes.from_hex()

2006-02-26 Thread Stephen J. Turnbull
> "Greg" == Greg Ewing <[EMAIL PROTECTED]> writes: Greg> I think we need some concrete use cases to talk about if Greg> we're to get any further with this. Do you have any such use Greg> cases in mind? I gave you one, MIME processing in email, and a concrete bug that is possible w

Re: [Python-Dev] bytes.from_hex()

2006-02-25 Thread Ron Adam
Greg Ewing wrote: > Stephen J. Turnbull wrote: >> It's "what is the Python compiler/interpreter going > > to think?" AFAICS, it's going to think that base64 is > > a unicode codec. > > Only if it's designed that way, and I specifically > think it shouldn't -- i.e. it should be an error > to att

Re: [Python-Dev] bytes.from_hex()

2006-02-25 Thread Greg Ewing
Stephen J. Turnbull wrote: > The reason that Python source code is text is that the primary > producers/consumers of Python source code are human beings, not > compilers I disagree with "primary" -- I think human and computer use of source code have equal importance. Because of the fact that Pyth

Re: [Python-Dev] bytes.from_hex()

2006-02-25 Thread Stephen J. Turnbull
> "Ron" == Ron Adam <[EMAIL PROTECTED]> writes: Ron> So, lets consider a "codec" and a "coding" as being two Ron> different things where a codec is a character sub set of Ron> unicode characters expressed in a native format. And a Ron> coding is *not* a subset of the unicode c

Re: [Python-Dev] bytes.from_hex()

2006-02-25 Thread Stephen J. Turnbull
> "Greg" == Greg Ewing <[EMAIL PROTECTED]> writes: Greg> Stephen J. Turnbull wrote: >> the kind of "text" for which Unicode was designed is normally >> produced and consumed by people, who wll pt up w/ ll knds f >> nnsns. Base64 decoders will not put up with the same kinds of

Re: [Python-Dev] bytes.from_hex()

2006-02-24 Thread Greg Ewing
Stephen J. Turnbull wrote: > the kind of "text" for which Unicode was designed is normally produced > and consumed by people, who wll pt up w/ ll knds f nnsns. Base64 > decoders will not put up with the same kinds of nonsense that people > will. The Python compiler won't put up with that sort of

Re: [Python-Dev] bytes.from_hex()

2006-02-24 Thread Ron Adam
* The following reply is a rather longer than I intended explanation of why codings (and how they differ) like 'rot' aren't the same thing as pure unicode codecs and probably should be treated differently. If you already understand that, then I suggest skipping this. But if you like detailed l

Re: [Python-Dev] bytes.from_hex()

2006-02-24 Thread Stephen J. Turnbull
> "Greg" == Greg Ewing <[EMAIL PROTECTED]> writes: Greg> Stephen J. Turnbull wrote: >> No, base64 isn't a wire protocol. It's a family[...]. Greg> Yes, and it's up to the programmer to choose those code Greg> units (i.e. pick an encoding for the characters) that will, Gr

Re: [Python-Dev] bytes.from_hex()

2006-02-24 Thread Stephen J. Turnbull
> "Ron" == Ron Adam <[EMAIL PROTECTED]> writes: Ron> We could call it transform or translate if needed. You're still losing the directionality, which is my primary objection to "recode". The absence of directionality is precisely why "recode" is used in that sense for i18n work. There r

Re: [Python-Dev] bytes.from_hex()

2006-02-23 Thread Greg Ewing
Stephen J. Turnbull wrote: > Please define "character," and explain how its semantics map to > Python's unicode objects. One of the 65 abstract entities referred to in the RFC and represented in that RFC by certain visual glyphs. There is a subset of the Unicode code points that are conventionall

Re: [Python-Dev] bytes.from_hex()

2006-02-22 Thread Ron Adam
Stephen J. Turnbull wrote: >> "Ron" == Ron Adam <[EMAIL PROTECTED]> writes: > > Ron> Terry Reedy wrote: > > >> I prefer the shorter names and using recode, for instance, for > >> bytes to bytes. > > Ron> While I prefer constructors with an explicit encode argument, > Ron>

Re: [Python-Dev] bytes.from_hex()

2006-02-22 Thread Stephen J. Turnbull
> "Ron" == Ron Adam <[EMAIL PROTECTED]> writes: Ron> Terry Reedy wrote: >> I prefer the shorter names and using recode, for instance, for >> bytes to bytes. Ron> While I prefer constructors with an explicit encode argument, Ron> and use a recode() method for 'like to like

Re: [Python-Dev] bytes.from_hex()

2006-02-22 Thread Stephen J. Turnbull
> "Greg" == Greg Ewing <[EMAIL PROTECTED]> writes: Greg> Stephen J. Turnbull wrote: >> Base64 is a (family of) wire protocol(s). It's not clear to me >> that it makes sense to say that the alphabets used by "baseNN" >> encodings are composed of characters, Greg> Take a l

Re: [Python-Dev] bytes.from_hex()

2006-02-22 Thread Greg Ewing
James Y Knight wrote: > Some MIME sections > might have a base64 Content-Transfer-Encoding, others might be 8bit > encoded, others might be 7bit encoded, others might be quoted- printable > encoded. I stand corrected -- in that situation you would have to encode the characters before combini

Re: [Python-Dev] bytes.from_hex()

2006-02-22 Thread Greg Ewing
Ron Adam wrote: > While I prefer constructors with an explicit encode argument, and use a > recode() method for 'like to like' coding. Then the whole encode/decode > confusion goes away. I'd be happy with that, too. -- Greg Ewing, Computer Science Dept, +-

Re: [Python-Dev] bytes.from_hex()

2006-02-22 Thread Greg Ewing
Terry Reedy wrote: > "Greg Ewing" <[EMAIL PROTECTED]> wrote in message > >>Efficiency is an implementation concern. > > It is also a user concern, especially if inefficiency overruns memory > limits. Sure, but what I mean is that it's better to find what's conceptually right and then look for an

Re: [Python-Dev] bytes.from_hex()

2006-02-22 Thread Ron Adam
Terry Reedy wrote: > "Greg Ewing" <[EMAIL PROTECTED]> wrote in message > >> Which is why I think that only *unicode* codings should be >> available through the .encode and .decode interface. Or >> alternatively there should be something more explicit like >> .unicode_encode and .unicode_decode th

Re: [Python-Dev] bytes.from_hex()

2006-02-22 Thread Terry Reedy
"Greg Ewing" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > Efficiency is an implementation concern. It is also a user concern, especially if inefficiency overruns memory limits. > In Py3k, strings > which contain only ascii or latin-1 might be stored as > 1 byte per character,

Re: [Python-Dev] bytes.from_hex()

2006-02-22 Thread James Y Knight
On Feb 22, 2006, at 6:35 AM, Greg Ewing wrote: > I'm thinking of convenience, too. Keep in mind that in Py3k, > 'unicode' will be called 'str' (or something equally neutral > like 'text') and you will rarely have to deal explicitly with > unicode codings, this being done mostly for you by the I/O

Re: [Python-Dev] bytes.from_hex()

2006-02-22 Thread Greg Ewing
Stephen J. Turnbull wrote: > Base64 is a (family of) wire protocol(s). It's not clear to me that > it makes sense to say that the alphabets used by "baseNN" encodings > are composed of characters, Take a look at http://en.wikipedia.org/wiki/Base64 where it says ...base64 is a binary to

Re: [Python-Dev] bytes.from_hex()

2006-02-22 Thread Stephen J. Turnbull
> "Greg" == Greg Ewing <[EMAIL PROTECTED]> writes: Greg> Stephen J. Turnbull wrote: >> What I advocate for Python is to require that the standard >> base64 codec be defined only on bytes, and always produce >> bytes. Greg> I don't understand that. It seems quite clear to

Re: [Python-Dev] bytes.from_hex()

2006-02-21 Thread Greg Ewing
Josiah Carlson wrote: > It doesn't seem strange to you to need to encode data twice to be able > to have a usable sequence of characters which can be embedded in an > effectively 7-bit email; I'm talking about a 3.0 world where all strings are unicode and the unicode <-> external coding is for th

Re: [Python-Dev] bytes.from_hex()

2006-02-21 Thread Josiah Carlson
Greg Ewing <[EMAIL PROTECTED]> wrote: > > Stephen J. Turnbull wrote: > > > What I advocate for Python is to require that the standard base64 > > codec be defined only on bytes, and always produce bytes. > > I don't understand that. It seems quite clear to me that > base64 encoding (in the gener

Re: [Python-Dev] bytes.from_hex()

2006-02-21 Thread Barry Warsaw
On Sun, 2006-02-19 at 23:30 +0900, Stephen J. Turnbull wrote: > > "M" == "M.-A. Lemburg" <[EMAIL PROTECTED]> writes: > M> * for Unicode codecs the original form is Unicode, the derived > M> form is, in most cases, a string > > First of all, that's Martin's point! > > Second, almost a

Re: [Python-Dev] bytes.from_hex()

2006-02-21 Thread Greg Ewing
Stephen J. Turnbull wrote: > What I advocate for Python is to require that the standard base64 > codec be defined only on bytes, and always produce bytes. I don't understand that. It seems quite clear to me that base64 encoding (in the general sense of encoding, not the unicode sense) takes binar

Re: [Python-Dev] bytes.from_hex()

2006-02-20 Thread Bob Ippolito
On Feb 20, 2006, at 7:25 PM, Stephen J. Turnbull wrote: >> "Martin" == Martin v Löwis <[EMAIL PROTECTED]> writes: > > Martin> Please do take a look. It is the only way: If you were to > Martin> embed base64 *bytes* into character data content of an XML > Martin> element, the resul

Re: [Python-Dev] bytes.from_hex()

2006-02-20 Thread Stephen J. Turnbull
> "Martin" == Martin v Löwis <[EMAIL PROTECTED]> writes: Martin> Please do take a look. It is the only way: If you were to Martin> embed base64 *bytes* into character data content of an XML Martin> element, the resulting XML file might not be well-formed Martin> anymore (if the

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-20 Thread Ron Adam
Bengt Richter wrote: > On Sat, 18 Feb 2006 23:33:15 +0100, Thomas Wouters <[EMAIL PROTECTED]> wrote: > note what base64 really is for. It's essence is to create a _character_ > sequence > which can succeed in being encoded as ascii. The concept of base64 going > str->str > is really a mental sho

Re: [Python-Dev] bytes.from_hex()

2006-02-20 Thread Martin v. Löwis
Stephen J. Turnbull wrote: > Martin> For an example where base64 is *not* necessarily > Martin> ASCII-encoded, see the "binary" data type in XML > Martin> Schema. There, base64 is embedded into an XML document, > Martin> and uses the encoding of the entire XML document. As a > M

Re: [Python-Dev] bytes.from_hex()

2006-02-20 Thread Stephen J. Turnbull
> "Josiah" == Josiah Carlson <[EMAIL PROTECTED]> writes: Josiah> I try to internalize it by not thinking of strings as Josiah> encoded data, but as binary data, and unicode as text. I Josiah> then remind myself that unicode isn't native on-disk or Josiah> cross-network (which

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-20 Thread Bengt Richter
On Sat, 18 Feb 2006 23:33:15 +0100, Thomas Wouters <[EMAIL PROTECTED]> wrote: >On Sat, Feb 18, 2006 at 01:21:18PM +0100, M.-A. Lemburg wrote: > [...] >> > - The return value for the non-unicode encodings depends on the value of >> >the encoding argument. > >> Not really: you'll always get a b

Re: [Python-Dev] bytes.from_hex()

2006-02-20 Thread Stephen J. Turnbull
> "Martin" == Martin v Löwis <[EMAIL PROTECTED]> writes: Martin> Stephen J. Turnbull wrote: Bengt> The characters in b could be encoded in plain ascii, or Bengt> utf16le, you have to know. >> Which base64 are you thinking about? Both RFC 3548 and RFC >> 2045 (MIME) speci

Re: [Python-Dev] bytes.from_hex()

2006-02-19 Thread Josiah Carlson
"Stephen J. Turnbull" <[EMAIL PROTECTED]> wrote: > > > "Josiah" == Josiah Carlson <[EMAIL PROTECTED]> writes: > > Josiah> The question remains: is str.decode() returning a string > Josiah> or unicode depending on the argument passed, when the > Josiah> argument quite literally na

Re: [Python-Dev] bytes.from_hex()

2006-02-19 Thread Bob Ippolito
On Feb 19, 2006, at 10:55 AM, Martin v. Löwis wrote: > Stephen J. Turnbull wrote: >> BTW, what use cases do you have in mind for Unicode -> Unicode >> decoding? > > I think "rot13" falls into that category: it is a transformation > on text, not on bytes. The current implementation is a transforma

Re: [Python-Dev] bytes.from_hex()

2006-02-19 Thread Martin v. Löwis
Stephen J. Turnbull wrote: > Bengt> The characters in b could be encoded in plain ascii, or > Bengt> utf16le, you have to know. > > Which base64 are you thinking about? Both RFC 3548 and RFC 2045 > (MIME) specify subsets of US-ASCII explicitly. Unfortunately, it is ambiguous as to whethe

Re: [Python-Dev] bytes.from_hex()

2006-02-19 Thread Martin v. Löwis
Stephen J. Turnbull wrote: > Do you do any of the user education *about codec use* that you > recommend? The people I try to teach about coding invariably find it > difficult to understand. The problem is that the near-universal > intuition is that for "human-usable text" is pretty much anything

Re: [Python-Dev] bytes.from_hex()

2006-02-19 Thread Martin v. Löwis
Stephen J. Turnbull wrote: > BTW, what use cases do you have in mind for Unicode -> Unicode > decoding? I think "rot13" falls into that category: it is a transformation on text, not on bytes. For other "odd" cases: "base64" goes Unicode->bytes in the *decode* direction, not in the encode directio

Re: [Python-Dev] bytes.from_hex()

2006-02-19 Thread Stephen J. Turnbull
> "Bengt" == Bengt Richter <[EMAIL PROTECTED]> writes: Bengt> The characters in b could be encoded in plain ascii, or Bengt> utf16le, you have to know. Which base64 are you thinking about? Both RFC 3548 and RFC 2045 (MIME) specify subsets of US-ASCII explicitly. -- School of System

Re: [Python-Dev] bytes.from_hex()

2006-02-19 Thread Stephen J. Turnbull
> "Bob" == Bob Ippolito <[EMAIL PROTECTED]> writes: Bob> On Feb 17, 2006, at 8:33 PM, Josiah Carlson wrote: >> But you aren't always getting *unicode* text from the decoding >> of bytes, and you may be encoding bytes *to* bytes: Please note that I presumed that you can indeed ass

Re: [Python-Dev] bytes.from_hex()

2006-02-19 Thread Stephen J. Turnbull
> "Josiah" == Josiah Carlson <[EMAIL PROTECTED]> writes: Josiah> The question remains: is str.decode() returning a string Josiah> or unicode depending on the argument passed, when the Josiah> argument quite literally names the codec involved, Josiah> difficult to understand? I

Re: [Python-Dev] bytes.from_hex()

2006-02-19 Thread Stephen J. Turnbull
> "M" == "M.-A. Lemburg" <[EMAIL PROTECTED]> writes: M> The main reason is symmetry and the fact that strings and M> Unicode should be as similar as possible in order to simplify M> the task of moving from one to the other. Those are perfectly compatible with Martin's suggestion.

Re: [Python-Dev] bytes.from_hex()

2006-02-19 Thread Stephen J. Turnbull
> "M" == "M.-A. Lemburg" <[EMAIL PROTECTED]> writes: M> Martin v. Löwis wrote: >> No. The reason to ban string.decode and bytes.encode is that it >> confuses users. M> Instead of starting to ban everything that can potentially M> confuse a few users, we should educate tho

Re: [Python-Dev] bytes.from_hex()

2006-02-19 Thread Stephen J. Turnbull
> "Ian" == Ian Bicking <[EMAIL PROTECTED]> writes: Ian> Encodings cover up eclectic interfaces, where those Ian> interfaces fit a basic pattern -- data in, data out. Isn't "filter" the word you're looking for? I think you've just made a very strong case that this is a slippery slope

Re: [Python-Dev] bytes.from_hex()

2006-02-19 Thread Michael Hudson
"M.-A. Lemburg" <[EMAIL PROTECTED]> writes: > Martin v. Löwis wrote: >> M.-A. Lemburg wrote: > True. However, note that the .encode()/.decode() methods on > strings and Unicode narrow down the possible return types. > The corresponding .bytes methods should only allow bytes and > U

Re: [Python-Dev] bytes.from_hex()

2006-02-18 Thread Ron Adam
Josiah Carlson wrote: > Ron Adam <[EMAIL PROTECTED]> wrote: > Except that ambiguates it even further. > > Is encodings.tounicode() encoding, or decoding? According to everything > you have said so far, it would be decoding. But if I am decoding binary > data, why should it be spending any time

Re: [Python-Dev] bytes.from_hex()

2006-02-18 Thread Josiah Carlson
Ron Adam <[EMAIL PROTECTED]> wrote: > Josiah Carlson wrote: > > Ron Adam <[EMAIL PROTECTED]> wrote: > >> Josiah Carlson wrote: > > [snip] > >>> Again, the problem is ambiguity; what does bytes.recode(something) mean? > >>> Are we encoding _to_ something, or are we decoding _from_ something? > >>

Re: [Python-Dev] bytes.from_hex()

2006-02-18 Thread Ron Adam
Josiah Carlson wrote: > Ron Adam <[EMAIL PROTECTED]> wrote: >> Josiah Carlson wrote: > [snip] >>> Again, the problem is ambiguity; what does bytes.recode(something) mean? >>> Are we encoding _to_ something, or are we decoding _from_ something? >> This was just an example of one way that might wor

Re: [Python-Dev] bytes.from_hex()

2006-02-18 Thread Terry Reedy
"Josiah Carlson" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > Again, the problem is ambiguity; what does bytes.recode(something) mean? > Are we encoding _to_ something, or are we decoding _from_ something? > Are we going to need to embed the direction in the encoding/decoding >

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-18 Thread Thomas Wouters
On Sat, Feb 18, 2006 at 01:21:18PM +0100, M.-A. Lemburg wrote: > It's by no means a Perl attitude. In your eyes, perhaps. It certainly feels that way to me (or I wouldn't have said it :). Perl happens to be full of general constructs that were added because they were easy to add, or they were use

Re: [Python-Dev] bytes.from_hex()

2006-02-18 Thread Ron Adam
Aahz wrote: > On Sat, Feb 18, 2006, Ron Adam wrote: >> I like the bytes.recode() idea a lot. +1 >> >> It seems to me it's a far more useful idea than encoding and decoding by >> overloading and could do both and more. It has a lot of potential to be >> an intermediate step for encoding as well a

Re: [Python-Dev] bytes.from_hex()

2006-02-18 Thread Josiah Carlson
Ron Adam <[EMAIL PROTECTED]> wrote: > Josiah Carlson wrote: [snip] > > Again, the problem is ambiguity; what does bytes.recode(something) mean? > > Are we encoding _to_ something, or are we decoding _from_ something? > > This was just an example of one way that might work, but here are my > tho

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-18 Thread M.-A. Lemburg
Martin v. Löwis wrote: > M.-A. Lemburg wrote: True. However, note that the .encode()/.decode() methods on strings and Unicode narrow down the possible return types. The corresponding .bytes methods should only allow bytes and Unicode. >>> I forgot that: what is the rationale for

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-18 Thread Martin v. Löwis
M.-A. Lemburg wrote: >>>True. However, note that the .encode()/.decode() methods on >>>strings and Unicode narrow down the possible return types. >>>The corresponding .bytes methods should only allow bytes and >>>Unicode. >> >>I forgot that: what is the rationale for that restriction? > > > To as

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-18 Thread M.-A. Lemburg
Martin v. Löwis wrote: > M.-A. Lemburg wrote: >> I've already explained why we have .encode() and .decode() >> methods on strings and Unicode many times. I've also >> explained the misunderstanding that can codecs only do >> Unicode-string conversions. And I've explained that >> the .encode() and .

Re: [Python-Dev] bytes.from_hex()

2006-02-18 Thread Martin v. Löwis
Michael Hudson wrote: > There's one extremely significant example where the *value* of > something impacts on the type of something else: functions. The types > of everything involved in str([1]) and len([1]) are the same but the > results are different. This shows up in PyPy's type annotation; m

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-18 Thread Martin v. Löwis
M.-A. Lemburg wrote: > I've already explained why we have .encode() and .decode() > methods on strings and Unicode many times. I've also > explained the misunderstanding that can codecs only do > Unicode-string conversions. And I've explained that > the .encode() and .decode() method *do* check the

Re: [Python-Dev] bytes.from_hex()

2006-02-18 Thread M.-A. Lemburg
Aahz wrote: > On Sat, Feb 18, 2006, Ron Adam wrote: >> I like the bytes.recode() idea a lot. +1 >> >> It seems to me it's a far more useful idea than encoding and decoding by >> overloading and could do both and more. It has a lot of potential to be >> an intermediate step for encoding as well a

Re: [Python-Dev] bytes.from_hex()

2006-02-18 Thread Aahz
On Sat, Feb 18, 2006, Ron Adam wrote: > > I like the bytes.recode() idea a lot. +1 > > It seems to me it's a far more useful idea than encoding and decoding by > overloading and could do both and more. It has a lot of potential to be > an intermediate step for encoding as well as being used for

Re: [Python-Dev] bytes.from_hex()

2006-02-18 Thread Adam Olsen
On 2/18/06, Josiah Carlson <[EMAIL PROTECTED]> wrote: > Look at what we've currently got going for data transformations in the > standard library to see what these removals will do: base64 module, > binascii module, binhex module, uu module, ... Do we want or need to > add another top-level module

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-18 Thread M.-A. Lemburg
Thomas Wouters wrote: > On Sat, Feb 18, 2006 at 12:06:37PM +0100, M.-A. Lemburg wrote: > >> I've already explained why we have .encode() and .decode() >> methods on strings and Unicode many times. I've also >> explained the misunderstanding that can codecs only do >> Unicode-string conversions. An

Re: [Python-Dev] bytes.from_hex()

2006-02-18 Thread Ron Adam
Josiah Carlson wrote: > Ron Adam <[EMAIL PROTECTED]> wrote: >> Josiah Carlson wrote: >>> Bengt Richter had a good idea with bytes.recode() for strictly bytes >>> transformations (and the equivalent for text), though it is ambiguous as >>> to the direction; are we encoding or decoding with bytes.rec

Re: [Python-Dev] bytes.from_hex()

2006-02-18 Thread Michael Hudson
This posting is entirely tangential. Be warned. "Martin v. Löwis" <[EMAIL PROTECTED]> writes: > It's worse than that. The return *type* depends on the *value* of > the argument. I think there is little precedence for that: There's one extremely significant example where the *value* of something

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-18 Thread M.-A. Lemburg
Martin v. Löwis wrote: > M.-A. Lemburg wrote: >> Just because some codecs don't fit into the string.decode() >> or bytes.encode() scenario doesn't mean that these codecs are >> useless or that the methods should be banned. > > No. The reason to ban string.decode and bytes.encode is that > it confu

Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-18 Thread Thomas Wouters
On Sat, Feb 18, 2006 at 12:06:37PM +0100, M.-A. Lemburg wrote: > I've already explained why we have .encode() and .decode() > methods on strings and Unicode many times. I've also > explained the misunderstanding that can codecs only do > Unicode-string conversions. And I've explained that > the .e

  1   2   >