Re: [Python-Dev] Bytes path related questions for Guido

2014-08-29 Thread Walter Dörwald
On 28 Aug 2014, at 19:54, Glenn Linderman wrote: On 8/28/2014 10:41 AM, R. David Murray wrote: On Thu, 28 Aug 2014 10:15:40 -0700, Glenn Linderman wrote: [...] Also for cases where the data stream is *supposed* to be in a given encoding, but contains undecodable bytes. Showing the stuff tha

Re: [Python-Dev] Bytes path related questions for Guido

2014-08-28 Thread R. David Murray
On Thu, 28 Aug 2014 10:54:44 -0700, Glenn Linderman wrote: > On 8/28/2014 10:41 AM, R. David Murray wrote: > > On Thu, 28 Aug 2014 10:15:40 -0700, Glenn Linderman > > wrote: > >> On 8/28/2014 12:30 AM, MRAB wrote: > >>> There'll be a surrogate escape if a byte couldn't be decoded, but just > >>

Re: [Python-Dev] Bytes path related questions for Guido

2014-08-28 Thread Glenn Linderman
On 8/28/2014 10:41 AM, R. David Murray wrote: On Thu, 28 Aug 2014 10:15:40 -0700, Glenn Linderman wrote: On 8/28/2014 12:30 AM, MRAB wrote: On 2014-08-28 05:56, Glenn Linderman wrote: On 8/27/2014 6:08 PM, Stephen J. Turnbull wrote: Glenn Linderman writes: > On 8/26/2014 4:31 AM, MRAB wr

Re: [Python-Dev] Bytes path related questions for Guido

2014-08-28 Thread R. David Murray
On Thu, 28 Aug 2014 10:15:40 -0700, Glenn Linderman wrote: > On 8/28/2014 12:30 AM, MRAB wrote: > > On 2014-08-28 05:56, Glenn Linderman wrote: > >> On 8/27/2014 6:08 PM, Stephen J. Turnbull wrote: > >>> Glenn Linderman writes: > >>> > On 8/26/2014 4:31 AM, MRAB wrote: > >>> > > On 2014-08-26

Re: [Python-Dev] Bytes path related questions for Guido

2014-08-28 Thread Glenn Linderman
On 8/28/2014 12:30 AM, MRAB wrote: On 2014-08-28 05:56, Glenn Linderman wrote: On 8/27/2014 6:08 PM, Stephen J. Turnbull wrote: Glenn Linderman writes: > On 8/26/2014 4:31 AM, MRAB wrote: > > On 2014-08-26 03:11, Stephen J. Turnbull wrote: > >> Nick Coghlan writes: > > How about: > >

Re: [Python-Dev] Bytes path related questions for Guido

2014-08-28 Thread MRAB
On 2014-08-28 05:56, Glenn Linderman wrote: On 8/27/2014 6:08 PM, Stephen J. Turnbull wrote: Glenn Linderman writes: > On 8/26/2014 4:31 AM, MRAB wrote: > > On 2014-08-26 03:11, Stephen J. Turnbull wrote: > >> Nick Coghlan writes: > > How about: > > > > replace_surrogate_escapes

Re: [Python-Dev] Bytes path related questions for Guido

2014-08-27 Thread Stephen J. Turnbull
Glenn Linderman writes: > On 8/27/2014 6:08 PM, Stephen J. Turnbull wrote: > > Glenn Linderman writes: > > > And further, replacement could be a vector of 128 characters, to do > > > immediate transcoding, > > > > Using what encoding? > > The vector would contain the transcoding. Each

Re: [Python-Dev] Bytes path related questions for Guido

2014-08-27 Thread Glenn Linderman
On 8/27/2014 6:08 PM, Stephen J. Turnbull wrote: Glenn Linderman writes: > On 8/26/2014 4:31 AM, MRAB wrote: > > On 2014-08-26 03:11, Stephen J. Turnbull wrote: > >> Nick Coghlan writes: > > How about: > > > > replace_surrogate_escapes(s, replacement='\uFFFD') > > > > If you

Re: [Python-Dev] Bytes path related questions for Guido

2014-08-27 Thread Stephen J. Turnbull
Glenn Linderman writes: > On 8/26/2014 4:31 AM, MRAB wrote: > > On 2014-08-26 03:11, Stephen J. Turnbull wrote: > >> Nick Coghlan writes: > > How about: > > > > replace_surrogate_escapes(s, replacement='\uFFFD') > > > > If you want them removed, just pass an empty string as the > > re

Re: [Python-Dev] Bytes path related questions for Guido

2014-08-27 Thread Glenn Linderman
On 8/26/2014 4:31 AM, MRAB wrote: On 2014-08-26 03:11, Stephen J. Turnbull wrote: Nick Coghlan writes: > "purge_surrogate_escapes" was the other term that occurred to me. "purge" suggests removal, not replacement. That may be useful too. neutralize_surrogate_escapes(s, remove=False, replac

Re: [Python-Dev] Bytes path related questions for Guido

2014-08-26 Thread MRAB
On 2014-08-26 03:11, Stephen J. Turnbull wrote: Nick Coghlan writes: > "purge_surrogate_escapes" was the other term that occurred to me. "purge" suggests removal, not replacement. That may be useful too. neutralize_surrogate_escapes(s, remove=False, replacement='\uFFFD') How about: r

Re: [Python-Dev] Bytes path related questions for Guido

2014-08-25 Thread Stephen J. Turnbull
Nick Coghlan writes: > "purge_surrogate_escapes" was the other term that occurred to me. "purge" suggests removal, not replacement. That may be useful too. neutralize_surrogate_escapes(s, remove=False, replacement='\uFFFD') maybe? (Of course the remove argument is feature creep, so I'm only

Re: [Python-Dev] Bytes path related questions for Guido

2014-08-24 Thread Nick Coghlan
On 25 Aug 2014 03:55, "Guido van Rossum" wrote: > > Yes on #1 -- making the low-level functions more usable for edge cases by supporting bytes seems fine (as long as the support for strings, where it exists, is not compromised). Thanks! > The status of pathlib is a little unclear to me -- is the

Re: [Python-Dev] Bytes path related questions for Guido

2014-08-24 Thread Guido van Rossum
Yes on #1 -- making the low-level functions more usable for edge cases by supporting bytes seems fine (as long as the support for strings, where it exists, is not compromised). The status of pathlib is a little unclear to me -- is there a plan to eventually support bytes or not? For #2 I think yo

Re: [Python-Dev] Bytes path related questions for Guido

2014-08-24 Thread Nick Coghlan
On 25 August 2014 00:23, Antoine Pitrou wrote: > Le 24/08/2014 09:04, Nick Coghlan a écrit : >> Serhiy & Ezio convinced me to scale this one back to a proposal for >> "codecs.clean_surrogate_escapes(s)", which replaces surrogates that >> may be produced by surrogateescape (that's what string.clean

Re: [Python-Dev] Bytes path related questions for Guido

2014-08-24 Thread Antoine Pitrou
Le 24/08/2014 09:04, Nick Coghlan a écrit : On 24 August 2014 14:44, Nick Coghlan wrote: 2. Should we add some additional helpers to the string module for dealing with surrogate escaped bytes and other techniques for smuggling arbitrary binary data as text? My proposal [3] is to add: * string

Re: [Python-Dev] Bytes path related questions for Guido

2014-08-24 Thread Nick Coghlan
On 24 August 2014 14:44, Nick Coghlan wrote: > 2. Should we add some additional helpers to the string module for > dealing with surrogate escaped bytes and other techniques for > smuggling arbitrary binary data as text? > > My proposal [3] is to add: > > * string.escaped_surrogates (constant with

[Python-Dev] Bytes path related questions for Guido

2014-08-23 Thread Nick Coghlan
At Guido's request, splitting out two specific questions from Serhiy's thread where I believe we could do with an explicit "yes or no" from him. 1. Should we accept patches adding support for the direct use of bytes paths in lower level filesystem manipulation APIs? (i.e. everything that isn't pat