date:20160614

Re: [Python-Dev] PEP 520: Ordered Class Definition Namespace

2016-06-14 Thread Nikita Nemkin

Is there any rationale for rejecting alternatives like:

1. Adding standard metaclass with ordered namespace.
2. Adding `namespace` or `ordered` args to the default metaclass.
3. Making compiler fill in __definition_order__ for every class
(just like __qualname__) without touching the runtime.
?

To me, any of the above seems preferred to complicating
the core part of the language forever.

The vast majority of Python classes don't care about their member
order, this is minority use case receiving majority treatment.

Also, wiring OrderedDict into class creation means elevating it
from a peripheral utility to indispensable built-in type.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 468

2016-06-14 Thread Franklin? Lee

Compact OrderedDicts can leave gaps, and once in a while compactify. For
example, whenever the entry table is full, it can decide whether to resize
(and only copy non-gaps), or just compactactify

Compact regular dicts can swap from the back and have no gaps.

I don't see the point of discussing these details. Isn't it enough to say
that these are solvable problems, which we can worry about if/when someone
actually decides to sit down and implement compact dicts?

P.S.: Sorry about the repeated emails. I think it was the iOS Gmail app.

On Jun 13, 2016 10:23 PM, "Ethan Furman"  wrote:
>
> On 06/13/2016 05:47 PM, Larry Hastings wrote:
>>
>> On 06/13/2016 05:05 PM, MRAB wrote:
>
>
>>> This could be avoided by expanding the items to include the index of
>>> the 'previous' and 'next' item, so that they could be handled like a
>>> doubly-linked list.
>>>
>>> The disadvantage would be that it would use more memory.
>>
>>
>> Another, easier technique: don't fill holes.  Same disadvantage
>> (increased memory use), but easier to write and maintain.
>
>
> I hope this is just an academic discussion: suddenly having Python's
dicts grow continuously is going to have nasty consequences somewhere.
>
> --
> ~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] mod_python compilation error in VS 2008 for py2.7.1

2016-06-14 Thread asimkon

I would like to ask you a technical question regarding python module
compilation for python 2.7.1.

I want to compile mod_python
library
for Apache 2.2 and py2.7 on Win32 in
order to use it for psp - py scripts that i have written. I tried to
compile it using VS 2008 (VC++) and unfortunately i get an error on
pyconfig.h (Py2.7/include) error C2632: int followed by int is illegal.

This problem occurs when i try to run the bat file that exists on
mod_python/dist folder. Any idea or suggestion what should i do in order to
run it on Win 7 Pro (win 32) environment and produce the final apache
executable module (.so).

For your better assistance, i attach you the necessary files and error_log
(ouput that i get during compilation process). I have posted the same
question here
,
but unfortunately i had had no luck!

Additionally i give you the compilation instructions that i follow (used
also MinGW-w64 and get the same error) in order to produce the final output!

Compiling

Open a command prompt with VS2008 support. The easiest way to do this is to
use "Start | All Programs | Microsoft Visual Studio 2008 | Visual Studio
Tools | Visual Studio 2008 Command Prompt". (This puts the VS2008 binaries
in the path and sets up the lib/include environmental variables for the
Platform SDK.)

1.cd to the mod_python\dist folder.

2.Tell mod_python where Apache is: set APACHESRC=C:\Apache

3. Run build_installer.bat.

If it succeeds, an installer.exe will be created in a subfolder. Run that
install the module.

Kind Regards

Kostas Asimakopoulos

#ifndef _UNISTD_H
#define _UNISTD_H1

/* This file intended to serve as a drop-in replacement for
* unistd.h on Windows
* Please add functionality as neeeded
*/

#include
#include
#include /* getopt at: https://gist.github.com/ashelly/7776712 */
#include /* for getpid() and the exec..() family */
#include /* for _getcwd() and _chdir() */

#define srandom srand
#define random rand

/* Values for the second argument to access.
These may be OR'd together. */
#define R_OK4 /* Test for read permission. */
#define W_OK2 /* Test for write permission. */
//#define X_OK1 /* execute permission - unsupported in windows*/
#define F_OK0 /* Test for existence. */

#define access _access
#define dup2 _dup2
#define execve _execve
#define ftruncate _chsize
#define unlink _unlink
#define fileno _fileno
#define getcwd _getcwd
#define chdir _chdir
#define isatty _isatty
#define lseek _lseek
/* read, write, and close are NOT being #defined here, because while there are file handle specific versions for Windows, they probably don't work for sockets. You need to look at your app and consider whether to call e.g. closesocket(). */

#define ssize_t int

#define STDIN_FILENO 0
#define STDOUT_FILENO 1
#define STDERR_FILENO 2
/* should be in some equivalent to */
typedef __int8int8_t;
typedef __int16 int16_t;
typedef __int32 int32_t;
typedef __int64 int64_t;
typedef unsigned __int8 uint8_t;
typedef unsigned __int16 uint16_t;
typedef unsigned __int32 uint32_t;
typedef unsigned __int64 uint64_t;

#endif /* unistd.h */#ifndef __GETOPT_H__
/**
* DISCLAIMER
* This file is part of the mingw-w64 runtime package.
*
* The mingw-w64 runtime package and its code is distributed in the hope that it
* will be useful but WITHOUT ANY WARRANTY. ALL WARRANTIES, EXPRESSED OR
* IMPLIED ARE HEREBY DISCLAIMED. This includes but is not limited to
* warranties of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
*/
/*
* Copyright (c) 2002 Todd C. Miller
*
* Permission to use, copy, modify, and distribute this software for any
* purpose with or without fee is hereby granted, provided that the above
* copyright notice and this permission notice appear in all copies.
*
* THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
* WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
* MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
* ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
* WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
* ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
* OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
*
* Sponsored in part by the Defense Advanced Research Projects
* Agency (DARPA) and Air Force Research Laboratory, Air Force
* Materiel Command, USAF, under agreement number F39502-99-1-0512.
*/
/*-
* Copyright (c) 2000 The NetBSD Foundation, Inc.
* All rights reserved.
*
* This code is derived from software contributed to The NetBSD Foundation
* by Dieter Baron and Thomas Klausner.
*
* Redistribution and use in source and binary forms, with or wi

Re: [Python-Dev] [Python-checkins] cpython (3.5): Fix os.urandom() using getrandom() on Linux

2016-06-14 Thread Steven D'Aprano

Is this right? I thought we had decided that os.urandom should *not* 
fall back on getrandom on Linux?



On Tue, Jun 14, 2016 at 02:36:27PM +, victor. stinner wrote:
> https://hg.python.org/cpython/rev/e028e86a5b73
> changeset:   102033:e028e86a5b73
> branch:  3.5
> parent:  102031:a36238de31ae
> user:Victor Stinner 
> date:Tue Jun 14 16:31:35 2016 +0200
> summary:
>   Fix os.urandom() using getrandom() on Linux
> 
> Issue #27278: Fix os.urandom() implementation using getrandom() on Linux.
> Truncate size to INT_MAX and loop until we collected enough random bytes,
> instead of casting a directly Py_ssize_t to int.
> 
> files:
>   Misc/NEWS   |  4 
>   Python/random.c |  2 +-
>   2 files changed, 5 insertions(+), 1 deletions(-)
> 
> 
> diff --git a/Misc/NEWS b/Misc/NEWS
> --- a/Misc/NEWS
> +++ b/Misc/NEWS
> @@ -13,6 +13,10 @@
>  Library
>  ---
>  
> +- Issue #27278: Fix os.urandom() implementation using getrandom() on Linux.
> +  Truncate size to INT_MAX and loop until we collected enough random bytes,
> +  instead of casting a directly Py_ssize_t to int.
> +
>  - Issue #26386: Fixed ttk.TreeView selection operations with item id's
>containing spaces.
>  
> diff --git a/Python/random.c b/Python/random.c
> --- a/Python/random.c
> +++ b/Python/random.c
> @@ -143,7 +143,7 @@
> to 1024 bytes */
>  n = Py_MIN(size, 1024);
>  #else
> -n = size;
> +n = Py_MIN(size, INT_MAX);
>  #endif
>  
>  errno = 0;
> 
> -- 
> Repository URL: https://hg.python.org/cpython

> ___
> Python-checkins mailing list
> python-check...@python.org
> https://mail.python.org/mailman/listinfo/python-checkins

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Steven D'Aprano

Normally I'd take a question like this to Python-List, but this question
has turned out to be quite diversive, with people having strong opinions
but no definitive answer. So I thought I'd ask here and hope that some
of the core devs would have an idea.

Why does base64 encoding in Python return bytes?

base64.b64encode take bytes as input and returns bytes. Some people are
arguing that this is wrong behaviour, as RFC 3548 specifies that Base64
should transform bytes to characters:

https://tools.ietf.org/html/rfc3548.html

albeit US-ASCII characters. E.g.:

The encoding process represents 24-bit groups of input bits
as output strings of 4 encoded characters.
[...]
Each 6-bit group is used as an index into an array of 64 printable
characters. The character referenced by the index is placed in the
output string.

Are they misinterpreting the standard? Has Python got it wrong? Is there
a good reason for returning bytes?

I see that other languages choose different strategies. Microsoft's
languages C#, F# and VB (plus their C++ compiler) take an array of bytes
as input, and outputs a UTF-16 string:

https://msdn.microsoft.com/en-us/library/dhx0d524%28v=vs.110%29.aspx

Java's base64 encoder takes and returns bytes:

https://docs.oracle.com/javase/8/docs/api/java/util/Base64.Encoder.html

and Javascript's Base64 encoder takes input as UTF-16 encoded text and
returns the same:

https://developer.mozilla.org/en-US/docs/Web/API/WindowBase64/Base64_encoding_and_decoding

I'm not necessarily arguing that Python's strategy is the wrong one, but
I am interested in what (if any) reasons are behind it.

Thanks in advance,

Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-checkins] cpython (3.5): Fix os.urandom() using getrandom() on Linux

2016-06-14 Thread Jelle Zijlstra

I think this is an issue unrelated to the big discussion from a little
while ago. The problem isn't that os.urandom() uses getrandom(), it's that
it calls it in a mode that may block.

2016-06-14 8:07 GMT-07:00 Steven D'Aprano :

> Is this right? I thought we had decided that os.urandom should *not*
> fall back on getrandom on Linux?
>
>
>
> On Tue, Jun 14, 2016 at 02:36:27PM +, victor. stinner wrote:
> > https://hg.python.org/cpython/rev/e028e86a5b73
> > changeset:   102033:e028e86a5b73
> > branch:  3.5
> > parent:  102031:a36238de31ae
> > user:Victor Stinner 
> > date:Tue Jun 14 16:31:35 2016 +0200
> > summary:
> >   Fix os.urandom() using getrandom() on Linux
> >
> > Issue #27278: Fix os.urandom() implementation using getrandom() on Linux.
> > Truncate size to INT_MAX and loop until we collected enough random bytes,
> > instead of casting a directly Py_ssize_t to int.
> >
> > files:
> >   Misc/NEWS   |  4 
> >   Python/random.c |  2 +-
> >   2 files changed, 5 insertions(+), 1 deletions(-)
> >
> >
> > diff --git a/Misc/NEWS b/Misc/NEWS
> > --- a/Misc/NEWS
> > +++ b/Misc/NEWS
> > @@ -13,6 +13,10 @@
> >  Library
> >  ---
> >
> > +- Issue #27278: Fix os.urandom() implementation using getrandom() on
> Linux.
> > +  Truncate size to INT_MAX and loop until we collected enough random
> bytes,
> > +  instead of casting a directly Py_ssize_t to int.
> > +
> >  - Issue #26386: Fixed ttk.TreeView selection operations with item id's
> >containing spaces.
> >
> > diff --git a/Python/random.c b/Python/random.c
> > --- a/Python/random.c
> > +++ b/Python/random.c
> > @@ -143,7 +143,7 @@
> > to 1024 bytes */
> >  n = Py_MIN(size, 1024);
> >  #else
> > -n = size;
> > +n = Py_MIN(size, INT_MAX);
> >  #endif
> >
> >  errno = 0;
> >
> > --
> > Repository URL: https://hg.python.org/cpython
>
> > ___
> > Python-checkins mailing list
> > python-check...@python.org
> > https://mail.python.org/mailman/listinfo/python-checkins
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/jelle.zijlstra%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Joao S. O. Bueno

On 14 June 2016 at 12:19, Steven D'Aprano  wrote:
> Is there
> a good reason for returning bytes?

What about: it returns 0-255 numeric values for each position in  a stream, with
no clue whatsoever to how those values map to text characters beyond
the 32-128 range?

Maybe base64.decode could take a "encoding" optional parameter - or
there could  be
a separate 'decote_to_text" method that would explicitly take a text codec name.
Otherwise, no, you simply can't take a bunch of bytes and say they
represent text.

João

(see ^- the "ã" ?)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-checkins] cpython (3.5): Fix os.urandom() using getrandom() on Linux

2016-06-14 Thread Victor Stinner

Sorry, I don't hve the bandwith to follow the huge discussion around random
in Python. If you want my help, please write a PEP to summarize the
discussion.

My change fixes an obvious bug. Even if the Python API changes, I don't
expect that all the C code will be removed.

Victor
Le 14 juin 2016 5:11 PM, "Steven D'Aprano"  a écrit :

> Is this right? I thought we had decided that os.urandom should *not*
> fall back on getrandom on Linux?
>
>
>
> On Tue, Jun 14, 2016 at 02:36:27PM +, victor. stinner wrote:
> > https://hg.python.org/cpython/rev/e028e86a5b73
> > changeset:   102033:e028e86a5b73
> > branch:  3.5
> > parent:  102031:a36238de31ae
> > user:Victor Stinner 
> > date:Tue Jun 14 16:31:35 2016 +0200
> > summary:
> >   Fix os.urandom() using getrandom() on Linux
> >
> > Issue #27278: Fix os.urandom() implementation using getrandom() on Linux.
> > Truncate size to INT_MAX and loop until we collected enough random bytes,
> > instead of casting a directly Py_ssize_t to int.
> >
> > files:
> >   Misc/NEWS   |  4 
> >   Python/random.c |  2 +-
> >   2 files changed, 5 insertions(+), 1 deletions(-)
> >
> >
> > diff --git a/Misc/NEWS b/Misc/NEWS
> > --- a/Misc/NEWS
> > +++ b/Misc/NEWS
> > @@ -13,6 +13,10 @@
> >  Library
> >  ---
> >
> > +- Issue #27278: Fix os.urandom() implementation using getrandom() on
> Linux.
> > +  Truncate size to INT_MAX and loop until we collected enough random
> bytes,
> > +  instead of casting a directly Py_ssize_t to int.
> > +
> >  - Issue #26386: Fixed ttk.TreeView selection operations with item id's
> >containing spaces.
> >
> > diff --git a/Python/random.c b/Python/random.c
> > --- a/Python/random.c
> > +++ b/Python/random.c
> > @@ -143,7 +143,7 @@
> > to 1024 bytes */
> >  n = Py_MIN(size, 1024);
> >  #else
> > -n = size;
> > +n = Py_MIN(size, INT_MAX);
> >  #endif
> >
> >  errno = 0;
> >
> > --
> > Repository URL: https://hg.python.org/cpython
>
> > ___
> > Python-checkins mailing list
> > python-check...@python.org
> > https://mail.python.org/mailman/listinfo/python-checkins
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Victor Stinner

To port OpenStack to Python 3, I wrote 4 (2x2) helper functions which
accept bytes *and* Unicode as input. xxx_as_bytes() functions return bytes,
xxx_as_text() return Unicode:
http://docs.openstack.org/developer/oslo.serialization/api.html

Victor
Le 14 juin 2016 5:21 PM, "Steven D'Aprano"  a écrit :

> Normally I'd take a question like this to Python-List, but this question
> has turned out to be quite diversive, with people having strong opinions
> but no definitive answer. So I thought I'd ask here and hope that some
> of the core devs would have an idea.
>
> Why does base64 encoding in Python return bytes?
>
> base64.b64encode take bytes as input and returns bytes. Some people are
> arguing that this is wrong behaviour, as RFC 3548 specifies that Base64
> should transform bytes to characters:
>
> https://tools.ietf.org/html/rfc3548.html
>
> albeit US-ASCII characters. E.g.:
>
> The encoding process represents 24-bit groups of input bits
> as output strings of 4 encoded characters.
> [...]
> Each 6-bit group is used as an index into an array of 64 printable
> characters.  The character referenced by the index is placed in the
> output string.
>
> Are they misinterpreting the standard? Has Python got it wrong? Is there
> a good reason for returning bytes?
>
> I see that other languages choose different strategies. Microsoft's
> languages C#, F# and VB (plus their C++ compiler) take an array of bytes
> as input, and outputs a UTF-16 string:
>
> https://msdn.microsoft.com/en-us/library/dhx0d524%28v=vs.110%29.aspx
>
> Java's base64 encoder takes and returns bytes:
>
> https://docs.oracle.com/javase/8/docs/api/java/util/Base64.Encoder.html
>
> and Javascript's Base64 encoder takes input as UTF-16 encoded text and
> returns the same:
>
>
> https://developer.mozilla.org/en-US/docs/Web/API/WindowBase64/Base64_encoding_and_decoding
>
> I'm not necessarily arguing that Python's strategy is the wrong one, but
> I am interested in what (if any) reasons are behind it.
>
>
> Thanks in advance,
>
>
>
>
> Steve
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-checkins] cpython (3.5): Fix os.urandom() using getrandom() on Linux

2016-06-14 Thread Victor Stinner

Le 14 juin 2016 5:28 PM, "Jelle Zijlstra"  a
écrit :
>The problem isn't that os.urandom() uses getrandom(), it's that it calls
it in a mode that may block.

Except if it changed very recently, os.urandom() doesn't block anymore
thanks to my previous change ;-)

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Paul Moore

On 14 June 2016 at 16:19, Steven D'Aprano  wrote:
> Why does base64 encoding in Python return bytes?

I seem to recall there was a debate about this around the time of the
Python 3 move. (IIRC, it was related to the fact that there used to be
a base64 "codec", that wasn't available in Python 3 because it wasn't
clear whether it converted bytes to text or bytes). I don't remember
any of the details, let alone if a conclusion was reached, but a
search of the archives may find something.

Paul
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Toshio Kuratomi

On Jun 14, 2016 8:32 AM, "Joao S. O. Bueno"  wrote:
>
> On 14 June 2016 at 12:19, Steven D'Aprano  wrote:
> > Is there
> > a good reason for returning bytes?
>
> What about: it returns 0-255 numeric values for each position in  a
stream, with
> no clue whatsoever to how those values map to text characters beyond
> the 32-128 range?
>
> Maybe base64.decode could take a "encoding" optional parameter - or
> there could  be
> a separate 'decote_to_text" method that would explicitly take a text
codec name.
> Otherwise, no, you simply can't take a bunch of bytes and say they
> represent text.
>
Although it's not explicit, the question seems to be about the output of
encoding (and for symmetry, the input of decoding).  In both of those
cases, valid output will consist only of ascii characters.

The input to encoding would have to remain bytes (that's the main purpose
of base64... to turn bytes into an ascii string).

-Toshio
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Terry Reedy


On 6/14/2016 11:19 AM, Steven D'Aprano wrote:

Normally I'd take a question like this to Python-List, but this question
has turned out to be quite diversive, with people having strong opinions
but no definitive answer. So I thought I'd ask here and hope that some
of the core devs would have an idea.

Why does base64 encoding in Python return bytes?


Ultimately, because we never decided to change this in 3.0.


base64.b64encode take bytes as input and returns bytes. Some people are
arguing that this is wrong behaviour, as RFC 3548 specifies that Base64
should transform bytes to characters:

https://tools.ietf.org/html/rfc3548.html

albeit US-ASCII characters. E.g.:

The encoding process represents 24-bit groups of input bits
as output strings of 4 encoded characters.


One could argue that 'encoded character' means 'bytes' in Python, but I 
don't know what the standard writer meant, as unicode characters always 
have some internal encoding.



[...]
Each 6-bit group is used as an index into an array of 64 printable
characters.  The character referenced by the index is placed in the
output string.


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Mark Lawrence via Python-Dev


On 14/06/2016 16:51, Paul Moore wrote:

On 14 June 2016 at 16:19, Steven D'Aprano  wrote:

Why does base64 encoding in Python return bytes?


I seem to recall there was a debate about this around the time of the
Python 3 move. (IIRC, it was related to the fact that there used to be
a base64 "codec", that wasn't available in Python 3 because it wasn't
clear whether it converted bytes to text or bytes). I don't remember
any of the details, let alone if a conclusion was reached, but a
search of the archives may find something.

Paul



As I've the time to play detective I'd suggest 
https://mail.python.org/pipermail/python-3000/2007-July/008975.html


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] mod_python compilation error in VS 2008 for py2.7.1

2016-06-14 Thread Terry Reedy


On 6/14/2016 4:44 AM, asimkon wrote:

I would like to ask you a technical question regarding python module
compilation for python 2.7.1.


So you know, python-list, where you cross-posted this, is the right 
place for discussion of development *with* Python.


python-dev is for development *of* Python language and future CPython 
and this is off-topic here.


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Joao S. O. Bueno

On 14 June 2016 at 13:32, Toshio Kuratomi  wrote:
>
> On Jun 14, 2016 8:32 AM, "Joao S. O. Bueno"  wrote:
>>
>> On 14 June 2016 at 12:19, Steven D'Aprano  wrote:
>> > Is there
>> > a good reason for returning bytes?
>>
>> What about: it returns 0-255 numeric values for each position in  a
>> stream, with
>> no clue whatsoever to how those values map to text characters beyond
>> the 32-128 range?
>>
>> Maybe base64.decode could take a "encoding" optional parameter - or
>> there could  be
>> a separate 'decote_to_text" method that would explicitly take a text codec
>> name.
>> Otherwise, no, you simply can't take a bunch of bytes and say they
>> represent text.
>>
> Although it's not explicit, the question seems to be about the output of
> encoding (and for symmetry, the input of decoding).  In both of those cases,
> valid output will consist only of ascii characters.
>
> The input to encoding would have to remain bytes (that's the main purpose of
> base64... to turn bytes into an ascii string).
>

Sorry, it is 2016, and I don't think at this point anyone can consider
an ASCII string
as a representative pattern of textual data in any field of application.
Bytes are not text. Bytes with an associated, meaningful, encoding are text.
  I thought this had been through when Python 3 was out.

Unless you are working with COBOL generated data (and intending to keep
the file format) , it does not make sense in any real-world field.
(supposing your
Cobol data is ASCII and nort EBCDIC).


> -Toshio
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Paul Sokolovsky

Hello,

On Tue, 14 Jun 2016 16:51:44 +0100
Paul Moore  wrote:

> On 14 June 2016 at 16:19, Steven D'Aprano  wrote:
> > Why does base64 encoding in Python return bytes?
> 
> I seem to recall there was a debate about this around the time of the
> Python 3 move. (IIRC, it was related to the fact that there used to be
> a base64 "codec", that wasn't available in Python 3 because it wasn't
> clear whether it converted bytes to text or bytes). I don't remember
> any of the details, let alone if a conclusion was reached, but a
> search of the archives may find something.

Well, it's easy to remember the conclusion - it was decided to return
bytes. The reason also wouldn't be hard to imagine - regardless of the
fact that base64 uses ASCII codes for digits and letters, it's still
essentially a binary data. And the most natural step for it is to send
it down the socket (socket.send() accepts bytes), etc.

I'd find it a bit more surprising that binascii.hexlify() returns
bytes, but I personally got used to it, and consider it a
consistency thing on binascii module.

Generally, with Python3 by default using (inefficient) Unicode for
strings, any efficient data processing would use bytes, and then one
appreciates the fact that data encoding/decoding routines also return
bytes, avoiding implicit expensive conversion to strings.

-- 
Best regards,
 Paul  mailto:pmis...@gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Random832

On Tue, Jun 14, 2016, at 13:05, Joao S. O. Bueno wrote:
> Sorry, it is 2016, and I don't think at this point anyone can consider
> an ASCII string
> as a representative pattern of textual data in any field of application.
> Bytes are not text. Bytes with an associated, meaningful, encoding are
> text.
>   I thought this had been through when Python 3 was out.

Of all the things that anyone has said in this thread, this makes the
*least* contextual sense. The input to base64 encoding, which is what is
under discussion, is not text in any way. It is images, it is zip files,
it is executables, it could be the output of os.urandom (at least,
provided it doesn't block ;) for all anyone cares.

The *output* is only an ascii string in the sense that it is a text
string consisting of characters within (a carefully chosen subset of)
ASCII's repertoire, but the output wasn't what he was claiming should be
bytes in the sentence you replied to. Is your objection to the phrase
"ascii string"?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Joao S. O. Bueno

On 14 June 2016 at 14:45, Random832  wrote:
> On Tue, Jun 14, 2016, at 13:05, Joao S. O. Bueno wrote:
>> Sorry, it is 2016, and I don't think at this point anyone can consider
>> an ASCII string
>> as a representative pattern of textual data in any field of application.
>> Bytes are not text. Bytes with an associated, meaningful, encoding are
>> text.
>>   I thought this had been through when Python 3 was out.
>
> Of all the things that anyone has said in this thread, this makes the
> *least* contextual sense. The input to base64 encoding, which is what is
> under discussion, is not text in any way. It is images, it is zip files,
> it is executables, it could be the output of os.urandom (at least,
> provided it doesn't block ;) for all anyone cares.
>
> The *output* is only an ascii string in the sense that it is a text
> string consisting of characters within (a carefully chosen subset of)
> ASCII's repertoire, but the output wasn't what he was claiming should be
> bytes in the sentence you replied to. Is your objection to the phrase
> "ascii string"?
Sorry - everything I wrote, I was thinking about _decoding_ base 64.
As for the result of an encoded base64, yes, of course it fits into ASCII.

The arguments about compactness and what is most likely to happen
next applies (transmission trhough a binary network protocol),
 but the strong objection I had was just because I thought it was
a suggestion of decoding base 64 automatically to text without providing
a text encoding.

> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/jsbueno%40python.org.br
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Random832

On Tue, Jun 14, 2016, at 13:19, Paul Sokolovsky wrote:
> Well, it's easy to remember the conclusion - it was decided to return
> bytes. The reason also wouldn't be hard to imagine - regardless of the
> fact that base64 uses ASCII codes for digits and letters, it's still
> essentially a binary data. 

Only in the sense that all text is binary data. There's nothing in the
definition of base64 specifying ASCII codes. It specifies *characters*
that all happen to be in ASCII's character repertoire.

>And the most natural step for it is to send
> it down the socket (socket.send() accepts bytes), etc.

How is that more natural than to send it to a text buffer that is
ultimately encoded (maybe not even in an ASCII-compatible encoding...
though probably) and sent down a socket or written to a file by a layer
that is outside your control? Yes, everything eventually ends up as
bytes. That doesn't mean that we should obsessively convert things to
bytes as early as possible.

I mean if we were gonna do that why bother even having a unicode string
type at all?

> I'd find it a bit more surprising that binascii.hexlify() returns
> bytes, but I personally got used to it, and consider it a
> consistency thing on binascii module.
> 
> Generally, with Python3 by default using (inefficient) Unicode for
> strings, 

Why is it inefficient?

> any efficient data processing would use bytes, and then one
> appreciates the fact that data encoding/decoding routines also return
> bytes, avoiding implicit expensive conversion to strings.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread R. David Murray

On Tue, 14 Jun 2016 14:05:19 -0300, "Joao S. O. Bueno"  
wrote:
> On 14 June 2016 at 13:32, Toshio Kuratomi  wrote:
> >
> > On Jun 14, 2016 8:32 AM, "Joao S. O. Bueno"  wrote:
> >>
> >> On 14 June 2016 at 12:19, Steven D'Aprano  wrote:
> >> > Is there
> >> > a good reason for returning bytes?
> >>
> >> What about: it returns 0-255 numeric values for each position in  a
> >> stream, with
> >> no clue whatsoever to how those values map to text characters beyond
> >> the 32-128 range?
> >>
> >> Maybe base64.decode could take a "encoding" optional parameter - or
> >> there could  be
> >> a separate 'decote_to_text" method that would explicitly take a text codec
> >> name.
> >> Otherwise, no, you simply can't take a bunch of bytes and say they
> >> represent text.
> >>
> > Although it's not explicit, the question seems to be about the output of
> > encoding (and for symmetry, the input of decoding).  In both of those cases,
> > valid output will consist only of ascii characters.
> >
> > The input to encoding would have to remain bytes (that's the main purpose of
> > base64... to turn bytes into an ascii string).
> >
> 
> Sorry, it is 2016, and I don't think at this point anyone can consider
> an ASCII string
> as a representative pattern of textual data in any field of application.
> Bytes are not text. Bytes with an associated, meaningful, encoding are text.
>   I thought this had been through when Python 3 was out.
> 
> Unless you are working with COBOL generated data (and intending to keep
> the file format) , it does not make sense in any real-world field.
> (supposing your
> Cobol data is ASCII and nort EBCDIC).

The fundamental purpose of the base64 encoding is to take a series
of arbitrary bytes and reversibly turn them into another series of
bytes in which the eighth bit is not significant.  Its utility is for
transmitting eight bit bytes over a channel that is not eight bit clean.
Before unicode, that meant bytes.  Now that we have unicode in use in
lots of places, you can think of unicode as a communications channel
that is not eight bit clean.  So, we might want to use base64 encoding to
transmit arbitrary bytes over a unicode channel.  This gives a legitimate
reason to want unicode output from a base64 encoder.   However, it is
equally legitimate in the Python context to say you should be explicit
about your intentions by decoding the bytes output of the base64 encoder
using the ASCII codec.

This was indeed discussed at length.  For a while we didn't even allow
unicode input on either side, but we relaxed that.  My understanding of
Python's current stance on functions that handle both bytes and string
is that *either* the function accepts both types and outputs the *same*
type as the input, *or* it accepts both types but always outputs *one*
type or the other.

You can't have unicode output if you give unicode input to the base64
decoder in the general case.  So decode, at least, has to always give
bytes output.  Likewise, there is small to zero utility for using unicode
input to the base64 encoder, since the unicode would have to be ASCII
only and there'd be no point in doing the encoding.  So, the only thing
that makes sense is to follow the "one output type" rule here.

Now, you can argue whether or not it would make sense for the encoder
to always produce unicode.  However, you then immediately run into the
backward compatibility issue:  the primary use case of the base64 encoding
is to produce *wire ready* bytes.  This is what the email package uses
it for, for example.  So for backward compatibility reasons, which
are consonant with its primary use case, it makes more sense for the
encoder to produce bytes than string.  If you need to transmit bytes
over a unicode channel, you can decode it from ASCII.  That is,
unicode is the *exceptional* use case here, not the rule.  That might
in fact be changing, but for backward compatibility reasons, Python
won't change.

And that should answer Steve's original question :)

--David
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Daniel Holth

IMO this is more a philosophical problem than a programming problem. base64
has a dual-nature. It is both text and bytes. At least it should fit in a
1-byte-per-character efficient Python 3 unicode string also.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Paul Sokolovsky

Hello,

On Tue, 14 Jun 2016 14:02:02 -0400
Random832  wrote:

> On Tue, Jun 14, 2016, at 13:19, Paul Sokolovsky wrote:
> > Well, it's easy to remember the conclusion - it was decided to
> > return bytes. The reason also wouldn't be hard to imagine -
> > regardless of the fact that base64 uses ASCII codes for digits and
> > letters, it's still essentially a binary data. 
> 
> Only in the sense that all text is binary data. There's nothing in the
> definition of base64 specifying ASCII codes. It specifies *characters*
> that all happen to be in ASCII's character repertoire.
> 
> >And the most natural step for it is to send
> > it down the socket (socket.send() accepts bytes), etc.
> 
> How is that more natural than to send it to a text buffer that is

It's more natural because it's more efficient. It's more natural in the
same sense that the most natural way to get from point A to point B is
a straight line.

> ultimately encoded (maybe not even in an ASCII-compatible encoding...
> though probably) and sent down a socket or written to a file by a
> layer that is outside your control? Yes, everything eventually ends
> up as bytes. That doesn't mean that we should obsessively convert
> things to bytes as early as possible.

It's vice-versa - there's no need to obsessively convert simple,
primary type of bytes (everything in computers are bytes) to more
complex things like Unicode strings.

> I mean if we were gonna do that why bother even having a unicode
> string type at all?

You're trying to raise the topic which is a subject of gigantic flame
wars on python-list for years. Here's my summary: not using unicode
string type *at all* is better than not using bytes type at all. So,
feel free to use unicode string *only* when it's needed, which is
*only* when you accept input from or produce output for *human* (like
real human, walking down a street to do grocery shopping). In all
other cases, data should stay bytes (mind - stay, as it's bytes in the
beginning, and it requires extra effort to convert it to a strings).

> > I'd find it a bit more surprising that binascii.hexlify() returns
> > bytes, but I personally got used to it, and consider it a
> > consistency thing on binascii module.
> > 
> > Generally, with Python3 by default using (inefficient) Unicode for
> > strings, 
> 
> Why is it inefficient?

Because bytes is the most efficient basic representation of data.
Everything which tries to convert it to something is less efficient in
general. Less efficient == inefficient.

-- 
Best regards,
 Paul  mailto:pmis...@gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Terry Reedy


On 6/14/2016 12:32 PM, Toshio Kuratomi wrote:


The input to encoding would have to remain bytes (that's the main
purpose of base64... to turn bytes into an ascii string).


The purpose is to turn arbitrary binary data (commonly images) into 
'safe bytes' that will not get mangled on transmission (7 bit channels 
were once common) and that will not mangle a display of data transmitted 
or received.  Ignoring the EBCDIC world, which Python mostly does, the 
set of 'safe bytes' is the set that encodes printable ascii characters. 
Those bytes pass through 7 bit channels and display on ascii-based 
terminals.


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Paul Sokolovsky

Hello,

On Tue, 14 Jun 2016 18:13:11 +
Daniel Holth  wrote:

> IMO this is more a philosophical problem than a programming problem.
> base64 has a dual-nature. It is both text and bytes. At least it
> should fit in a 1-byte-per-character efficient Python 3 unicode
> string also.

You probably mean "CPython3 1-byte-per-character "efficient" string".
But CPython3 is merely one of half-dozen Python3 language
implementations. Yup, a special one, but hopefully it's special in a
respect that it doesn't abuse its powers to make language API *changes*
based on its own implementation details. API changes, because API
*decisions* have been done long ago already.

-- 
Best regards,
 Paul  mailto:pmis...@gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Terry Reedy


On 6/14/2016 12:29 PM, Mark Lawrence via Python-Dev wrote:


As I've the time to play detective I'd suggest
https://mail.python.org/pipermail/python-3000/2007-July/008975.html


Thank you for finding that.  I reread it and still believe that bytes 
was the right choice.  Base64 is an generic edge encoding for binary 
data.  It fits in with the the standard paradigm as a edge encoding.


Receive encoded bytes.
Decode bytes to python objects
Manipulate python objects
Encode python objects to bytes
Send bytes.

Receive and send can be from and to either local files or sockets 
usually connected to remote systems.  Transmissions can have blocks with 
different encodings. In the latter case, the bytes need to be parsed 
into blocks with different encodings.


In the (fairly common) special case that a transmission consists 
entirely of text in *1* encoding (ignoring any transmission wrappers), 
decode and encode can be incorporated into a text-mode file object.  If 
a transmission consists entirely or partly of binary, one can open in 
binary mode and .write one or more blocks of encoded bytes, possible 
with encoding data.


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Stephen J. Turnbull

Steven D'Aprano writes:

 > base64.b64encode take bytes as input and returns bytes. Some people are 
 > arguing that this is wrong behaviour, as RFC 3548

That RFC is obsolete: the replacement is RFC 4648.  However, the text
is essentially unchanged.

 > specifies that Base64  should transform bytes to characters:

Without defining "character" except as a "subset" of ASCII.  That
omission is evidently deliberate.  Unfortunately the RFC is unclear
whether a subset of the ASCII repertoire of (abstract) characters is
meant, or a subset of the ASCII codes.  I believe the latter is meant,
but either way, it does refer to *encoded* characters as the output of
the encoding process:

 > The encoding process represents 24-bit groups of input bits 
 > as output strings of 4 encoded characters. 

and I see no reason to deny that the bytes output by base64.b64encode
are the octets representing the ASCII codes for the characters of the
BASE64 alphabet.

 > Are they misinterpreting the standard?

I think they are.  As I understand it, the intention of the standard
in using "character" to denote the code unit is similar to that of RFC
3986: BASE encodings are intended to be printable and recognizable to
humans.  If you're using a non-ASCII-superset encoding such as EBCDIC
for text I/O, then you should translate from ASCII to that encoding
for display, and in the (unlikely) case that a human types BASE
encoding from the terminal, the reverse transformation is necessary.

 > Has Python got it wrong?

I can't see anything in the RFC that suggests that.  And, in the end,
an RFC is not concerned with Python's internal fiddling, but rather
with what goes out over the wire.  All of the implementations you
mention will eventually send to the wire octets that are interpreted
as ASCII-encoded characters according to their integer values.

 > Is there a good reason for returning bytes?

I suppose practicality over purity: BASE encodings are normally used
on the wire, and so programs need to encode text to appropriately
encoded octets *before* BASE encoding, and then normally immediately
put the BASE-encoded content on the wire.  Why round-trip from UTF-8
bytes to a str in BASE64 representation, and then do the (trivial)
conversion back to bytes?  OK, it's not that expensive, but still...

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] ValuesView abc: why doesn't it (officially) inherit from Iterable?

2016-06-14 Thread Alan Franzoni

Hello,
I hope not to bother anyone with a somewhat trivial question, I was
unable to get an answer from other channels.

I was just checking out some docs on ABCs for a project of mine, where
I need to do some type-related work. Those are the official docs about
the ValuesView type, in both Python 2 and 3:

https://docs.python.org/2/library/collections.html#collections.ValuesView
https://docs.python.org/3/library/collections.abc.html

and this is the source (Python 2, but same happens in Python 3)

https://hg.python.org/releases/2.7.11/file/9213c70c67d2/Lib/_abcoll.py#l479

I was very puzzled about the ValuesView interface, because from a
logical standpoint it should inherit from Iterable, IMHO (it's even
got the __iter__ Mixin method); on the contrary the docs say that it
just inherits from MappingView, which inherits from Sized, which
doesn't inherit from Iterable.

So I fired up my 2.7 interpreter:

>>> from collections import Iterable
>>> d = {1:2, 3:4}
>>> isinstance(d.viewvalues(), Iterable)
True
>>>

It looks iterable, after all, because of Iterable's own subclasshook.

But I don't understand why ValuesView isn't explicitly Iterable. Other
ABCs, like Sequence, are explicitly inheriting Iterable. Is there some
arcane reason behind that, or it's just a documentation+implementation
shortcoming (with no real-world impact) for a little-used feature?

Bye,

-- 
www.franzoni.eu - Twitter: @alanfranz
contact me at public@[mysurname].eu
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] ValuesView abc: why doesn't it (officially) inherit from Iterable?

2016-06-14 Thread Brett Cannon

On Tue, 14 Jun 2016 at 13:30 Alan Franzoni  wrote:

> Hello,
> I hope not to bother anyone with a somewhat trivial question, I was
> unable to get an answer from other channels.
>
> I was just checking out some docs on ABCs for a project of mine, where
> I need to do some type-related work. Those are the official docs about
> the ValuesView type, in both Python 2 and 3:
>
> https://docs.python.org/2/library/collections.html#collections.ValuesView
> https://docs.python.org/3/library/collections.abc.html
>
> and this is the source (Python 2, but same happens in Python 3)
>
> https://hg.python.org/releases/2.7.11/file/9213c70c67d2/Lib/_abcoll.py#l479
>
> I was very puzzled about the ValuesView interface, because from a
> logical standpoint it should inherit from Iterable, IMHO (it's even
> got the __iter__ Mixin method); on the contrary the docs say that it
> just inherits from MappingView, which inherits from Sized, which
> doesn't inherit from Iterable.
>
> So I fired up my 2.7 interpreter:
>
> >>> from collections import Iterable
> >>> d = {1:2, 3:4}
> >>> isinstance(d.viewvalues(), Iterable)
> True
> >>>
>
> It looks iterable, after all, because of Iterable's own subclasshook.
>
> But I don't understand why ValuesView isn't explicitly Iterable. Other
> ABCs, like Sequence, are explicitly inheriting Iterable. Is there some
> arcane reason behind that, or it's just a documentation+implementation
> shortcoming (with no real-world impact) for a little-used feature?
>

To add some extra info, both KeysView and ItemsView inherit from Set which
does inherit from Iterable. I personally don't know why ValuesView doesn't
inherit from Set (although Iterable does override __subclasshook__() so
there isn't a direct functional loss which if this turns out to be a bug
why no one has notified until now).

Alan, would you mind filing an issue at bugs.python.org about this?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] ValuesView abc: why doesn't it (officially) inherit from Iterable?

2016-06-14 Thread Alan Franzoni

ValuesView doesn't inherit from Set because the values in a dictionary
can contain duplicates. That makes sense. It's just the missing
Iterable, which is a weaker contract, that doesn't.

I'm filing the bug tomorrow.

On Tue, Jun 14, 2016 at 10:44 PM, Brett Cannon  wrote:
> On Tue, 14 Jun 2016 at 13:30 Alan Franzoni  wrote:
>>
>> Hello,
>> I hope not to bother anyone with a somewhat trivial question, I was
>> unable to get an answer from other channels.
>>
>> I was just checking out some docs on ABCs for a project of mine, where
>> I need to do some type-related work. Those are the official docs about
>> the ValuesView type, in both Python 2 and 3:
>>
>> https://docs.python.org/2/library/collections.html#collections.ValuesView
>> https://docs.python.org/3/library/collections.abc.html
>>
>> and this is the source (Python 2, but same happens in Python 3)
>>
>>
>> https://hg.python.org/releases/2.7.11/file/9213c70c67d2/Lib/_abcoll.py#l479
>>
>> I was very puzzled about the ValuesView interface, because from a
>> logical standpoint it should inherit from Iterable, IMHO (it's even
>> got the __iter__ Mixin method); on the contrary the docs say that it
>> just inherits from MappingView, which inherits from Sized, which
>> doesn't inherit from Iterable.
>>
>> So I fired up my 2.7 interpreter:
>>
>> >>> from collections import Iterable
>> >>> d = {1:2, 3:4}
>> >>> isinstance(d.viewvalues(), Iterable)
>> True
>> >>>
>>
>> It looks iterable, after all, because of Iterable's own subclasshook.
>>
>> But I don't understand why ValuesView isn't explicitly Iterable. Other
>> ABCs, like Sequence, are explicitly inheriting Iterable. Is there some
>> arcane reason behind that, or it's just a documentation+implementation
>> shortcoming (with no real-world impact) for a little-used feature?
>
>
> To add some extra info, both KeysView and ItemsView inherit from Set which
> does inherit from Iterable. I personally don't know why ValuesView doesn't
> inherit from Set (although Iterable does override __subclasshook__() so
> there isn't a direct functional loss which if this turns out to be a bug why
> no one has notified until now).
>
> Alan, would you mind filing an issue at bugs.python.org about this?



-- 
My development blog: ollivander.franzoni.eu . @franzeur on Twitter
contact me at public@[mysurname].eu
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Greg Ewing


Joao S. O. Bueno wrote:

The arguments about compactness and what is most likely to happen
next applies (transmission trhough a binary network protocol),


I'm not convinced that this is what is most likely to
happen next *in a Python program*. How many people
implement their own binary network protocols in Python?
It seems to me most people will be using a protocol
library written by someone else.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Greg Ewing


R. David Murray wrote:

The fundamental purpose of the base64 encoding is to take a series
of arbitrary bytes and reversibly turn them into another series of
bytes in which the eighth bit is not significant.


No, it's not. If that were its only purpose, it would be
called base128, and the RFC would describe it purely in
terms of bit patterns and not mention characters or
character sets at all.

The RFC does *not* do that. It describes the output in
terms of characters, and does not specify any bit patterns
for the output. The intention is clearly to represent
binary data as *text*.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Steven D'Aprano

On Tue, Jun 14, 2016 at 05:29:12PM +0100, Mark Lawrence via Python-Dev wrote:

> As I've the time to play detective I'd suggest 
> https://mail.python.org/pipermail/python-3000/2007-July/008975.html

Thanks Mark, that's great!



-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Stephen J. Turnbull

Greg Ewing writes:

 > The RFC does *not* do that. It describes the output in terms of
 > characters, and does not specify any bit patterns for the
 > output.

The RFC is unclear on this point, but I read it as specifying the
ASCII coded character set, not the ASCII repertoire of (abstract)
characters.  Therefore, it specifies an invertible mapping from a
particular set of integers to characters.

 > The intention is clearly to represent binary data as *text*.

It's more subtle than that.  *RFCs do not deal with text.*  Text is
an internal concept of (some) programming environments.  RFCs may
deal with *encoded text*, and RFC 4648 indeed specifically mentions
"encoded characters" as the output of the BASE64 algorithm.[1]

The intention then is to represent binary data with *binary data that
may be conveniently interpreted as text* (ie, without reencoding), eg,
by a terminal or a printer.[2]  It is also desirable that it be likely
to pass unscathed through channels that are not necessarily even 7-bit
clean (file system directories and JIS X 0201, for example) which
*inadvertantly* treat it as text.  Both requirements are conveniently
fulfilled by using appropriate ASCII subsets, and encoding on the wire
using the usual bit patterns.  However, I suppose you could also use
EBCDIC or UTF-16, as long as you have agreed with the receiver to do
so.

So I would say that Python can do what it wants with the type that
base64.b64encode returns as far as the RFC is concerned; that's an
internal aspect of Python.  It's purely a matter of our convenience
(as programmer *in* Python) whether we return str or bytes.

My own experience is biased toward email and web (not to be confused
with SMTP and HTTP), and so my experience is that most composers
(1) automatically handle text encodings for the users, and then the
content transfer encoding as necessary for the underlying protocol,
and (2) handle attachments by placing a reference in the composed
content, which is replaced by the object just before transmission (and
any desired content transfer encoding is applied at that time, at the
option of the composing agent, which rarely needs to bother the user
with such trivia).  Bytes seem more convenient to me, and give an on-
the-wire representation consistent with that of Python 2 str.

Footnotes: 
[1]  Admittedly, RFC 3986 (URIs) does stretch the notion of "encoded
text" to the breaking point by including marks on paper.

[2]  Thus, BASE64-encoding resources provides a more efficient,
alternative datagram protocol for the physical links used by RFC 1149
networks.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Random832

On Tue, Jun 14, 2016, at 22:58, Stephen J. Turnbull wrote:
> The RFC is unclear on this point, but I read it as specifying the
> ASCII coded character set, not the ASCII repertoire of (abstract)
> characters.  Therefore, it specifies an invertible mapping from a
> particular set of integers to characters.

There are multiple descriptions of base 64 that specifically mention
using it with EBCDIC and with local character sets of unspecified
nature.

>  > The intention is clearly to represent binary data as *text*.
> 
> It's more subtle than that.  *RFCs do not deal with text.*  Text is
> an internal concept of (some) programming environments.

It's also a human concept. Plenty of RFCs deal with human concept rather
than purely programming topics.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Greg Ewing

Stephen J. Turnbull wrote:

it does refer to *encoded* characters as the output of
the encoding process:

 > The encoding process represents 24-bit groups of input bits 
 > as output strings of 4 encoded characters.

The "encoding" being referred to there is the encoding
from input bytes to output characters, not an encoding
of the output characters as bytes.

Nowhere in RFC 4648 does it refer to the output as
being made up of "bytes" or "octets". It's always
described in terms of "characters".

As I understand it, the intention of the standard
in using "character" to denote the code unit is similar to that of RFC
3986: BASE encodings are intended to be printable and recognizable to
humans.

Hmmm... so why then does it say, in section 4:

   The Base 64 encoding is designed to represent arbitrary sequences of
   octets in a form that ... need not be human readable.

If you're using a non-ASCII-superset encoding such as EBCDIC
for text I/O, then you should translate from ASCII to that encoding
for display,

What about the channel you're sending the encoded data over?

Suppose I'm on Windows and I'm embedding the base64 encoded
data in a text message that I'm sending through a mail client
that accepts text in utf-16.

I hope you would agree that, in that situation, encoding the
base64 output in ASCII and giving those bytes directly to
the mail client would be very much the wrong thing to do?

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-checkins] cpython (3.5): Fix os.urandom() using getrandom() on Linux

2016-06-14 Thread Larry Hastings




On 06/14/2016 08:07 AM, Steven D'Aprano wrote:

Is this right? I thought we had decided that os.urandom should *not*
fall back on getrandom on Linux?


We decided that os.urandom() should not *block* on Linux.  Which it 
doesn't; we now strictly call getrandom(GRND_NONBLOCK), which will never 
block.  getrandom() is better because it's a system call, instead of 
reading from a file.  So it's much less messy.


If getrandom() wanted to block, instead it'll return EAGAIN, and we'll 
fail over to reading from /dev/urandom directly, just like we did in 3.4 
and before.


It's all working as intended,


//arry/
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Why does base64 return bytes?

2016-06-14 Thread Simon Cross

On Tue, Jun 14, 2016 at 8:42 PM, Terry Reedy  wrote:
> Thank you for finding that.  I reread it and still believe that bytes was
> the right choice.  Base64 is an generic edge encoding for binary data.  It
> fits in with the the standard paradigm as a edge encoding.

I'd like to me-too Terry's sentiment, but also expand on it a bit.

Base64 encoding is used to convert bytes into a limited set of symbols
for inclusion in a stream of data. Whether bytes or unicode characters
are appropriate depends on whether the stream being constructed is a
byte stream or a unicode character stream.

Many people do deal with byte streams in Python and we have large
sub-communities for who this use case is important (e.g. Twisted,
Asyncio, anyone using the socket module).

It is also no longer 1980 though, and there are many protocols layered
on top of unicode character streams rather than bytes.

Ideally I'd like us to support both options (like we've been
increasingly doing for reading from other external sources such as
file systems or environment variables).

If we only support one, I would prefer it to be bytes since (bytes ->
bytes -> unicode) seems like less overhead and slightly conceptually
clearer than (bytes -> unicode -> bytes), but I consider this a
personal preference rather than any sort of one-true-way.

Schiavo
Simon
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

38 matches

Mail list logo