Re: [Python-Dev] Adding bytes.frombuffer() constructor to PEP 467

Serhiy Storchaka Fri, 06 Jan 2017 14:07:54 -0800

On 06.01.17 21:31, Alexander Belopolsky wrote:

On Thu, Jan 5, 2017 at 5:54 PM, Serhiy Storchaka <storch...@gmail.com
<mailto:storch...@gmail.com>> wrote:


    On 05.01.17 22:37, Alexander Belopolsky wrote:

        2. For 3.7, I would like to see a drastically simplified bytes(x):
        2.1.  Accept only objects with a __bytes__ method or a sequence
        of ints
        in range(256).
        2.2.  Expand __bytes__ definition to accept optional encoding
        and errors
        parameters.  Implement str.__bytes__(self, [encoding[, errors]]).


    I think it is better to use the encode() method if you want to
    encode from non-strings.


Possibly, but the goal of my proposal is to lighten the logic in the
bytes(x, [encoding[, errors]])
constructor.  If it detects x.__bytes__, it should just call it with
whatever arguments are given.

I think this would complicate the __bytes__ protocol. I don't knowprecedences of passing additional optional arguments to a specialmethod. int() doesn't pass the base argument to __int__, str() doesn'tpass encoding and errors to __str__, and pickle.dumps() passes theprotocol argument to new special method __reduce_ex__ instead of __reduce__.

    bytes.frombuffer(x) is bytes(memoryview(x)) or memoryview(x).tobytes().


I've just tried Inada's patch < http://bugs.python.org/issue29178
<http://bugs.python.org/issue29178>>:

$ ./python.exe -m timeit -s "from array import array; x=array('f', [0])"
"bytes..frombuffer(x)"
2000000 loops, best of 5: 134 nsec per loop

$ ./python.exe -m timeit -s "from array import array; x=array('f', [0])"
"with memoryview(x) as m: bytes(m)"
500000 loops, best of 5: 436 nsec per loop

A 3x speed-up seems to be worth it.

There is a constant overhead for calling functions. It is dwarfen bymemory copying for large arrays. I'm not sure that 300 ns is worthadding new method.

        2.4. Implement memoryview.__bytes__ method so that
        bytes(memoryview(x))
        works ad before.
        2.5.  Implement a fast bytearray.__bytes__ method.


    This wouldn't help for the bytearray constructor. And wouldn't allow
    to avoid double copying in the constructor of bytes subclass.


I don't see why bytearray constructor should behave differently from bytes.

bytes constructor can just return the result of __bytes__. bytearrayconstructor needs to do a double copying if support __bytes__. Firstcopy a data to a bytes object returned by __bytes__, then copy it'scontent to the newly created bytearray object. Creating a bytearrayobject using the buffer protocol needs only one copying.

Perhaps this is the cause why the support of __bytes__ was not added inbytearray constructor after all.

Compare these two calls:

from array import array
bytes(array('h', [1, 2, 3]))

b'\x01\x00\x02\x00\x03\x00'

and

bytes(array('f', [1, 2, 3]))

b'\x00\x00\x80?\x00\x00\x00@\x00\x00@@'


I don't see a difference.

For me the __bytes__ method is a way for types to specify their bytes
representation that may or may not be the same as memoryview(x).tobytes().

It would be confusing if some type that supports the buffer protocolwould implement __bytes__ returning a result different frommemoryview(x).tobytes(). If you want to get b'\1\2\3' from array('h',[1, 2, 3]), use bytes(list(x)).


_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Adding bytes.frombuffer() constructor to PEP 467

Reply via email to