On 06.01.17 21:31, Alexander Belopolsky wrote:
On Thu, Jan 5, 2017 at 5:54 PM, Serhiy Storchaka <storch...@gmail.com
<mailto:storch...@gmail.com>> wrote:
On 05.01.17 22:37, Alexander Belopolsky wrote:
2. For 3.7, I would like to see a drastically simplified bytes(x):
2.1. Accept only objects with a __bytes__ method or a sequence
of ints
in range(256).
2.2. Expand __bytes__ definition to accept optional encoding
and errors
parameters. Implement str.__bytes__(self, [encoding[, errors]]).
I think it is better to use the encode() method if you want to
encode from non-strings.
Possibly, but the goal of my proposal is to lighten the logic in the
bytes(x, [encoding[, errors]])
constructor. If it detects x.__bytes__, it should just call it with
whatever arguments are given.
I think this would complicate the __bytes__ protocol. I don't know
precedences of passing additional optional arguments to a special
method. int() doesn't pass the base argument to __int__, str() doesn't
pass encoding and errors to __str__, and pickle.dumps() passes the
protocol argument to new special method __reduce_ex__ instead of __reduce__.
bytes.frombuffer(x) is bytes(memoryview(x)) or memoryview(x).tobytes().
I've just tried Inada's patch < http://bugs.python.org/issue29178
<http://bugs.python.org/issue29178>>:
$ ./python.exe -m timeit -s "from array import array; x=array('f', [0])"
"bytes..frombuffer(x)"
2000000 loops, best of 5: 134 nsec per loop
$ ./python.exe -m timeit -s "from array import array; x=array('f', [0])"
"with memoryview(x) as m: bytes(m)"
500000 loops, best of 5: 436 nsec per loop
A 3x speed-up seems to be worth it.
There is a constant overhead for calling functions. It is dwarfen by
memory copying for large arrays. I'm not sure that 300 ns is worth
adding new method.
2.4. Implement memoryview.__bytes__ method so that
bytes(memoryview(x))
works ad before.
2.5. Implement a fast bytearray.__bytes__ method.
This wouldn't help for the bytearray constructor. And wouldn't allow
to avoid double copying in the constructor of bytes subclass.
I don't see why bytearray constructor should behave differently from bytes.
bytes constructor can just return the result of __bytes__. bytearray
constructor needs to do a double copying if support __bytes__. First
copy a data to a bytes object returned by __bytes__, then copy it's
content to the newly created bytearray object. Creating a bytearray
object using the buffer protocol needs only one copying.
Perhaps this is the cause why the support of __bytes__ was not added in
bytearray constructor after all.
Compare these two calls:
from array import array
bytes(array('h', [1, 2, 3]))
b'\x01\x00\x02\x00\x03\x00'
and
bytes(array('f', [1, 2, 3]))
b'\x00\x00\x80?\x00\x00\x00@\x00\x00@@'
I don't see a difference.
For me the __bytes__ method is a way for types to specify their bytes
representation that may or may not be the same as memoryview(x).tobytes().
It would be confusing if some type that supports the buffer protocol
would implement __bytes__ returning a result different from
memoryview(x).tobytes(). If you want to get b'\1\2\3' from array('h',
[1, 2, 3]), use bytes(list(x)).
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com