On Tue, Sep 13, 2016 at 11:05 AM, Lluís Vilanova <vilan...@ac.upc.edu> wrote:
> Whenever we repr an array using 'S', we can instead show a unicode in py3. > That > keeps the binary representation, but will always show the expected result > to > users, and it's only a handful of lines added to dump_data(). > > If needed, I could easily add a bytes array to make the alternative > explicit > (where py3 would repr the contents as b'foo'). > > This would only leave the less-common paths inconsistent across python > versions, > which should not be a problem for most examples/doctests: > > * A 'U' array will show u'foo' in py2 and 'foo' in py3. > * The new binary array will show 'foo' in py2 and b'foo' in py3 (that > could also > be patched on the repr code). > * A 'O' array will not be able to do any meaningful repr conversions. > > > A more complex alternative (and actually closer to what I'm proposing) is > to > modify numpy in py3 to restrict 'S' to using 8-bit points in a unicode > string. It would have the binary compatibility, while being a unicode > string in > practice. I'm afraid these are both also non-starters at this point. NumPy's string dtype corresponds to bytes on Python 3, and you can use it to store arbitrary binary values. Would it really be an improvement to change the repr, if the scalar value resulting from indexing is still bytes? The sanest approach is probably a new dtype for one-byte strings. We talked about this a few years ago, but nobody has implemented it yet: http://numpy-discussion.scipy.narkive.com/3nqDu3Zk/a-one-byte-string-dtype (normally I would link to the archives on scipy.org, but the certificate for HTTPS has expired so you see a big error message right now...)
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion