On 17/01/14 13:09, Aldcroft, Thomas wrote: > I've been playing around with porting a stack of analysis libraries to > Python 3 and this is a very timely thread and comment. What I > discovered right away is that all the string data coming from binary > HDF5 files show up (as expected) as 'S' type,, but that trying to make > everything actually work in Python 3 without converting to 'U' is a big > mess of whack-a-mole. > > Yes, it's possible to change my libraries to use bytestring literals > everywhere, but the Python 3 user experience becomes horrible because to > interact with the data all downstream applications need to use > bytestring literals everywhere. E.g. doing a simple filter like > `string_array == 'foo'` doesn't work, and this will break all existing > code when trying to run in Python 3. And every time you try to print > something it has this horrible "b" in front. Ugly, and it just won't > work well in the end.
In terms of HDF5 it is interesting to look at how h5py -- which has to go between NumPy types and HDF5 conventions -- handles the problem as described here: http://www.h5py.org/docs/topics/strings.html which IMHO got it about right. Regards, Freddie.
signature.asc
Description: OpenPGP digital signature
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion