On 17/01/14 13:09, Aldcroft, Thomas wrote:
> I've been playing around with porting a stack of analysis libraries to
> Python 3 and this is a very timely thread and comment.  What I
> discovered right away is that all the string data coming from binary
> HDF5 files show up (as expected) as 'S' type,, but that trying to make
> everything actually work in Python 3 without converting to 'U' is a big
> mess of whack-a-mole.  
> 
> Yes, it's possible to change my libraries to use bytestring literals
> everywhere, but the Python 3 user experience becomes horrible because to
> interact with the data all downstream applications need to use
> bytestring literals everywhere.  E.g. doing a simple filter like
> `string_array == 'foo'` doesn't work, and this will break all existing
> code when trying to run in Python 3.  And every time you try to print
> something it has this horrible "b" in front.  Ugly, and it just won't
> work well in the end.

In terms of HDF5 it is interesting to look at how h5py -- which has to
go between NumPy types and HDF5 conventions -- handles the problem as
described here:

  http://www.h5py.org/docs/topics/strings.html

which IMHO got it about right.

Regards, Freddie.

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to