On Fri, Jul 18, 2014 at 3:33 AM, Nathaniel Smith <n...@pobox.com> wrote:
> > 2) A bytes types -- almost the current 'S' type > > - A bytes type would map to/from py3 bytes objects (and py2 bytes > > objects, which are the same as py2strings) > > - one way is would differ from a py2str is that there would be no > > assumption of null-termination (not sure where that is now) > > AFAICT this is *exactly* the same as the current 'S' type. What > differences do you see? as you mention it, it is the same on py3, except maybe handling of null bytes -- you mentioned that you had to do some work-arounds for that. a proper bytes type would do nothing special with null bytes. > > 3) A one-byte-per-char text type -- more or less Chuck's current > proposal. > > - it would map to/from the py3 string -- it is text after all > > - it would be null-terminated > > Numpy strings types are never null-terminated ATM. They're > null-padded, which is slightly different. When storing data in an S5, > for instance, strings of length 5 have no nulls appending, strings of > length 4 have 1 null appended, strings of length 3 have 2 nulls > appended, etc. When reading data out of an S5, then all trailing nulls > are stripped. > > So, they may not be null terminated (if the length of the string > exactly matches the length of the dtype), and the strings being stored > can contain internal nulls ("foo\x00bar" is fine), but they cannot > contain trailing nulls ("foo\x00" will come back as just "foo"). > > Do you actually care about null-termination specifically? Or did you > just mean "it should work like the other ones, which I vaguely > remember involves nulls"? ;-) > That's pretty much what I meant, yes ;-) But the key is that when pushing one of these things to a python string, any thing after a null byte is ignored. Which is why you can't use it for arbitrary bytes. > - it would have a one-byte per-char encoding: ascii, latin-1 or > settable > > (TBA) > > Settable is technically very difficult until we redo the dtype > machinery to allow parametrized types. indeed -- we have that a bit with Datetime -- but that's a whole other kettle of fish. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion