On Fri, Dec 2, 2011 at 8:23 AM, Thouis (Ray) Jones <tho...@gmail.com> wrote:
> On Thu, Dec 1, 2011 at 17:39, Charles R Harris > <charlesr.har...@gmail.com> wrote: > > Given that strings should be the result, this looks like a bug. It's a > bit > > of a corner case that probably slipped through during the recent work on > > casting. There needs to be tests for these sorts of things, so if you > find > > more oddities post them so we can add them. > > I'm happy to add a patch and tests, but could use some guidance... > > It looks like discover_itemsize() in core/src/multiarray/ctors.c > should compute the length of the string or unicode representation of > the object based on the eventual type, but looking at > UNICODE_setitem() and STRING_setitem() in > core/src/multiarray/arraytypes.c.src, this is not trivial. > > Perhaps the object-to-unicode/string parts of > UNICODE_setitem/STRING_setitem can be extracted into separate > functions that can be called from *_setitem as well as > discover_itemsize. discover_itemsize would also need to know the > type it's discovering for (string or unicode or user-defined). > > After sleeping on this, I think an object array in this situation would be the better choice and wouldn't result in lost information. This might change the behavior of some functions though, so would need testing. Not sure what to do to handle user-defined types (error?). > > If that's is too complicated, maybe discover_itemsize should return -1 > (or warn, but given the danger of truncation, that seems a bit weak) > if asked to discover from data that doesn't have a length. This would > result in dtype=object when np.array is handed a mixed int/string > list. > > I wonder, also, if STRING_setitem and UNICODE_setitem shouldn't emit a > warning if asked to truncate data? > > I think a warning would be useful. But I don't use strings much so input from a user might carry more weight. Chuck
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion