I have filed a bug against this, along with a patch that fixes casting to fixed-size string arrays:
http://projects.scipy.org/numpy/ticket/1235 Undefined-sized string arrays is a harder problem, which I'm deferring for later. Mike On 09/24/2009 01:19 PM, Michael Droettboom wrote: > On 09/24/2009 01:02 PM, Christopher Barker wrote: > >> Michael Droettboom wrote: >> >> >>> As I'm looking into fixing a number of bugs in chararray, I'm running >>> into some surprising behavior. >>> In [14]: x = np.array(['abcdefgh', 'ijklmnop'], 'O') >>> >>> # Without specifying the length, it seems to default to sizeof(int)... ??? >>> In [15]: np.array(x, 'S') >>> Out[15]: >>> array(['abcd', 'ijkl'], >>> dtype='|S4') >>> >>> >> This sure looks like a bug, and I'm no expert, but I suspect that it's >> the size of a pointer (you are on a 32 system -- I am), which makes a >> bit of sense, as Object arrays store a pointer to the python objects. >> >> > That was my guess, too, but I haven't yet delved into the code. I'm on > 32-bit as well. > >> The question is, what should the array constructor do? perhaps the >> equivalent of: >> >> In [41]: np.array(x.tolist()) >> Out[41]: >> array(['abcdefgh', 'ijklmnop'], >> dtype='|S8') >> >> which you could use as a work around. >> >> > Yes, that's the behaviour I was expecting. > >> Do you need to go through object arrays? could you go straight to a >> string array: >> >> np.array(['abcdefgh', 'ijklmnop'], np.string_) >> Out[35]: >> array(['abcdefgh', 'ijklmnop'], >> dtype='|S8') >> >> or just keep the strings in a list. >> >> > The background here is that I'm fixing/resurrecting chararray, which > provides vectorized versions of the standard Python string operations, > endswith, ljust etc. > > I was using object arrays when the length of the output string can't be > determined ahead of time. For example, the string __mod__ operator. I > could probably get away with generating a list of strings instead, but > it's a little bit inconsistent with how I'm doing things elsewhere, > which is always to generate an array. > >> Object arrays are weird, I think there are a lot of corner cases. >> >> > Yeah, that's been my experience. But it would be nice to try to plug > those corner cases up if possible. I'll spend some time investigating > this particular one. > > Cheers, > Mike > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion