On 09/24/2009 01:02 PM, Christopher Barker wrote: > Michael Droettboom wrote: > >> As I'm looking into fixing a number of bugs in chararray, I'm running >> into some surprising behavior. >> In [14]: x = np.array(['abcdefgh', 'ijklmnop'], 'O') >> >> # Without specifying the length, it seems to default to sizeof(int)... ??? >> In [15]: np.array(x, 'S') >> Out[15]: >> array(['abcd', 'ijkl'], >> dtype='|S4') >> > This sure looks like a bug, and I'm no expert, but I suspect that it's > the size of a pointer (you are on a 32 system -- I am), which makes a > bit of sense, as Object arrays store a pointer to the python objects. > That was my guess, too, but I haven't yet delved into the code. I'm on 32-bit as well. > The question is, what should the array constructor do? perhaps the > equivalent of: > > In [41]: np.array(x.tolist()) > Out[41]: > array(['abcdefgh', 'ijklmnop'], > dtype='|S8') > > which you could use as a work around. > Yes, that's the behaviour I was expecting. > Do you need to go through object arrays? could you go straight to a > string array: > > np.array(['abcdefgh', 'ijklmnop'], np.string_) > Out[35]: > array(['abcdefgh', 'ijklmnop'], > dtype='|S8') > > or just keep the strings in a list. > The background here is that I'm fixing/resurrecting chararray, which provides vectorized versions of the standard Python string operations, endswith, ljust etc.
I was using object arrays when the length of the output string can't be determined ahead of time. For example, the string __mod__ operator. I could probably get away with generating a list of strings instead, but it's a little bit inconsistent with how I'm doing things elsewhere, which is always to generate an array. > Object arrays are weird, I think there are a lot of corner cases. > Yeah, that's been my experience. But it would be nice to try to plug those corner cases up if possible. I'll spend some time investigating this particular one. Cheers, Mike _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion