On Wed, 2014-01-22 at 07:58 +0100, Dr. Leo wrote: > Hi, > > thanks. Both recarray and itertools.chain work just fine in the example > case. > > However, the real purpose of this is to read strings from a large xml > file into a pandas DataFrame. But fromiter cannot create arrays of dtype > 'object'. Fixed length strings may be worth trying. But as the xml > schema does not guarantee a max. length, and pandas generally uses > 'object' arrays for strings, I see no better way than creating the array > through list comprehensions and turn it into a DataFrame.
If your datatype is object, I doubt that using an intermediate list is a real overhead, since the list will use much less memory then the string objects anyway. - Sebastian > > Maybe a variable length string/unicode type would help in the long term. > > Leo > > > > > > I would like to write something like: > > > > In [25]: iterable=((i, i**2) for i in range(10)) > > > > In [26]: a=np.fromiter(iterable, int32) > > --------------------------------------------------------------------------- > > ValueError Traceback (most recent call > > last) > > <ipython-input-26-5bcc2e94dbca> in <module>() > > ----> 1 a=np.fromiter(iterable, int32) > > > > ValueError: setting an array element with a sequence. > > > > > > Is there an efficient way to do this? > > > Perhaps you could just utilize structured arrays ( > http://docs.scipy.org/doc/numpy/user/basics.rec.html), like: > iterable= ((i, i**2) for i in range(10)) > a= np.fromiter(iterable, [('a', int32), ('b', int32)], 10) > a.view(int32).reshape(-1, 2) > > You could use itertools: > > >>> from itertools import chain > >>> g = ((i, i**2) for i in range(10)) > >>> import numpy > >>> numpy.fromiter(chain.from_iterable(g), numpy.int32).reshape(-1, 2) > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion