On Jul 9, 2012, at 9:24 PM, Yan Tang wrote:

> Hi,
> 
> I noticed there is an odd issue when I am trying to convert a recarray to 
> list.  See below for the example/test case.
> 
> $ cat a.csv
> date,count
> 2011-07-25,91
> 2011-07-26,118
> $ cat b.csv
> name,count
> foo,1233
> bar,100
> 
> $ python
> 
> >>> from matplotlib import mlab
> >>> import numpy as np
> 
> >>> a = mlab.csv2rec('a.csv')
> >>> b = mlab.csv2rec('b.csv')
> >>> a
> rec.array([(datetime.date(2011, 7, 25), 91), (datetime.date(2011, 7, 26), 
> 118)], 
>       dtype=[('date', '|O8'), ('count', '<i8')])
> >>> b
> rec.array([('foo', 1233), ('bar', 100)], 
>       dtype=[('name', '|S3'), ('count', '<i8')])
> 
> 
> >>> np.array(a.tolist()).tolist()
> [[datetime.date(2011, 7, 25), 91], [datetime.date(2011, 7, 26), 118]]
> >>> np.array(b.tolist()).tolist()
> [['foo', '1233'], ['bar', '100']]
> 
> 
> The odd case is, 1233 becomes a string '1233' in the second command.  But 91 
> is still a number 91.
> 
> Why would this happen?  What's the correct way to do this conversion?

You are trying to convert the record array into a list of lists, I presume?   
The tolist() method on the rec.array produces a list of tuples.   Be sure that 
a list of tuples does not actually satisfy your requirements --- it might.    

Passing this back to np.array is going to try to come up with a data-type that 
satisfies all the elements in the list of tuples.  You are relying here on 
np.array's "intelligence" for trying to figure out what kind of array you have. 
  It tries to do it's best, but it is limited to determining a "primitive" 
data-type (float, int, string, object).   It can't always predict what you 
expect --- especially when the original data source was a record like this.    
In the first case, because of the date-time object, it decides the data is an 
"object" array which works.  In the second it decides that the data can all be 
represented as a "string" and so choose that.   The second .tolist() just 
produces a list out of the 2-d array. 

Likely what you want to do is just create a list of lists from the original 
output of .tolist.   Like this: 

[list(x) for x in a.tolist()]
[list(x) for x in b.tolist()]

This wil be faster as well...

Best, 

-Travis








> 
> Thanks.
> 
> -uris-
> _______________________________________________
> NumPy-Discussion mailing list
> [email protected]
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
[email protected]
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to