Re: [Numpy-discussion] Numpy Memory Error with corrcoef

Pierre Haessig Tue, 27 Mar 2012 02:38:39 -0700

Hi Nicole,

Le 27/03/2012 11:12, Nicole Stoffels a écrit :
> *if __name__ == '__main__':
>    
>     data_records = random.random((459375, 24))
>     correlation = corrcoef(data_records)*


May I assume that your data_record is made of 24 different variables of
which you have 459375 observations ?

If this is so and if you expect corrcoeff to return a 24*24 matrix, you
need to either transpose data_records :

>>> correlation = corrcoef(data_records.T)

or use the rowvar=0 argument (see np.corrcoef or np.cov docstring)

>>> correlation = corrcoef(data_records, rowvar = 0)

Both work on my computer, while your example indeed leads to a
MemoryError (because shape 459375*459375 would be a decently big matrix...)

I don't know if it's your case, but for those used to the Matlab (and
textbooks) convention of having variables stored in columns, the default
behaviour of numpy's covariance function is a bit surprising. I guess
historical reasons are involved in this choice. Just a matter of getting
used to it !

Best,
Pierre

signature.asc
Description: OpenPGP digital signature

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Numpy Memory Error with corrcoef

Reply via email to