Hi Pierre,

thanks for the fast answer!

I actually have timeseries of 24 hours for 459375 gridpoints in Europe. The timeseries of every grid point is stored in a column. That's why in my real program I already transposed the data, so that the correlation is made column by column. What I finally need is the correlation of each gridpoint with every other gridpoint. I'm afraid that this results in a 459375*459375 matrix.

The correlation is actually just an interim result. So I'm currently trying to loop over every gridpoint to get single correlations which will then be processed further. Is this the right approach?

for column in range(len(data_records)):
    for columnnumber in range(len(data_records)):
correlation = corrcoef(data_records[column], data_records[columnnumber])

Best wished,
Nicole

On 27.03.2012 11:38, Pierre Haessig wrote:
Hi Nicole,

Le 27/03/2012 11:12, Nicole Stoffels a écrit :
*if __name__ == '__main__':

    data_records = random.random((459375, 24))
    correlation = corrcoef(data_records)*

May I assume that your data_record is made of 24 different variables of which you have 459375 observations ?

If this is so and if you expect corrcoeff to return a 24*24 matrix, you need to either transpose data_records :

>>> correlation = corrcoef(data_records.T)

or use the rowvar=0 argument (see np.corrcoef or np.cov docstring)

>>> correlation = corrcoef(data_records, rowvar = 0)

Both work on my computer, while your example indeed leads to a MemoryError (because shape 459375*459375 would be a decently big matrix...)

I don't know if it's your case, but for those used to the Matlab (and textbooks) convention of having variables stored in columns, the default behaviour of numpy's covariance function is a bit surprising. I guess historical reasons are involved in this choice. Just a matter of getting used to it !

Best,
Pierre


_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

--

Dipl.-Met. Nicole Stoffels

Wind Power Forecasting and Simulation

ForWind - Center for Wind Energy Research
Institute of Physics
Carl von Ossietzky University Oldenburg

Ammerländer Heerstr. 136
D-26129 Oldenburg

Tel: +49(0)441 798 - 5079
Fax: +49(0)441 798 - 5099

Web  : www.ForWind.de
Email: nicole.stoff...@forwind.de

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to