Hi Pierre,
thanks for the fast answer!
I actually have timeseries of 24 hours for 459375 gridpoints in Europe.
The timeseries of every grid point is stored in a column. That's why in
my real program I already transposed the data, so that the correlation
is made column by column. What I finally need is the correlation of each
gridpoint with every other gridpoint. I'm afraid that this results in a
459375*459375 matrix.
The correlation is actually just an interim result. So I'm currently
trying to loop over every gridpoint to get single correlations which
will then be processed further. Is this the right approach?
for column in range(len(data_records)):
for columnnumber in range(len(data_records)):
correlation = corrcoef(data_records[column],
data_records[columnnumber])
Best wished,
Nicole
On 27.03.2012 11:38, Pierre Haessig wrote:
Hi Nicole,
Le 27/03/2012 11:12, Nicole Stoffels a écrit :
*if __name__ == '__main__':
data_records = random.random((459375, 24))
correlation = corrcoef(data_records)*
May I assume that your data_record is made of 24 different variables
of which you have 459375 observations ?
If this is so and if you expect corrcoeff to return a 24*24 matrix,
you need to either transpose data_records :
>>> correlation = corrcoef(data_records.T)
or use the rowvar=0 argument (see np.corrcoef or np.cov docstring)
>>> correlation = corrcoef(data_records, rowvar = 0)
Both work on my computer, while your example indeed leads to a
MemoryError (because shape 459375*459375 would be a decently big
matrix...)
I don't know if it's your case, but for those used to the Matlab (and
textbooks) convention of having variables stored in columns, the
default behaviour of numpy's covariance function is a bit surprising.
I guess historical reasons are involved in this choice. Just a matter
of getting used to it !
Best,
Pierre
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
--
Dipl.-Met. Nicole Stoffels
Wind Power Forecasting and Simulation
ForWind - Center for Wind Energy Research
Institute of Physics
Carl von Ossietzky University Oldenburg
Ammerländer Heerstr. 136
D-26129 Oldenburg
Tel: +49(0)441 798 - 5079
Fax: +49(0)441 798 - 5099
Web : www.ForWind.de
Email: nicole.stoff...@forwind.de
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion