Le 27 mars 2012 06:04, Nicole Stoffels <nicole.stoff...@forwind.de> a écrit :
> ** > Hi Pierre, > > thanks for the fast answer! > > I actually have timeseries of 24 hours for 459375 gridpoints in Europe. > The timeseries of every grid point is stored in a column. That's why in my > real program I already transposed the data, so that the correlation is made > column by column. What I finally need is the correlation of each gridpoint > with every other gridpoint. I'm afraid that this results in a 459375*459375 > matrix. > > The correlation is actually just an interim result. So I'm currently > trying to loop over every gridpoint to get single correlations which will > then be processed further. Is this the right approach? > > for column in range(len(data_records)): > for columnnumber in range(len(data_records)): > correlation = corrcoef(data_records[column], > data_records[columnnumber]) > > Best wished, > Nicole > It may be painfully slow... You should make sure you don't compute twice each off-diagonal element. Also, if all your computations can be vectorized, you'll probably get a significant performance boost by computing your matrix by blocks instead of element-by-element. Take blocks as big as can fit in memory. -=- Olivier
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion