Le 27 mars 2012 06:04, Nicole Stoffels <nicole.stoff...@forwind.de> a écrit
:

> **
> Hi Pierre,
>
> thanks for the fast answer!
>
> I actually have timeseries of 24 hours for 459375 gridpoints in Europe.
> The timeseries of every grid point is stored in a column. That's why in my
> real program I already transposed the data, so that the correlation is made
> column by column. What I finally need is the correlation of each gridpoint
> with every other gridpoint. I'm afraid that this results in a 459375*459375
> matrix.
>
> The correlation is actually just an interim result. So I'm currently
> trying to loop over every gridpoint to get single correlations which will
> then be processed further. Is this the right approach?
>
> for column in range(len(data_records)):
>     for columnnumber in range(len(data_records)):
>         correlation = corrcoef(data_records[column],
> data_records[columnnumber])
>
> Best wished,
> Nicole
>

It may be painfully slow... You should make sure you don't compute twice
each off-diagonal element.
Also, if all your computations can be vectorized, you'll probably get a
significant performance boost by computing your matrix by blocks instead of
element-by-element. Take blocks as big as can fit in memory.

-=- Olivier
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to