Thanks for your reply. The simplified code is as follows. It takes 7 seconds
to process 1000 rows, which is tolerable, but I wonder why it takes so long.
Isn't vectorized operation supposed to run very quickly.

from numpy import *

componentcount = 300000
currSum = zeros(componentcount)
row = zeros(componentcount) #current row
rowcount = 50000
for i in range(1,rowcount):
   row[:] = 1
   currSum = currSum + row;



On 11/30/06, Robert Kern <[EMAIL PROTECTED]> wrote:

Fang Fang wrote:
> Hi,
>
> I am writing code to sort the columns according to the sum of each
> column. The dataset is huge (50k rows x 300k cols), so i have to read
> line by line and do the summation to avoid the out-of-memory problem.
> But I don't know why it runs very slow, and part of the code is as
> follows. Can anyone point out what needs to be modified to make it run
> fast? thanks in advance!

Nothing leaps out. Generally, it's difficult (or impossible) to answer
such
questions without running code. Can you distill the time-consuming part
into a
small, self-contained script with fake data that we can run?

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma
that is made terrible by our own mad attempt to interpret it as though it
had
an underlying truth."
-- Umberto Eco
_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to