On Tue, Oct 08, 2013 at 01:49:14AM -0700, Matthew Brett wrote: > Hi, > > On Tue, Oct 8, 2013 at 1:06 AM, Ke Sun <[email protected]> wrote: > > Dear all, > > > > I have written the following function to compute the square distances of a > > large > > matrix (each sample a row). It compute row by row and print the overall > > progress. > > The progress output is important and I didn't use matrix multiplication. > > > > I give as input a 70,000x800 matrix. The output should be a 70,000x70,000 > > matrix. The program runs really slow (16 hours for 1/3 progress). And it > > eats > > 36G memory (fortunately I have enough). > > That is very slow. > > As a matter of interest - why didn't you use matrix multiplication? Because it will cost hours and I want to see the progress and know how far it goes. Another concern is to save memory and compute one sample at a time.
> On a machine I had access to it took about 20 minutes. How? I am using matrix multiplication (the same code as http://stackoverflow.com/a/4856692) and it runs for around 18 hours. Best, Ke Sun > > You've got a 70000 by 70000 element output matrix so I think that's > 37G already (if the matrix is double precision float). > > > Could you give some insights on how to modify the code to be efficient and > > to eat less memory? > > You could try using Cython - but I'm guessing that the BLAS routines > in numpy will already do this the most efficient way. _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
