Re: [R] kmeans clustering on large but sparse matrix

2013-12-06 Thread Wuming Gong
Hi Lishu, I run into the similar large-scale problems recently. I used a parallel SGD k-means described in this paper for my problem: http://www.eecs.tufts.edu/~dsculley/papers/fastkmeans.pdf Let n be the samples, k be the number of clusters, and m be the number of nodes, 1. First, each node r

[R] kmeans clustering on large but sparse matrix

2012-01-18 Thread Lishu Liu
Hi, I have a 60k*600k matrix, which exceed the vector length limit of 2^32-1. But it's rather sparse, only 0.02% has value. So I save is as MarketMatrix (mm) file, it's about 300M in size. I use readMM in Matrix package to read it in. If do so, the data type becomes dgTMatrix in 'Matrix' package i