Hi, I have a 60k*600k matrix, which exceed the vector length limit of 2^32-1. But it's rather sparse, only 0.02% has value. So I save is as MarketMatrix (mm) file, it's about 300M in size. I use readMM in Matrix package to read it in. If do so, the data type becomes dgTMatrix in 'Matrix' package instead of the common matrix type.
The problem is, if I run k-means only on part of the data, to make sure the vector length do not exceed 2^32-1, there's no problem at all. Meaning that the kmeans in R could recognize this type of matrix. If I run the entire matrix, R says "too many elements specified." I have considered the 'bigmemory' and 'biganalytics' packages. But to save the sparse matrix as common CSV file would take approx 70G and 99% being 0. I just don't think it's necessary or efficient to treat it as a dense matrix. It there anyway to deal with the vector length limit? Can I split the whole matrix into small ones and then do k-means? Thanks, Lishu [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.