Hi,

I have a 60k*600k matrix, which exceed the vector length limit of 2^32-1.
But it's rather sparse, only 0.02% has value. So I save is as MarketMatrix
(mm) file, it's about 300M in size. I use readMM in Matrix package to read
it in. If do so, the data type becomes dgTMatrix in 'Matrix' package
instead of the common matrix type.

The problem is, if I run k-means only on part of the data, to make sure the
vector length do not exceed 2^32-1, there's no problem at all. Meaning that
the kmeans in R could recognize this type of matrix.
If I run the entire matrix, R says "too many elements specified."

I have considered the 'bigmemory' and 'biganalytics' packages. But to save
the sparse matrix as common CSV file would take approx 70G and 99% being 0.
I just don't think it's necessary or efficient to treat it as a dense
matrix.

It there anyway to deal with the vector length limit? Can I split the whole
matrix into small ones and then do k-means?



Thanks,
Lishu

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to