Try the flow cytometry clustering functions in Bioconductor. -thomas
On Thu, Aug 11, 2011 at 7:07 AM, Ken Hutchison <vicvoncas...@gmail.com> wrote: > Hello all, > I am using the clustering functions in R in order to work with large > masses of binary time series data, however the clustering functions do not > seem able to fit this size of practical problem. Library 'hclust' is good > (though it may be sub par for this size of problem, thus doubly poor for > this application) in that I do not want to make assumptions about the number > of clusters present, also due to computational resources and time hclust is > not functionally good enough; furthermore k-means works fine assuming the > number of clusters within the data, which is not realistic. The silhouette > functions in 'Pam' and 'Clara' and (if I remember correctly) 'cluster' seem > to be really bad through very thorough experimentation of data generation > with known clusters. I am left then with either theoretical abstractions > such as pruning hclust trees with minimal spanning trees or perhaps > hand-rolling a hierarchical k-medoids which works extremely efficiently and > without cluster number assumptions. Anybody have any suggestions as to > possible libraries which I have missed or suggestions in general? Note: this > is not a question for 'Bigkmeans' unless there exists a > 'findbigkmeansnumberofclusters' function also. > Thank you in advance for your > assistance, > Ken > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Thomas Lumley Professor of Biostatistics University of Auckland ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.