Re: [R] Clustering Large Applications..sort of

2011-08-10 Thread Christian Hennig
PS to my previous posting: Also have a look at kmeansruns in fpc. This runs kmeans for several numbers of clusters and decides the number of clusters by either Calinski&Harabasz or Average Silhouette Width. Christian On Wed, 10 Aug 2011, Ken Hutchison wrote: Hello all, I am using the clust

Re: [R] Clustering Large Applications..sort of

2011-08-10 Thread Christian Hennig
There is a number of methods in the literature to decide the number of clusters for k-means. Probably the most popular one is the Calinski and Harabasz index, implemented as calinhara in package fpc. A distance based version (and several other indexes to do this) is in function cluster.stats in

Re: [R] Clustering Large Applications..sort of

2011-08-10 Thread Peter Langfelder
On Wed, Aug 10, 2011 at 12:07 PM, Ken Hutchison wrote: > Hello all, >   I am using the clustering functions in R in order to work with large > masses of binary time series data, however the clustering functions do not > seem able to fit this size of practical problem. Library 'hclust' is good > (t

Re: [R] Clustering Large Applications..sort of

2011-08-10 Thread Thomas Lumley
Try the flow cytometry clustering functions in Bioconductor. -thomas On Thu, Aug 11, 2011 at 7:07 AM, Ken Hutchison wrote: > Hello all, >   I am using the clustering functions in R in order to work with large > masses of binary time series data, however the clustering functions do not > see