Dear All, I hope I am not asking a FAQ. I am dealing with a problem of graph theory [connected components in a non-directed graph] and I do not want to rediscover the wheel. I saw a large number of R packages dealing for instance with the k-means method or hierarchical clustering for spatially distributed data and I am basically facing a similar problem. I am given a set of data which are the positions of particles in 3 dimensions; I define two particles A and B to be directly connected if their Euclidean distance is below a certain threshold d. If A and B are directly connected and B and C are directly connected, then A,B and C are connected components (physically it means that they are members of the same cluster). All my N particles then split into k disjointed clusters, each with a certain number of connected components, and this is what I want to investigate. I do not know a priori how many clusters I have (this is my problem with e.g. k-means since k is an output for me); the only input is the set of 3-dimensional particle positions and a threshold distance. The algorithm/package I am looking should return the number of clusters and the composition of each cluster, e.g. the fact that the second cluster is made up of particles {R,T,L}. Consider for instance:
# a 2-dimensional example x <- rbind(matrix(rnorm(100, sd = 0.3), ncol = 2), matrix(rnorm(100, mean = 1, sd = 0.3), ncol = 2)) colnames(x) <- c("x", "y") How can I then find out how many connected components I have when my threshold distance is d=0.5? Many thanks Lorenzo ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.