new...@r wrote: > > Hey Everyone! > > I wanted to carry out Hierarchical clustering using distance matrices i > have calculated ( instead of euclidean distance etc.) > > I understand as.dist is the function for this, but the distances in the > dendrogram i got by using the following script(1) were not the distances > defined in my distance matrices. > > script: > var<-read.table("the distance matrix i calculated", header=TRUE, sep=" ") > var_HC<-hclust(as.dist(var),method="average") > > > var_dendro<-as.dendrogram(var_HC) > > plot(var_dendro,ylim=c(0,5), nodePar =list(lab.cex = 0.3), header=title(" > My Distance Matrix")) > > > I did some research and found that the hclust function (from the hclust > help page): > > > "...Initially, each object is assigned to its own cluster and then the > algorithm proceeds iteratively, at each stage joining the two most similar > clusters, continuing until there is just a single cluster. At each stage > distances between clusters are recomputed by the Lance–Williams > dissimilarity update formula according to the particular clustering method > being used. ..." > > > I am wondering is there another function that doesnt do " At each stage > distances between clusters are recomputed by the Lance–Williams > dissimilarity update formula according to the particular clustering method > being used.."??? > > > I hope my message was clear, any help would be greatly appreciated. > > > Thanks!! > > A.Jadoon > > Kings College London > > > If I understand your question correctly, you expected to find the distances in your matrix in the dendrogram?
Well, hierarchical clustering needs some way of calculating distance between clusters, and these distances are based on the distance matrix, but do not equal them. The choice "average" you used means that if clusters C1 and C2 are joined, the distance of the joined cluster to another cluster C' is the average distance of all elements of clusters C1 and C2 to all elements of C'. Thus, the distances in the dendrogram are averages of groups of distances in your matrix. The Lance-Williams is a catch-all term and formula that, with certain special values of the coefficients, reduces to the more intuitive choices like "average", "complete" etc. Peter Langfelder Hey Peter! Precisely what you understand- i had hoped to see the distances in my matrix as the lengths in my dendrogram. You have been an enormous help! Thank you!! (I find R documentation a little hard to understand sometimes!-prob due to my tiny experience with programming and clustering methods!) Ayesha -- View this message in context: http://r.789695.n4.nabble.com/Hierarchical-clustering-using-own-distance-matrices-tp2230724p2231501.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.