Dear Marianna,

the function agnes in library cluster can compute Ward's method from a raw data matrix (at least this is what the help page suggests).

Also, you may not be using the most recent version of hclust. The most recent version has a note in its help page that states:

"Two different algorithms are found in the literature for Ward clustering. The one used by option "ward.D" (equivalent to the only Ward option "ward" in R versions <= 3.0.3) does not implement Ward's (1963) clustering criterion, whereas option "ward.D2" implements that criterion (Murtagh and Legendre 2013). With the latter, the dissimilarities are squared before cluster updating. Note that agnes(*, method="ward") corresponds to hclust(*, "ward.D2")."

The Murtagh and Legendre paper has more details on this and is here:
http://arxiv.org/abs/1111.6285
F. Murtagh and P. Legendre, "Ward's hierarchical clustering method: clustering criterion and agglomerative algorithm"

It's not clear to me why one would want to use Ward's method for this kind of data, but that's your decision of course.

Best wishes,
Christian


On Fri, 25 Jul 2014, Marianna Bolognesi wrote:

Hi everybody, I have a problem with a cluster analysis.

I am trying to use hclust, method=ward.

The Ward method works with SQUARED Euclidean distances.

Hclust demands "a dissimilarity structure as produced by dist".

Yet, dist does not seem to produce a table of squared euclidean distances,
starting from cosines.
In fact, computing manually the squared euclidean distances from cosines
(d=2(1-cos)) produces a different outcome.

As a consequence, using hclust with ward method on a table of cosines
tranformed into distances with dist, produces a different dendrogram than
other programs for hierarchical clustering with ward method (i.e.
multidendrograms). Weird right??

Computing manually the distances and then feeding them to hclust produces
an error message. So, I am wondering, what the hell is this dist function
doing?!

thanks!

marianna

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



*** --- ***
Christian Hennig
University College London, Department of Statistical Science
Gower St., London WC1E 6BT, phone +44 207 679 1698
c.hen...@ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to