>>>>> Hugo Varet <vareth...@gmail.com> >>>>> on Tue, 11 Jun 2013 15:15:36 +0200 writes:
> Dear Martin, > Thank you for your answer. Here is the exact call to agnes(): > setwd("E:/Hugo") > library(cluster) > load("mydata.rda") > tableauTani<-dist.binary(mydata, method = 4, diag = FALSE, upper = FALSE) > resAgnes.Tani<-agnes(tableauTani, diss = inherits(tableauTani, > "dist"),method = "ward") > classe.agnTani.3 <- cutree(resAgnes.Tani, 3) > I'm going to send you the data in a separated e-mail. Thank you, Hugo, and I got that alright. I can see that many of the distances are *identical*, because your data is completely binary. >From experience, I know that this can lead (for some algorithms) to "arbitrary" decisions in clustering, namely when two *pairs* of observations / clusters have exactly the same distance, it is somewhat random which of the pair is "merged" / "fused" first, in a bottom up hierarchical algorithm such as agnes(). To reproduce your example (above) I need however to know *where* you got the the dist.binary() function from. It is not part of standard R nor of the cluster package. Regards, Martin > Regards, > Hugo > Le lundi 10 juin 2013, Martin Maechler <maech...@stat.math.ethz.ch> a > écrit : >>>>>>> Hugo Varet <vareth...@gmail.com> >>>>>>> on Sun, 9 Jun 2013 11:43:32 +0200 writes: >> >> > Dear R users, >> > I discovered something strange using the function agnes() of the > cluster >> > package on R 3.0.1 and on R 2.14.1. Indeed, the clusterings > obtained are >> > different whereas I ran exactly the same code. >> >> hard to believe... but .. >> >> > I quickly looked at the source code of the function and I > discovered that >> > there was an important change: agnes() in R 2.14.1 used a FORTRAN > code >> > whereas agnes() in R 3.0.1 uses a C code. >> >> well, it does so quite a bit longer, e.g., also in R 2.15.0 >> >> > Here is one of the contingency table between R 2.14.1 and R 3.0.1: >> > classe.agnTani.2.14.1 >> > classe.agnTani.3.0.1 1 2 3 >> > 1 74 0 229 >> > 2 0 235 0 >> > 3 120 0 15 >> >> > So, I was wondering if it was normal that the C and FORTRAN codes > give >> > different results? >> >> It's not normal, and I'm pretty sure I have had many many >> examples which gave identical results. >> >> Can you provide a reproducible example, please? >> If the example is too large [for dput() ], please send me the *.rda >> file produced from >> save(<your data>, file=<the file I neeed>) >> *and* a the exact call to agnes() for your data. >> >> Thank you in advance! >> >> Martin Maechler, >> the one you could have e-mailed directly >> to using maintainer("cluster") ... >> >> >> > Best regards, >> > Hugo Varet >> >> > [[alternative HTML version deleted]] >> ^^^^^^^^^^^^^ try to avoid, please ^^^^^^^^^^^^^^^^^ >> >> > ______________________________________________ >> > R-help@r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >> yes indeed, please. >> ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.