Hello David, Many thanks - this does exactly what I want and it lets me see whether the clusters make sense in terms of the patetrn of values & where they join a cluster.
Regards Bob > Something like this? > >> split(FS1, hcli8) > $`1` > X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 > 1 1 1 0 1 0 0 1 1 0 1 1 1 > 3 1 0 1 0 0 1 1 0 0 1 0 1 > 4 1 1 0 0 0 0 1 1 1 1 1 1 > 7 0 1 0 1 0 0 1 1 0 1 0 1 > 9 1 1 1 1 0 1 1 0 1 1 1 0 > 12 1 0 0 0 0 1 1 1 1 1 0 1 > 13 0 1 1 1 1 0 0 0 1 1 0 1 > 15 1 0 1 1 0 0 1 0 0 1 0 1 > 16 1 0 1 0 0 1 1 0 1 0 1 1 > 19 0 1 0 0 0 0 1 0 0 1 0 1 > 20 0 1 1 1 0 0 0 1 1 0 0 1 > 24 1 1 0 1 0 0 1 0 1 1 1 0 > 26 1 1 1 1 1 1 0 1 0 1 0 1 > 28 1 0 1 0 1 0 1 1 0 1 1 1 > 33 1 1 0 1 0 0 0 0 1 1 0 0 > 38 1 1 1 0 0 0 0 0 1 1 0 0 > 40 1 0 1 0 0 0 1 0 0 1 1 1 > 41 1 1 0 0 0 0 0 0 1 1 1 1 > 43 0 0 1 0 0 0 1 0 1 1 0 1 > 52 1 1 1 1 0 0 0 1 1 1 0 1 > 53 1 1 0 0 1 0 0 1 1 1 0 1 > 56 1 0 1 0 0 1 1 0 1 0 0 0 > 60 1 1 1 0 1 1 0 1 1 1 0 1 > > $`2` > X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 > 2 0 1 1 1 1 1 1 0 0 1 1 0 > 5 0 1 0 1 1 1 0 0 0 1 1 1 > 6 0 0 0 0 1 0 1 0 0 1 1 1 > 10 1 1 1 1 1 0 1 1 0 1 0 0 > 11 0 1 0 1 1 0 1 0 1 1 1 1 > 14 0 0 1 1 1 1 1 1 0 1 1 1 > 17 0 1 0 0 1 0 0 0 0 0 1 1 > 18 1 0 0 1 1 1 1 1 0 0 1 1 > 29 1 1 0 1 0 1 1 1 0 0 1 1 > 37 1 0 0 1 1 0 1 1 0 1 0 0 > 42 1 1 0 1 1 1 1 0 0 0 0 0 > 46 1 1 0 1 0 1 1 0 0 1 0 1 > 48 0 1 0 0 1 0 1 0 0 1 1 0 > 50 0 1 0 1 1 1 1 1 0 0 1 0 > 51 0 0 0 1 1 1 1 0 0 0 1 1 > 54 0 0 0 1 1 1 1 0 0 1 1 0 > 58 0 1 0 1 1 1 1 1 1 1 1 0 > 61 1 0 1 0 1 1 1 1 0 1 0 0 > > $`3` > X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 > 8 0 1 1 0 0 1 0 1 1 1 1 0 > 21 0 1 0 0 1 1 0 1 0 1 1 0 > 22 1 1 0 0 0 1 1 1 0 0 1 0 > 25 0 1 0 0 0 1 0 1 0 1 1 0 > 27 1 1 0 0 1 1 0 1 1 0 0 0 > 32 1 1 1 0 1 1 0 1 0 0 1 0 > 36 1 1 0 0 0 1 0 1 0 0 0 0 > 44 1 1 1 1 1 1 0 1 0 0 0 0 > 63 0 1 1 0 1 1 0 0 1 1 1 0 > > $`4` > X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 > 23 0 0 1 1 0 0 0 0 0 1 0 0 > 34 0 1 1 1 0 0 0 1 0 1 0 0 > > $`5` > X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 > 30 0 0 0 0 1 1 0 0 1 1 0 1 > 31 0 1 1 0 1 0 0 0 1 0 1 1 > 35 0 0 1 0 1 1 0 0 1 1 0 1 > 47 0 0 1 0 1 0 0 0 1 0 0 1 > 49 1 0 0 0 1 1 0 0 1 1 1 0 > 55 1 0 1 0 1 0 0 0 0 1 1 0 > 59 0 0 1 0 1 0 0 0 1 0 1 1 > > $`6` > X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 > 39 0 0 0 0 1 0 1 1 0 0 0 0 > 62 0 0 0 0 1 0 1 1 0 0 0 1 > > $`7` > X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 > 45 1 1 0 0 0 0 0 0 0 0 1 0 > > $`8` > X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 > 57 0 0 1 0 0 1 0 1 0 0 1 1 > > ------- > David > >> -----Original Message----- >> From: Bob Green [mailto:bgr...@dyson.brisnet.org.au] >> Sent: Sunday, November 18, 2012 3:22 PM >> To: dcarl...@tamu.edu; r-help@r-project.org >> Subject: RE: [R] Examining how cases are similar by cluster, in cluster >> analysis >> >> David, >> >> >> Many thanks, I'm sure this will be helpful. What would also be >> helpful is if I can extract each cluster and examine id by variable, >> within the respective cluster. I could index the variables for each >> cluster and run such an analysis but thre must be a more efficient >> way of doing this (especially as I experiment with different >> clustering methods) >> >> Thanks again, >> >> Bob >> >> At 06:44 AM 19/11/2012, David L Carlson wrote: >> >If you just want a summary of the mean for each variable in each >> >cluster, this will get you there: >> > >> > > set.seed=42 >> > > FS1 <- data.frame(matrix(sample(c(0, 1), 12*63, replace=TRUE), >> >nrow=63, >> >+ ncol=12)) >> > > dmat <- dist(FS1, method="binary") >> > > cl.test <- hclust(dmat, method="average") >> > > plot(cl.test, hang=-1) >> > > hcli8 <- cutree(cl.test, k=8) >> > > tbl <- aggregate(FS1, by=list(Group=hcli8), mean) >> > > print(tbl, digits=4) >> > Group X1 X2 X3 X4 X5 X6 X7 X8 >> >X9 >> >1 1 0.5122 0.6829 0.6829 0.6341 0.5854 0.5854 0.6829 0.6341 >> >0.5366 >> >2 2 0.0000 0.0000 0.0000 1.0000 0.6667 0.6667 0.0000 0.6667 >> >0.0000 >> >3 3 0.9286 0.1429 0.1429 0.1429 0.2857 0.5714 0.7857 0.3571 >> >0.8571 >> >4 4 1.0000 1.0000 1.0000 0.0000 0.0000 0.0000 0.0000 0.0000 >> >0.0000 >> >5 5 0.0000 1.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 >> >1.0000 >> >6 6 1.0000 0.0000 0.0000 0.0000 0.0000 1.0000 0.0000 1.0000 >> >0.0000 >> >7 7 1.0000 0.0000 0.0000 0.0000 1.0000 0.0000 0.0000 0.0000 >> >0.0000 >> >8 8 0.0000 1.0000 0.0000 0.0000 0.0000 1.0000 0.0000 0.0000 >> >0.0000 >> > X10 X11 X12 >> >1 0.4146 0.4634 0.561 >> >2 0.6667 0.0000 0.000 >> >3 0.8571 0.6429 0.500 >> >4 1.0000 0.0000 0.000 >> >5 0.0000 1.0000 0.000 >> >6 0.0000 0.0000 1.000 >> >7 0.0000 0.0000 0.000 >> >8 0.0000 0.0000 0.000 >> > > >> >---------------------------------------------- >> >David L Carlson >> >Associate Professor of Anthropology >> >Texas A&M University >> >College Station, TX 77843-4352 >> > >> > > -----Original Message----- >> > > From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- >> > > project.org] On Behalf Of Bob Green >> > > Sent: Sunday, November 18, 2012 5:00 AM >> > > To: r-help@r-project.org >> > > Subject: [R] Examining how cases are similar by cluster, in >> > > cluster analysis >> > > >> > > Hello, >> > > >> > > I used the following code to perform a cluster analysis on a >> > > dataframe consisting of 12 variables (coded as 1,0) and 63 >> > > cases. >> > > >> > > >> > > >> > > FS1 <- read.csv("D://Arsontest2.csv",header=T,row.names=1) >> > > >> > > str(FS1) >> > > >> > > dmat <- dist(FS1, method="binary") >> > > >> > > cl.test <- hclust (dist(FS1, method ="binary"), "ave") >> > > >> > > plot(cl.test, hang = -1) >> > > >> > > >> > > >> > > Each case has an id and the dendogram identifies the respective >> > > cases >> > > which constitute each cluster. What I am seeking advice on is >> > > how to >> > > examine the variables on which the cases are similar, within >> > > each cluster. >> > > >> > > >> > > >> > > sort (hcli8 <- cutree(cl.test, k=8)) identifies that the >> > > following >> > > cluster 2is comprised of the following cases: >> > > >> > > 1641 2295 2594 2654 2799 3213 3510 3513 2958 3294 >> > > >> > > 2 2 2 2 2 2 2 >> > > 2 >> > > 2 2 >> > > >> > > >> > > >> > > This code provides means for the variables by cluster. In >> > > relation to >> > > cluster 2 it appears the cases should have no clear motive and >> > > be depressed : >> > > >> > > round(sapply(x, function(i) colMeans(FS1[i,])),2) >> > > >> > > [,1] [,2] [,3] [ ,4] [,5] >> > > [,6] [,7] [,8] >> > > >> > > depressed 0.00 0.33 0.00 0.0 0 0.6 0.00 0.08 >> > > >> > > unclear 0.33 1.00 1.00 1.0 0 0.0 0.07 0.12 >> > > >> > > >> > > >> > > I can manually, examine this variable by variable and look at >> > > how >> > > each of the cases in cluster 2 are similar on the variables. I >> > > am >> > > looking at a more efficient and quicker way to do this. >> > > >> > > Bob >> > > >> > > ______________________________________________ >> > > R-help@r-project.org mailing list >> > > https://stat.ethz.ch/mailman/listinfo/r-help >> > > PLEASE do read the posting guide http://www.R- >> > > project.org/posting-guide.html >> > > and provide commented, minimal, self-contained, reproducible >> > > code. > > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.