Johannes Graumann <[EMAIL PROTECTED]> wrote in news:[EMAIL PROTECTED]:
> But cutree does away with the indexes from the original input, which > rect.hclust retains. > I will have no other choice and match that input with the 'values' > contained in the clusters ... If you want to retain the original rownames, then try: > vector [1] 0.00 0.45 1.00 2.00 3.00 3.25 3.33 3.75 4.10 5.00 6.00 6.45 7.00 7.10 8.00 #-----start cut-and-pastable----- #this will "label" individual group membership #diff(.) returns a vector that is smaller by one than its input #so it needs to be augmented with c(1,fn(diff((.)) grp.v<-cbind(vector,(c(1,1+cumsum(as.numeric(diff(vector)>0.5))))) #You can then tally up the counts in groups tb<-table(grp.v[,2]) tb #1 2 3 4 5 6 7 8 #2 1 1 5 1 2 2 1 # And apply the counts to the rows by doing a # "row count" lookup into tb[.] grp.v<-cbind(grp.v,tb[grp.v[,2]]) grp.v -----end cut and pastable------ vector 1 0.00 1 2 1 0.45 1 2 2 1.00 2 1 3 2.00 3 1 4 3.00 4 5 4 3.25 4 5 4 3.33 4 5 4 3.75 4 5 4 4.10 4 5 5 5.00 5 1 6 6.00 6 2 6 6.45 6 2 7 7.00 7 2 7 7.10 7 2 8 8.00 8 1 Further processing of the membership "label" might better be accomplished by converting the matrix to a dataframe, and then working with the membership "label" as a factor. If you only want to deal with the rownames and values of vector that have more than <x> values, that should be straightforward. -- David Winsemius > Gabor Grothendieck wrote: > >> If we don't need any plotting we don't really need rect.hclust at >> all. Split the output of cutree, instead. Continuing from the >> prior code: >> >>> for(el in split(unname(vv), names(vv))) print(el) >> [1] 0.00 0.45 >> [1] 1 >> [1] 2 >> [1] 3.00 3.25 3.33 3.75 4.10 >> [1] 5 >> [1] 6.00 6.45 >> [1] 7.0 7.1 >> [1] 8 >> >> On Dec 21, 2007 3:24 PM, Johannes Graumann <[EMAIL PROTECTED]> >> wrote: >>> Hm, hm, rect.hclust doesn't accept "plot=FALSE" and cutree doesn't >>> retain the indexes of membership ... anyway short of ripping out the >>> guts of rect.hclust to achieve the same result without an active >>> graphics device? >>> >>> Joh >>> >>> >>> >> # cluster and plot >>> >> hc <- hclust(dist(v), method = "single") >>> >> plot(hc, lab = v) >>> >> cl <- rect.hclust(hc, h = .5, border = "red") >>> >> >>> >> # each component of list cl is one cluster. Print them out. >>> >> for(idx in cl) print(unname(v[idx])) >>> > [1] 8 >>> > [1] 7.0 7.1 >>> > [1] 6.00 6.45 >>> > [1] 5 >>> > [1] 3.00 3.25 3.33 3.75 4.10 >>> > [1] 2 >>> > [1] 1 >>> > [1] 0.00 0.45 >>> > >>> >> # a different representation of the clusters >>> >> vv <- v >>> >> names(vv) <- ct <- cutree(hc, h = .5) >>> >> vv >>> > 1 1 2 3 4 4 4 4 4 5 6 6 7 >>> > 7 >>> > 8 >>> > 0.00 0.45 1.00 2.00 3.00 3.25 3.33 3.75 4.10 5.00 6.00 6.45 7.00 >>> > 7.10 8.00 >>> > >>> > >>> > On Dec 21, 2007 4:56 AM, Johannes Graumann >>> > <[EMAIL PROTECTED]> wrote: >>> >> <posted & mailed> >>> >> >>> >> Dear all, >>> >> >>> >> I'm trying to solve the problem, of how to find clusters of >>> >> values in a vector that are closer than a given value. >>> >> Illustrated this might look as follows: >>> >> >>> >> vector <- c(0,0.45,1,2,3,3.25,3.33,3.75,4.1,5,6,6.45,7,7.1,8) >>> >> >>> >> When using '0.5' as the proximity requirement, the following >>> >> groups would result: >>> >> 0,0.45 >>> >> 3,3.25,3.33,3.75,4.1 >>> >> 6,6.45 >>> >> 7,7.1 >>> >> >>> >> Jim Holtman proposed a very elegant solution in >>> >> http://tolstoy.newcastle.edu.au/R/e2/help/07/07/21286.html, which >>> >> I have modified and perused since he wrote it to me. The beauty >>> >> of this approach is that it will not only work for constant >>> >> proximity requirements as above, but also for overlap-windows >>> >> defined in terms of ppm around each value. Now I have an >>> >> additional need and have found no way (short of iteratively step >>> >> through all the groups returned) to figure out how to do that >>> >> with Jim's approach: how to figure out that 6,6.45 and 7,7.1 are >>> >> separate clusters? >>> >> >>> >> Thanks for any hints, Joh >>> >> ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.