Try replacing your order() call with the following 2 lines meanClusterRadius <- ave(distances, kmeans.result$cluster, FUN = mean) outliers <- order(distances/meanClusterRadius, decreasing = T)[1:5] ave(x,group,FUN=fun) applies FUN to the subsets of x defined by the group argument(s) and puts the results of FUN(x[group[i]]) back into x[group[i]], returning the modified x. Bill Dunlap TIBCO Software wdunlap tibco.com
On Wed, May 7, 2014 at 1:34 AM, marioger <mario_wieg...@gmx.de> wrote: > Hi, > > i am hoping you can help me with my problem. I am trying to detect outliers > with use of the kmeans algorithm. First I perform the algorithm and choose > those object as possible outliers which have a big distance to their cluster > center. Instead of using the absolute distance I want to use the relative > distance, i.e. the ration of absolute distance of the object to the cluster > center and the average distance of all objects of the cluster to their > cluster center. The code for outlier detection based on absolute distance is > the following: > >> # remove species from the data to cluster >> iris2 <- iris[,1:4] >> kmeans.result <- kmeans(iris2, centers=3) >> # cluster centers >> kmeans.result$centers >> # calculate distances between objects and cluster centers >> centers <- kmeans.result$centers[kmeans.result$cluster, ] >> distances <- sqrt(rowSums((iris2 - centers)^2)) >> # pick top 5 largest distances >> outliers <- order(distances, decreasing=T)[1:5] >> # who are outliers >> print(outliers) > > But how can I use the relative instead of the absolute distance to find > outliers? > Thanks in advance. > > Mario > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/Outlier-Detection-with-k-Means-tp4690098.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.