Re: [R] Help in kmeans

2011-10-16 Thread Christoph Molnar
Hi, no, don't use kmeans with factors. The kmeans algorithm does, besides other things, calculate the mean of the k clusters. But you don't get a useful mean from factors, because the internally used integers are arbitrary. In this case its 1,2 and 3. But it could be 42, 7 and 10 as well, whi

Re: [R] Help in kmeans

2011-10-16 Thread raji sankaran
Hi, Thank you .. The information was very helpful. Yes.It was meant to be centers=3.Even with that , kmeans gives error if we give the index of Species columns. So, *is it ok to use kmeans for String data by using cbind*.But, kmeans*works even if we give a column which contains distinct String

Re: [R] Help in kmeans

2011-10-16 Thread Christoph Molnar
Hi, I suspect your column Species is of class "factor" (as it is in R's built in iris dataset). This means that in your case Species is an integer vector with the additional information of the levels names. kmeans is internally calling as.matrix(), which creates a character matrix of your datafram

Re: [R] Help in kmeans

2011-10-16 Thread Raji
Hi All, For executing kmeans for Iris, we found that there were 2 different ways. dataFrame <- read.csv("c:/Iris.csv",header=T) 1. kmeans_model<-kmeans(dataFrame[1:5],size=3) *This gave an error as it had Species which is a String column as one of the inputs* 2.attach(dataFrame) kmeans_mo

Re: [R] Help in kmeans

2011-04-06 Thread raji sankaran
Hi, I have herewith attached the results of the 2 commands. > *set.seed(1234) > kmeans_model<-kmeans((SepalLength+SepalWidth+PetalLength+PetalWidth),centers=3) > kmeans_model$cluster * [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 2 3

Re: [R] Help in kmeans

2011-04-06 Thread raji sankaran
Hi, Thanks for the information.But , i am already using set.seed().My problem is that, when i use column names instead of column indices, the result seems to be less accurate consistently.Hence, we wanted to understand how kmeans differentiates between column names and column indices. Is there a

Re: [R] Help in kmeans

2011-04-06 Thread Christian Hennig
I'm not going to comment on column names, but this is just to make you aware that the results of k-means depend on random initialisation. This means that it is possible that you get different results if you run it several times. It basically gives you a local optimum and there may be more than

[R] Help in kmeans

2011-04-06 Thread Raji
Hi All, I was using the following command for performing kmeans for Iris dataset. Kmeans_model<-kmeans(dataFrame[,c(1,2,3,4)],centers=3) This was giving proper results for me. But, in my application we generate the R commands dynamically and there was a requirement that the column names will b