John,

On Tue, 2 Oct 2012 11:35:12 -0400 John Sorkin
<jsor...@grecc.umaryland.edu> wrote:

> Window XP
> R 2.15
>  
> I am running a cluster analysis in which I ask for three clusters (see code 
> below). The analysis nicely tells me what cluster each of the subjects in my 
> input dataset belongs to. I would like two pieces of information
> (1) for every subject in my input data set, what is the probability of the 
> subject belonging to each of the three cluster

K-means provides hard clustering, whatever cluster has closest mean
gets the assignment.

> (2) given a new subject, someone who was not in my original dataset, how can 
> I determine their cluster assignment?

Look at the distance between the subject the cluster means: the one
that is closest gets assigned the cluster.

If you are looking for probabilistic clustering (under Gaussian
mixture model assumptions), you could use model-based clustering: one R
package is mclust.

Btw, note that kmeans is very sensitive to initialization (as is
mclust): you may want to try several random starts (for kmeans),
at the very least. Use the argument "nstart" with a huge number.

HTH,
Ranjan


> Thanks,
> John
>  
> # K-Means Cluster Analysis
> jclusters <- 3
> fit       <- kmeans(datascaled, jclusters) # 3 cluster solution
>  
> and fit$cluster tells me what cluster each observation in my input dataset 
> belongs to (output truncated for brevity):
>  
> > fit$cluster   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16 
> >  17 . . . .
>   1   1   1   1   3   1   1   1   1   2   1   2   1   1   1   1   1 . . . . 
> How do I get probability of being in cluster 1, cluster 2, and cluster 3 for 
> a given subject, e.g datascaled[1,]?How do I get the cluster assigment for a 
> new subject?Thanks,John 
> John David Sorkin M.D., Ph.D.
> Chief, Biostatistics and Informatics
> University of Maryland School of Medicine Division of Gerontology
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
> Confidentiality Statement:
> This email message, including any attachments, is for ...{{dropped:16}}

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to