John,
On Tue, 2 Oct 2012 11:35:12 -0400 John Sorkin
<[email protected]> wrote:
> Window XP
> R 2.15
>
> I am running a cluster analysis in which I ask for three clusters (see code
> below). The analysis nicely tells me what cluster each of the subjects in my
> input dataset belongs to. I would like two pieces of information
> (1) for every subject in my input data set, what is the probability of the
> subject belonging to each of the three cluster
K-means provides hard clustering, whatever cluster has closest mean
gets the assignment.
> (2) given a new subject, someone who was not in my original dataset, how can
> I determine their cluster assignment?
Look at the distance between the subject the cluster means: the one
that is closest gets assigned the cluster.
If you are looking for probabilistic clustering (under Gaussian
mixture model assumptions), you could use model-based clustering: one R
package is mclust.
Btw, note that kmeans is very sensitive to initialization (as is
mclust): you may want to try several random starts (for kmeans),
at the very least. Use the argument "nstart" with a huge number.
HTH,
Ranjan
> Thanks,
> John
>
> # K-Means Cluster Analysis
> jclusters <- 3
> fit <- kmeans(datascaled, jclusters) # 3 cluster solution
>
> and fit$cluster tells me what cluster each observation in my input dataset
> belongs to (output truncated for brevity):
>
> > fit$cluster 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
> > 17 . . . .
> 1 1 1 1 3 1 1 1 1 2 1 2 1 1 1 1 1 . . . .
> How do I get probability of being in cluster 1, cluster 2, and cluster 3 for
> a given subject, e.g datascaled[1,]?How do I get the cluster assigment for a
> new subject?Thanks,John
> John David Sorkin M.D., Ph.D.
> Chief, Biostatistics and Informatics
> University of Maryland School of Medicine Division of Gerontology
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
> Confidentiality Statement:
> This email message, including any attachments, is for ...{{dropped:16}}
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.