Maura: I looked at the scatter plots you sent.
A few thoughts: 1- Patient 3 data has a lot of missing data. This will make doing a good grouping against your cases an issue. Missing data is so common and much work has been done in this area. One can do the trivial approach, forward fill and backward fill the sample data thus have same amount of data for all cases. The more advanced approaches are, "Expectation-Maximization algorithm", a Google search on EM Algorithm will provide you a lot of info. Another approach is called, "Multiple Imputation" (http://www.multiple-imputation.com/). EM for your type of data appears to be a good solution. 2- Looking at your data, Principal Component Analysis (PCA) appears to be your best starting point before clustering. Many books on this subject but start with these simple links: http://en.wikipedia.org/wiki/Karhunen-Lo%C3%A8ve_transform http://csnet.otago.ac.nz/cosc453/student_tutorials/principal_components. pdf All the methods mentioned above will be in R... PCA, EM. Finally, there is no one right answer for clustering, I.e. single linkage, Complete linkage, Ward's Method et al. It's always particular to the type of data one is analyzing. Naturally our fellow R community members might have more and better insights/suggestion! :) Hope this helps. Regards, Neil -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Prof Brian Ripley Sent: Monday, October 01, 2007 1:37 PM To: Maura E Monville Cc: [EMAIL PROTECTED] Subject: Re: [R] Clustering techniques using R On Mon, 1 Oct 2007, Maura E Monville wrote: > Now that I've loaded a file into an R data.frame and played with > linear regression until I got a good model, my next step is clustering > using the coefficients of the regression model (I have many files) > Thanks to some R experts' guidelines I could find plenty of > documentation on regression analysis in the "contributed" section. > Some touch on the concepts of the underlying theory and then show some > worked out examples (extremely useful). > I found nothing so nicely explained and laid out about cluster analysis with R. > I would appreciate some suggestion about reading on techniques for > clustering using R. Some application examples are very welcome. Have you looked at MASS (the book, see the FAQ)? Or the CRAN task views at http://cran.r-project.org/src/contrib/Views/Cluster.html http://cran.r-project.org/src/contrib/Views/Multivariate.html (Clustering is 'unsupervied classification')? There is a lot of information there. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -------------------------------------------------------- This information is being sent at the recipient's reques...{{dropped:16}} ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.