I know that there are quite a few packages out that there for cluster analysis. The problem that I am facing is finding a package that will not incorporate all my samples into clusters but just the samples that fit a threshold (that I have not set yet and may need help finding the right level) for genotyping. It should be able to "no call" samples outside the clusters. It also needs to accommodate a negative control sample by not including it in any genotype cluster.
I'm looking at both nuclear and mitochondrial DNA so hopefully it can be sophisticated enough to set the number of cluster between two or three within the array. These genotyping arrays are either 48 samples x 48 assays, 96x96, or 192x24 and it would be nice if it could accommodate any range of samples and assays. the data headings from the csv are: ID,Assay,Allele Y,Allele X,Name,Type,Auto,Confidence,Final,Converted,Allele Y,Allele X where Allele Y and Allele X are the plotted values and the vectors within the data.frame are 9216 (96x96) long. So what would be the recommended package for moving to a more quantifiable method of genotyping using cluster analysis? Thanks again. Forgive me if this is a better question for bioconductor. I will provide any additional context that I might have forgot to add here that could help. [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.