I am sorry for the incorrect subject. My subject autofilled without my noticing in time. I suppose a better subject would be Calculating proportion of shared occurances and randomizations.
Grant 2008/4/19 Grant Gillis <[EMAIL PROTECTED]>: > Hello All, > > Once again thanks for all of the help to date. I am climbing my R > learning curve. I've got a few more questions that I hope I can get some > guidance on though. I am not sure whether the etiquette is to break up > multiple questions or not but I'll keep them together here for now as it may > help put the questions in context despite the fact that the post may get a > little long. > > > Question 1: > > > My first goal is to calculate the proportion of shared 1) behaviours and > 2) alleles between numerous individuals. Pasted below ('propshared' > function) is what I have now and and works very well for calculating the > proportion of shared behaviours where the data is formatted with each column > as a behaviour and each row an individual. Microsatellite genotypes are > formatted differently. An example is below. Each row is an individual and > each column is one allele from a single locus. From the below values L1 > and L1.1 each give a copy of an allele for same locus. Occasionally values > from different loci will have the same value altough these are not actually > the same allele. > > I would like the calculation of the proportion of shared values for > alleles to be restricted to the proportion of shared alleles within loci for > all individuals (pairs of columns L1 and L1.1, L2 and L2.2....) What I have > now calculates the proportion of shared values for alleles across loci. A > specific example is that I would like the value *2* for individual *w *at > *L1* to be considered the same as the value* 2* for individual *y* at * > L1.1* but not the same as the value *2* for any other individual within > any other pair of columns. > > > genos<- data.frame( > > L1 = c(2,NA,1,3), > L1 = c(1,NA,2,3), > L2 = c(5,2,5,3), > L2 = c(3,4,2,4), > L3 = c(4,5,7,2), > L3 = c(4,6,6,6) ) > > rownames(genos) = c("w","x","y","z") > > > genos > L1 L1.1 L2 L2.1 L3 L3.1 > w 2 1 5 3 4 4 > x NA NA 2 4 5 6 > y 1 2 5 2 7 6 > z 3 3 3 4 2 6 > > > > propshared<-function(genos){ > > sapply( rownames(genos), function(ind1) > sapply( rownames(genos), function(ind2) > (sum( genos[ind1,] == genos[ind2,],na.rm=TRUE ))) > /length(genos[1,]))->x > is.na(diag(x))<-TRUE > x > > } > > > propshared(genos) > w x y z > w NA 0.0000000 0.1666667 0.1666667 > x 0.0000000 NA 0.1666667 0.3333333 > y 0.1666667 0.1666667 NA 0.3333333 > z 0.1666667 0.3333333 0.3333333 NA > > > The matrix I would like to have would look like this. > w x y > z > w NA 0 0.333333333 0.166666667 > x 0 NA 0.166666667 > 0.166666667 > y 0.333333333 0.166666667 NA 0.166666667 > z 0.166666667 0.166666667 0.166666667 NA > > > Question 2: Thanks if you have made it this far..........Next I would > like to calculate a randomized value of the mean proportion of shared > alleles. To do this I thought I would randomize the original data (genos > above say 1000 times ), recalculate the proportion of shared alleles at each > step and then take the mean (my attempt below). When I do this I get the > same mean proportion of shared alleles (or behaviours) as the original for > every randomization. I assume that this is due to some property of > permuting this type of data that I do not know. Does anyone have a > recommendation as to how I might get a value of the proportion of shared > alleles if alleles were distributed (again within loci) at random? > > > randomize <- function(genos){ > x <- apply(genos, 2, sample) > rownames(x) <- rownames(genos) > x > } > > > allele.permute<-function(genos, n){ > > list<-replicate(n,randomize(genos), simplify = FALSE) > sapply(list, propshared, simplify = FALSE) > } > > > > > > > I hope this is clear. I appreciate all insights and input > Thanks > > Grant > > > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.