Sorry I'm not sure that prob is suitable for my purposes(but i'm quite newbie with R). If I correctly understand prob allows to set a weight for each row in the original dataset in order to include the rows on the basis of their weights). ... I'm not sure to correctly understanding ;-) In my case all the rows are equally important. I need "simply " that my subset has in each column the same frequency of 1 that in the original dataset Thank you again Guido
2012/6/14 R. Michael Weylandt <michael.weyla...@gmail.com> > sample() takes a prob = argument which lets you supply weights, which > need not sum to one so, if I understand you, you could just pass TRUEs > and FALSEs for those rows you want. If I'm wrong about that last bit, > I'm still pretty confident sample(prob = ) is the way to go. > > Best, > Michael > > On Thu, Jun 14, 2012 at 6:02 AM, Guido Leoni <guido.le...@gmail.com> > wrote: > > Dear list I wish to extract from a population genotypized for 10 SNP a > > subsample of the same population of size n with similar allele > frequencies. > > Essentially i have a matrix of 200 rows (df) like this > > Name,Condition,rs1385699_X,rs6625163_X,rs962458_X,Rs4658627_1, > > sample01,Case,1,1,1,-1 > > sample02,Control,1,1,1,1 > > sample06,Control,1,-1,1,0 > > sample10,Case,1,1,1,0 > > sample11,Control,1,1,1,1 > > sample24,Control,-1,-1,1,0 > > sample29,Control,1,-1,1,0 > > sample42,Case,-1,-1,1,0 > > sample64,Case,-1,1,1,0 > > .... > > I'm interested to mantain in my subsample the same frequencies of those > > observed for the 1 value in each column > > I approached the problem with sample() function > > > > mysample<-df[sample(1:nrow(df),100,replace=F),] > > Then I tested that the frequencies of each allele in mysample are not > > statistically different respect to the initial dataset by mean of > prop.test > > This seems to work but do you know if there is a package that can do the > > same thing allowing for example a more strict control? > > Thank you very much > > Guido > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > -- Guido Leoni National Research Institute on Food and Nutrition (I.N.R.A.N.) via Ardeatina 546 00178 Rome Italy tel + 39 06 51 49 41 (operator) + 39 06 51 49 4498 (direct) [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.