Hello, I have a data frame which looks like this:
> head(a,20) rs pvalue 1: rs185642176 0.267407 2: rs184120752 0.787681 3: rs10904045 0.508162 4: rs35849539 0.875910 5: rs141633513 0.787759 6: rs4468273 0.542171 7: rs4567378 0.539484 8: rs7084251 0.126445 9: rs181605000 0.787838 10: rs12255619 0.192719 11: rs140367257 0.788008 12: rs10904178 0.969814 13: rs7918960 0.436341 14: rs61688896 0.526256 15: rs151283848 0.787284 16: rs140174295 0.989107 17: rs145945079 0.787015 18: rs4881370 0.455089 19: rs183895035 0.787015 20: rs181749526 0.787015 > dim(a) [1] 3859763 2 What I would like to do is to take random subsets of 30 of those rs throughout the dataframe and find out which subsets of those generated have FDR value <0.05 FDR I would calculate I guess with: a$fdr=p.adjust(a$pvalue,method="BH") but I also guess I would be calculating only FDR for a particular subset of 30 randomly chosen rs, not for the whole data set. The result I would like to present like in the attached plot. The x-axis say proportion of SNPs and in my case SNP is equivalent to rs Can you please help with this, I really don't have idea how to go about this. Thanks ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.