Hi all; I have a big data set (a small part is given below) and V1 column has repeated info in it. That is rs941873, rs12307687... are repeating many times. I need choose only one SNP (in first column named rs) which has the smallest Pvalue withing V1 column. That is I need choose only one SNP for repeated names in V1 which has the smallest Pvalue. Your helps are truly appreciated,Oslo
| rs | n0 | Pvalue | V1 | | rs941873 | 81139462 | 1.52E-07 | rs941873 | | rs634552 | 75282052 | 1.08E-01 | rs941873 | | rs11107175 | 94161719 | 2.85E-02 | rs941873 | | rs12307687 | 47175866 | 1.23E-01 | rs12307687 | | rs3917155 | 76444685 | 6.80E-01 | rs941873 | | rs1600640 | 84603034 | 2.75E-04 | rs12307687 | | rs2871865 | 99194896 | 7.09E-02 | rs12307687 | | rs2955250 | 61959740 | 3.17E-02 | rs12307687 | | rs228758 | 42148205 | 7.72E-02 | rs12307687 | | rs224333 | 34023962 | 2.10E-02 | rs10071837 | | rs4681725 | 56692321 | 4.45E-04 | rs10071837 | | rs7652177 | 171969077 | 6.34E-04 | rs10071837 | | rs925098 | 17919811 | 5.55E-09 | rs925098 | | rs1662837 | 82168889 | 8.66E-05 | rs925098 | | rs10071837 | 33381581 | 5.74E-04 | rs925098 | [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.