Hi > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- > project.org] On Behalf Of Pieter Schoonees > Sent: Friday, October 12, 2012 6:19 PM > To: Vining, Kelly; r-help@r-project.org > Subject: Re: [R] average duplicated rows? > > You will have to split() the data and unsplit() it after making the > alterations. Have a look at the plyr package for such functions. > > > -----Original Message----- > > From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- > > project.org] On Behalf Of Vining, Kelly > > Sent: Friday 12 October 2012 5:42 > > To: r-help@r-project.org > > Subject: [R] average duplicated rows? > > > > Dear useRs, > > > > I have a slightly complicated data structure and am stuck trying to > > extract what I need. I'm pasting an example of this data below. In > > some cases, there are duplicates in the "gene_id" column because > there > > are two different "sample 1" values for a given "sample 2" value. > > Where these duplicates exist, I need to average the corresponding > "FL_EARLY" > > values and retain the "FL_LATE" value and replace those two rows with > > a row containing the "FL_EARLY" average so that I no longer have any > > "gene_id" duplicates. > > > > Seems like this is a job for some version of the apply function, but > > searching and puzzling over this has not gotten me anywhere. Any help > > will be much appreciated!
Aggregate is designed for that ex.ag<-aggregate(ex[,c("FL_EARLY", "FL_LATE")], list(ex$gene_id), mean) you will lose sample1 and 2 but you did not mention you want to retain them and how. Regards Petr > > > > Example data: > > > > > > gene_id sample_1 sample_2 FL_EARLY FL_LATE > > 763938 Eucgr.A00054 fl_S1E fl_S1L 13.170800 22.2605 > > 763979 Eucgr.A00101 fl_S1E fl_S1L 0.367960 14.1202 > > 1273243 Eucgr.A00101 fl_S2 fl_S1L 0.356625 14.1202 > > 764169 Eucgr.A00350 fl_S1E fl_S1L 7.381070 43.9275 > > 1273433 Eucgr.A00350 fl_S2 fl_S1L 10.674500 43.9275 > > 1273669 Eucgr.A00650 fl_S2 fl_S1L 33.669100 50.0169 > > 764480 Eucgr.A00744 fl_S1E fl_S1L 132.429000 747.2770 > > 1273744 Eucgr.A00744 fl_S2 fl_S1L 142.659000 747.2770 > > 764595 Eucgr.A00890 fl_S1E fl_S1L 2.937760 14.9647 > > 764683 Eucgr.A00990 fl_S1E fl_S1L 8.681250 48.5492 > > 1273947 Eucgr.A00990 fl_S2 fl_S1L 10.553300 48.5492 > > 764710 Eucgr.A01020 fl_S1E fl_S1L 0.000000 57.9273 > > 1273974 Eucgr.A01020 fl_S2 fl_S1L 0.000000 57.9273 > > 764756 Eucgr.A01073 fl_S1E fl_S1L 8.504710 101.1870 > > 1274020 Eucgr.A01073 fl_S2 fl_S1L 5.400010 101.1870 > > 764773 Eucgr.A01091 fl_S1E fl_S1L 3.448910 15.7756 > > 764826 Eucgr.A01152 fl_S1E fl_S1L 69.565700 198.2320 > > 764831 Eucgr.A01158 fl_S1E fl_S1L 7.265640 30.9565 > > 764845 Eucgr.A01172 fl_S1E fl_S1L 3.248020 16.9127 > > 764927 Eucgr.A01269 fl_S1E fl_S1L 18.710200 76.6918 > > > > > > > > --Kelly V. > > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting- > > guide.html and provide commented, minimal, self-contained, > > reproducible code. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.