On Aug 3, 2009, at 7:12 PM, David Winsemius wrote:


On Aug 3, 2009, at 9:24 AM, Rnewbie wrote:


Dear all,

I have a dataset, and I wanted to merge the rows with duplicated IDs by calculating the means or medians from the duplicate rows. I tried using the command duplicated(x), but it only tells where the duplicated rows are.

You might want to look at the ave function. It will calculate a function within IDs and you can assign that as another row in the datafrme before you exclude the duplicates.
                        ^^^^^^

err... I meant to say another column.

> tst <- data.frame(ID = sample(c("1234", "4567", "2346"), 10, replace=TRUE), val=rnorm(10))
> tst
     ID         val
1  2346  0.22659389
2  2346  0.46835154
3  2346 -0.53702251
4  2346 -1.00187606
5  1234  0.90843566
6  2346 -0.59654370
7  4567 -0.04355647
8  1234  0.65332120
9  4567 -2.22517105
10 1234 -0.26911187
> tst$IDmn <- ave(tst$val, tst$ID) #default function for ave is mean but others can be used
> tst
     ID         val       IDmn
1  2346  0.22659389 -0.2880994
2  2346  0.46835154 -0.2880994
3  2346 -0.53702251 -0.2880994
4  2346 -1.00187606 -0.2880994
5  1234  0.90843566  0.4308817
6  2346 -0.59654370 -0.2880994
7  4567 -0.04355647 -1.1343638
8  1234  0.65332120  0.4308817
9  4567 -2.22517105 -1.1343638
10 1234 -0.26911187  0.4308817


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to