On Aug 3, 2009, at 7:12 PM, David Winsemius wrote:
On Aug 3, 2009, at 9:24 AM, Rnewbie wrote:
Dear all,
I have a dataset, and I wanted to merge the rows with duplicated
IDs by
calculating the means or medians from the duplicate rows. I tried
using the
command duplicated(x), but it only tells where the duplicated rows
are.
You might want to look at the ave function. It will calculate a
function within IDs and you can assign that as another row in the
datafrme before you exclude the duplicates.
^^^^^^
err... I meant to say another column.
> tst <- data.frame(ID = sample(c("1234", "4567", "2346"), 10,
replace=TRUE), val=rnorm(10))
> tst
ID val
1 2346 0.22659389
2 2346 0.46835154
3 2346 -0.53702251
4 2346 -1.00187606
5 1234 0.90843566
6 2346 -0.59654370
7 4567 -0.04355647
8 1234 0.65332120
9 4567 -2.22517105
10 1234 -0.26911187
> tst$IDmn <- ave(tst$val, tst$ID) #default function for ave is mean
but others can be used
> tst
ID val IDmn
1 2346 0.22659389 -0.2880994
2 2346 0.46835154 -0.2880994
3 2346 -0.53702251 -0.2880994
4 2346 -1.00187606 -0.2880994
5 1234 0.90843566 0.4308817
6 2346 -0.59654370 -0.2880994
7 4567 -0.04355647 -1.1343638
8 1234 0.65332120 0.4308817
9 4567 -2.22517105 -1.1343638
10 1234 -0.26911187 0.4308817
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.