On Jul 8, 2009, at 12:17 PM, tathta wrote:


From an email suggestion, here are two sample datasets, and my ideal output:

dataA <- data.frame(unique.id=c("A","B","C","B"),x=11:14,y=5:2)
dataB <-
data .frame(unique.id=c("A","B","A","B","A","C","D","A"),x=27:20,y=22:29)

## mystery operation(s) happen here....

## ideal output would be:
dataA <-
data .frame (unique .id =c("A","B","C","B"),x=11:14,y=5:2,countA=c(1,2,1,2),countB=c(4,2,1,2))


so my mystery operation(s) would count the number of times the unique id
shows up in a given dataset.
my ideal outputs are as follows:
countA is the "mystery operation" applied to dataA (counting occurrences
within the same dataset)
countB is applied to dataB (counting occurrences within a second dataset).



My best try so far is to do:
tempA <- aggregate(dataA$unique.id,list(dataA$unique.id),length)

which gives me a matrix with ONE instance of each unique.id and the
counts...
(and which I thought was kinda cute)
but it only works for within a single dataset!

<snip>

Modify my initial proposal:

countA <- as.data.frame(table(dataA$unique.id), responseName = "countA")
countB <- as.data.frame(table(dataB$unique.id), responseName = "countB")

> countA
  Var1 countA
1    A      1
2    B      2
3    C      1

> countB
  Var1 countB
1    A      4
2    B      2
3    C      1
4    D      1


dataA <- merge(dataA, countA, by.x = "unique.id", by.y = "Var1")
dataA <- merge(dataA, countB, by.x = "unique.id", by.y = "Var1")

> dataA
  unique.id  x y countA countB
1         A 11 5      1      4
2         B 12 4      2      2
3         B 14 2      2      2
4         C 13 3      1      1


Note that without 'all.x = TRUE' in the merge() calls, only those unique.id's that are common to both datasets will be in the result. If you want to include unique.id's that are in A, but not in B, using 'all.x = TRUE'.

Note also that by default, 'unique.id' will be alpha sorted in the output.

HTH,

Marc Schwartz

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to