Hello everyone,

My problem is better explained with an example:

> x=data.frame(a=1:4,b=1:4,c=rnorm(4))
> x
 a b          c
1 1 1 -0.8821089
2 2 2 -0.7082583
3 3 3 -0.5948835
4 4 4 -1.8571443
> y=data.frame(a=c(1,3),b=3,c=rnorm(2))
> y
 a b            c
1 1 3 -0.273155973
2 3 3  0.009517862

Now I want to merge x and y by columns a and b, hence creating a data.frame with all a:b combinations observed in x and y. That's easily done with merge:

> merge(x,y,by=c("a","b"),all=T)
 a b        c.x          c.y
1 1 1 -0.8821089           NA
2 1 3         NA -0.273155973
3 2 2 -0.7082583           NA
4 3 3 -0.5948835  0.009517862
5 4 4 -1.8571443           NA

But rather than two c columns I would want the merge to:
- keep the value in x if there is no corresponding value in y
- keep the value in y if there is no corresponding value in x
- prefer the value in y when the a:b combination exists in both x and y

So basically I want my result to look like:
 a b          c
1 1 1 -0.8821089
2 1 3 -0.2731559
3 2 2 -0.7082583
4 3 3  0.0095178
5 4 4 -1.8571443

I can't find a combinations of options for merge that does this. Is there another fonction that would do that or do I have to resort to some post-processing after merge? It seems that it might be something like a "right merge" for data bases but I don't know this world at all. I would be happy to look into sqldf if that allows to do things like that.

Thanks in advance. Sincerely,

JiHO
---
http://maururu.net

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to