Jorge, thank you! that seems to be working, but unfortunately in real life I have thousands of variables (except for a, a2, a3 and b) so that manually selecting columns (as in c(2:4, 8:9)) would be too difficult... Dimitri
On Wed, Jul 11, 2012 at 6:36 PM, Jorge I Velez <jorgeivanve...@gmail.com> wrote: > Hi Dimitri, > > Try creating a key for "x" and "y" and then merging the result by that > variable: > > x$key <- with(x, paste(a, b, sep = "/")) > y$key <- with(y, paste(a2, b, sep = "/")) > merge(x, y, by = 'key')[, c(2:4, 8:9)] > > HTH, > Jorge.- > > > On Wed, Jul 11, 2012 at 6:28 PM, Dimitri Liakhovitski <> wrote: >> >> Dear R-ers, >> >> I feel I am close, but can't get it quite right. >> Thanks a lot for your help! >> >> Dimitri >> >> # I have 2 data frames: >> >> >> x<-data.frame(a=c("aa","aa","ab","ab","ba","ba","bb","bb"),b=c(1:2,1:2,1:2,1:2),d=c(10,20,30,40,50,60,70,80)) >> >> y<-data.frame(a2=c("aa","aa","ba","ba"),a3=c("ab","ab","bb","bb"),b=c(1:2,1:2),e1=c(100,200,300,400),e2=c(101,201,301,401)) >> (x);(y) >> >> # I'd like to merge them so that the result looks like this: >> >> >> desired<-data.frame(a=c("aa","aa","ab","ab","ba","ba","bb","bb"),b=c(1:2,1:2,1:2,1:2),d=c(10,20,30,40,50,60,70,80), >> >> e1=c(100,200,100,200,300,400,300,400),e2=c(101,201,101,201,301,401,301,401)) >> (desired) >> >> # In other words, I want column e1 and e2 entries from data frame y to >> be repeated based on matching of column a from x and columns a2 and >> then a3 from y. >> >> # I am trying step-by-step - first I am using column a2 from data >> frame y for merging: >> out1<-merge(x,y[-2],by.x=c("a","b"),by.y=c("a2","b"),all.x=T,all.y=F) >> (out1) # looking good - half of the job is done >> >> # Step2 - does not work >> >> # next line produces columns e1 and e2 twice (in real life I have tons >> of columns like e1 and e2): >> merge(out1,y[-1],by.x=c("a","b"),by.y=c("a3","b"),all.x=T,all.y=F) >> >> # next line also doesn't do the job: >> >> merge(out1,y[-1],by.x=c("a","b","e1","e2"),by.y=c("a3","b","e1","e2"),all.x=T,all.y=F) >> >> # Finally, I tried this approach: >> out1<-merge(x,y[-2],by.x=c("a","b"),by.y=c("a2","b"),all.x=T,all.y=F) >> out2<-merge(x,y[-1],by.x=c("a","b"),by.y=c("a3","b"),all.x=T,all.y=F) >> (out1); (out2) >> >> # Now I need to merge these 2 - however, the next line doubles the >> number of entries: >> merge(out1,out2,by=names(out1),all.x=T,all.y=T) >> >> -- >> Dimitri Liakhovitski >> marketfusionanalytics.com >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > -- Dimitri Liakhovitski marketfusionanalytics.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.