On Feb 26, 2013, at 9:33 PM, Anika Masters <anika.mast...@gmail.com> wrote:
> Thanks Arun and David. Another issue I am running into are memory > issues when one of the data frames I'm trying to rbind to or merge > with are "very large". (This is a repetitive problem, as I am trying > to merge/rbind thousands of small dataframes into a single "very > large" dataframe.) > > > > I'm thinking of creating a function that creates an empty dataframe to > which I can add data, but will need to first determine and ensure that > each dataframe has the exact same columns, in the exact same > "location". > > > > Before I write any new code, is there any pre-existing functions or > code that might solve this problem of "merging small or medium sized > dataframes with a "very large" dataframe.) Consider plyr. Memory issues can be a problem, but it's a piece of cake to write a one liner that iterates over a list of data frames and returns them all rbind'd together. Or just: do.call(rbind, list.of.data.frames). If memory is a serious problem then I think it's best to write your own code that appends each row by index - which avoids copying entire data frames in memory. > > On Tue, Feb 26, 2013 at 2:00 PM, David L Carlson <dcarl...@tamu.edu> wrote: >> Clumsy but it doesn't require any packages: >> >> merge2 <- function(x, y) { >> if(all(union(names(x), names(y)) == intersect(names(x), names(y)))){ >> rbind(x, y) >> } else merge(x, y, all=TRUE) >> } >> merge2(df1, df2) >> df3 <- df1 >> merge2(df1, df3) >> >> ---------------------------------------------- >> David L Carlson >> Associate Professor of Anthropology >> Texas A&M University >> College Station, TX 77843-4352 >> >> >> >> >>> -----Original Message----- >>> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- >>> project.org] On Behalf Of arun >>> Sent: Tuesday, February 26, 2013 1:14 PM >>> To: Anika Masters >>> Cc: R help >>> Subject: Re: [R] merging or joining 2 dataframes: merge, rbind.fill, >>> etc.? >>> >>> Hi, >>> >>> You could also try: >>> library(gtools) >>> smartbind(df2,df1) >>> # a b d >>> #1 7 99 12 >>> #2 7 99 12 >>> >>> >>> When df1!=df2 >>> smartbind(df1,df2) >>> # a b d x y c >>> #1 7 99 12 NA NA NA >>> #2 NA 34 88 12 44 56 >>> A.K. >>> >>> >>> >>> >>> ----- Original Message ----- >>> From: Anika Masters <anika.mast...@gmail.com> >>> To: r-help@r-project.org >>> Cc: >>> Sent: Tuesday, February 26, 2013 1:55 PM >>> Subject: [R] merging or joining 2 dataframes: merge, rbind.fill, etc.? >>> >>> #I want to "merge" or "join" 2 dataframes (df1 & df2) into a 3rd >>> (mydf). I want the 3rd dataframe to contain 1 row for each row in df1 >>> & df2, and all the columns in both df1 & df2. The solution should >>> "work" even if the 2 dataframes are identical, and even if the 2 >>> dataframes do not have the same column names. The rbind.fill function >>> seems to work. For learning purposes, are there other "good" ways to >>> solve this problem, using merge or other functions other than >>> rbind.fill? >>> >>> #e.g. These 3 examples all seem to "work" correctly and as I hoped: >>> >>> df1 <- data.frame(matrix(data=c(7, 99, 12) , nrow=1 , dimnames = >>> list( NULL , c('a' , 'b' , 'd') ) ) ) >>> df2 <- data.frame(matrix(data=c(88, 34, 12, 44, 56) , nrow=1 , >>> dimnames = list( NULL , c('d' , 'b' , 'x' , 'y', 'c') ) ) ) >>> mydf <- merge(df2, df1, all.y=T, all.x=T) >>> mydf >>> >>> #e.g. this works: >>> library(reshape) >>> mydf <- rbind.fill(df1, df2) >>> mydf >>> >>> #This works: >>> library(reshape) >>> mydf <- rbind.fill(df1, df2) >>> mydf >>> >>> #But this does not (the 2 dataframes are identical) >>> df1 <- data.frame(matrix(data=c(7, 99, 12) , nrow=1 , dimnames = >>> list( NULL , c('a' , 'b' , 'd') ) ) ) >>> df2 <- df1 >>> mydf <- merge(df2, df1, all.y=T, all.x=T) >>> mydf >>> >>> #Any way to get "mere" to work for this final example? Any other good >>> solutions? >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting- >>> guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting- >>> guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.