I just figured out the reason was the column (the 1st column in each data frame "gene.name") by which to merge each data frame has no unique values, some values were repeated, so when merging, the data frame gets bigger and bigger exponentially.
Sorry to bother all. John ________________________________ From: J Toll <jct...@gmail.com> Cc: "r-help@r-project.org" <r-help@r-project.org> Sent: Friday, January 11, 2013 1:35 PM Subject: Re: [R] weird merge() Hi, > >I have some protein array data, each array in a separate text file. So I read >them in and try to combine them into a single data frame by using merge(). see >code below (If you download the attached data files into a specific folder, >the code below should work): > > >fls<-list.files("C:\\folder_of_download",full.names=T) ## get file names >prot<-list() ## a list to contain individual files >ind<-1 >for (i in fls[c(1:11)]) { > cat(ind, " ") > > tmp<-read.delim(i,header=T,row.names=NULL,na.string='null') > colnames(tmp)[4]<-as.character(tmp$barcode[1]) > prot[[ind]]<-tmp[,-(1:2)] > ind<-ind+1 >} > > ## try to merge them together > ## not do this in a loop so I can see where the problem occurs >pro<-merge(prot[[1]],prot[[2]],by.x=1,by.y=1,all=T) >pro<-merge(pro,prot[[3]],by.x=1,by.y=1,all=T) >pro<-merge(pro,prot[[4]],by.x=1,by.y=1,all=T) >pro<-merge(pro,prot[[5]],by.x=1,by.y=1,all=T) >pro<-merge(pro,prot[[6]],by.x=1,by.y=1,all=T) >pro<-merge(pro,prot[[7]],by.x=1,by.y=1,all=T) >pro<-merge(pro,prot[[8]],by.x=1,by.y=1,all=T) >pro<-merge(pro,prot[[9]],by.x=1,by.y=1,all=T) >pro<-merge(pro,prot[[10]],by.x=1,by.y=1,all=T) >pro<-merge(pro,prot[[11]],by.x=1,by.y=1,all=T) > > >I noticed that starting file #8, the merge become more and more slower that >when it's file #11, the computer was stuck! Originally I thought something >wrong with the later files, but when I change the order of merging, the >slow-down still happens at the 8th files to be merged. > >Can anyone suggest what's going on with merging? > > I'm not sure exactly what you're trying to do with all that code, but if you're just trying to get all eleven files into a data.frame, you could do this: allFilesAsList <- lapply(1:11, function(i) read.delim(paste("p", i, ".txt", sep = ""))) oneBigDataFrame <- do.call(rbind, allFilesAsList) You may need to fix the column names. Is that anything like what you were trying to do? James [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.