Hi Dennis,

Actually, I am trying to combine them by COLUMN, so 
that's why I am using merge(). The first loop was to simply read these 
protein data into R as 11 data frames, each data frame is 165 x 2. Then I
 use merge() to combine these data frames into 1 big data frame by 
column with these individual merge() statements. I didn't do it in a 
loop because I want to see at what point the merge() will start to fail.
 And it turns out the merge of the first 7 data frames is working fine. 
Starting from the 8th column, it becomes more and more slow and at the 
11th data frame it appeared stuck on my computer.

Thanks

John






________________________________
 From: Dennis Murphy <djmu...@gmail.com>

Sent: Friday, January 11, 2013 1:25 PM
Subject: Re: [R] weird merge()

Hi John:

This doesn't look right. What are you trying to do? [BTW, the variable
names in the attachments have spaces, so most of R's read functions
should choke on them. At the very least, replace the spaces with
underscores.]

If all you are trying to do is row concatenate them (since the two or
three I looked at appear to have the same structure), then it's as
simple as

pro <- do.call(rbind, prot)

If this is what you want along with an indicator for each data file,
then the ldply() function in the plyr package might be useful as an
alternative to do.call. It should return an additional variable .id
whose value corresponds to the number (or name) of the list component.

library(plyr)
pro2 <- ldply(prot, rbind)

If you want something different, then be more explicit about what you
want, because your merge() code doesn't make a lot of sense to me.


Dennis

PS: Just a little hint: if you're using (almost) the same code
repeatedly, there's probably a more efficient way to do it in R. CS
types call this the DRY principle: Don't Repeat Yourself. I know you
know this, but a little reminder doesn't hurt :)



> Hi,
>
> I have some protein array data, each array in a separate text file. So I read 
> them in and try to combine them into a single data frame by using merge(). 
> see code below (If you download the attached data files into a specific 
> folder, the code below should work):
>
>
> fls<-list.files("C:\\folder_of_download",full.names=T) ## get file names
> prot<-list() ## a list to contain individual files
> ind<-1
> for (i in fls[c(1:11)]) {
>     cat(ind, " ")
>
>     tmp<-read.delim(i,header=T,row.names=NULL,na.string='null')
>     colnames(tmp)[4]<-as.character(tmp$barcode[1])
>     prot[[ind]]<-tmp[,-(1:2)]
>     ind<-ind+1
> }
>
>         ## try to merge them together
>         ## not do this in a loop so I can see where the problem occurs
> pro<-merge(prot[[1]],prot[[2]],by.x=1,by.y=1,all=T)
> pro<-merge(pro,prot[[3]],by.x=1,by.y=1,all=T)
> pro<-merge(pro,prot[[4]],by.x=1,by.y=1,all=T)
> pro<-merge(pro,prot[[5]],by.x=1,by.y=1,all=T)
> pro<-merge(pro,prot[[6]],by.x=1,by.y=1,all=T)
> pro<-merge(pro,prot[[7]],by.x=1,by.y=1,all=T)
> pro<-merge(pro,prot[[8]],by.x=1,by.y=1,all=T)
> pro<-merge(pro,prot[[9]],by.x=1,by.y=1,all=T)
> pro<-merge(pro,prot[[10]],by.x=1,by.y=1,all=T)
> pro<-merge(pro,prot[[11]],by.x=1,by.y=1,all=T)
>
>
> I noticed that starting file #8, the merge become more and more slower that 
> when it's file #11, the computer was stuck!  Originally I thought something 
> wrong with the later files, but when I change the order of merging, the 
> slow-down still happens at the 8th files to be merged.
>
> Can anyone suggest what's going on with merging?
>
> Thanks
>
> John
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to