Jim, d.frame[[i]] is a list of data.frames and seqFile is a data.frame. I have coverted them to vectors/matrixes and the timing is the same as data.frame. 'index' is unique in both structures. The list is subset into data.frame/matrix structures. Lana
-----Original Message----- From: jim holtman [mailto:[EMAIL PROTECTED] Sent: Friday, June 13, 2008 9:45 AM To: Lana Schaffer Cc: r-help@r-project.org Subject: Re: [R] alternative to matching/merge? What is the structure of 'd.frame' and 'segFile'? Run Rprof so that we can see which of the functions it is spending its time in. What happens if x$index is not in seqFile$index? Are the values in the 'index' unique in both structures? Subsetting a data frame can be expensive when compared to using a matrix. Could you use a matrix instead of a data frame; are all the columns the same mode? Again either a subset of data would be helpful or an 'str' on the data objects being used so that we can understand what they are. On Fri, Jun 13, 2008 at 12:03 PM, Lana Schaffer <[EMAIL PROTECTED]> wrote: > Jim, > My code is this: > mergefunc <- function(x,seqFile){ > # merge(seqFile,x) > cbind(x, seqFile[ match(as.vector(x$index), as.vector(seqFile$index)), > ]) > } > LIX <- lapply(d.frame[[1]], mergefunc,seqFile=seqFile) Each > matrix/data.frame takes 0.2 seconds and then to do this 1240 times > takes ~4 minutes. > Thanks, > Lana > > -----Original Message----- > From: jim holtman [mailto:[EMAIL PROTECTED] > Sent: Thursday, June 12, 2008 6:40 PM > To: Lana Schaffer > Cc: r-help@r-project.org > Subject: Re: [R] alternative to matching/merge? > > It would be nice if you at least included the code that you are using > and a subset of the data. Have you run Rprof to determine which of > the functions is consuming the time? > > On Thu, Jun 12, 2008 at 3:25 PM, Lana Schaffer <[EMAIL PROTECTED]> > wrote: >> >> Greetings, >> I am doing matching/merge for a table (40919x3) to data which is in >> the form of a list of 1268 data.frames. Using lapply this is taking >> ~5 minutes. I know that the match/merge functions are time >> consuming, > >> so is there an alternative to this accomplish this goal? is lapply >> not efficient? >> >> Lana Schaffer >> Biostatistics/Informatics >> The Scripps Research Institute >> DNA Array Core Facility >> La Jolla, CA 92037 >> (858) 784-2263 >> (858) 784-2994 >> [EMAIL PROTECTED] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Jim Holtman > Cincinnati, OH > +1 513 646 9390 > > What is the problem you are trying to solve? > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.