Try this one; it is doing a list of 7000 in under 2 seconds: > sequences <- list( + + + c("M","G","L","W","I","S","F","G","T","P","P","S","Y","T","Y","L","L","I" + ,"M", + + + "N","H","K","L","L","L","I","N","N","N","N","L","T","E","V","H","T","Y","F", "N","I","N","I","N","I","D","K","M","Y","I","H","*") + ) > > > > indexes <- list( + list( + c(1,22),c(22,46),c(46, 51),c(1,46),c(22,51),c(1,51) + ) + ) > > indexes <- rep(indexes,10) > sequences <- rep(sequences,7000) > > system.time({ + fragments <- lapply(indexes, function(.seq){ + lapply(.seq, function(.range){ + .range <- seq(.range[1], .range[2]) # save since we use several times + lapply(sequences, '[', .range) + }) + }) + }) user system elapsed 1.24 0.00 1.26 > >
On Fri, Jan 16, 2009 at 3:16 PM, Johannes Graumann <johannes_graum...@web.de> wrote: > Thanks. Very elegant, but doesn't solve the problem of the outer "for" loop, > since I now would rewrite the code like so: > > fragments <- list() > for(iN in seq(length(sequences))){ > cat(paste(iN,"\n")) > fragments[[iN]] <- > lapply(indexes[[1]], function(g)sequences[[1]][do.call(seq, as.list(g))]) > } > > still very slow for length(sequences) ~ 7000. > > Joh > > On Friday 16 January 2009 14:23:47 Henrique Dallazuanna wrote: >> Try this: >> >> lapply(indexes[[1]], function(g)sequences[[1]][do.call(seq, as.list(g))]) >> >> On Fri, Jan 16, 2009 at 11:06 AM, Johannes Graumann < >> >> johannes_graum...@web.de> wrote: >> > Hello, >> > >> > I have a list of character vectors like this: >> > >> > sequences <- list( >> > >> > >> > c("M","G","L","W","I","S","F","G","T","P","P","S","Y","T","Y","L","L","I" >> >,"M", >> > >> > >> > "N","H","K","L","L","L","I","N","N","N","N","L","T","E","V","H","T","Y"," >> >F", "N","I","N","I","N","I","D","K","M","Y","I","H","*") >> > ) >> > >> > and another list of subset ranges like this: >> > >> > indexes <- list( >> > list( >> > c(1,22),c(22,46),c(46, 51),c(1,46),c(22,51),c(1,51) >> > ) >> > ) >> > >> > What I now want to do is to subset each entry in "sequences" >> > (sequences[[1]]) with all ranges in the corresponding low level list in >> > "indexes" (indexes[[1]]). Here is what I came up with. >> > >> > fragments <- list() >> > for(iN in seq(length(sequences))){ >> > cat(paste(iN,"\n")) >> > tmpFragments <- sapply( >> > indexes[[iN]], >> > function(x){ >> > sequences[[iN]][seq.int(x[1],x[2])] >> > } >> > ) >> > fragments[[iN]] <- tmpFragments >> > } >> > >> > This works fine, but "sequences" contains thousands of entries and the >> > corresponding "indexes" are sometimes hundreds of ranges long, so this >> > whole >> > process is EXTREMELY inefficient. >> > >> > Does somebody out there take the challenge and show me a way on how to >> > speed >> > this up? >> > >> > Thanks for any hints, >> > >> > Joh >> > >> > ______________________________________________ >> > R-help@r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.