Thanks for the suggestion. The solution below is much better than my round-about way.
combn(outcomes, 2, list ) I can't do much about the speed of combn() so I wanted to trim the fat wherever else I could. C On 17 November 2010 15:10, Charles C. Berry <cbe...@tajo.ucsd.edu> wrote: > > On Wed, 17 Nov 2010, Chris Carleton wrote: > >> Hi List, >> >> I'm hoping to get opinions for enhancing the efficiency of the following >> code designed to take a vector of probabilities (outcomes) and calculate a >> union of the probability space. As part of the union calculation, combn() >> must be used, which returns a matrix, and the parallelized version of >> lapply() provided in the multicore package requires a list. I've found that >> parallelization is very necessary for vectors of outcomes greater in length >> than about 10 or 15 elements, which is why I need to make use of multicore >> (and, therefore, convert the combn() matrix to a list). It would speed the >> process up if there was a more direct way to convert the columns of combn() >> to elements of a single list. > > > I think you are mistaken. > > Is this what Rprof() tells you? > > On my system, combn() is the culprit > >> Rprof() >> outcomes <- 1:25 >> nada <- replicate(200, {apply(combn(outcomes,2),2,column2list);NULL}) >> Rprof(NULL) >> summaryRprof() > > $by.self > self.time self.pct total.time total.pct > "combn" 0.64 61.54 0.70 67.31 > "apply" 0.20 19.23 1.04 100.00 > "FUN" 0.10 9.62 1.04 100.00 > "!=" 0.04 3.85 0.04 3.85 > "<" 0.02 1.92 0.02 1.92 > "-" 0.02 1.92 0.02 1.92 > "is.null" 0.02 1.92 0.02 1.92 > > > And it hardly takes any time at that! > > > HTH, > > Chuck > > p.s. Isn't > > as.data.frame( combn( outcomes, 2 ) ) > or > combn(outcomes, 2, list ) > > good enough? > > > Any constructive suggestions will be greatly >> >> appreciated. Thanks for your consideration, >> >> C >> >> code: >> ------------ >> unionIndependant <- function(outcomes) { >> intsctn <- c() >> column2list <- function(x){list(x)} >> pb <- >> ProgressBar(max=length(outcomes),stepLength=1,newlineWhenDone=TRUE) >> for (i in 2:length(outcomes)){ >> increase(pb) >> outcomes_ <- apply(combn(outcomes,i),2,column2list) >> for (j in 1:length(outcomes_)){outcomes_[[j]] <- >> outcomes_[[j]][[1]]} >> outcomes_container <- mclapply(outcomes_,prod,mc.cores=3) >> intsctn[i] <- sum(unlist(outcomes_container)) >> } >> intsctn <- intsctn[-1] >> return(sum(outcomes) - sum(intsctn[which(which((intsctn %in% intsctn)) >> %% 2 == 1)]) + sum(intsctn[which(which((intsctn %in% intsctn)) %% 2 == 0)]) >> + ((-1)^length(intsctn) * prod(outcomes))) >> } >> ------------ >> PS This code has been tested on vectors of up to length(outcomes) == 25 and >> it should be noted that ProgressBar() requires the R.utils package. >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > Charles C. Berry Dept of Family/Preventive Medicine > cbe...@tajo.ucsd.edu UC San Diego > http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 > > > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.