On Wed, 17 Nov 2010, Chris Carleton wrote:

Hi List,

I'm hoping to get opinions for enhancing the efficiency of the following
code designed to take a vector of probabilities (outcomes) and calculate a
union of the probability space. As part of the union calculation, combn()
must be used, which returns a matrix, and the parallelized version of
lapply() provided in the multicore package requires a list. I've found that
parallelization is very necessary for vectors of outcomes greater in length
than about 10 or 15 elements, which is why I need to make use of multicore
(and, therefore, convert the combn() matrix to a list). It would speed the
process up if there was a more direct way to convert the columns of combn()
to elements of a single list.


I think you are mistaken.

Is this what Rprof() tells you?

On my system, combn() is the culprit

Rprof()
outcomes <- 1:25
nada <- replicate(200, {apply(combn(outcomes,2),2,column2list);NULL})
Rprof(NULL)
summaryRprof()
$by.self
          self.time self.pct total.time total.pct
"combn"        0.64    61.54       0.70     67.31
"apply"        0.20    19.23       1.04    100.00
"FUN"          0.10     9.62       1.04    100.00
"!="           0.04     3.85       0.04      3.85
"<"            0.02     1.92       0.02      1.92
"-"            0.02     1.92       0.02      1.92
"is.null"      0.02     1.92       0.02      1.92


And it hardly takes any time at that!


HTH,

Chuck

p.s. Isn't

        as.data.frame( combn( outcomes, 2 ) )
or
        combn(outcomes, 2, list )

good enough?


Any constructive suggestions will be greatly
appreciated. Thanks for your consideration,

C

code:
------------
unionIndependant <- function(outcomes) {
   intsctn <- c()
   column2list <- function(x){list(x)}
   pb <-
ProgressBar(max=length(outcomes),stepLength=1,newlineWhenDone=TRUE)
   for (i in 2:length(outcomes)){
       increase(pb)
       outcomes_ <- apply(combn(outcomes,i),2,column2list)
       for (j in 1:length(outcomes_)){outcomes_[[j]] <-
outcomes_[[j]][[1]]}
       outcomes_container <- mclapply(outcomes_,prod,mc.cores=3)
       intsctn[i] <- sum(unlist(outcomes_container))
   }
   intsctn <- intsctn[-1]
   return(sum(outcomes) - sum(intsctn[which(which((intsctn %in% intsctn))
%% 2 == 1)]) + sum(intsctn[which(which((intsctn %in% intsctn)) %% 2 == 0)])
+ ((-1)^length(intsctn) * prod(outcomes)))
}
------------
PS This code has been tested on vectors of up to length(outcomes) == 25 and
it should be noted that ProgressBar() requires the R.utils package.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Charles C. Berry                            Dept of Family/Preventive Medicine
cbe...@tajo.ucsd.edu                        UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to