On Wed, 17 Nov 2010, Chris Carleton wrote:
Hi List,
I'm hoping to get opinions for enhancing the efficiency of the following
code designed to take a vector of probabilities (outcomes) and calculate a
union of the probability space. As part of the union calculation, combn()
must be used, which returns a matrix, and the parallelized version of
lapply() provided in the multicore package requires a list. I've found that
parallelization is very necessary for vectors of outcomes greater in length
than about 10 or 15 elements, which is why I need to make use of multicore
(and, therefore, convert the combn() matrix to a list). It would speed the
process up if there was a more direct way to convert the columns of combn()
to elements of a single list.
I think you are mistaken.
Is this what Rprof() tells you?
On my system, combn() is the culprit
Rprof()
outcomes <- 1:25
nada <- replicate(200, {apply(combn(outcomes,2),2,column2list);NULL})
Rprof(NULL)
summaryRprof()
$by.self
self.time self.pct total.time total.pct
"combn" 0.64 61.54 0.70 67.31
"apply" 0.20 19.23 1.04 100.00
"FUN" 0.10 9.62 1.04 100.00
"!=" 0.04 3.85 0.04 3.85
"<" 0.02 1.92 0.02 1.92
"-" 0.02 1.92 0.02 1.92
"is.null" 0.02 1.92 0.02 1.92
And it hardly takes any time at that!
HTH,
Chuck
p.s. Isn't
as.data.frame( combn( outcomes, 2 ) )
or
combn(outcomes, 2, list )
good enough?
Any constructive suggestions will be greatly
appreciated. Thanks for your consideration,
C
code:
------------
unionIndependant <- function(outcomes) {
intsctn <- c()
column2list <- function(x){list(x)}
pb <-
ProgressBar(max=length(outcomes),stepLength=1,newlineWhenDone=TRUE)
for (i in 2:length(outcomes)){
increase(pb)
outcomes_ <- apply(combn(outcomes,i),2,column2list)
for (j in 1:length(outcomes_)){outcomes_[[j]] <-
outcomes_[[j]][[1]]}
outcomes_container <- mclapply(outcomes_,prod,mc.cores=3)
intsctn[i] <- sum(unlist(outcomes_container))
}
intsctn <- intsctn[-1]
return(sum(outcomes) - sum(intsctn[which(which((intsctn %in% intsctn))
%% 2 == 1)]) + sum(intsctn[which(which((intsctn %in% intsctn)) %% 2 == 0)])
+ ((-1)^length(intsctn) * prod(outcomes)))
}
------------
PS This code has been tested on vectors of up to length(outcomes) == 25 and
it should be noted that ProgressBar() requires the R.utils package.
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Charles C. Berry Dept of Family/Preventive Medicine
cbe...@tajo.ucsd.edu UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.