Hi, It seemed to me like the calculations were implemented in a bit of a redundant way, so I tried to simplify it algebraically. doit1 is Phil's version of your original calculations, and doit2 is my simplified way.
doit1 <- function(i, j) { exp(sum(log((exp(tmp[i,] * (qq$nodes[j]-b)))/ (factorial(tmp[i,])*exp(exp(qq$nodes[j]-b))))))* ((1/(s*sqrt(2*pi)))*exp(-((qq$nodes[j]-0)^2/(2*s^2))))/ dnorm(qq$nodes[j])*qq$weights[j] } doit2 <- function(i, j) { (prod(exp((tmp[i,]*(qq$nodes[j]-b))-exp(qq$nodes[j]-b))/ factorial(tmp[i,]))*qq$weights[j])/ (s*sqrt(2*pi)*dnorm(qq$nodes[j])*exp((qq$nodes[j]-0)^2/(2*s^2))) } system.time(t(outer(1:300,1:2,Vectorize(doit1)))) system.time(t(outer(1:300,1:2,Vectorize(doit2)))) New vs. Old Characters (for the calculations): 154 vs. 177 Function calls for calculations (i.e., excluding subseting/dnorm, etc.): 22 vs. 27 Timing difference: Not appreciable on my system, but I spent so long wading through exactly what was happening, I figured why not share it. Moving tmp up to 300 rows, I got: > system.time(t(outer(1:300,1:2,Vectorize(doit1)))) user system elapsed 9.044 0.008 10.135 > system.time(t(outer(1:300,1:2,Vectorize(doit2)))) user system elapsed 8.413 0.008 8.893 At this rate, the difference between a few million rows might save you enough time you can stretch your hands once, and those 20 characters should really free up space on your hard disk....sigh. Josh On Mon, Oct 4, 2010 at 11:22 AM, Doran, Harold <hdo...@air.org> wrote: > Ahhhh, I see that you wrapped apply around mapply. I was toying with both; > but didn't think to use mapply inside apply. As always, thank you, Phil > > -----Original Message----- > From: Phil Spector [mailto:spec...@stat.berkeley.edu] > Sent: Monday, October 04, 2010 2:20 PM > To: Doran, Harold > Cc: Gabriela Cendoya; R-help > Subject: Re: [R] Help with apply > > Harold - > The first way that comes to mind is > > doit = function(i,j)exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) / > (factorial(tmp[i,]) * exp(exp(qq$nodes[j]-b)))))) * > ((1/(s*sqrt(2*pi))) * exp(-((qq$nodes[j]-0)^2/(2*s^2))))/ > dnorm(qq$nodes[j]) * qq$weights[j] > t(outer(1:3,1:2,Vectorize(doit))) > > but you said you wanted to use apply. That leads to > > doit1 = function(tmpi,nod,wt) > exp(sum(log((exp(tmpi*(nod-b))) / (factorial(tmpi) * > exp(exp(nod-b)))))) * ((1/(s*sqrt(2*pi))) * > exp(-((nod-0)^2/(2*s^2))))/dnorm(nod) * qq$weights[j] > apply(tmp,1,function(x)mapply(function(n,w)doit1(x,n,w),qq$nodes,qq$weights)) > > Both seem to agree with your rr1. > > - Phil Spector > Statistical Computing Facility > Department of Statistics > UC Berkeley > spec...@stat.berkeley.edu > > On Mon, 4 Oct 2010, Doran, Harold wrote: > >> Sorry about that; s <- 1 >> >> -----Original Message----- >> From: Gabriela Cendoya [mailto:gabrielacendoya.rl...@gmail.com] >> Sent: Monday, October 04, 2010 1:44 PM >> To: Doran, Harold >> Cc: R-help >> Subject: Re: [R] Help with apply >> >> You are missing "s" in your definitions so I can't reproduce your code. >> >>> tmp <- data.frame(var1 = sample(c(0:10), 3, replace = TRUE), var2 = >>> sample(c(0:10), 3, replace = TRUE), var3 = sample(c(0:10), 3, replace = >>> TRUE)) >>> >>> str(tmp) >> 'data.frame': 3 obs. of 3 variables: >> $ var1: int 9 3 9 >> $ var2: int 4 6 2 >> $ var3: int 2 9 3 >>> >>> #I can run the following double loop and yield what I want in the end (rr1) >>> as: >>> >>> library(statmod) >>> Q <- 2 >>> b <- runif(3) >>> qq <- gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1) >>> rr1 <- matrix(0, nrow = Q, ncol = nrow(tmp)) >>> L <- nrow(tmp) >>> for(j in 1:Q){ >> + for(i in 1:L){ >> + rr1[j,i] <- exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) / >> (factorial(tmp[i,]) * >> + exp(exp(qq$nodes[j]-b)))))) * ((1/(s*sqrt(2*pi))) * >> exp(-((qq$nodes[j]-0)^2/(2*s^2))))/dnorm(qq$nodes[j]) * qq$weights[j] >> + } >> + } >> Error: objeto 's' no encontrado >>> rr1 >> [,1] [,2] [,3] >> [1,] 0 0 0 >> [2,] 0 0 0 >>> >> >> Gabriela >> >> >> 2010/10/4, Doran, Harold <hdo...@air.org>: >>> Suppose I have the following data: >>> >>> tmp <- data.frame(var1 = sample(c(0:10), 3, replace = TRUE), var2 = >>> sample(c(0:10), 3, replace = TRUE), var3 = sample(c(0:10), 3, replace = >>> TRUE)) >>> >>> I can run the following double loop and yield what I want in the end (rr1) >>> as: >>> >>> library(statmod) >>> Q <- 2 >>> b <- runif(3) >>> qq <- gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1) >>> rr1 <- matrix(0, nrow = Q, ncol = nrow(tmp)) >>> L <- nrow(tmp) >>> for(j in 1:Q){ >>> for(i in 1:L){ >>> rr1[j,i] <- >>> exp(sum(log((exp(tmp[i,]*(qq$nodes[j]-b))) / (factorial(tmp[i,]) * >>> exp(exp(qq$nodes[j]-b)))))) * >>> >>> ((1/(s*sqrt(2*pi))) * exp(-((qq$nodes[j]-0)^2/(2*s^2))))/dnorm(qq$nodes[j]) >>> * qq$weights[j] >>> } >>> } >>> rr1 >>> >>> But, I think this is so inefficient for large Q and nrow(tmp). The function >>> I am looping over is: >>> >>> fn <- function(x, nodes, weights, s, b) { >>> exp(sum(log((exp(x*(nodes-b))) / (factorial(x) * >>> exp(exp(nodes-b)))))) * >>> ((1/(s*sqrt(2*pi))) * >>> exp(-((nodes-0)^2/(2*s^2))))/dnorm(nodes) * weights >>> } >>> >>> I've tried using apply in a few ways, but I can't replicate rr1 from the >>> double loop. I can go through each value of Q one step at a time and get >>> matching values in the rr1 table, but this would still require a loop and >>> that I think can be avoided. >>> >>> apply(tmp, 1, fn, nodes = qq$nodes[1], weights = qq$weights[1], s = 1, b = >>> b) >>> >>> Does anyone see an efficient way to compute rr1 without a loop? >>> >>> Harold >>> >>>> sessionInfo() >>> R version 2.10.1 (2009-12-14) >>> i386-pc-mingw32 >>> >>> locale: >>> [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United >>> States.1252 LC_MONETARY=English_United States.1252 >>> [4] LC_NUMERIC=C LC_TIME=English_United >>> States.1252 >>> >>> attached base packages: >>> [1] stats graphics grDevices utils datasets methods base >>> >>> other attached packages: >>> [1] minqa_1.1.9 Rcpp_0.8.6 MiscPsycho_1.6 statmod_1.4.6 >>> lattice_0.17-26 gdata_2.8.0 >>> >>> loaded via a namespace (and not attached): >>> [1] grid_2.10.1 gtools_2.6.2 tools_2.10.1 >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> >> -- >> _________________________ >> Lic. María Gabriela Cendoya >> Magíster en Biometría >> Profesor Adjunto >> Facultad de Ciencias Agrarias >> UNMdP - Argentina >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.