On Thu, Dec 8, 2016 at 3:59 PM, David Winsemius <dwinsem...@comcast.net> wrote: > >> On Dec 8, 2016, at 12:09 PM, John P. Nolan <jpno...@american.edu> wrote: >> >> Dear All, >> >> I regularly want to "apply" some function to an array in a way that the >> arguments to the user function depend on the index on which the apply is >> working. A simple example is: >> >> A <- array( runif(160), dim=c(5,4,8) ) >> x <- matrix( runif(32), nrow=4, ncol=8 ) >> b <- runif(8) >> f1 <- function( A, x, b ) { sum( A %*% x ) + b } >> result <- rep(0.0,8) >> for (i in 1:8) { >> result[i] <- f1( A[,,i], x[,i] , b[i] ) >> } >> >> This works, but is slow. I'd like to be able to do something like: >> generalized.apply( A, MARGIN=3, FUN=f1, list(x=x,MARGIN=2), >> list(b=b,MARGIN=1) ), where the lists tell generalized.apply to pass x[,i] >> and b[i] to FUN in addition to A[,,i]. >> >> Does such a generalized.apply already exist somewhere? While I can write a >> C function to do a particular case, it would be nice if there was a fast, >> general way to do this. > > I would have thought that this would achieve the same result: > > result <- sapply( seq_along(b) , function(i) { f1( A[,,i], x[,i] , b[i] )} ) > > Or: > > result <- sapply( seq.int( dim(A)[3] ) , function(i) { f1( A[,,i], x[,i] , > b[i] )} ) > > (I doubt it will be any faster, but if 'i' is large, parallelism might help. > The inner function appears to be fairly efficient.)
You're right, it's slower. Despite how often it's repeated that "loops in R are slow", they're not *that* slow. They're often faster than the *apply functions, especially if they have been "compiled" by compiler::cmpfun(). You really need to know *why* code is slow before trying to make it faster. I profiled an example that would have a loop with 1e6 iterations and 80%+ of the time was still spent inside f1(). set.seed(21) nc <- 1e6 nr <- 10 A <- array( runif(5*nr*nc), dim=c(5,nr,nc) ) x <- matrix( runif(nr*nc), nrow=nr, ncol=nc ) b <- runif(nc) f1 <- compiler::cmpfun(function( A, x, b ) { sum( A %*% x ) + b }) f2 <- compiler::cmpfun({ function(A, x, b, FUN) { result <- numeric(length(b)) for (i in seq_along(b)) { result[i] <- FUN( A[,,i], x[,i] , b[i] ) } return(result) } }) Rprof(interval=0.01) result <- f2(A,x,b,f1) Rprof(NULL) summaryRprof() $by.self self.time self.pct total.time total.pct "FUN" 4.29 84.28 4.76 93.52 "%*%" 0.47 9.23 0.47 9.23 "f2" 0.33 6.48 5.09 100.00 $by.total total.time total.pct self.time self.pct "f2" 5.09 100.00 0.33 6.48 "FUN" 4.76 93.52 4.29 84.28 "%*%" 0.47 9.23 0.47 9.23 $sample.interval [1] 0.01 $sampling.time [1] 5.09 In this case, almost all the time is spent evaluating f1() ("FUN"), even after calling compiler::cmpfun on f1() and on a function containing the loop. Making the looping construct faster is not going to improve the performance of this code by a significant amount. I.e., dropping to compiled code will only help if you avoid the R function call, but then that's not a general solution... > -- > > David Winsemius > Alameda, CA, USA > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel -- Joshua Ulrich | about.me/joshuaulrich FOSS Trading | www.fosstrading.com R/Finance 2016 | www.rinfinance.com ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel