On Feb 22, 2011, at 1:55 AM, Jason Rudy wrote: > I just tried that myself and the .Call version is substantially > faster. It seems like there is a lot more going on in the .Call C > code than the .C. Why is .Call faster? Does it have to do with the > way arguments are passed into the C function? I tried with DUP=FALSE > and NAOK=TRUE, but .Call was still the winner. >
.Call is the "native" interface so it passes all objects directly as references - essentially the same way that R uses internally (.C with DUP=FALSE and NAOK=TRUE comes close to that, though; the fastest is .External in this respect). Also you're allocating objects as you need them so there will be less copying afterwards. However, I suspect that major part of the difference is also in the code preceding the C code involved in this -- for example as.double() will implicitly create a copy since it needs to strip attributes whereas coerceVector is a no-op if the type matches. .C is there for historical compatibility - it is essentially equivalent to .Fortran which was the original (and only) interface way back when. .Call is in comparison a more recent addition, that's why you still find a lot of .C code. Personally I don't use .C at all because compared to .Call it is so cumbersome and error-prone (you can't even tell the length of the passed vectors in C!), but others have different preferences. Cheers, Simon > On Sun, Feb 20, 2011 at 4:27 PM, Simon Urbanek > <simon.urba...@r-project.org> wrote: >> Jason, >> >> FWIW the direct interface (.Call) is more efficient and makes passing things >> from R simpler: >> >> C_matrix_multiply = function(A,B) .Call("R_matrix_multiply", A, B) >> >> The drawback is a bit more legwork on the C side, but it also gives you more >> flexibility: >> >> SEXP R_matrix_multiply(SEXP A, SEXP B) { >> double one = 1.0; >> double zero = 0.0; >> int *dimA = INTEGER(getAttrib(A, R_DimSymbol)); >> int *dimB = INTEGER(getAttrib(B, R_DimSymbol)); >> SEXP sDimC = PROTECT(allocVector(INTSXP, 2)); >> int *dimC = INTEGER(sDimC); >> SEXP C = PROTECT(allocVector(REALSXP, dimA[0] * dimB[1])); >> if (dimA[1] != dimB[0]) error("incompatible matrices!"); >> dimC[0] = dimA[0]; >> dimC[1] = dimB[1]; >> setAttrib(C, R_DimSymbol, sDimC); >> A = PROTECT(coerceVector(A, REALSXP)); >> B = PROTECT(coerceVector(B, REALSXP)); >> >> F77_CALL(dgemm)("N","N",dimA,dimB+1,dimA+1,&one,REAL(A),dimA,REAL(B),dimA+1,&zero,REAL(C),dimA); >> UNPROTECT(4); >> return C; >> } >> >> For comparison: >>> A=matrix(rnorm(1e5),500) >>> B=matrix(rnorm(1e5),,500) >> >> .Call: >> >>> system.time(for (i in 1:10) C_matrix_multiply(A,B)) >> user system elapsed >> 0.656 0.008 0.686 >> >> .C: >> >>> system.time(for (i in 1:10) CC_matrix_multiply(A,B)) >> user system elapsed >> 0.886 0.044 0.943 >> >> >> in fact .Call is even a tiny bit faster than %*%: >> >>> system.time(for (i in 1:10) A %*% B) >> user system elapsed >> 0.658 0.004 0.665 >> >> (it's not just a measurement error - it's consistent for more replications >> etc. - but it's really negligible - possibly just due to dispatch of %*%) >> >> Cheers, >> Simon >> >> >> On Feb 20, 2011, at 5:23 PM, Jason Rudy wrote: >> >>> It was indeed a simple problem! I took a look at that array.c as you >>> suggested and that cleared it right up. So, the correct C code is: >>> >>> #include <R.h> >>> #include <R_ext/Utils.h> >>> #include <R_ext/Lapack.h> >>> #include <R_ext/BLAS.h> >>> >>> void R_matrix_multiply(double * A, double * B, int * m, int *n, int * >>> p, double * C){ >>> >>> double one = 1.0; >>> double zero = 0.0; >>> >>> //Just printing the input arguments >>> Rprintf("m = %d, n = %d, p = %d\n",*m,*n,*p); >>> int i; >>> for(i=0;i<(*m**n);i++){ >>> Rprintf("%f ",A[i]); >>> } >>> Rprintf("\n"); >>> for(i=0;i<(*n**p);i++){ >>> Rprintf("%f ",B[i]); >>> } >>> Rprintf("\n"); >>> for(i=0;i<(*m**p);i++){ >>> Rprintf("%f ",C[i]); >>> } >>> Rprintf("\n"); >>> >>> //Here is the actual multiplication >>> F77_CALL(dgemm)("N","N",m,p,n,&one,A,m,B,n,&zero,C,m); >>> } >>> >>> The only difference being that I had the 4th and 5th arguments (n and >>> p) mixed up. There was also a problem in my R code after the >>> multiplication took place. For the record, the correct R code is: >>> >>> C_matrix_multiply = function(A,B){ >>> C <- matrix(0,nrow(A),ncol(B)) >>> cout <- >>> .C("R_matrix_multiply",as.double(A),as.double(B),nrow(A),ncol(A),ncol(B),as.double(C)) >>> return(matrix(cout[[6]],nrow(A),ncol(B))) >>> } >>> >>> Thanks for the help. Now that I have a functioning example I am well >>> on my way to completing this project. >>> >>> -Jason >>> >>> On Sun, Feb 20, 2011 at 7:42 AM, Prof Brian Ripley >>> <rip...@stats.ox.ac.uk> wrote: >>>> Look a close look at matprod in src/main/array in the R sources. >>>> Hint: it is the other dimensions you have wrong. >>>> >>>> And as BLAS is Fortran, counts do start at 1. >>>> >>>> On Sat, 19 Feb 2011, Jason Rudy wrote: >>>> >>>>> Dear R-devel, >>>>> >>>>> I've written a numerical solver for SOCPs (second order cone programs) >>>>> in R, and now I want to move most of the solver code into C for speed. >>>>> I've written combined R/C packages before, but in this case I need to >>>>> do matrix operations in my C code. As I have never done that before, >>>>> I'm trying to write some simple examples to make sure I understand the >>>>> basics. I am stuck on the first one. I'm trying to write a function >>>>> to multiply two matrices using the blas routine dgemm. The name of my >>>>> example package is CMATRIX. My code is as follows. >>>>> >>>>> I have a file matrix.c in my src directory: >>>>> >>>>> #include <R.h> >>>>> #include <R_ext/Utils.h> >>>>> #include <R_ext/Lapack.h> >>>>> #include <R_ext/BLAS.h> >>>>> >>>>> //Computes C = A*B >>>>> void R_matrix_multiply(double * A, double * B, int * m, int *n, int * >>>>> p, double * C){ >>>>> double one = 1.0; >>>>> double zero = 0.0; >>>>> >>>>> //Just printing the input arguments >>>>> Rprintf("m = %d, n = %d, p = %d\n",*m,*n,*p); >>>>> int i; >>>>> for(i=0;i<(*m**n);i++){ >>>>> Rprintf("%f ",A[i]); >>>>> } >>>>> Rprintf("\n"); >>>>> for(i=0;i<(*n**p);i++){ >>>>> Rprintf("%f ",B[i]); >>>>> } >>>>> Rprintf("\n"); >>>>> for(i=0;i<(*m**p);i++){ >>>>> Rprintf("%f ",C[i]); >>>>> } >>>>> Rprintf("\n"); >>>>> >>>>> >>>>> //Here is the actual multiplication >>>>> F77_CALL(dgemm)("N","N",m,n,p,&one,A,m,B,n,&zero,C,m); >>>>> } >>>>> >>>>> And the file C_matrix_multiply.R in my R directory: >>>>> >>>>> C_matrix_multiply = function(A,B){ >>>>> C <- matrix(0,nrow(A),ncol(B)) >>>>> cout <- >>>>> .C("R_matrix_multiply",as.double(A),as.double(B),nrow(A),ncol(A),ncol(B),as.double(C)) >>>>> return(matrix(cout$C,nrowA,ncol(B))) >>>>> >>>>> } >>>>> >>>>> My namespace file is: >>>>> >>>>> export("C_matrix_multiply") >>>>> useDynLib(CMATRIX.so,R_matrix_multiply) >>>>> >>>>> I'm not sure if it's necessary, but I've also included a Makevars.in >>>>> file in my src directory: >>>>> >>>>> PKG_CPPFLAGS=@PKG_CPPFLAGS@ >>>>> PKG_CFLAGS=@PKG_CFLAGS@ >>>>> PKG_LIBS=@PKG_LIBS@ ${LAPACK_LIBS} ${BLAS_LIBS} ${FLIBS} >>>>> >>>>> which I simply copied from the diversitree package, which seems to use >>>>> a lot of fortran. I have the same problem (which I am getting to) >>>>> with or without this Makevars.in file. >>>>> >>>>> I install my package using: >>>>> >>>>> R CMD INSTALL CMATRIX >>>>> >>>>> Then I start up R and attempt to run the following code: >>>>> >>>>> #Make some random matrices >>>>> A = matrix(rnorm(8),4,2) >>>>> B = matrix(rnorm(6),2,3) >>>>> >>>>> #Load my package >>>>> library(CMATRIX) >>>>> >>>>> #Print the matrices >>>>> A >>>>> B >>>>> >>>>> #Try to multiply them >>>>> product = C_matrix_multiply(A,B) >>>>> >>>>> What I want, and what according to my understanding should happen, is >>>>> for product to contain the same matrix as would result from A %*% B. >>>>> Instead, I get the following: >>>>> >>>>>> A = matrix(rnorm(8),4,2) >>>>>> B = matrix(rnorm(6),2,3) >>>>>> library(CMATRIX) >>>>>> A >>>>> >>>>> [,1] [,2] >>>>> [1,] -0.4981664 -0.7243532 >>>>> [2,] 0.1428766 -1.5501623 >>>>> [3,] -2.0624701 1.5104507 >>>>> [4,] -0.5871962 0.3049442 >>>>>> >>>>>> B >>>>> >>>>> [,1] [,2] [,3] >>>>> [1,] 0.02477964 0.5827084 1.8434375 >>>>> [2,] -0.20200104 1.7294264 0.9071397 >>>>>> >>>>>> C_matrix_multiply(A,B) >>>>> >>>>> m = 4, n = 2, p = 3 >>>>> -0.498166 0.142877 -2.062470 -0.587196 -0.724353 -1.550162 1.510451 >>>>> 0.304944 >>>>> 0.024780 -0.202001 0.582708 1.729426 1.843437 0.907140 >>>>> 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 >>>>> 0.000000 0.000000 0.000000 0.000000 0.000000 >>>>> Parameter 10 to routine DGEMM was incorrect >>>>> Mac OS BLAS parameter error in DGEMM , parameter #0, (unavailable), is 0 >>>>> >>>>> and R immediately dies. I know the arguments are being passed into >>>>> the C code and everything up to my F77_CALL is functioning based on >>>>> the printed output. The problem is definitely something to do with my >>>>> F77_CALL(dgemm) line. My understanding is that parameter 10 should be >>>>> the leading dimension of the matrix B, which in this case should be >>>>> equal to 2, the number of rows in that matrix, which is what I am >>>>> doing. I have also considered that parameter numbering starts at 0, >>>>> in which case the incorrect parameter is &zero, but again that seems >>>>> correct to me. All of my reading and research suggests I am doing >>>>> everything correctly, so I am somewhat stumped. Perhaps I am missing >>>>> something simple or obvious, as I have never done this before and am >>>>> proceeding with only google and the R docs as my guide. I am >>>>> wondering if anybody can see what I'm doing wrong here, or perhaps >>>>> something I could do to try to fix it. Any assistance would be >>>>> greatly appreciated. >>>>> >>>>> Best Regards, >>>>> >>>>> Jason Rudy >>>>> Graduate Student >>>>> Bioinformatics and Medical Informatics Program >>>>> San Diego State University >>>>> >>>>> ______________________________________________ >>>>> R-devel@r-project.org mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/r-devel >>>>> >>>> >>>> -- >>>> Brian D. Ripley, rip...@stats.ox.ac.uk >>>> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ >>>> University of Oxford, Tel: +44 1865 272861 (self) >>>> 1 South Parks Road, +44 1865 272866 (PA) >>>> Oxford OX1 3TG, UK Fax: +44 1865 272595 >>>> >>> >>> ______________________________________________ >>> R-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >>> >>> >> >> > > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel