[Rd] Matrix memory layout R vs. C
Hi everybody, I'm trying to pass a matrix from R to C, where some computation is done for performance reasons, and back to R for evaluation. But I've run into the problem that R and C seem to have different ways of representing the matrix in main memory. The C representation of a 2D matrix in linear memory is concatenation of the rows whereas in R, it's a concatenation of the columns. That leads to the problem. that an R-matrix, for example 123 456 789 is seen by C as 147 258 369 and vice versa. Here's an example of C code that simply prints the matrix it gets from R: #include #include "R.h" void printMatrix(int *mPtr, int *m, int *n) { int (*matrix)[*n] = mPtr; int j,k; for(j = 0; j < *m; j++){ for(k = 0; k < *n; k++) { printf("%d", matrix[j][k]); } printf("\n"); } } And here's what happens when I call the function in R: m <- 3; n <- 3 mat <- matrix(c(1:9), nrow=m, ncol=n, byrow=TRUE) mat [,1] [,2] [,3] [1,]123 [2,]456 [3,]789 mat <- .C("printMatrix", mat, as.integer(m), as.integer(n))[[1]] 147 258 369 No matter if you create the matrix with byrow=TRUE or FALSE, C always interprets it the other way round. Is there a way to avoid this? I've read previous posts on passing a matrix from R to C, but the essence of the answers was that "a matrix in R is just a vector with attributes", but I don't see how this helps. Maybe someone can clarify. Thanks a lot in advance! Cheers Larissa Here's the C main function showing that the C code itself is correct: #include void printMatrix(int *mPtr, int *m, int *n); int main(void) { int m, n, i; int *mPtr, *nPtr; m = 3; n = 3; mPtr = &m; nPtr = &n; int *M = malloc(m * n * sizeof(int)); for (i = 0; i < m * n; i++){ M[i] = i + 1; } printMatrix(M, mPtr, nPtr); return EXIT_SUCCESS; __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Matrix memory layout R vs. C
Larissa, So is the problem "in the matrix reference is mat[col][row] whereas in R it is mar[row, col]?" The solution is just recognizing the difference in references. Dave On Fri, Dec 6, 2013 at 7:21 AM, Larissa Hauer wrote: > > Hi everybody, > > I'm trying to pass a matrix from R to C, where some computation is done > for performance reasons, and back to R for evaluation. But I've run into > the problem that R and C seem to have different ways of representing the > matrix in main memory. The C representation of a 2D matrix in linear memory > is concatenation of the rows whereas in R, it's a concatenation of the > columns. That leads to the problem. that an R-matrix, for example > 123 > 456 > 789 > is seen by C as > 147 > 258 > 369 > and vice versa. > > Here's an example of C code that simply prints the matrix it gets from R: > > #include > #include "R.h" > > void printMatrix(int *mPtr, int *m, int *n) { > int (*matrix)[*n] = mPtr; > > int j,k; > > for(j = 0; j < *m; j++){ > for(k = 0; k < *n; k++) { > printf("%d", matrix[j][k]); > } > printf("\n"); > } > } > > And here's what happens when I call the function in R: > > m <- 3; n <- 3 >> mat <- matrix(c(1:9), nrow=m, ncol=n, byrow=TRUE) >> mat >> > [,1] [,2] [,3] > [1,]123 > [2,]456 > [3,]789 > >> mat <- .C("printMatrix", mat, as.integer(m), as.integer(n))[[1]] >> > 147 > 258 > 369 > > > No matter if you create the matrix with byrow=TRUE or FALSE, C always > interprets it the other way round. Is there a way to avoid this? I've read > previous posts on passing a matrix from R to C, but the essence of the > answers was that "a matrix in R is just a vector with attributes", but I > don't see how this helps. Maybe someone can clarify. > > Thanks a lot in advance! > > Cheers > Larissa > > Here's the C main function showing that the C code itself is correct: > > #include > > void printMatrix(int *mPtr, int *m, int *n); > > int main(void) { > int m, n, i; > int *mPtr, *nPtr; > m = 3; > n = 3; > mPtr = &m; > nPtr = &n; > > int *M = malloc(m * n * sizeof(int)); > > for (i = 0; i < m * n; i++){ > M[i] = i + 1; > } > > printMatrix(M, mPtr, nPtr); > > return EXIT_SUCCESS; > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Matrix memory layout R vs. C
On 06/12/2013 8:21 AM, Larissa Hauer wrote: Hi everybody, I'm trying to pass a matrix from R to C, where some computation is done for performance reasons, and back to R for evaluation. But I've run into the problem that R and C seem to have different ways of representing the matrix in main memory. The C representation of a 2D matrix in linear memory is concatenation of the rows whereas in R, it's a concatenation of the columns. That leads to the problem. that an R-matrix, for example 123 456 789 is seen by C as 147 258 369 and vice versa. Here's an example of C code that simply prints the matrix it gets from R: #include #include "R.h" void printMatrix(int *mPtr, int *m, int *n) { int (*matrix)[*n] = mPtr; int j,k; for(j = 0; j < *m; j++){ for(k = 0; k < *n; k++) { printf("%d", matrix[j][k]); } printf("\n"); } } And here's what happens when I call the function in R: > m <- 3; n <- 3 > mat <- matrix(c(1:9), nrow=m, ncol=n, byrow=TRUE) > mat [,1] [,2] [,3] [1,]123 [2,]456 [3,]789 > mat <- .C("printMatrix", mat, as.integer(m), as.integer(n))[[1]] 147 258 369 No matter if you create the matrix with byrow=TRUE or FALSE, C always interprets it the other way round. Is there a way to avoid this? I've read previous posts on passing a matrix from R to C, but the essence of the answers was that "a matrix in R is just a vector with attributes", but I don't see how this helps. Maybe someone can clarify. I would not assume that a 2D matrix in C doesn't have gaps in it between the rows. Let C treat it as a vector, and write a little macro that does the indexing. For example, #define INDEX(i,j) (i) + rows*(j) Then mPtr[INDEX(i,j)] will do R-style indexing (except it will be 0-based, not 1-based. You could fix that too if you wanted.) Duncan Murdoch Thanks a lot in advance! Cheers Larissa Here's the C main function showing that the C code itself is correct: #include void printMatrix(int *mPtr, int *m, int *n); int main(void) { int m, n, i; int *mPtr, *nPtr; m = 3; n = 3; mPtr = &m; nPtr = &n; int *M = malloc(m * n * sizeof(int)); for (i = 0; i < m * n; i++){ M[i] = i + 1; } printMatrix(M, mPtr, nPtr); return EXIT_SUCCESS; __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Matrix memory layout R vs. C
On Fri, Dec 6, 2013 at 9:38 AM, Duncan Murdoch wrote: > On 06/12/2013 8:21 AM, Larissa Hauer wrote: [...] > > > I would not assume that a 2D matrix in C doesn't have gaps in it between the > rows. Let C treat it as a vector, and write a little macro that does the > indexing. For example, > > #define INDEX(i,j) (i) + rows*(j) I would make this #define INDEX(i,j) ((i) + rows*(j)) just to be on the safe side. Gabor [...] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Matrix memory layout R vs. C
On 06/12/2013 14:42, Gábor Csárdi wrote: On Fri, Dec 6, 2013 at 9:38 AM, Duncan Murdoch wrote: On 06/12/2013 8:21 AM, Larissa Hauer wrote: [...] I would not assume that a 2D matrix in C doesn't have gaps in it between the rows. Let C treat it as a vector, and write a little macro that does the indexing. For example, #define INDEX(i,j) (i) + rows*(j) I would make this #define INDEX(i,j) ((i) + rows*(j)) just to be on the safe side. And to be safer on a 64-bit platform #define INDEX(i,j) ((i) + rows*(R_xlen_t)(j)) since rows*j might overflow there. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Matrix memory layout R vs. C
On Fri, Dec 6, 2013 at 10:42 AM, Prof Brian Ripley wrote: > On 06/12/2013 14:42, Gábor Csárdi wrote: >> >> On Fri, Dec 6, 2013 at 9:38 AM, Duncan Murdoch >> wrote: >>> >>> On 06/12/2013 8:21 AM, Larissa Hauer wrote: >> >> [...] >>> >>> >>> >>> I would not assume that a 2D matrix in C doesn't have gaps in it between >>> the >>> rows. Let C treat it as a vector, and write a little macro that does the >>> indexing. For example, >>> >>> #define INDEX(i,j) (i) + rows*(j) >> >> >> I would make this >> >> #define INDEX(i,j) ((i) + rows*(j)) >> >> just to be on the safe side. > > > And to be safer on a 64-bit platform > > #define INDEX(i,j) ((i) + rows*(R_xlen_t)(j)) > > since rows*j might overflow there. Indeed. Of course this still does not save you from indexing out-of-range and integer overflow in the addition. Gabor [...] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Matrix memory layout R vs. C
Hi Larissa, > I'm trying to pass a matrix from R to C, where some computation is > done for performance reasons, and back to R for evaluation. But I've > run into the problem that R and C seem to have different ways of > representing the matrix in main memory. The C representation of a 2D > matrix in linear memory is concatenation of the rows whereas in R, > it's a concatenation of the columns. That leads to the problem. R uses column-major order [1] because that's the order used by FORTRAN and R uses many libraries with a FORTRAN interface (most important: BLAS and LAPACK for numerical linear algebra.) That's the situation with many other languages / libraries that use similar interfaces, such as MATLAB, Octave, Julia, and Scilab [1]. So, there's no way around it and you just have to get used to referencing matrix entries in col-major order. [1] http://en.wikipedia.org/wiki/Row-major_order > Here's an example of C code that simply prints the matrix it gets from R: Try this instead: #include #include "R.h" void printMatrix(int *mPtr, int m, int n) { int j,k; for(j = 0; j < m; j++){ for(k = 0; k < n; k++) { printf("%d\t", mPtr[j + m * k]); } printf("\n"); } } > No matter if you create the matrix with byrow=TRUE or FALSE, C > always interprets it the other way round. Is there a way to avoid > this? I've read previous posts on passing a matrix from R to C, but > the essence of the answers was that "a matrix in R is just a vector > with attributes", but I don't see how this helps. Maybe someone can > clarify. Specifying byrow=TRUE only changes how the matrix is *read*, not how it's stored. A matrix -- and, more generally, an array -- is in fact just a vector with (dimension) attributes, but that just specifies the memory layout of the matrix, and not its representation (that is, it doesn't help.) Unfortunately, there's no way to avoid this, but it shouldn't be too bad to get used to it. :) Cheers, Luis -- Computers are useless. They can only give you answers. -- Pablo Picasso -- Luis Carvalho (Kozure) lua -e 'print((("lexcarva...@no.gmail.spam.com"):gsub("(%u+%.)","")))' __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Matrix memory layout R vs. C
On 06/12/2013 15:49, Gábor Csárdi wrote: On Fri, Dec 6, 2013 at 10:42 AM, Prof Brian Ripley wrote: On 06/12/2013 14:42, Gábor Csárdi wrote: On Fri, Dec 6, 2013 at 9:38 AM, Duncan Murdoch wrote: On 06/12/2013 8:21 AM, Larissa Hauer wrote: [...] I would not assume that a 2D matrix in C doesn't have gaps in it between the rows. Let C treat it as a vector, and write a little macro that does the indexing. For example, #define INDEX(i,j) (i) + rows*(j) I would make this #define INDEX(i,j) ((i) + rows*(j)) just to be on the safe side. And to be safer on a 64-bit platform #define INDEX(i,j) ((i) + rows*(R_xlen_t)(j)) since rows*j might overflow there. Indeed. Of course this still does not save you from indexing out-of-range and integer overflow in the addition. To a large extent it does. It means the arithmetic is done in R_xlen_t, 64-bit on 64-bit machines, and so any legitimate index gets computed correctly. Gabor [...] -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Matrix memory layout R vs. C
> And to be safer on a 64-bit platform > > #define INDEX(i,j) ((i) + rows*(R_xlen_t)(j)) > > since rows*j might overflow there. Shouldn't 'rows' be also a parameter? #define INDEX(rows,i,j) ((i) + (rows)*((R_xlen_t)(j))) Cheers, Luis -- Computers are useless. They can only give you answers. -- Pablo Picasso -- Luis Carvalho (Kozure) lua -e 'print((("lexcarva...@no.gmail.spam.com"):gsub("(%u+%.)","")))' __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Matrix memory layout R vs. C
On 06/12/2013 15:51, Luis Carvalho wrote: And to be safer on a 64-bit platform #define INDEX(i,j) ((i) + rows*(R_xlen_t)(j)) since rows*j might overflow there. Shouldn't 'rows' be also a parameter? This is a macro, not a function. 'rows' (I would have use nr or nrows) is going to be the same at all invocations. #define INDEX(rows,i,j) ((i) + (rows)*((R_xlen_t)(j))) Cheers, Luis -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Depending/Importing data only packages
Hi all, What should you do when you rely on a data only package. If you just "Depend" on it, you get the following from R CMD check: Package in Depends field not imported from: 'hflights' These packages needs to imported from for the case when this namespace is loaded but not attached. But there's nothing in the namespace to import, so adding it to imports doesn't seem like the right answer. Is that just a spurious note? Hadley -- http://had.co.nz/ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Linking to native routines in other packages
> But now if I do Depends: Rcpp or Imports: Rcpp for the sole purpose of this > LinkingTo mechanism, I'm getting > > * checking dependencies in R code ... NOTE > Namespace in Imports field not imported from: ‘Rcpp’ > All declared Imports should be used. > See the information on DESCRIPTION files in the chapter ‘Creating R > packages’ of the ‘Writing R Extensions’ manual. This is just a note, so perhaps it's spurious, and can be ignored as long as you provide an explanation when submitting to CRAN. Hadley -- http://had.co.nz/ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Linking to native routines in other packages
Le 06/12/2013 22:29, Hadley Wickham a écrit : But now if I do Depends: Rcpp or Imports: Rcpp for the sole purpose of this LinkingTo mechanism, I'm getting * checking dependencies in R code ... NOTE Namespace in Imports field not imported from: ‘Rcpp’ All declared Imports should be used. See the information on DESCRIPTION files in the chapter ‘Creating R packages’ of the ‘Writing R Extensions’ manual. This is just a note, so perhaps it's spurious, and can be ignored as long as you provide an explanation when submitting to CRAN. Hadley The problem is that I'd have to ask every package maintainer to negociate that when they release a package that depends on Rcpp. Perhaps that's alright. Romain -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel