[Rd] Matrix memory layout R vs. C

2013-12-06 Thread Larissa Hauer


Hi everybody,

I'm trying to pass a matrix from R to C, where some computation is done 
for performance reasons, and back to R for evaluation. But I've run into 
the problem that R and C seem to have different ways of representing the 
matrix in main memory. The C representation of a 2D matrix in linear 
memory is concatenation of the rows whereas in R, it's a concatenation 
of the columns.  That leads to the problem. that an R-matrix, for example

123
456
789
is seen by C as
147
258
369
and vice versa.

Here's an example of C code that simply prints the matrix it gets from R:

#include 
#include "R.h"

void printMatrix(int *mPtr, int *m, int *n) {
  int (*matrix)[*n] = mPtr;

  int j,k;

  for(j = 0; j < *m; j++){
for(k = 0; k < *n; k++) {
  printf("%d", matrix[j][k]);
}
  printf("\n");
  }
}

And here's what happens when I call the function in R:


m <- 3; n <- 3
mat <- matrix(c(1:9), nrow=m, ncol=n, byrow=TRUE)
mat

 [,1] [,2] [,3]
[1,]123
[2,]456
[3,]789

mat <- .C("printMatrix", mat, as.integer(m), as.integer(n))[[1]]

147
258
369


No matter if you create the matrix with byrow=TRUE or FALSE, C always 
interprets it the other way round. Is there a way to avoid this? I've 
read previous posts on passing a matrix from R to C, but the essence of 
the answers was that "a matrix in R is just a vector with attributes", 
but I don't see how this helps. Maybe someone can clarify.


Thanks a lot in advance!

Cheers
Larissa

Here's the C main function showing that the C code itself is correct:

#include 

void printMatrix(int *mPtr, int *m, int *n);

int main(void) {
  int m, n, i;
  int *mPtr, *nPtr;
  m = 3;
  n = 3;
  mPtr = &m;
  nPtr = &n;

  int *M = malloc(m * n * sizeof(int));

  for (i = 0; i < m * n; i++){
  M[i] = i + 1;
  }

  printMatrix(M, mPtr, nPtr);

  return EXIT_SUCCESS;

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Matrix memory layout R vs. C

2013-12-06 Thread Lorenz, David
Larissa,
  So is the problem "in the matrix reference is mat[col][row] whereas in R
it is mar[row, col]?"
  The solution is just recognizing the difference in references.
Dave


On Fri, Dec 6, 2013 at 7:21 AM, Larissa Hauer
wrote:

>
> Hi everybody,
>
> I'm trying to pass a matrix from R to C, where some computation is done
> for performance reasons, and back to R for evaluation. But I've run into
> the problem that R and C seem to have different ways of representing the
> matrix in main memory. The C representation of a 2D matrix in linear memory
> is concatenation of the rows whereas in R, it's a concatenation of the
> columns.  That leads to the problem. that an R-matrix, for example
> 123
> 456
> 789
> is seen by C as
> 147
> 258
> 369
> and vice versa.
>
> Here's an example of C code that simply prints the matrix it gets from R:
>
> #include 
> #include "R.h"
>
> void printMatrix(int *mPtr, int *m, int *n) {
>   int (*matrix)[*n] = mPtr;
>
>   int j,k;
>
>   for(j = 0; j < *m; j++){
> for(k = 0; k < *n; k++) {
>   printf("%d", matrix[j][k]);
> }
>   printf("\n");
>   }
> }
>
> And here's what happens when I call the function in R:
>
>  m <- 3; n <- 3
>> mat <- matrix(c(1:9), nrow=m, ncol=n, byrow=TRUE)
>> mat
>>
>  [,1] [,2] [,3]
> [1,]123
> [2,]456
> [3,]789
>
>> mat <- .C("printMatrix", mat, as.integer(m), as.integer(n))[[1]]
>>
> 147
> 258
> 369
>
>
> No matter if you create the matrix with byrow=TRUE or FALSE, C always
> interprets it the other way round. Is there a way to avoid this? I've read
> previous posts on passing a matrix from R to C, but the essence of the
> answers was that "a matrix in R is just a vector with attributes", but I
> don't see how this helps. Maybe someone can clarify.
>
> Thanks a lot in advance!
>
> Cheers
> Larissa
>
> Here's the C main function showing that the C code itself is correct:
>
> #include 
>
> void printMatrix(int *mPtr, int *m, int *n);
>
> int main(void) {
>   int m, n, i;
>   int *mPtr, *nPtr;
>   m = 3;
>   n = 3;
>   mPtr = &m;
>   nPtr = &n;
>
>   int *M = malloc(m * n * sizeof(int));
>
>   for (i = 0; i < m * n; i++){
>   M[i] = i + 1;
>   }
>
>   printMatrix(M, mPtr, nPtr);
>
>   return EXIT_SUCCESS;
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Matrix memory layout R vs. C

2013-12-06 Thread Duncan Murdoch

On 06/12/2013 8:21 AM, Larissa Hauer wrote:

Hi everybody,

I'm trying to pass a matrix from R to C, where some computation is done
for performance reasons, and back to R for evaluation. But I've run into
the problem that R and C seem to have different ways of representing the
matrix in main memory. The C representation of a 2D matrix in linear
memory is concatenation of the rows whereas in R, it's a concatenation
of the columns.  That leads to the problem. that an R-matrix, for example
123
456
789
is seen by C as
147
258
369
and vice versa.

Here's an example of C code that simply prints the matrix it gets from R:

#include 
#include "R.h"

void printMatrix(int *mPtr, int *m, int *n) {
int (*matrix)[*n] = mPtr;

int j,k;

for(j = 0; j < *m; j++){
  for(k = 0; k < *n; k++) {
printf("%d", matrix[j][k]);
  }
printf("\n");
}
}

And here's what happens when I call the function in R:

> m <- 3; n <- 3
> mat <- matrix(c(1:9), nrow=m, ncol=n, byrow=TRUE)
> mat
   [,1] [,2] [,3]
[1,]123
[2,]456
[3,]789
> mat <- .C("printMatrix", mat, as.integer(m), as.integer(n))[[1]]
147
258
369


No matter if you create the matrix with byrow=TRUE or FALSE, C always
interprets it the other way round. Is there a way to avoid this? I've
read previous posts on passing a matrix from R to C, but the essence of
the answers was that "a matrix in R is just a vector with attributes",
but I don't see how this helps. Maybe someone can clarify.


I would not assume that a 2D matrix in C doesn't have gaps in it between 
the rows.  Let C treat it as a vector, and write a little macro that 
does the indexing.  For example,


#define INDEX(i,j) (i) + rows*(j)

Then mPtr[INDEX(i,j)] will do R-style indexing (except it will be 
0-based, not 1-based.  You could fix that too if you wanted.)


Duncan Murdoch


Thanks a lot in advance!

Cheers
Larissa

Here's the C main function showing that the C code itself is correct:

#include 

void printMatrix(int *mPtr, int *m, int *n);

int main(void) {
int m, n, i;
int *mPtr, *nPtr;
m = 3;
n = 3;
mPtr = &m;
nPtr = &n;

int *M = malloc(m * n * sizeof(int));

for (i = 0; i < m * n; i++){
M[i] = i + 1;
}

printMatrix(M, mPtr, nPtr);

return EXIT_SUCCESS;

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Matrix memory layout R vs. C

2013-12-06 Thread Gábor Csárdi
On Fri, Dec 6, 2013 at 9:38 AM, Duncan Murdoch  wrote:
> On 06/12/2013 8:21 AM, Larissa Hauer wrote:
[...]
>
>
> I would not assume that a 2D matrix in C doesn't have gaps in it between the
> rows.  Let C treat it as a vector, and write a little macro that does the
> indexing.  For example,
>
> #define INDEX(i,j) (i) + rows*(j)

I would make this

#define INDEX(i,j) ((i) + rows*(j))

just to be on the safe side.

Gabor

[...]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Matrix memory layout R vs. C

2013-12-06 Thread Prof Brian Ripley

On 06/12/2013 14:42, Gábor Csárdi wrote:

On Fri, Dec 6, 2013 at 9:38 AM, Duncan Murdoch  wrote:

On 06/12/2013 8:21 AM, Larissa Hauer wrote:

[...]



I would not assume that a 2D matrix in C doesn't have gaps in it between the
rows.  Let C treat it as a vector, and write a little macro that does the
indexing.  For example,

#define INDEX(i,j) (i) + rows*(j)


I would make this

#define INDEX(i,j) ((i) + rows*(j))

just to be on the safe side.


And to be safer on a 64-bit platform

#define INDEX(i,j) ((i) + rows*(R_xlen_t)(j))

since rows*j might overflow there.

--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Matrix memory layout R vs. C

2013-12-06 Thread Gábor Csárdi
On Fri, Dec 6, 2013 at 10:42 AM, Prof Brian Ripley
 wrote:
> On 06/12/2013 14:42, Gábor Csárdi wrote:
>>
>> On Fri, Dec 6, 2013 at 9:38 AM, Duncan Murdoch 
>> wrote:
>>>
>>> On 06/12/2013 8:21 AM, Larissa Hauer wrote:
>>
>> [...]
>>>
>>>
>>>
>>> I would not assume that a 2D matrix in C doesn't have gaps in it between
>>> the
>>> rows.  Let C treat it as a vector, and write a little macro that does the
>>> indexing.  For example,
>>>
>>> #define INDEX(i,j) (i) + rows*(j)
>>
>>
>> I would make this
>>
>> #define INDEX(i,j) ((i) + rows*(j))
>>
>> just to be on the safe side.
>
>
> And to be safer on a 64-bit platform
>
> #define INDEX(i,j) ((i) + rows*(R_xlen_t)(j))
>
> since rows*j might overflow there.

Indeed. Of course this still does not save you from indexing
out-of-range and integer overflow in the addition.

Gabor

[...]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Matrix memory layout R vs. C

2013-12-06 Thread Luis Carvalho
Hi Larissa,

> I'm trying to pass a matrix from R to C, where some computation is
> done for performance reasons, and back to R for evaluation. But I've
> run into the problem that R and C seem to have different ways of
> representing the matrix in main memory. The C representation of a 2D
> matrix in linear memory is concatenation of the rows whereas in R,
> it's a concatenation of the columns.  That leads to the problem.



R uses column-major order [1] because that's the order used by FORTRAN and R
uses many libraries with a FORTRAN interface (most important: BLAS and LAPACK
for numerical linear algebra.) That's the situation with many other languages
/ libraries that use similar interfaces, such as MATLAB, Octave, Julia, and
Scilab [1]. So, there's no way around it and you just have to get used to
referencing matrix entries in col-major order.

[1] http://en.wikipedia.org/wiki/Row-major_order


> Here's an example of C code that simply prints the matrix it gets from R:



Try this instead:

#include 
#include "R.h"

void printMatrix(int *mPtr, int m, int n) {
  int j,k;

  for(j = 0; j < m; j++){
for(k = 0; k < n; k++) {
  printf("%d\t", mPtr[j + m * k]);
}
printf("\n");
  }
}

> No matter if you create the matrix with byrow=TRUE or FALSE, C
> always interprets it the other way round. Is there a way to avoid
> this? I've read previous posts on passing a matrix from R to C, but
> the essence of the answers was that "a matrix in R is just a vector
> with attributes", but I don't see how this helps. Maybe someone can
> clarify.

Specifying byrow=TRUE only changes how the matrix is *read*, not how it's
stored. A matrix -- and, more generally, an array -- is in fact just a vector
with (dimension) attributes, but that just specifies the memory layout of the
matrix, and not its representation (that is, it doesn't help.) Unfortunately,
there's no way to avoid this, but it shouldn't be too bad to get used to it.
:)

Cheers,
Luis

-- 
Computers are useless. They can only give you answers.
-- Pablo Picasso

-- 
Luis Carvalho (Kozure)
lua -e 'print((("lexcarva...@no.gmail.spam.com"):gsub("(%u+%.)","")))'

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Matrix memory layout R vs. C

2013-12-06 Thread Prof Brian Ripley

On 06/12/2013 15:49, Gábor Csárdi wrote:

On Fri, Dec 6, 2013 at 10:42 AM, Prof Brian Ripley
 wrote:

On 06/12/2013 14:42, Gábor Csárdi wrote:


On Fri, Dec 6, 2013 at 9:38 AM, Duncan Murdoch 
wrote:


On 06/12/2013 8:21 AM, Larissa Hauer wrote:


[...]




I would not assume that a 2D matrix in C doesn't have gaps in it between
the
rows.  Let C treat it as a vector, and write a little macro that does the
indexing.  For example,

#define INDEX(i,j) (i) + rows*(j)



I would make this

#define INDEX(i,j) ((i) + rows*(j))

just to be on the safe side.



And to be safer on a 64-bit platform

#define INDEX(i,j) ((i) + rows*(R_xlen_t)(j))

since rows*j might overflow there.


Indeed. Of course this still does not save you from indexing
out-of-range and integer overflow in the addition.


To a large extent it does.  It means the arithmetic is done in R_xlen_t, 
64-bit on 64-bit machines, and so any legitimate index gets computed 
correctly.





Gabor

[...]




--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Matrix memory layout R vs. C

2013-12-06 Thread Luis Carvalho
> And to be safer on a 64-bit platform
> 
> #define INDEX(i,j) ((i) + rows*(R_xlen_t)(j))
> 
> since rows*j might overflow there.

Shouldn't 'rows' be also a parameter?

#define INDEX(rows,i,j) ((i) + (rows)*((R_xlen_t)(j)))

Cheers,
Luis

-- 
Computers are useless. They can only give you answers.
-- Pablo Picasso

-- 
Luis Carvalho (Kozure)
lua -e 'print((("lexcarva...@no.gmail.spam.com"):gsub("(%u+%.)","")))'

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Matrix memory layout R vs. C

2013-12-06 Thread Prof Brian Ripley

On 06/12/2013 15:51, Luis Carvalho wrote:

And to be safer on a 64-bit platform

#define INDEX(i,j) ((i) + rows*(R_xlen_t)(j))

since rows*j might overflow there.


Shouldn't 'rows' be also a parameter?


This is a macro, not a function.  'rows' (I would have use nr or nrows) 
is going to be the same at all invocations.




#define INDEX(rows,i,j) ((i) + (rows)*((R_xlen_t)(j)))

Cheers,
Luis




--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Depending/Importing data only packages

2013-12-06 Thread Hadley Wickham
Hi all,

What should you do when you rely on a data only package. If you just
"Depend" on it, you get the following from R CMD check:

Package in Depends field not imported from: 'hflights'
  These packages needs to imported from for the case when
  this namespace is loaded but not attached.

But there's nothing in the namespace to import, so adding it to
imports doesn't seem like the right answer.  Is that just a spurious
note?

Hadley

-- 
http://had.co.nz/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Linking to native routines in other packages

2013-12-06 Thread Hadley Wickham
> But now if I do Depends: Rcpp or Imports: Rcpp for the sole purpose of this
> LinkingTo mechanism, I'm getting
>
> * checking dependencies in R code ... NOTE
> Namespace in Imports field not imported from: ‘Rcpp’
>   All declared Imports should be used.
> See the information on DESCRIPTION files in the chapter ‘Creating R
> packages’ of the ‘Writing R Extensions’ manual.

This is just a note, so perhaps it's spurious, and can be ignored as
long as you provide an explanation when submitting to CRAN.

Hadley

-- 
http://had.co.nz/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Linking to native routines in other packages

2013-12-06 Thread Romain Francois

Le 06/12/2013 22:29, Hadley Wickham a écrit :

But now if I do Depends: Rcpp or Imports: Rcpp for the sole purpose of this
LinkingTo mechanism, I'm getting

* checking dependencies in R code ... NOTE
Namespace in Imports field not imported from: ‘Rcpp’
   All declared Imports should be used.
See the information on DESCRIPTION files in the chapter ‘Creating R
packages’ of the ‘Writing R Extensions’ manual.


This is just a note, so perhaps it's spurious, and can be ignored as
long as you provide an explanation when submitting to CRAN.

Hadley


The problem is that I'd have to ask every package maintainer to 
negociate that when they release a package that depends on Rcpp.


Perhaps that's alright.

Romain

--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel