If nm is double, very likely INTEGER(nm)[0] is negative and your C code does nothing at all. Hence it could be fast but useless.
On Mon, 3 Oct 2005, Heather Turner wrote: > Hi, > > I am trying to speed up part of an algorithm in which certain columns of > a large matrix (X) are replaced by the element-wise product of a matrix > (M) and a vector (v). In R, the code might be > > X[, ind] <- M * v > > I have written a small C routine to do this for me, but the timing > depends on how I define the storage.mode of the objects in R and the > data types in C, in a way which I don't understand. > > To illustrate, suppose I have the following for X, M and v: > nr <- 10000 > X <- matrix(1, nr, 500) > M <- matrix(1:(nr*450), nr, 450) > v <- 1:nr > storage.mode(X) <- storage.mode(M) <- storage.mode(v) <- "double" > > Then I have the following integers required by my C code - these are the > objects which I'm not sure how to handle > a <- 50*nr #index of X indicating where to start replacing values > nm <- length(M) > nv <- length(v) > > I would have thought I wanted > > storage.mode(a) <- storage.mode(nm) <- storage.mode(nv) <- "integer" > > to go with the following C code > > # include <Rinternals.h> > > SEXP prod_integer(SEXP X, SEXP M, SEXP v, SEXP a, SEXP nm, SEXP nv) { > int i = INTEGER(a)[0], i1 = 0, i2 = 0; > > for ( ; i1 < INTEGER(nm)[0]; i2 = (++i2 == INTEGER(nv)[0]) ? 0 : i2) { > REAL(X)[i++] = REAL(M)[i1++] * REAL(v)[i2]; > } > return(X); > } > > Running this is R gives the following timings on my PC > >> dyn.load("D:/C_routines/prod_integer") >> for(i in 1:3) {print(system.time(.Call("prod_integer", X, M, v, a, nm, nv)))} > [1] 0.17 0.00 0.18 NA NA > [1] 0.18 0.00 0.17 NA NA > [1] 0.15 0.00 0.17 NA NA > > But strangely (to me) if I change the storage mode of my integers to > "double", I get > >> storage.mode(a) <- storage.mode(nm) <- storage.mode(nv) <- "double" >> for(i in 1:3) {print(system.time(.Call("prod_integer", X, M, v, a, nm, nv)))} > [1] 0 0 0 NA NA > [1] 0 0 0 NA NA > [1] 0 0 0 NA NA > > If on the other hand I use REAL instead of INTEGER in my C code: > > # include <Rinternals.h> > > SEXP prod_real(SEXP X, SEXP M, SEXP v, SEXP a, SEXP nm, SEXP nv) { > int i = REAL(a)[0], i1 = 0, i2 = 0; > > for ( ; i1 < REAL(nm)[0]; i2 = (++i2 == REAL(nv)[0]) ? 0 : i2) { > REAL(X)[i++] = REAL(M)[i1++] * REAL(v)[i2]; > } > return(X); > } > > The reverse is true: >> storage.mode(a) <- storage.mode(nm) <- storage.mode(nv) <- "integer" >> for(i in 1:3) {print(system.time(.Call("prod_real", X, M, v, a, nm, nv)))} > [1] 0 0 0 NA NA > [1] 0 0 0 NA NA > [1] 0 0 0 NA NA >> storage.mode(a) <- storage.mode(nm) <- storage.mode(nv) <- "double" >> for(i in 1:3) {print(system.time(.Call("prod_real", X, M, v, a, nm, nv)))} > [1] 0.22 0.00 0.22 NA NA > [1] 0.21 0.00 0.20 NA NA > [1] 0.21 0.00 0.22 NA NA >> identical(.Call("prod_integer", X, M, v, a, nm, nv), .Call("prod_real", X, >> M, v, a, nm, nv)) > [1] TRUE > > So I seem to get the fastest results if I store a, nm and nv as doubles in R > and treat them as integers in C or if I store them as integers in R and treat > them as doubles in C, rather than matching the storage in R to the data type > in C. > > I must be misunderstanding something here. Can someone explain what's going > on - please note I have only just begun to learn C, apologies if this is a > basic C issue. > > Thanks, > > Heather > > Dr H Turner > Research Assistant > Dept. of Statistics > The University of Warwick > Coventry > CV4 7AL > > Tel: 024 76575870 > Url: www.warwick.ac.uk/go/heatherturner > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > > -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel