Re: [Rd] Objects created by more than one data call?
On 5/21/2013 3:03 PM, William Dunlap wrote: If you look at data(package="Ecat")$results[,"Item"] you will see the items "Hstarts", "Hstarts (Intratesm)", and "Hstarts (Intratesq)" which I think means that the dataset Hstarts is found in 3 .rda files, "Hstarts.rda", "Intratesq.rda", and "Intratesm.rda". There are duplicate, modulo (filename), items for "MedExp" as well. Thanks for this. I may get me closer, but I still don't see it: (data(Intratesm)) imports only the object Intratesm, etc. For more details, see below. Any other suggestions? Thanks again, Spencer > Ecdat.data <- data(package="Ecdat")$results > (Hstarts2 <- grep('Hstarts', Ecdat.data[, 'Item'])) [1] 47 48 49 > (MedExp2 <- grep('MedExp', Ecdat.data[, 'Item'])) [1] 67 68 > Ecdat.data[Hstarts2, ] Package LibPath Item [1,] "Ecdat" "C:/Users/sgraves/pgms/R/R-3.0.0/library" "Hstarts" [2,] "Ecdat" "C:/Users/sgraves/pgms/R/R-3.0.0/library" "Hstarts (Intratesm)" [3,] "Ecdat" "C:/Users/sgraves/pgms/R/R-3.0.0/library" "Hstarts (Intratesq)" Title [1,] "Housing Starts" [2,] "Housing Starts" [3,] "Housing Starts" > Ecdat.data[MedExp2,] Package LibPath Item [1,] "Ecdat" "C:/Users/sgraves/pgms/R/R-3.0.0/library" "MedExp" [2,] "Ecdat" "C:/Users/sgraves/pgms/R/R-3.0.0/library" "MedExp (VietNamH)" Title [1,] "Structure of Demand for Medical Care" [2,] "Structure of Demand for Medical Care" > library(Ecdat) > (data(Intratesm)) [1] "Intratesm" > (data(Intratesq)) [1] "Intratesq" Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On Behalf Of Spencer Graves Sent: Tuesday, May 21, 2013 12:21 PM To: Prof Brian Ripley Cc: r-devel@r-project.org Subject: Re: [Rd] Objects created by more than one data call? On 5/21/2013 9:03 AM, Prof Brian Ripley wrote: On 21/05/2013 16:51, Spencer Graves wrote: On 5/21/2013 7:47 AM, Prof Brian Ripley wrote: On 21/05/2013 15:28, Spencer Graves wrote: On 5/20/2013 10:10 PM, Prof Brian Ripley wrote: On 21/05/2013 00:12, Spencer Graves wrote: Hello, All: If I use LazyData with the Ecdat package on R-Forge, "R CMD check" reports "no visible binding for global variable 'nonEnglishNames'", where 'nonEnglishNames' is a dataset in Ecdat used as the default argument for a function. With LazyData, that NOTE disappears. However, then I get, "Warning: objects 'Hstarts', 'Hstarts', 'MedExp' are created by more than one data call". What do you suggest I do to fix this problem? Not create the objects in more than one data() call. Check what each of your data() calls produces. Thanks. How do I do that? Call data() on each in turn, and see what files get added to an empty workspace. Like the following? You missed the 'empty'. Look at tools:::data2LazyLoadDB to see how this is checked. Thanks for the suggestion. Unfortunately, I tried that function, including stepping through it line by line, fixing references to other functions not exported from tools, without enlightenment; see below. Thanks again, Spencer > lib.loc = NULL > package='Ecdat' > pkgpath <- find.package(package, lib.loc, quiet = TRUE) > pkgpath [1] "C:/Users/sgraves/pgms/R/R-3.0.0/library/Ecdat" > dataDir <- file.path(pkgpath, "data") > dataDir [1] "C:/Users/sgraves/pgms/R/R-3.0.0/library/Ecdat/data" > enc <- tools:::.read_description(file.path(pkgpath, "DESCRIPTION"))["Encoding"] > enc NA > if (!is.na(enc)) { + op <- options(encoding = enc) + on.exit(options(encoding = op[[1L]])) + } > file_test("-d", dataDir) [1] TRUE > file.path(dataDir, "Rdata.rds") [1] "C:/Users/sgraves/pgms/R/R-3.0.0/library/Ecdat/data/Rdata.rds" > (file.exists(file.path(dataDir, "Rdata.rds")) && file.exists(file.path(dataDir, + paste(package, "rdx", sep = "."))) && file.exists(file.path(dataDir, + paste(package, "rdb", sep = "." [1] FALSE > file.exists(file.path(dataDir, + paste(package, "rdx", sep = "."))) [1] FALSE > file.path(dataDir, + paste(package, "rdx", sep = ".")) [1] "C:/Users/sgraves/pgms/R/R-3.0.0/library/Ecdat/data/Ecdat.rdx" > dataEnv <- new.env(hash = TRUE) > tmpEnv <- new.env() > f0 <- files <- list_files_with_type(dataDir, "data") Error: could not find function "list_files_with_type" > f0 <- files <- tools:::list_files_with_type(dataDir, "data") > files <- unique(basename(file_path_sans_ext(files, + TRUE))) Error in basename(file_path_sans_ext(files, TRUE)) : could not find function "file_path_sans_ext" > files <- unique(basename(tools:::file_path_sans_ext(files, + TRUE))) > dlist <- vector("list", length(files)) > files character(0) > names(dlist) <- files > loaded <- character(0L) > loaded character(0) > for (f in files) { + utils::data(lis
Re: [Rd] Objects created by more than one data call?
I used svn to copy the current version of Ecdat from Rforge to my PC C:\temp\packages>svn checkout svn://r-forge.r-project.org/svnroot/ecdat/ then fired up R to look at the rda files in it. > setwd("c:/temp/packages/Ecdat/ecdat/pkg/data") > read.dcf("../DESCRIPTION")[, c("Package","Version")] Package Version "Ecdat" "0.2-3" > dir.rda <- function(rdaFile) { e <- new.env() ; load(rdaFile, envir=e) ; > objects(e, all=TRUE)} > dir.rda("VietNamH.rda") [1] "MedExp" > rdas <- dir(pattern="\\.rda$") > names(rdas) <- rdas > z <- lapply(rdas, dir.rda) > tab <- table(unlist(z)) > tab[tab>1] Hstarts MedExp 3 2 > z[sapply(z, function(zi)"Hstarts" %in% zi)] $Hstarts.rda [1] "Hstarts" $Intratesm.rda [1] "Hstarts" $Intratesq.rda [1] "Hstarts" > z[sapply(z, function(zi)"MedExp" %in% zi)] $MedExp.rda [1] "MedExp" $VietNamH.rda [1] "MedExp" It looks some files don't contain what their names suggest: > dir.rda("VietNamH.rda") [1] "MedExp" The two versions of MedExp are quite different: > load("VietNamH.rda", envViet <- new.env(parent=emptyenv())) > load("MedExp.rda", envMed <- new.env(parent=emptyenv())) > objects(envViet) [1] "MedExp" > objects(envMed) [1] "MedExp" > all.equal(envViet$MedExp, envMed$MedExp) [1] "Names: 11 string mismatches" [2] "Length mismatch: comparison on first 11 components" [3] "Component 1: 'current' is not a factor" ... [18] "Component 10: Numeric: lengths (5999, 5574) differ" [19] "Component 11: 'current' is not a factor" Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -Original Message- > From: Spencer Graves [mailto:spencer.gra...@prodsyse.com] > Sent: Wednesday, May 22, 2013 1:27 PM > To: William Dunlap > Cc: r-devel@r-project.org > Subject: Re: [Rd] Objects created by more than one data call? > > On 5/21/2013 3:03 PM, William Dunlap wrote: > > If you look at > > data(package="Ecat")$results[,"Item"] > > you will see the items "Hstarts", "Hstarts (Intratesm)", and "Hstarts > > (Intratesq)" > > which I think means that the dataset Hstarts is found in 3 .rda files, > > "Hstarts.rda", > > "Intratesq.rda", and "Intratesm.rda". There are duplicate, modulo > > (filename), > > items for "MedExp" as well. > > >Thanks for this. I may get me closer, but I still don't see it: > (data(Intratesm)) imports only the object Intratesm, etc. For more > details, see below. > > >Any other suggestions? > > >Thanks again, >Spencer > > > > Ecdat.data <- data(package="Ecdat")$results > > (Hstarts2 <- grep('Hstarts', Ecdat.data[, 'Item'])) > [1] 47 48 49 > > (MedExp2 <- grep('MedExp', Ecdat.data[, 'Item'])) > [1] 67 68 > > Ecdat.data[Hstarts2, ] > Package LibPath Item > [1,] "Ecdat" "C:/Users/sgraves/pgms/R/R-3.0.0/library" "Hstarts" > [2,] "Ecdat" "C:/Users/sgraves/pgms/R/R-3.0.0/library" "Hstarts (Intratesm)" > [3,] "Ecdat" "C:/Users/sgraves/pgms/R/R-3.0.0/library" "Hstarts (Intratesq)" > Title > [1,] "Housing Starts" > [2,] "Housing Starts" > [3,] "Housing Starts" > > Ecdat.data[MedExp2,] > Package LibPath Item > [1,] "Ecdat" "C:/Users/sgraves/pgms/R/R-3.0.0/library" "MedExp" > [2,] "Ecdat" "C:/Users/sgraves/pgms/R/R-3.0.0/library" "MedExp (VietNamH)" > Title > [1,] "Structure of Demand for Medical Care" > [2,] "Structure of Demand for Medical Care" > > library(Ecdat) > > (data(Intratesm)) > [1] "Intratesm" > > (data(Intratesq)) > [1] "Intratesq" > > > > Bill Dunlap > > Spotfire, TIBCO Software > > wdunlap tibco.com > > > > > >> -Original Message- > >> From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] > >> On > Behalf > >> Of Spencer Graves > >> Sent: Tuesday, May 21, 2013 12:21 PM > >> To: Prof Brian Ripley > >> Cc: r-devel@r-project.org > >> Subject: Re: [Rd] Objects created by more than one data call? > >> > >> On 5/21/2013 9:03 AM, Prof Brian Ripley wrote: > >>> On 21/05/2013 16:51, Spencer Graves wrote: > On 5/21/2013 7:47 AM, Prof Brian Ripley wrote: > > On 21/05/2013 15:28, Spencer Graves wrote: > >> On 5/20/2013 10:10 PM, Prof Brian Ripley wrote: > >>> On 21/05/2013 00:12, Spencer Graves wrote: > Hello, All: > > > If I use LazyData with the Ecdat package on R-Forge, "R CMD > check" reports "no visible binding for global variable > 'nonEnglishNames'", where 'nonEnglishNames' is a dataset in Ecdat > used > as the default argument for a function. With LazyData, that NOTE > disappears. However, then I get, "Warning: objects 'Hstarts', > 'Hstarts', 'MedExp' are created by more than one data call". > > > What do you suggest I do to fix this problem? > >>> Not create the objects in more than one data() call. > >>> > >>> Check what each of your data() calls produces. > >> > >> Thanks
Re: [Rd] Objects created by more than one data call?
Dear Bill: On 5/22/2013 2:12 PM, William Dunlap wrote: I used svn to copy the current version of Ecdat from Rforge to my PC C:\temp\packages>svn checkout svn://r-forge.r-project.org/svnroot/ecdat/ then fired up R to look at the rda files in it. setwd("c:/temp/packages/Ecdat/ecdat/pkg/data") read.dcf("../DESCRIPTION")[, c("Package","Version")] Package Version "Ecdat" "0.2-3" dir.rda <- function(rdaFile) { e <- new.env() ; load(rdaFile, envir=e) ; objects(e, all=TRUE)} dir.rda("VietNamH.rda") [1] "MedExp" rdas <- dir(pattern="\\.rda$") names(rdas) <- rdas z <- lapply(rdas, dir.rda) tab <- table(unlist(z)) tab[tab>1] Hstarts MedExp 3 2 z[sapply(z, function(zi)"Hstarts" %in% zi)] $Hstarts.rda [1] "Hstarts" $Intratesm.rda [1] "Hstarts" $Intratesq.rda [1] "Hstarts" z[sapply(z, function(zi)"MedExp" %in% zi)] $MedExp.rda [1] "MedExp" $VietNamH.rda [1] "MedExp" It looks some files don't contain what their names suggest: dir.rda("VietNamH.rda") [1] "MedExp" The two versions of MedExp are quite different: load("VietNamH.rda", envViet <- new.env(parent=emptyenv())) load("MedExp.rda", envMed <- new.env(parent=emptyenv())) objects(envViet) [1] "MedExp" objects(envMed) [1] "MedExp" all.equal(envViet$MedExp, envMed$MedExp) [1] "Names: 11 string mismatches" [2] "Length mismatch: comparison on first 11 components" [3] "Component 1: 'current' is not a factor" ... [18] "Component 10: Numeric: lengths (5999, 5574) differ" [19] "Component 11: 'current' is not a factor" Thanks very much. Now I understand the problem. For a solution, I need to consult with the package author (Yves Croissant; I'm only the maintainer). However, your work at least makes it easy for me to describe the problem to him. Thanks again. Spencer Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: Spencer Graves [mailto:spencer.gra...@prodsyse.com] Sent: Wednesday, May 22, 2013 1:27 PM To: William Dunlap Cc: r-devel@r-project.org Subject: Re: [Rd] Objects created by more than one data call? On 5/21/2013 3:03 PM, William Dunlap wrote: If you look at data(package="Ecat")$results[,"Item"] you will see the items "Hstarts", "Hstarts (Intratesm)", and "Hstarts (Intratesq)" which I think means that the dataset Hstarts is found in 3 .rda files, "Hstarts.rda", "Intratesq.rda", and "Intratesm.rda". There are duplicate, modulo (filename), items for "MedExp" as well. Thanks for this. I may get me closer, but I still don't see it: (data(Intratesm)) imports only the object Intratesm, etc. For more details, see below. Any other suggestions? Thanks again, Spencer > Ecdat.data <- data(package="Ecdat")$results > (Hstarts2 <- grep('Hstarts', Ecdat.data[, 'Item'])) [1] 47 48 49 > (MedExp2 <- grep('MedExp', Ecdat.data[, 'Item'])) [1] 67 68 > Ecdat.data[Hstarts2, ] Package LibPath Item [1,] "Ecdat" "C:/Users/sgraves/pgms/R/R-3.0.0/library" "Hstarts" [2,] "Ecdat" "C:/Users/sgraves/pgms/R/R-3.0.0/library" "Hstarts (Intratesm)" [3,] "Ecdat" "C:/Users/sgraves/pgms/R/R-3.0.0/library" "Hstarts (Intratesq)" Title [1,] "Housing Starts" [2,] "Housing Starts" [3,] "Housing Starts" > Ecdat.data[MedExp2,] Package LibPath Item [1,] "Ecdat" "C:/Users/sgraves/pgms/R/R-3.0.0/library" "MedExp" [2,] "Ecdat" "C:/Users/sgraves/pgms/R/R-3.0.0/library" "MedExp (VietNamH)" Title [1,] "Structure of Demand for Medical Care" [2,] "Structure of Demand for Medical Care" > library(Ecdat) > (data(Intratesm)) [1] "Intratesm" > (data(Intratesq)) [1] "Intratesq" Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On Behalf Of Spencer Graves Sent: Tuesday, May 21, 2013 12:21 PM To: Prof Brian Ripley Cc: r-devel@r-project.org Subject: Re: [Rd] Objects created by more than one data call? On 5/21/2013 9:03 AM, Prof Brian Ripley wrote: On 21/05/2013 16:51, Spencer Graves wrote: On 5/21/2013 7:47 AM, Prof Brian Ripley wrote: On 21/05/2013 15:28, Spencer Graves wrote: On 5/20/2013 10:10 PM, Prof Brian Ripley wrote: On 21/05/2013 00:12, Spencer Graves wrote: Hello, All: If I use LazyData with the Ecdat package on R-Forge, "R CMD check" reports "no visible binding for global variable 'nonEnglishNames'", where 'nonEnglishNames' is a dataset in Ecdat used as the default argument for a function. With LazyData, that NOTE disappears. However, then I get, "Warning: objects 'Hstarts', 'Hstarts', 'MedExp' are created by more than one data call". What do you suggest I do to fix this problem? Not create the objects in more than one data() call. Check what each of your data() calls produces. Thanks. How do I do that? Call data() on each in turn, and see what files get added to an empty workspace. Like the following? You missed the 'empty'. Look at
Re: [Rd] Inconsistent results from .C()
Update: I did eventually discover an error in the C code which is probably ultimately responsible for the bug (the variable eff_size is allowed to get too large, and overrun the end of lev2[]). However the behaviour of R (or Rstudio) in response to this is still somewhat mystifying to me! Robin On 21 May 2013 14:10, Robin Evans wrote: > Sure! C code is below if it helps. The gist is that the function > oneMargin forms two matrices M and C, mostly by repeatedly taking > Kronecker products. > > Robin > > void kronecker (int *A, int *B, int *dima, int *dimb, int *C) { > int k = 0; > int i1,i2,j1,j2; > > for (i2 = 0; i2 < dima[1]; i2++) { > for (j2 = 0; j2 < dimb[1]; j2++) { > for (i1 = 0; i1 < dima[0]; i1++) { > for (j1 = 0; j1 < dimb[0]; j1++) { > C[k] = A[i1 + dima[0]*i2]*B[j1 + dimb[0]*j2]; > k++; > > } > > void iterate (int *x, int *lev) { > bool ok = FALSE; > int i=0; > > do { > if(x[i] < lev[i]-1) { > x[i]++; > ok = TRUE; > } > else { > x[i] = 0; > } > i++; > } > while (!ok); > } > > void oneMargin(int *Mar, int *Eff, int *neff, int *lev, int *nvar, int > *M, int *C) { > > int *M2, *mult, *Cj, *Cj2; > int i,j=1,k,l,m; > > /* determine size of M to output */ > for (i = 0; i < nvar[0]; i++) { > j *= (Mar[i] == 1 ? lev[i]*lev[i] : lev[i]); > } > M2 = malloc(sizeof(int)*j); > mult = malloc(sizeof(int)*j); > > M[0] = 1; > int M_size[2] = {1,1}, mult_size[2], Cj_size[2], C_size[2] = {0,0}; > > for (i = nvar[0]-1; i >= 0; i--) { > if (Mar[i] == 1) { > for (j = 0; j < lev[i]*lev[i]; j++) { > mult[j] = ((j % (lev[i]+1)) == 0) ? 1 : 0; > } > mult_size[0] = mult_size[1] = lev[i]; > } > else { > for (j = 0; j < lev[i]; j++) { > mult[j] = 1; > } > mult_size[0] = 1; > mult_size[1] = lev[i]; > } > > kronecker(M, mult, M_size, mult_size, M2); > M_size[0] *= mult_size[0]; > M_size[1] *= mult_size[1]; > > /* copy M2 over to M */ > for (j=0; j < M_size[0]*M_size[1]; j++) { > M[j] = M2[j]; > } > } > > free(M2); > > /* Create C matrix */ > int index[nvar[0]]; > int prod; > int eff_size = 0; > /*swapped = 0; */ > mult_size[0] = 1; > > C_size[0] = 1; > for (j = 0; j < nvar[0]; j++) { > C_size[0] *= (Mar[j] == 1 ? lev[j] : 1); > } > > Cj = malloc(sizeof(int)*C_size[0]); > Cj2 = malloc(sizeof(int)*C_size[0]); > int *lev2; > lev2 = malloc(sizeof(int)*nvar[0]); > > /* loop over effects */ > for (i = 0; i < neff[0]; i++) { > prod = 1; > for (j = 0; j < nvar[0]; j++) { > if (Eff[j + i*nvar[0]]) { > prod *= lev[j]-1; > lev2[eff_size] = lev[j]-1; > index[eff_size] = 0; > eff_size++; >} > } > > /* loop over states of effect (excluding corner points) */ > for (j = 0; j < prod; j++) { > Cj[0] = 1; > Cj_size[0] = Cj_size[1] = 1; > > k = eff_size; > /* loop over variables */ > for (l = nvar[0]-1; l >= 0; l--) { > /* skip variables not in margin */ > if (Mar[l] == 0) continue; > mult_size[1] = lev[l]; > > /* kronecker factor depends on whether or not variable is in > effect */ > if (Eff[i*nvar[0]+l] == 0) { > /* for variables not in effect, just repeat matrix */ > for (m = 0; m < lev[l]; m++) { > mult[m] = 1; > } > } > else { > /* otherwise multiply based on state */ > k--; > for (m = 0; m < lev[l]; m++) { > mult[m] = (m == index[k]) ? lev[l] - 1 : -1; > } > } > kronecker(Cj, mult, Cj_size, mult_size, Cj2); > > Cj_size[1] *= mult_size[1]; > > /* copy Cj2 over to Cj */ > for (k=0; k < Cj_size[0]*Cj_size[1]; k++) { > Cj[k] = Cj2[k]; > } > if (Cj_size[0]*Cj_size[1] > C_size[0]) Rprintf("pointer screwup"); > } > > /* copy row over to C */ > for (m = 0; m < C_size[0]; m++) C[C_size[0]*C_size[1]+m] = Cj[m]; > C_size[1] += 1; > iterate(index, lev2); > } > } > > free(lev2); > free(Cj); > free(Cj2); > free(mult); > } > > > On 21 May 2013 14:04, R. Michael Weylandt > wrote: > > > > It might also help if you can point us to the C code to help debug. > > > > MW > > > > On Tue, May 21, 2013 at 10:53 AM, Robin Evans wrote: > > > I should add to this that I'm running on Scientific Linux 6. I later > > > noticed that the bug only seems to occur when I run the code from > Rstudio, > > > and not if I use the terminal directly, so this may be the key to the > > > problem. > > > > > > Robin > > > > > > On 20 May 2013 16:12, Robin Evans wrote: > > > > > >> Hello, > > >> > > >> I've run into a problem which is both maddening and rather hard to > > >> replicate, therefore I wondered if someone might know of a plausible > > >> explanation for it. I
Re: [Rd] Inconsistent results from .C()
Had you tried using the new-to-3.0.0 options(CBoundCheck=TRUE)? [from the NEWS file] There is a new option, options(CBoundsCheck=), which controls how .C() and .Fortran() pass arguments to compiled code. If true (which can be enabled by setting the environment variable R_C_BOUNDS_CHECK to yes), raw, integer, double and complex arguments are always copied, and checked for writing off either end of the array on return from the compiled code (when a second copy is made). This also checks individual elements of character vectors passed to .C(). valgrind helps find memory misuse also. When you write on memory that you didn't allocate, expect to get mystifying behavior. Nonrepeatable results are a sign of memory misuse. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -Original Message- > From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On > Behalf > Of Robin Evans > Sent: Wednesday, May 22, 2013 4:16 PM > To: R Devel List > Subject: Re: [Rd] Inconsistent results from .C() > > Update: I did eventually discover an error in the C code which is probably > ultimately responsible for the bug (the variable eff_size is allowed to get > too large, and overrun the end of lev2[]). However the behaviour of R (or > Rstudio) in response to this is still somewhat mystifying to me! > > Robin > > On 21 May 2013 14:10, Robin Evans wrote: > > > Sure! C code is below if it helps. The gist is that the function > > oneMargin forms two matrices M and C, mostly by repeatedly taking > > Kronecker products. > > > > Robin > > > > void kronecker (int *A, int *B, int *dima, int *dimb, int *C) { > > int k = 0; > > int i1,i2,j1,j2; > > > > for (i2 = 0; i2 < dima[1]; i2++) { > > for (j2 = 0; j2 < dimb[1]; j2++) { > > for (i1 = 0; i1 < dima[0]; i1++) { > > for (j1 = 0; j1 < dimb[0]; j1++) { > > C[k] = A[i1 + dima[0]*i2]*B[j1 + dimb[0]*j2]; > > k++; > > > > } > > > > void iterate (int *x, int *lev) { > > bool ok = FALSE; > > int i=0; > > > > do { > > if(x[i] < lev[i]-1) { > > x[i]++; > > ok = TRUE; > > } > > else { > > x[i] = 0; > > } > > i++; > > } > > while (!ok); > > } > > > > void oneMargin(int *Mar, int *Eff, int *neff, int *lev, int *nvar, int > > *M, int *C) { > > > > int *M2, *mult, *Cj, *Cj2; > > int i,j=1,k,l,m; > > > > /* determine size of M to output */ > > for (i = 0; i < nvar[0]; i++) { > > j *= (Mar[i] == 1 ? lev[i]*lev[i] : lev[i]); > > } > > M2 = malloc(sizeof(int)*j); > > mult = malloc(sizeof(int)*j); > > > > M[0] = 1; > > int M_size[2] = {1,1}, mult_size[2], Cj_size[2], C_size[2] = {0,0}; > > > > for (i = nvar[0]-1; i >= 0; i--) { > > if (Mar[i] == 1) { > > for (j = 0; j < lev[i]*lev[i]; j++) { > > mult[j] = ((j % (lev[i]+1)) == 0) ? 1 : 0; > > } > > mult_size[0] = mult_size[1] = lev[i]; > > } > > else { > > for (j = 0; j < lev[i]; j++) { > > mult[j] = 1; > > } > > mult_size[0] = 1; > > mult_size[1] = lev[i]; > > } > > > > kronecker(M, mult, M_size, mult_size, M2); > > M_size[0] *= mult_size[0]; > > M_size[1] *= mult_size[1]; > > > > /* copy M2 over to M */ > > for (j=0; j < M_size[0]*M_size[1]; j++) { > > M[j] = M2[j]; > > } > > } > > > > free(M2); > > > > /* Create C matrix */ > > int index[nvar[0]]; > > int prod; > > int eff_size = 0; > > /*swapped = 0; */ > > mult_size[0] = 1; > > > > C_size[0] = 1; > > for (j = 0; j < nvar[0]; j++) { > > C_size[0] *= (Mar[j] == 1 ? lev[j] : 1); > > } > > > > Cj = malloc(sizeof(int)*C_size[0]); > > Cj2 = malloc(sizeof(int)*C_size[0]); > > int *lev2; > > lev2 = malloc(sizeof(int)*nvar[0]); > > > > /* loop over effects */ > > for (i = 0; i < neff[0]; i++) { > > prod = 1; > > for (j = 0; j < nvar[0]; j++) { > > if (Eff[j + i*nvar[0]]) { > > prod *= lev[j]-1; > > lev2[eff_size] = lev[j]-1; > > index[eff_size] = 0; > > eff_size++; > >} > > } > > > > /* loop over states of effect (excluding corner points) */ > > for (j = 0; j < prod; j++) { > > Cj[0] = 1; > > Cj_size[0] = Cj_size[1] = 1; > > > > k = eff_size; > > /* loop over variables */ > > for (l = nvar[0]-1; l >= 0; l--) { > > /* skip variables not in margin */ > > if (Mar[l] == 0) continue; > > mult_size[1] = lev[l]; > > > > /* kronecker factor depends on whether or not variable is in > > effect */ > > if (Eff[i*nvar[0]+l] == 0) { > > /* for variables not in effect, just repeat matrix */ > > for (m = 0; m < lev[l]; m++) { > > mult[m] = 1; > > } > > } > > else { > > /* otherwise multiply based on state */ > > k--; > > for (m = 0; m < lev[l]; m++) { > > mult[m] = (m ==
Re: [Rd] Inconsistent results from .C()
lev2 is malloc-ed in the code you supplied. The new option Bill refers to is about input variables in .C, i.e. lev and not lev2. Other array overruns can be found by other tools: Bill mentioned valgrind, and AddressSanitizer finds a different set of overruns. See http://cran.r-project.org/doc/manuals/r-release/R-exts.html#Using-gctorture-and-memory-access . On 23/05/2013 01:18, William Dunlap wrote: Had you tried using the new-to-3.0.0 options(CBoundCheck=TRUE)? [from the NEWS file] There is a new option, options(CBoundsCheck=), which controls how .C() and .Fortran() pass arguments to compiled code. If true (which can be enabled by setting the environment variable R_C_BOUNDS_CHECK to yes), raw, integer, double and complex arguments are always copied, and checked for writing off either end of the array on return from the compiled code (when a second copy is made). This also checks individual elements of character vectors passed to .C(). valgrind helps find memory misuse also. When you write on memory that you didn't allocate, expect to get mystifying behavior. Nonrepeatable results are a sign of memory misuse. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On Behalf Of Robin Evans Sent: Wednesday, May 22, 2013 4:16 PM To: R Devel List Subject: Re: [Rd] Inconsistent results from .C() Update: I did eventually discover an error in the C code which is probably ultimately responsible for the bug (the variable eff_size is allowed to get too large, and overrun the end of lev2[]). However the behaviour of R (or Rstudio) in response to this is still somewhat mystifying to me! Robin On 21 May 2013 14:10, Robin Evans wrote: Sure! C code is below if it helps. The gist is that the function oneMargin forms two matrices M and C, mostly by repeatedly taking Kronecker products. Robin void kronecker (int *A, int *B, int *dima, int *dimb, int *C) { int k = 0; int i1,i2,j1,j2; for (i2 = 0; i2 < dima[1]; i2++) { for (j2 = 0; j2 < dimb[1]; j2++) { for (i1 = 0; i1 < dima[0]; i1++) { for (j1 = 0; j1 < dimb[0]; j1++) { C[k] = A[i1 + dima[0]*i2]*B[j1 + dimb[0]*j2]; k++; } void iterate (int *x, int *lev) { bool ok = FALSE; int i=0; do { if(x[i] < lev[i]-1) { x[i]++; ok = TRUE; } else { x[i] = 0; } i++; } while (!ok); } void oneMargin(int *Mar, int *Eff, int *neff, int *lev, int *nvar, int *M, int *C) { int *M2, *mult, *Cj, *Cj2; int i,j=1,k,l,m; /* determine size of M to output */ for (i = 0; i < nvar[0]; i++) { j *= (Mar[i] == 1 ? lev[i]*lev[i] : lev[i]); } M2 = malloc(sizeof(int)*j); mult = malloc(sizeof(int)*j); M[0] = 1; int M_size[2] = {1,1}, mult_size[2], Cj_size[2], C_size[2] = {0,0}; for (i = nvar[0]-1; i >= 0; i--) { if (Mar[i] == 1) { for (j = 0; j < lev[i]*lev[i]; j++) { mult[j] = ((j % (lev[i]+1)) == 0) ? 1 : 0; } mult_size[0] = mult_size[1] = lev[i]; } else { for (j = 0; j < lev[i]; j++) { mult[j] = 1; } mult_size[0] = 1; mult_size[1] = lev[i]; } kronecker(M, mult, M_size, mult_size, M2); M_size[0] *= mult_size[0]; M_size[1] *= mult_size[1]; /* copy M2 over to M */ for (j=0; j < M_size[0]*M_size[1]; j++) { M[j] = M2[j]; } } free(M2); /* Create C matrix */ int index[nvar[0]]; int prod; int eff_size = 0; /*swapped = 0; */ mult_size[0] = 1; C_size[0] = 1; for (j = 0; j < nvar[0]; j++) { C_size[0] *= (Mar[j] == 1 ? lev[j] : 1); } Cj = malloc(sizeof(int)*C_size[0]); Cj2 = malloc(sizeof(int)*C_size[0]); int *lev2; lev2 = malloc(sizeof(int)*nvar[0]); /* loop over effects */ for (i = 0; i < neff[0]; i++) { prod = 1; for (j = 0; j < nvar[0]; j++) { if (Eff[j + i*nvar[0]]) { prod *= lev[j]-1; lev2[eff_size] = lev[j]-1; index[eff_size] = 0; eff_size++; } } /* loop over states of effect (excluding corner points) */ for (j = 0; j < prod; j++) { Cj[0] = 1; Cj_size[0] = Cj_size[1] = 1; k = eff_size; /* loop over variables */ for (l = nvar[0]-1; l >= 0; l--) { /* skip variables not in margin */ if (Mar[l] == 0) continue; mult_size[1] = lev[l]; /* kronecker factor depends on whether or not variable is in effect */ if (Eff[i*nvar[0]+l] == 0) { /* for variables not in effect, just repeat matrix */ for (m = 0; m < lev[l]; m++) { mult[m] = 1; } } else { /* otherwise multiply based on state */ k--; for (m = 0; m < lev[l]; m++) { mult[m] = (m == index[k]) ? lev[l]