Re: [Rd] .Call and to reclaim the memory by allocVector
Hi Seth, Thank you for the suggestion. Because of using .Call (which does not copy the value) for both parts of my program, there is no extra copy shown by tracemem(). Anyway, the information shown by gc() is very misleading as stated by Prof. Ripley, especially after creating and removing a couple of large R datasets and applying the function gc() a couple of times. As shown by "ps aux", there is no "memory leak" from .Call. It's a big relief to me. Mysteriously, my program works now for storing the intermediate results as a 660M R object. I can run the same function as often as I want. The maximum space taken by the program has never exceeded 1.8G as I expected. The disappearance of taking too much memory from .Call may be due to a recompile of my C code or a restart of the linux or a fresh mind after the weekend. Thank you and Prof. Ripley for the suggestions. It helps me to stay focused. Yongchao On Sat, 25 Aug 2007, Seth Falcon wrote: > Hi Yongchao, > > Yongchao Ge <[EMAIL PROTECTED]> writes: >> Why am I storing a large dataset in the R? My program consist of two >> parts. The first part is to get the intermediate results, the computation >> of which takes a lot of time. The second part contains many >> different functions to manipulate the the intermediate >> results. >> >> My current solution is to save intermediate result in a temporary file, >> but my final goal is to to save it as an R object. The "memory leak" in >> .Call stops me from doing this and I'd like to know if I can have a clean >> solution for the R package I am writing. > > There are many examples of packages that use .Call to create large > objects. I don't think there is a "memory leak". > > One thing that may be catching you up is that because of R's > pass-by-value semantics, you may be ending up with multiple copies of > the object on the R side during some of your operations. I would > recommend recompiling with --enable-memory-profiling and using > tracemem() to see if you can identify places where copies of your > large object are occurring. You can also take a look at > Rprof(memory.profile=TRUE). > > + seth > > > -- > Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center > BioC: http://bioconductor.org/ > Blog: http://userprimary.net/user/ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] NA and NaN in function identical
The help page for function identical says: 'identical' sees 'NaN' as different from 'as.double(NA)', but all 'NaN's are equal (and all 'NA' of the same type are equal). However, we have x <- NaN y <- as.double(NA) x # [1] NaN y # [1] NA identical(x,y) # [1] TRUE In my opinion, NaN and as.double(NA) should be distinguished as the help page suggests. Tested under R version 2.5.1 Patched (2007-08-19 r42638) on Linux (CPU Xeon). Petr Savicky. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] NA and NaN in function identical
On Wed, 29 Aug 2007, Petr Savicky wrote: > The help page for function identical says: > 'identical' sees 'NaN' as different from 'as.double(NA)', but all > 'NaN's are equal (and all 'NA' of the same type are equal). > However, we have > x <- NaN > y <- as.double(NA) > x # [1] NaN > y # [1] NA > identical(x,y) # [1] TRUE > > In my opinion, NaN and as.double(NA) should be distinguished as the > help page suggests. And sometimes they are: > identical(y,x) [1] FALSE so it is a bug. A quicker version: identical(NaN, NA_real_) == identical(NA_real_, NaN) was false, fixed now, thanks for spotting it. > > Tested under R version 2.5.1 Patched (2007-08-19 r42638) on Linux (CPU Xeon). > > Petr Savicky. > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R CMD check: Error in function (env) : could not find function "finalize"
Hi, thanks Seth and others (I've got some offline replies); all feedback has been useful indeed. The short story is that as the author of R.oo I am actually the bad guy here (but not for long since I'm soon committing a fix for R.oo). REPRODUCIBLE EXAMPLE #1: % R --vanilla > library(R.oo) R.oo v1.2.8 (2006-06-09) successfully loaded. See ?R.oo for help. > detach("package:R.oo") > gc() Error in function (env) : could not find function "finalize" Error in function (env) : could not find function "finalize" Error in function (env) : could not find function "finalize" Error in function (env) : could not find function "finalize" Error in function (env) : could not find function "finalize" used (Mb) gc trigger (Mb) max used (Mb) Ncells 142543 3.9 35 9.4 35 9.4 Vcells 82660 0.7 786432 6.0 478255 3.7 REPRODUCIBLE EXAMPLE #2: Here is another example without R.oo illustrating what is going on. % R --vanilla > e <- new.env() > e$foo <- "foo" > e$foo <- 1 > e <- new.env() > e$foo <- 1 > reg.finalizer(e, function(env) { print(ls.str(envir=env)) }) > detach("package:utils") > rm(e) > gc() Error in print(ls.str(envir = env)) : could not find function "ls.str" used (Mb) gc trigger (Mb) max used (Mb) Ncells 158213 4.3 35 9.4 35 9.4 Vcells 86800 0.7 786432 6.0 478255 3.7 WHY ONLY WHEN RUNNING R CMD CHECK? So, the problem I had was with 'affxparser' examples failing in R CMD check, but not when I tested them manually. Same thing was happening with the 'pcaMethods' package. The common denominator was that both 'affxparser' and 'pcaMethods' had R.oo dependent package in DESCRIPTION/Suggests; 'affxparser' used Suggests: R.utils (which depends on R.oo), and 'pcaMethods' used Suggests: aroma.light (which in turn *suggests* R.utils). To the best of my understanding, when R CMD check runs examples, it will load *all* suggested packages, and when done, detach them. When the garbage collector later runs and cleans out objects, the generic function finalize() in R.oo called by the registered finalize hook is not around anymore. FYI, if you move the R.oo-dependent package from Suggests: to Depends:, there is no longer a problem because then the package is never detached. It all makes sense. CONCLUSION: When registering finalizers for object using reg.finalizer() there is always the risk of the finalizer code to be broken because a dependent package has been detached. SOLUTION: At least make the finalizer hook function robust against these things. For instance, check if required packages are loaded etc, or just add a tryCatch() statement. However, since finalizers are typically used to deallocate resources, much more effort has to be taken to make sure that is still work, which is not easy. For instance, one could make sure to require() the necessary packages in the finalizer, but that has side effects and it is not necessarily sufficient, e.g. you might only load a generic function, but the method for a specific subclass might be in a package out of your control. Same problem goes with explicit namespace calls to generic functions, e.g. R.oo::finalize(). If you have more clever suggestions, please let me know. SOME MORE DETAILS ON R.OO: This is what R.oo looks like now: Object <- function (core = NA) { this <- core attr(this, ".env") <- new.env() class(this) <- "Object" attr(this, "...instanciationTime") <- Sys.time() reg.finalizer(attr(this, ".env"), function(env) finalize(this)) this } finalize.Object <- function(this, ...) {} finalize <- function(...) UseMethod("finalize") As you see, when detaching R.oo, finalize() is no longer around. Lesson learned! Cheers Henrik On 8/28/07, Seth Falcon <[EMAIL PROTECTED]> wrote: > Hi Henrik, > > "Henrik Bengtsson" <[EMAIL PROTECTED]> writes: > > > Hi, > > > > does someone else get this error message: > > > > Error in function (env) : could not find function "finalize"? > > > > I get an error when running examples in R CMD check (v2.6.0; session > > info below): > [snip] > > The error occurs in R CMD check but also when start a fresh R session > > and run, in this case, affxparser.Rcheck/affxparser-Ex.R. It always > > occur on the same line. > > So does options(error=recover) help in determining where the error is > coming from? > > If you can narrow it down, gctorture may help or running the examples > under valgrind. > > + seth > > -- > Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center > BioC: http://bioconductor.org/ > Blog: http://userprimary.net/user/ > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Modifying R_CheckStack for a speed increase
Greetings R developers, R will run a little faster when executing "pure R" code if the function R_CheckStack() is modified. With the modification, the following code for example runs 15% faster (compared to a virgin R-2.5.1 on my Windows XP machine): N = 1e7 foo <- function(x) { for (i in 1:N) x <- x + 1 x } foo(0) The crux of the modification is to change the following line in R_CheckStack() if(R_CStackLimit != -1 && usage > 0.95 * R_CStackLimit) {... to if(usage > R_CStackLen) { ... Details and modified sources can be found at ftp://ftp.sonic.net/pub/users/milbo. Regards, Stephen http://milbo.users.sonic.net __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Modifying R_CheckStack for a speed increase
Alternatively, if you actually wanted to keep the 0.95 you could use usage > R_CStackLimit - (R_CStackLimit >> 4) and probably get close enough to 0.95 as it makes no difference or go with 5 and get something more like 97%. At any rate, you'd avoid floating point. On 8/29/07, Stephen Milborrow <[EMAIL PROTECTED]> wrote: > Greetings R developers, > > R will run a little faster when executing "pure R" code if the function > R_CheckStack() is modified. > > With the modification, the following code for example runs 15% faster > (compared to a virgin R-2.5.1 on my Windows XP machine): > > N = 1e7 > foo <- function(x) > { >for (i in 1:N) > x <- x + 1 > x > } > foo(0) > > The crux of the modification is to change the following line in > R_CheckStack() > > if(R_CStackLimit != -1 && usage > 0.95 * R_CStackLimit) {... > > to > > if(usage > R_CStackLen) { ... > > Details and modified sources can be found at > ftp://ftp.sonic.net/pub/users/milbo. > > Regards, > Stephen > > http://milbo.users.sonic.net > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > -- Byron Ellis ([EMAIL PROTECTED]) "Oook" -- The Librarian __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R CMD check: Error in function (env) : could not find function "finalize"
> To the best of my understanding, when R > CMD check runs examples, it will load *all* suggested packages, and > when done, detach them. When the garbage collector later runs and Not so. R CMD check just runs the examples in a normal session after loading the package being tested. Examples may themselves attach/load suggested packages (and if they are suggested it is likely that either examples or vignettes will do so). After each group of examples (\examples from a single help file) any packages which have been attached in the course of that group are detached. Looking at pcaMethods, function robustSvd require()s aroma.light, so this will happen in example(robustSvd). I think a package that sets finalizers probably ought to ensure that they are run in its .Last.lib or similar hook. There is no guarantee that they will be detached in a particular order, though. (R CMD check does detach them in an ordering determined from the search path, but users may do something different.) If namespaces are involved, similar considerations apply to unloading namespaces (although there are some guarantees on order since you cannot unload a namespace which is imported from). Beyond that, you may be able to ensure by setting the environment for the finalizer function that what it needs will still be present at finalization. On Wed, 29 Aug 2007, Henrik Bengtsson wrote: > Hi, thanks Seth and others (I've got some offline replies); all > feedback has been useful indeed. > > The short story is that as the author of R.oo I am actually the bad > guy here (but not for long since I'm soon committing a fix for R.oo). > > REPRODUCIBLE EXAMPLE #1: > % R --vanilla >> library(R.oo) > R.oo v1.2.8 (2006-06-09) successfully loaded. See ?R.oo for help. >> detach("package:R.oo") >> gc() > Error in function (env) : could not find function "finalize" > Error in function (env) : could not find function "finalize" > Error in function (env) : could not find function "finalize" > Error in function (env) : could not find function "finalize" > Error in function (env) : could not find function "finalize" > used (Mb) gc trigger (Mb) max used (Mb) > Ncells 142543 3.9 35 9.4 35 9.4 > Vcells 82660 0.7 786432 6.0 478255 3.7 > > REPRODUCIBLE EXAMPLE #2: > Here is another example without R.oo illustrating what is going on. > % R --vanilla >> e <- new.env() >> e$foo <- "foo" >> e$foo <- 1 >> e <- new.env() >> e$foo <- 1 >> reg.finalizer(e, function(env) { print(ls.str(envir=env)) }) >> detach("package:utils") >> rm(e) >> gc() > Error in print(ls.str(envir = env)) : could not find function "ls.str" > used (Mb) gc trigger (Mb) max used (Mb) > Ncells 158213 4.3 35 9.4 35 9.4 > Vcells 86800 0.7 786432 6.0 478255 3.7 > > > WHY ONLY WHEN RUNNING R CMD CHECK? > So, the problem I had was with 'affxparser' examples failing in R CMD > check, but not when I tested them manually. Same thing was happening > with the 'pcaMethods' package. The common denominator was that both > 'affxparser' and 'pcaMethods' had R.oo dependent package in > DESCRIPTION/Suggests; 'affxparser' used Suggests: R.utils (which > depends on R.oo), and 'pcaMethods' used Suggests: aroma.light (which > in turn *suggests* R.utils). To the best of my understanding, when R > CMD check runs examples, it will load *all* suggested packages, and > when done, detach them. When the garbage collector later runs and > cleans out objects, the generic function finalize() in R.oo called by > the registered finalize hook is not around anymore. FYI, if you move > the R.oo-dependent package from Suggests: to Depends:, there is no > longer a problem because then the package is never detached. It all > makes sense. > > > CONCLUSION: > When registering finalizers for object using reg.finalizer() there is > always the risk of the finalizer code to be broken because a dependent > package has been detached. > > > SOLUTION: > At least make the finalizer hook function robust against these things. > For instance, check if required packages are loaded etc, or just add > a tryCatch() statement. However, since finalizers are typically used > to deallocate resources, much more effort has to be taken to make sure > that is still work, which is not easy. For instance, one could make > sure to require() the necessary packages in the finalizer, but that > has side effects and it is not necessarily sufficient, e.g. you might > only load a generic function, but the method for a specific subclass > might be in a package out of your control. Same problem goes with > explicit namespace calls to generic functions, e.g. R.oo::finalize(). > If you have more clever suggestions, please let me know. > > > SOME MORE DETAILS ON R.OO: > This is what R.oo looks like now: > > Object <- function (core = NA) { > this <- core > attr(this, ".env") <- new.env() > class(this) <- "Object" > attr(this, "...instanciationTime") <- Sys.time() >
[Rd] R CMD check recursive copy of tests/
>From NEWS of R v2.6.0 devel: o R CMD check now does a recursive copy on the 'tests' directory. However, R CMD check does not run *.R scripts in such subdirectories (as I thought/hoped for), only those directly under tests/, This may or may not be intentional. If true, maybe the above should be clarified as: o R CMD check now does a recursive copy on the 'tests' directory for the purpose of provided data files for input. Test scripts still has to be directly under tests/ to be run. Just a comment Henrik __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel