On 13-04-19 2:57 PM, Thomas Alexander Gerds wrote:

hmm. I have tested a bit more, and found this perhaps more difficult
solve situation. even though I delete x, since x is part of the output
of the formula, the size of the object is twice as much as it should be:

test <- function(x){
   x <- rnorm(1000000)
   out <- list(x=x)
   rm(x)
   out$f <- as.formula(a~b)
   out
}
v <- test(1)
x <- rnorm(1000000)
save(v,file="~/tmp/v.rda")
save(x,file="~/tmp/x.rda")
system("ls -lah ~/tmp/*.rda")

-rw-rw-r-- 1 tag tag  15M Apr 19 20:52 /home/tag/tmp/v.rda
-rw-rw-r-- 1 tag tag 7,4M Apr 19 20:52 /home/tag/tmp/x.rda

can you solve this as well?

Yes, this is tricky. The problem is that "out" is in the environment of out$f, so you get two copies when you save it. (I think you won't have two copies in memory, because R only makes a copy when it needs to, but I haven't traced this.)

Here are two solutions, both have some problems.

1.  Don't put out in the environment:

test <- function(x) {
  x <- rnorm(1000000)
  out$x <- list(x=x)
  out$f <- a ~ b    # the as.formula() was never needed
  # temporarily create a new environment
  local({
    # get a copy of what you want to keep
    out <- out
    # remove everything that you don't need from the formula
    rm(list=c("x", "out"), envir=environment(out$f))
    # return the local copy
    out
  })
}

I don't like this because it is too tricky, but you could probably wrap the tricky bits into a little function (a variant on return() that cleans out the environment first), so it's probably what I would use if I was desperate to save space in saved copies.

2. Never evaluate the formula in the first place, so it doesn't pick up the environment:

test <- function(x) {
  x <- rnorm(1000000)
  out$x <- list(x=x)
  out$f <- quote(a ~ b)
  out
}

This is a lot simpler, but it might not work with some modelling functions, which would be confused by receiving the model formula unevaluated. It also has the problems that you get with using .GlobalEnv as the environment of the formula, but maybe to a slightly lesser extent: rather than having what is possibly the wrong environment, it doesn't have one at all.

Duncan Murdoch



thanks!
thomas

Duncan Murdoch <murdoch.dun...@gmail.com> writes:

On 13-04-18 11:39 AM, Thomas Alexander Gerds wrote:
Dear Duncan
thank you for taking the time to answer my questions! It will be
quite some work to delete all the objects generated inside the
function ... but if there is no other way to avoid a large
environment then this is what I will do.

It's not really that hard.  Use names <- ls() in the function to get a
list of all of them; remove the names of variables that might be
needed in the formula (and the name of the formula itself); then use
rm(list=names) to delete everything else just before returning it.

Duncan Murdoch


______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to