Henrik, thanks for your reply. I might have misrepresented a  bit my actual
code . It seems that you are suggesting doing rm() on objects i don't use.
In the real code which behavior i
reported it is exactly what is being done, i.e i use rm().  I also use a
small wrapper around load that lets me assign loaded data directly into any
a variable with any name, without remembering the name of the object from
which it was saved, i.e instead of standard load i use something like (with
error checking in real code)

ut.load(filename)<-function(filename){ load(filename); s<-ls(); get("obj")
}

in other words, after i called data[[j]]<-ut.load(file[j]), there is no
reference to intermediary object to clean, i am assuming garbage collector
quickly takes care of it.
Just making sure that we are on the same page. I am mostly looking for some
guidance on what to expect in terms of R memory behavior. This particular
task is just an illustration of a typical issue that i encounter often
lately. Is there a way to diagnose if everything is normal with a
particular task in terms of memory use? Is there a memory benchmark? Is
there some white paper discussing  how memory and copying of objects
actually works in R? Is there a limited chunk of C code that i could read
to try to understand it? I just don't want to read all of the C code.

Thanks much
Andre



On Wed, Apr 11, 2012 at 9:02 PM, Henrik Bengtsson <h...@biostat.ucsf.edu>wrote:

> Leaving aside what's going on inside abind::abind(), maybe the
> following sheds some light on what's is being wasted:
>
> # Preallocate (probably doesn't make a difference because it's a list)
> mat.data <- vector("list", length=length(files));
> for (j in 1:length(files)){
>     vars <- load(file.path(dump.dir, files[j]))
>     mat.data[[j]]<-data;
>      # Not needed anymore/remove everything loaded
>     rm(list=vars);
> }
>
> data <- abind(mat.data, along=2);
> # Not needed anymore
> rm(mat.data);
>
> save(data, file.path(dump.dir, filename))
>
> My $.02
> /Henrik
>
> On Wed, Apr 11, 2012 at 3:53 PM, andre zege <andre.z...@gmail.com> wrote:
> > I recently started using R 2.14.0 on a new machine and i am  experiencing
> > what seems like unusually greedy memory use. It happens all the time, but
> > to give a specific example, let's say i run the following code
> >
> > --------
> >
> > for(j in 1:length(files)){
> >      load(file.path(dump.dir, files[j]))
> >      mat.data[[j]]<-data
> > }
> > save(abind(mat.data, along=2), file.path(dump.dir, filename))
> >
> > ---------
> >
> > It loads parts of multidimensional matrix into a list, then binds it
> along
> > second dimension and saves on disk. Code works, although slowly, but
> what's
> > strange is the amount of memory it uses.
> > In particular, each chunk of data is between 50M to 100M, and altogether
> > the binded matrix is 1.3G. One would expect that R would use roughly
> double
> > that memory - to keep mat.data and its binded version separately, or 1G.
> I
> > could imagine that for somehow it could use 3 times the size of matrix.
> But
> > in fact it uses more than 5.5 times (almost all of my physical memory)
> and
> > i think is swapping a lot to disk . For this particular task, my top
> output
> > shows eating more than 7G of memory and using up 11G of virtual memory as
> > well
> >
> > $top
> >
> > PID    USER      PR  NI  VIRT    RES  SHR   S %CPU %MEM    TIME+  COMMAND
> > 8823  user        25   0  11g     7.2g  10m   R   99.7     92.9
> > 5:55.05
> > R
> >
> > 8590   root       15   0  154m   16m   5948  S  0.5      0.2
> > 23:22.40 Xorg
> >
> >
> > I have strong suspicion that something is off with my R binary, i don't
> > think i experienced things like that in a long time. Is this in line with
> > what i am supposed to experience? Are there any ideas for diagnosing what
> > is going on?
> > Would appreciate any suggestions
> >
> > Thanks
> > Andre
> >
> >
> > ==================================
> >
> > Here is what i am running on:
> >
> >
> > CentOS release 5.5 (Final)
> >
> >
> >> sessionInfo()
> > R version 2.14.0 (2011-10-31)
> > Platform: x86_64-unknown-linux-gnu (64-bit)
> >
> > locale:
> > [1] en_US.UTF-8
> >
> > attached base packages:
> > [1] stats     graphics  grDevices datasets  utils     methods   base
> >
> > other attached packages:
> > [1] abind_1.4-0       rJava_0.9-3       R.utils_1.12.1    R.oo_1.9.3
> > R.methodsS3_1.2.2
> >
> > loaded via a namespace (and not attached):
> > [1] codetools_0.2-8 tcltk_2.14.0    tools_2.14.0
> >
> >
> >
> > I compiled R configure as follows
> > /configure --prefix=/usr/local/R --enable-byte-compiled-packages=no
> > --with-tcltk --enable-R-shlib=yes
> >
> >        [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>

        [[alternative HTML version deleted]]

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to