Dear Developers:

I have a question about memory management in R 2.2.1 and am wondering if you 
would be kind enough to help me understand what is going on.
(It has been a few years since I have done software development on Windows, so 
I apologize in advance if these are easy questions.)

-------------
MY SYSTEM
-------------

I am currently using R (version 2.2.1) on a PC running Windows 2000 (Intel 
Pentium M) that has 785,328 KB (a little over 766 MB) of physical RAM.
The R executable resides on the C drive, which is of NTFS format, says it has 
15.08 GB free space and has recently been defragmented.

The report of that defragmented drive gives:
------------------------------------------------
Volume (C:):
        Volume size     =       35,083 MB
        Cluster size    =       512 bytes
        Used space      =       19,642 MB
        Free space      =       15,440 MB
        Percent free space      =       44 %

Volume fragmentation
        Total fragmentation     =       1 %
        File fragmentation      =       2 %
        Free space fragmentation        =       0 %

File fragmentation
        Total files     =       121,661
        Average file size       =       193 KB
        Total fragmented files  =       64
        Total excess fragments  =       146
        Average fragments per file      =       1.00

Pagefile fragmentation
        Pagefile size   =       768 MB
        Total fragments =       1

Directory fragmentation
        Total directories       =       7,479
        Fragmented directories  =       2
        Excess directory fragments      =       3

Master File Table (MFT) fragmentation
        Total MFT size  =       126 MB
        MFT record count        =       129,640
        Percent MFT in use      =       99 %
        Total MFT fragments     =       4
------------------------------------------------------
PROBLEM
---------

I am trying to run a R script which makes use of the MCLUST package.
The script can successfully read in the approximately 17000 data points ok, but 
then throws an error:
--------------------------------------------------------
Error:  cannot allocate vector of size 1115070Kb
In addition:  Warning messages:
1:  Reached total allocation of # Mb:  see help(memory.size)
2:  Reached total allocation of # Mb:  see help(memory.size)
Execution halted
--------------------------------------------------------
after attempting line:
summary(EMclust(y),y)
which is computationally intensive (performs a "deconvolution" of the data into 
a series of Gaussian peaks)

and where # is either 766Mb or 2048Mb (depending on the max memory size I set).

The call I make is to Rterm.exe (to try to avoid Windows overhead):
"C:\Program Files\R\R-2.2.1\bin\Rterm.exe" --no-save --no-restore --vanilla 
--silent --max-mem-size=766M < 
"C:\Program Files\R\R-2.2.1\dTest.R"

(I have also tried it with 2048M but with same lack of success.)

------------
QUESTIONS  
------------

(1) I had initially thought that Windows 2000 should be able to allocate up to 
about 2 GB memory.  So, why is there a problem to allocate a little over 1GB on 
a defragmented disk with over 15 GB free?  (Is this a pagefile size issue?)

(2) Do you think the origin of the problem is 
    (a) the R environment, or 
    (b) the function in the MCLUST package using an in-memory instead of an 
on-disk approach?

(3)
    (a) If the problem originates in the R environment, would switching to the 
Linux version of R solve the problem? 
    (b) If the problem originates in the function in the MCLUST package, whom 
do I need to contact to get more information about re-writing the source code 
to handle large datasets?


Information I have located on overcoming Windows2000 memory allocation limits 
[http://www.rsinc.com/services/techtip.asp?ttid=3346; 
http://www.petri.co.il/pagefile_optimization.htm] does not seem to help me 
understand this any better.

I had initially upgraded to R version 2.2.1 because I had read 
[https://svn.r-project.org/R/trunk/src/gnuwin32/CHANGES/]:
------------------------------------------------------------------------------------
R 2.2.1
=======
Using the latest binutils allows us to distribute RGui.exe and Rterm.exe
as large-address-aware (see the rw-FAQ Q2.9).

The maximum C stack size for RGui.exe and Rterm.exe has been increased
to 10Mb (from 2Mb); this is comparable with the default on Linux systems
and may allow some larger programs to run without crashes.  ... 
------------------------------------------------------------------------------------
and also from the Windows FAQ 
[http://cran.r-project.org/bin/windows/base/rw-FAQ.html#There-seems-to-be-a-limit-on-the-memory-it-uses_0021]:
------------------------------------------------------------------------------------
2.9 There seems to be a limit on the memory it uses!
Indeed there is. It is set by the command-line flag --max-mem-size (see How do 
I install R for Windows?) and defaults to the smaller of the amount of physical 
RAM in the machine and 1Gb. It can be set to any amount over 16M. (R will not 
run in less.) Be aware though that Windows has (in most versions) a maximum 
amount of user virtual memory of 2Gb. 
Use ?Memory and ?memory.size for information about memory usage. The limit can 
be raised by calling memory.limit within a running R session. 
R can be compiled to use a different memory manager which might be better at 
using large amounts of memory, but is substantially slower (making R several 
times slower on some tasks). 
In this version of R, the executables support up to 3Gb per process under 
suitably enabled versions of Windows (see 
<http://www.microsoft.com/whdc/system/platform/server/PAE/PAEmem.mspx>). 
------------------------------------------------------------------------------------

Thank you in advance for any help you might be able to provide, 

Karen
---
Karen M. Green, Ph.D.
[EMAIL PROTECTED]
Research Investigator
Drug Design Group
Sanofi Aventis Pharmaceuticals
1580 E. Hanley Blvd.
Tucson, AZ  85737-9525


        [[alternative HTML version deleted]]

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to