Others may have mentioned this, but you might try loading your data in a small database like mysql and then pulling smaller portions of your data in via a package like RMySQL or RODBC.

One approach might be to split the data file into smaller pieces outside of R, then read the smaller pieces into R one at a time, subsequently creating aggregations (counts and sums of your data fields). From these aggregations you can create an "aggregated" dataset that is smaller and more pithy that you ultimately may graph with ggplot2 or other libraries of your choice.

-Avram



On Apr 26, 2009, at 8:20 AM, Neotropical bat risk assessments wrote:


   How do people deal with R and memory issues?
I have tried using gc() to see how much memory is used at each step. Scanned Crawley R-Book and all other R books I have available and the FAQ
   on-line but no help really found.
   Running WinXP Pro (32 bit) with 4 GB RAM.
One SATA drive pair is in RAID 0 configuration with 10000 MB allocated as
   virtual memory.
I do have another machine set up with Ubuntu but it only has 2 GB RAM and
   have not been able to get R installed on that system.
I can run smaller sample data sets w/o problems and everything plots as
   needed.
   However I need to review large data sets.
   Using latest R version 2.9.0 (2009-04-17)
My data is in CSV format with a header row and is a big data set with
   1,200,240 rows!
   E.g. below:
   Dur,TBC,Fmax,Fmin,Fmean,Fc,S1,Sc,
   9.81,0,28.78,24.54,26.49,25.81,48.84,14.78,
   4.79,1838.47,37.21,29.41,31.76,29.52,241.77,62.83,
   4.21,5.42,28.99,26.23,27.53,27.4,76.03,11.44,
   10.69,193.48,30.53,25.4,27.69,25.4,-208.19,26.05,
   15.5,248.18,30.77,24.32,26.57,24.92,-202.76,18.64,
   14.85,217.47,31.25,24.62,26.93,25.56,-88.4,10.32,
   11.86,158.01,33.61,25.24,27.66,25.32,83.32,17.62,
   14.05,229.74,30.65,24.24,26.76,25.24,61.87,14.06,
   8.71,264.02,31.01,25.72,27.56,25.72,253.18,19.2,
   3.91,10.3,25.32,24.02,24.55,24.02,-71.67,16.83,
   16.11,242.21,29.85,24.02,26.07,24.62,79.45,19.11,
   16.81,246.48,28.57,23.05,25.46,23.81,-179.82,15.95,
   16.93,255.09,28.78,23.19,25.75,24.1,-112.21,16.38,
   5.12,107.16,32,29.41,30.46,29.41,134.45,20.88,
   16.7,150.49,27.97,22.92,24.91,23.95,42.96,16.81
   .... etc
   I am getting the following warning/error message:
   Error: cannot allocate vector of size 228.9 Mb
   Complete listing from R console below:
library(batcalls)
   Loading required package: ggplot2
   Loading required package: proto
   Loading required package: grid
   Loading required package: reshape
   Loading required package: plyr
   Attaching package: 'ggplot2'
           The following object(s) are masked from package:grid :
            nullGrob
gc()
            used (Mb) gc trigger (Mb) max used (Mb)
   Ncells 186251  5.0     407500 10.9   350000  9.4
   Vcells  98245  0.8     786432  6.0   358194  2.8
BR <- read.csv ("C:/R-Stats/Bat calls/Reduced bats.csv")
gc()
             used (Mb) gc trigger  (Mb) max used  (Mb)
   Ncells  188034  5.1     667722  17.9   378266  10.2
   Vcells 9733249 74.3   20547202 156.8 20535538 156.7
attach(BR)
library(ggplot2)
library(MASS)
library(batcalls)
BRC<-kde2d(Sc,Fc)
   Error: cannot allocate vector of size 228.9 Mb
gc()
              used  (Mb) gc trigger  (Mb)  max used  (Mb)
   Ncells   198547   5.4     667722  17.9    378266  10.2
   Vcells 19339695 147.6  106768803 814.6 124960863 953.4

   Tnx for any insight,
   Bruce
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting- guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to