Hello All,

I have a very large data set (1.1GB) that I am trying to read into R. The
file is tab delimited and contains headers; there are over 800 columns and
almost 700,000 rows. I am using the Ubuntu 7.10 Gutsy Gibbon version of R. I
am using Kernel Linux 2.6.22-14-generic. I have 3.1GB of RAM with the AMD
Athlon(tm) 64 Processor 3200+. I downloaded R using the instructions from
cran under Linux-Ubuntu.

I need to be able to read the whole data set into R, but when I try right
now, it will only use 4.2GB of the swap space (50% of the 8.5GB currently
available) and won't go any further. I am new to Linux, but anxious to
learn. Is there a memory constraint with this build of R? or is this
something that can be fixed with hardware (like more RAM)? I thought that a
64bit version of R would be able to handle data of this magnitude. Is there
a different version of Linux that is better for reading in large data sets
such as this one?

I know that databases can be used for large data, but i need run
discriminant analysis or randomForest on all of the variables.

Any of your suggestions would be very much appreciated.

Sincerely,

Randy Griffiths

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to