Hello All, I have a very large data set (1.1GB) that I am trying to read into R. The file is tab delimited and contains headers; there are over 800 columns and almost 700,000 rows. I am using the Ubuntu 7.10 Gutsy Gibbon version of R. I am using Kernel Linux 2.6.22-14-generic. I have 3.1GB of RAM with the AMD Athlon(tm) 64 Processor 3200+. I downloaded R using the instructions from cran under Linux-Ubuntu.
I need to be able to read the whole data set into R, but when I try right now, it will only use 4.2GB of the swap space (50% of the 8.5GB currently available) and won't go any further. I am new to Linux, but anxious to learn. Is there a memory constraint with this build of R? or is this something that can be fixed with hardware (like more RAM)? I thought that a 64bit version of R would be able to handle data of this magnitude. Is there a different version of Linux that is better for reading in large data sets such as this one? I know that databases can be used for large data, but i need run discriminant analysis or randomForest on all of the variables. Any of your suggestions would be very much appreciated. Sincerely, Randy Griffiths [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.