Also, use the non-formula interface to the function: # saves some space randomForest(x, y)
the formula interface: # avoid: randomForest(y~., data = something) This second method saves a terms object that is very sparse and takes up a lot of space. Max On Wed, Feb 27, 2008 at 12:31 PM, Nagu <[EMAIL PROTECTED]> wrote: > Thank you Andy. > > It is throwing memory allocation error for me for numerous > combinations of ntree and nodesize values. I tried with memory.limit() > and memory.size to use the maximum memory but the error was > consistent. But one thing I noticed was that I had tough time even > just loading the dataset previously. I, then, used Rcmdr library to > load the same data, and it was faster than just loading with the R > console and it didn't throw any memory errors like it used to throw > previously, now and then. I thought that may be this was a fluke with > Rcmdr, I, then, opened it a few more times and every time Rcmdr was > consistent in loading the large dataset without any allocation errors. > I also tried with opening a few other programs on the desktop, > repeated the process, it loaded just fine. > > Any ideas on how Rcmdr is loading the file as opposed to R console (I > am using read.table())? > > Anyway, I thought I'd share this observation with the others. Thank > you Andy for your ideas. I'll keep tinkering with the parameters. > > Thank you, > Nagu > > > > On Wed, Feb 27, 2008 at 5:24 AM, Liaw, Andy <[EMAIL PROTECTED]> wrote: > > There are a couple of things you may want to try, if you can load the > > data into R and still have enough to spare: > > > > - Run randomForest() with fewer trees, say 10 to start with. > > > > - Run randomForest() with nodesize set to something larger than the > > default (5 for classification). This puts a limit on the size of the > > trees being grown. Try something like 21 and see if that runs, and > > adjust accordingly. > > > > HTH, > > Andy > > > > > > From: Nagu > > > > > > > > > Hi, > > > > > > I am trying to run randomForests on a datasets of size 500000X650 and > > > R pops up memory allocation error. Are there any better ways to deal > > > with large datasets in R, for example, Splus had something like > > > bigData library. > > > > > > Thank you, > > > Nagu > > > > > > > > ______________________________________________ > > > R-help@r-project.org mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > > > > > > > > > ------------------------------------------------------------------------------ > > Notice: This e-mail message, together with any attachments, contains > > information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station, > > New Jersey, USA 08889), and/or its affiliates (which may be known > > outside the United States as Merck Frosst, Merck Sharp & Dohme or MSD > > and in Japan, as Banyu - direct contact information for affiliates is > > available at http://www.merck.com/contact/contacts.html) that may be > > confidential, proprietary copyrighted and/or legally privileged. It is > > intended solely for the use of the individual or entity named on this > > message. If you are not the intended recipient, and have received this > > message in error, please notify us immediately by reply e-mail and then > > delete it from your system. > > > > > ------------------------------------------------------------------------------ > > > > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Max ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.