Hi Max, Thank you for the fast response. Here are the versions of the R packages I am using:
caret 3.13 caretNWS 0.16 nws 1.62 Here are the python versions Active Python 2.5.1.1 nws server 1.5.2 for py2.5 twisted 2.5.9 py2.5 The computer I am using has 1 Xeon dual core cpu at 1.86 GHz with 4 GB of RAM. R is currently set up to use 2 GB of it (it starts with "C:\Program Files\R\R-2.6.2\bin\Rgui.exe" --max-mem-size=2047M). The OS is Windows Server 2003 R2 with SP2. I am running one R job/process (Rgui.exe) and almost nothing else on the computer while R is running (no databases, web servers, office apps etc..) I really appreciate your help. Cheers Peter >-----Original Message----- >From: Max Kuhn [mailto:[EMAIL PROTECTED] >Sent: Monday, March 10, 2008 12:41 PM >To: Tait, Peter >Cc: r-help@R-project.org >Subject: Re: [R] caretNWS and training data set sizes > >What version of caret and caretNWS are you using? Also, what version >of the nws server and twisted are you using? What kind of machine (# >processors, how much physical memory etc)? > >I haven't seen any real limitations with one exception: if you are >running P jobs on the same machine, you are replicating the memory >needs P times. > >I've been running jobs with 4K to 90K samples and 1200 predictors >without issues, so I'll need a lot more information to help you. > >Max > > >On Mon, Mar 10, 2008 at 12:04 PM, Tait, Peter <[EMAIL PROTECTED]> wrote: >> Hi, >> >> I am using the caretNWS package to train some supervised regression >models (gbm, lasso, random forest and mars). The problem I have encountered >started when my training data set increased in the number of predictors and >the number of observations. >> >> The training data set has 347 numeric columns. The problem I have is >when there are more then 2500 observations the 5 sleigh objects start but >do not use any CPU resources and do not process any data. >> >> N=100 cpu(%) memory(K) >> Rgui.exe 0 91737 >> 5x sleighs (RTerm.exe) 15-25 ~27000 >> >> N=2500 >> Rgui.exe 0 160000 >> 5x sleighs (RTerm.exe) 15-25 ~74000 >> >> N=5000 >> Rgui.exe 50 193000 >> 5x sleighs (RTerm.exe) 0 ~19000 >> >> >> A 10% sample of my overall data is ~22000 observations. >> >> Can someone give me an idea of the limitations of the nws and caretNWS >packages in terms of the number of columns and rows of the training >matrices and if there are other tuning/training functions that work faster >on large datasets? >> >> Thanks for your help. >> Peter >> >> >> > version >> _ >> platform i386-pc-mingw32 >> arch i386 >> os mingw32 >> system i386, mingw32 >> status >> major 2 >> minor 6.2 >> year 2008 >> month 02 >> day 08 >> svn rev 44383 >> language R >> version.string R version 2.6.2 (2008-02-08) >> >> > memory.limit() >> [1] 2047 >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting- >guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > >-- > >Max ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.