>>>>> William Dunlap <wdun...@tibco.com> >>>>> on Mon, 16 Nov 2015 16:01:42 -0800 writes:
> If a quick running time is important and your models involve only > numeric data with no missing values and you are willing to spend more > programming time setting things up, the lsfit() function may work > better for you. > Bill Dunlap > TIBCO Software > wdunlap tibco.com or even faster is the extra-simple but fast .lm.fit() function (in R >= 3.1.0). I've written a small demo about it and published it here, http://rpubs.com/maechler/fast_lm Martin Maechler, ETH Zurich (and R Core) > On Mon, Nov 16, 2015 at 3:25 PM, Sasikumar Kandhasamy <ckms...@gmail.com> wrote: >> Thanks a lot Bill & Bert. >> >> Hi Bill, >> >> Sorry i was wrong on number of records, actually, i am using two dimensional >> data of 250K records each. And regarding CPU usage, it was the elapsed time. >> Infact, i have pined one core to run R. >> >> Thanks & Regards >> Sasi >> >> On Mon, Nov 16, 2015 at 2:04 PM, William Dunlap <wdun...@tibco.com> wrote: >>> >>> You cannot do a linear regression with one column of data - there must >>> be at least one response column and one predictor. By default, lm >>> throws in a constant term which gives you a second predictor. If your >>> predictor is categorical, you get a new column for all but the first >>> unique value in it. >>> >>> lm() deals only with double precision data, at 8 bytes/number. Thus >>> 250k numbers occupies 2 million bytes. Your three columns (in the >>> non-categorical-predictor case) take up 6 million bytes, >>> >>> lm()'s output contains several columns the size of the response >>> variable: residuals, effects, and fitted.values. It also contains the >>> QR decomposition of the design matrix (the size of all the predictor >>> columns together). >>> >>> There are also some temporary variables generated in the course of the >>> computation. >>> >>> So your observed 40 MB memory usage seems reasonable. >>> >>> Use the object.size() function to see how big objects are and str() to >>> look at their structure. >>> >>> My laptop with a 2.5 GHz Intel i7 processor takes a quarter second to >>> fit a simple linear model with one numeric predictor and a constant >>> term. 6 seconds sounds slow. Is that cpu or elapsed time (use >>> system.time() to see)? >>> >>> >>> >>> Bill Dunlap >>> TIBCO Software >>> wdunlap tibco.com >>> >>> >>> On Mon, Nov 16, 2015 at 12:25 PM, Sasikumar Kandhasamy >>> <ckms...@gmail.com> wrote: >>> > Hi All, >>> > >>> > I have couple of clarifications on R run-time performance. I have >>> > R-3.2.2 >>> > package compiled for MIPS64 and am running it on my linux machine with >>> > mips64 processor (core speed 1.5GHz) and observing the following >>> > behaviors, >>> > >>> > 1. Applying "linear regression model" (lm) on 1MB of data (contains 1 >>> > column of 250K records) takes ~6 seconds to complete. Anyidea, is it an >>> > expected behavior or not? If not, can you please the suggestions or >>> > options >>> > to improve if we have any? >>> > >>> > 2. Also, the R process runtime virtual memory is increased by 40MB after >>> > applying the linear model on 1MB data. Is it also expected behavior? If >>> > it >>> > is expected, can you please share the insight of memory usage? >>> > >>> > Thanks in advance. >>> > >>> > Regards >>> > Sasi >>> > >>> > [[alternative HTML version deleted]] >>> > >>> > ______________________________________________ >>> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> > https://stat.ethz.ch/mailman/listinfo/r-help >>> > PLEASE do read the posting guide >>> > http://www.R-project.org/posting-guide.html >>> > and provide commented, minimal, self-contained, reproducible code. >> >> > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.