Hi Tony,

It appears you (and David) where right, and my own problem came from another
issue which wasn't the floating point. (the problem was misspelling of the
colname of the data.frame I used as the newdata in the predict.lm)

Yet I am still happy with my posting, only to get your informed reply -
thank you very much for the interesting info!

Best regards,
Tal






On Mon, Sep 14, 2009 at 8:31 PM, Tony Plate <tpl...@acm.org> wrote:

> Results can be slightly different when matrix algebra routines are called.
>  Here's your example again.  When the prediction is computed directly using
> matrix multiplication, the result is the same as 'predict' produces (at
> least in this case.)
>
>  set.seed(1)
>> n <- 100
>> x <- rnorm(n)
>> y <- rnorm(n)
>> aa <- lm(y ~ x)
>> all.equal(as.numeric(predict(aa, new)), as.numeric(aa$coef[1] + aa$coef[2]
>> * new$x), tol=0)
>>
> [1] "Mean relative difference: 1.840916e-16"
>
>> all.equal(as.numeric(predict(aa, new)), as.numeric(cbind(1, new$x) %*%
>> aa$coef), tol=0)
>>
> [1] TRUE
>
>>
>>
> These types of small differences are often not indicative of lower
> precision in one method, but rather just random floating-point inaccuracies
> that can depend on things like the order numbers are summed in (e.g.,
> ((bigNegNum + bigPosNum) + smallPosNum) will often be slightly different to
> ((bigPosNum + smallPosNum) + bigNegNum).  They can also depend on whether
> intermediate results are kept in CPU registers, which sometimes have higher
> precision than 64 bits.  Usually, they're nothing to worry about, which is
> one of the major reasons that all.equal() has a non-zero default for the
> tol= argument.
>
> -- Tony Plate
>
>
> Tal Galili wrote:
>
>> Hello dear r-help group
>>
>> I am turning for you for help with FAQ number 7.31: "Why doesn't R think
>> these numbers are equal?"
>>
>> http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f
>>
>>
>> *My story* is this:
>> I wish to run many lm predictions and need to have them run fast.
>> Using predict.lm is relatively slow, so I tried having it run faster by
>> doing the prediction calculations manually.
>> But doing that gave me problematic results (I won't go into the details of
>> how I found that out).
>>
>> I then discovered that the problem was that the manual calculations I used
>> for the lm predictions yielded different results than that of predict.lm,
>>
>> *here is an example*:
>>
>> predict.lm.diff.from.manual.compute <- function(sample.size = 100)
>> {
>> x <- rnorm(sample.size)
>> y <- x + rnorm(sample.size)
>>  new <- data.frame(x = seq(-3, 3, length.out = sample.size))
>> aa <- lm(y ~ x)
>>
>> predict.lm.result <- sum(predict(aa, new, se.fit = F))
>> manual.lm.compute.result <- sum(aa$coeff[1]+ new * aa$coeff[2])
>>
>> # manual.lm.compute.result == predict.lm.result
>> return(all.equal(manual.lm.compute.result , predict.lm.result, tol=0))
>> }
>>
>> # and here are the results of running the code several times:
>>
>>  predict.lm.diff.from.manual.compute(100)
>>>
>> [1] "Mean relative difference: 1.046407e-15"
>>
>>> predict.lm.diff.from.manual.compute(1000)
>>>
>> [1] "Mean relative difference: 4.113951e-16"
>>
>>> predict.lm.diff.from.manual.compute(10000)
>>>
>> [1] "Mean relative difference: 2.047455e-14"
>>
>>> predict.lm.diff.from.manual.compute(100000)
>>>
>> [1] "Mean relative difference: 1.294251e-14"
>>
>>> predict.lm.diff.from.manual.compute(1000000)
>>>
>> [1] "Mean relative difference: 5.508314e-13"
>>
>>
>>
>> And that leaves me with *the question*:
>> Can I reproduce more accurate results from the manual calculations (as the
>> ones I might have gotten from predict.lm) ?
>> Maybe some parameter to increase the precision of the computation ?
>>
>> Many thanks,
>> Tal
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>


-- 
----------------------------------------------


My contact information:
Tal Galili
Phone number: 972-50-3373767
FaceBook: Tal Galili
My Blogs:
http://www.r-statistics.com/
http://www.talgalili.com
http://www.biostatistics.co.il

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to