Hi Tony, It appears you (and David) where right, and my own problem came from another issue which wasn't the floating point. (the problem was misspelling of the colname of the data.frame I used as the newdata in the predict.lm)
Yet I am still happy with my posting, only to get your informed reply - thank you very much for the interesting info! Best regards, Tal On Mon, Sep 14, 2009 at 8:31 PM, Tony Plate <tpl...@acm.org> wrote: > Results can be slightly different when matrix algebra routines are called. > Here's your example again. When the prediction is computed directly using > matrix multiplication, the result is the same as 'predict' produces (at > least in this case.) > > set.seed(1) >> n <- 100 >> x <- rnorm(n) >> y <- rnorm(n) >> aa <- lm(y ~ x) >> all.equal(as.numeric(predict(aa, new)), as.numeric(aa$coef[1] + aa$coef[2] >> * new$x), tol=0) >> > [1] "Mean relative difference: 1.840916e-16" > >> all.equal(as.numeric(predict(aa, new)), as.numeric(cbind(1, new$x) %*% >> aa$coef), tol=0) >> > [1] TRUE > >> >> > These types of small differences are often not indicative of lower > precision in one method, but rather just random floating-point inaccuracies > that can depend on things like the order numbers are summed in (e.g., > ((bigNegNum + bigPosNum) + smallPosNum) will often be slightly different to > ((bigPosNum + smallPosNum) + bigNegNum). They can also depend on whether > intermediate results are kept in CPU registers, which sometimes have higher > precision than 64 bits. Usually, they're nothing to worry about, which is > one of the major reasons that all.equal() has a non-zero default for the > tol= argument. > > -- Tony Plate > > > Tal Galili wrote: > >> Hello dear r-help group >> >> I am turning for you for help with FAQ number 7.31: "Why doesn't R think >> these numbers are equal?" >> >> http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f >> >> >> *My story* is this: >> I wish to run many lm predictions and need to have them run fast. >> Using predict.lm is relatively slow, so I tried having it run faster by >> doing the prediction calculations manually. >> But doing that gave me problematic results (I won't go into the details of >> how I found that out). >> >> I then discovered that the problem was that the manual calculations I used >> for the lm predictions yielded different results than that of predict.lm, >> >> *here is an example*: >> >> predict.lm.diff.from.manual.compute <- function(sample.size = 100) >> { >> x <- rnorm(sample.size) >> y <- x + rnorm(sample.size) >> new <- data.frame(x = seq(-3, 3, length.out = sample.size)) >> aa <- lm(y ~ x) >> >> predict.lm.result <- sum(predict(aa, new, se.fit = F)) >> manual.lm.compute.result <- sum(aa$coeff[1]+ new * aa$coeff[2]) >> >> # manual.lm.compute.result == predict.lm.result >> return(all.equal(manual.lm.compute.result , predict.lm.result, tol=0)) >> } >> >> # and here are the results of running the code several times: >> >> predict.lm.diff.from.manual.compute(100) >>> >> [1] "Mean relative difference: 1.046407e-15" >> >>> predict.lm.diff.from.manual.compute(1000) >>> >> [1] "Mean relative difference: 4.113951e-16" >> >>> predict.lm.diff.from.manual.compute(10000) >>> >> [1] "Mean relative difference: 2.047455e-14" >> >>> predict.lm.diff.from.manual.compute(100000) >>> >> [1] "Mean relative difference: 1.294251e-14" >> >>> predict.lm.diff.from.manual.compute(1000000) >>> >> [1] "Mean relative difference: 5.508314e-13" >> >> >> >> And that leaves me with *the question*: >> Can I reproduce more accurate results from the manual calculations (as the >> ones I might have gotten from predict.lm) ? >> Maybe some parameter to increase the precision of the computation ? >> >> Many thanks, >> Tal >> >> >> >> >> >> >> >> >> >> > -- ---------------------------------------------- My contact information: Tal Galili Phone number: 972-50-3373767 FaceBook: Tal Galili My Blogs: http://www.r-statistics.com/ http://www.talgalili.com http://www.biostatistics.co.il [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.