On Mon, Apr 1, 2013 at 1:20 PM, Paul Johnson <pauljoh...@gmail.com> wrote: > Here's my little discussion example for a quadratic regression: > > http://pj.freefaculty.org/R/WorkingExamples/regression-quadratic-1.R > > Students press me to know the benefits of poly() over the more obvious > regression formulas. > > I think I understand the theory on why poly() should be more numerically > stable, but I'm having trouble writing down an example that proves the > benefit of this. > > I am beginning to suspect that because R's lm uses a QR decomposition under > the hood, the instability that we used to see when we inverted (X'X) is no > longer so serious as it used to be. When the columns in X are > > x1 x2 x3 x3squared > > then the QR decomposition is doing about the same thing as we would do if > we replaced x3 and x3squared by the poly(x3, 2) results. > > Or not. I'm not arguing that, I'm thinking out loud, hoping you'll correct > me. >
# 1. One benefit is if you fit a higher degree polynomial using poly(x, n) the lower degree coefficients will agree with the fit using a lower n. # For example, compare these two: set.seed(123) y <- rnorm(10); x <- 1:10 lm(y ~ poly(x, 2)) lm(y ~ poly(x, 3)) # however, that is not true if you don't use orthogonal polynomials. Compare these two: lm(y ~ poly(x, 2, raw = TRUE)) lm(y ~ poly(x, 3, raw = TRUE)) # 2. A second advantage is that the t statistics are invariant under shift if you use orthogonal polynomials # compare these two: summary(lm(y ~ poly(x, 2))) summary(lm(y ~ poly(x+1, 2))) # however, that is not true if you don;t use orthogonal polynomials, Compare these two: summary(lm(y ~ poly(x, 2, raw = TRUE))) summary(lm(y ~ poly(x+1, 2, raw = TRUE))) -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.