Peter,

This is exactly the answer I was wanted.

1) I was a little fuzzy on how the qr decomposition was creating the "error"
2) I wanted to know if it was possible to just change a setting to get
around the "error".  Changing the tol in lm makes a lot more sense to me
than changing the global eps.

Also, I love the "<lecture on numerical linear algebra>" tags.  I wish that
conceptual structure was a common convention in human language.

Thanks,

Gene

On Sat, Apr 14, 2012 at 2:45 PM, peter dalgaard <pda...@gmail.com> wrote:

>
> On Apr 14, 2012, at 14:40 , Berend Hasselman wrote:
>
> >
> > On 13-04-2012, at 22:20, Gene Leynes wrote:
> >
> >> I can't figure out why this is returning an NA for the slope in one
> case,
> >> but not in the other.
> >>
> >> I can tell that R thinks the first case is singular, but why isn't the
> >> second?
> >>
> >> ## Define X and Y
> >> ## There are two versions of x
> >> ##     1) "as is"
> >> ##     2) shifted to start at 0
> >> y  = c(58, 57, 57, 279, 252, 851, 45, 87, 47)
> >> x1 = c(1334009411.437, 1334009411.437, 1334009411.437, 1334009469.297,
> >>       1334009469.297, 1334009469.297, 1334009485.697, 1334009485.697,
> >> 1334009485.697)
> >> x2 = x1 - min(x1)
> >>
> >> ## Why doesn't the LM model work for the "as is" x?
> >> lm(y~x1)
> >> lm(y~x2)
> >
> >
> > With your data  the matrix t(X)%*%X is extremely ill-conditioned for the
> case with x1.
> > See
> >
> > http://en.wikipedia.org/wiki/Condition_number
> > http://mathworld.wolfram.com/ConditionNumber.html
> > http://en.wikipedia.org/wiki/Numerical_methods_for_linear_least_squares
> >
> > You can check it with
> >
> > makeXX <- function(x) {
> >    matrix(data=c(x,rep(1,length(x))),nrow=length(x), byrow=FALSE)
> > }
> >
> > X1 <- makeXX(x1)
> > (XTX1 <- t(X1) %*% X1)
> > svd(XTX1)
> >
> > and similar for x2.
>
> <lecture on numerical linear algebra>
>
> lm() is actually a bit smarter than to use the textbook formula
> (X'X)^{-1}X'Y, it is internally based on a QR decomposition which is
> numerically far more stable. What it does is effectively to orthogonalize
> the columns of X successively.
>
> > x <- cbind(1, x1)
> > qr(x)
> $qr
>                            x1
>  [1,] -3.0000000 -4.002028e+09
>  [2,]  0.3333333  9.555777e+01
>  [3,]  0.3333333  3.456548e-01
>  [4,]  0.3333333 -2.598428e-01
>  [5,]  0.3333333 -2.598428e-01
>  [6,]  0.3333333 -2.598428e-01
>  [7,]  0.3333333 -4.314668e-01
>  [8,]  0.3333333 -4.314668e-01
>  [9,]  0.3333333 -4.314668e-01
>
> $rank
> [1] 1
>
> $qraux
> [1] 1.333333 1.345655
>
> $pivot
> [1] 1 2
>
> attr(,"class")
> [1] "qr"
>
> However, notice the rank of 1. That's because it wants to protect the user
> against unwittingly plugging in a singular design matrix, so when the
> algorithm encounters a variable that looks like it is getting reduced to
> zero by after orthogonalization, it basically throws it out.
>
> You can defeat this mechanism by setting the tol= argument sufficiently
> low. In fact, you can do the same with lm itself:
>
> > lm(y ~ x1, tol = 0)
> ...
> (Intercept)           x1
>  -2.474e+09    1.854e+00
>
> Notice that the slope is consistent with the
>
> > lm(y ~ x2)
> ...
> (Intercept)           x2
>    110.884        1.854
>
> In contrast if you try the textbook way:
>
> > solve(crossprod(x), crossprod(x,y))
> Error in solve.default(crossprod(x), crossprod(x, y)) :
>  system is computationally singular: reciprocal condition number =
> 2.67813e-34
>
> Ah, well, let's try and defeat that:
>
> > solve(crossprod(x), crossprod(x,y), tol=0)
>            [,1]
>   -2.959385e+09
> x1  2.218414e+00
>
> Which, as you'll notice is, er, somewhat off the mark.
>
> </lecture on numerical linear algebra>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Email: pd....@cbs.dk  Priv: pda...@gmail.com
>
>
>
>
>
>
>
>
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to