Re: [R] functions to calculate t-stats, etc. for lm.fit objects?

Marc Schwartz Wed, 08 Jul 2009 07:53:23 -0700

On Jul 8, 2009, at 8:51 AM, Whit Armstrong wrote:

I'm running a huge number of regressions in a loop, so I tried lm.fit
for a speedup.  However, I would like to be able to calculate the
t-stats for the coefficients.


Does anyone have some functions for calculating the regression summary
stats of an lm.fit object?

Thanks,
Whit

Whit, depending upon just how much time savings you are realizing byusing lm.fit() and not lm(), the approach to your question may vary.


Do you need all of the models, or only a subset?

If the latter, then I would narrow down your model set and re-run themwith lm() so that you can use summary.lm() directly. That would entailless custom coding, which may otherwise offset any time savings fromusing lm.fit()


If the former, then there are two choices as I see them.

The first would be to restructure the object resulting from lm.fit()by adding the elements required to run summary.lm(). However, I wouldthink that this overhead would bring you back to a point where justusing lm() would be a better approach from a time standpoint.

The second would be to cook up a function that only provides thesubset of results that you need from summary.lm() and then use that onthe results of lm.fit(). Here again, there remains the question ofjust how much time are you saving using lm.fit() versus the additionaloverhead of calculating even a subset of the output.

Here is a very simple approach to a function that would get you asubset of the output that you would get using, for example,coef(summary(lm.object)). This is using a selective approach ofcopying and slightly editing code from summary.lm(). Note that thereis other code in summary.lm() to handle weights and such, if yourmodels are more complex. You would need to add that in if that is thecase.

If you need much more summary output than this on each model, then Ithink you would be better off just using lm() and summary.lm().



# Use at your own risk...untested on more complex models  :-)

# 'x' is an lm.fit object

calc.lm.t <- function(x)
{
  Qr <- x$qr
  r <- x$residuals
  p <- x$rank
  p1 <- 1L:p
  rss <- sum(r^2)

  n <- NROW(Qr$qr)
  rdf <- n - p

  resvar <- rss/rdf
  R <- chol2inv(Qr$qr[p1, p1, drop = FALSE])
  se <- sqrt(diag(R) * resvar)

  est <- x$coefficients[Qr$pivot[p1]]
  tval <- est/se

  res <- cbind(est = est, se = se, tval = tval)
  res
}



Here is some simple example data:

set.seed(1)
y <- rnorm(100)
x <- rnorm(100)


# Get the default coefficient output using summary.lm()
> coef(summary(lm(y ~ x)))
                 Estimate Std. Error     t value  Pr(>|t|)
(Intercept)  0.1088521158 0.09034800  1.20480938 0.2311784
x           -0.0009323697 0.09472155 -0.00984327 0.9921663



# Now use calc.lm.t

lmf <- lm.fit(model.matrix(y ~ x), y)

> calc.lm.t(lmf)
                      est         se        tval
(Intercept)  0.1088521158 0.09034800  1.20480938
x           -0.0009323697 0.09472155 -0.00984327

I'll leave it to you to see whether this approach may or may not behelpful from a time perspective.


HTH,

Marc Schwartz

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] functions to calculate t-stats, etc. for lm.fit objects?

Reply via email to