Hi, the standard errors of the coefficients in two regressions that I computed by hand and using lm() differ by about 1%. Can somebody help me to identify the source of this difference? The coefficient estimates are the same, but the standard errors differ.
####Simulate data happiness=0 income=0 gender=(rep(c(0,1,1,0),25)) for(i in 1:100){ happiness[i]=1000+i+rnorm(1,0,40) income[i]=2*i+rnorm(1,0,40) } ####Run lm() reg=lm(happiness~income+factor(gender)) summary(reg) ####Compute coefficient estimates "by hand" x=cbind(income,gender) y=happiness z=as.matrix(cbind(rep(1,100),x)) beta=solve(t(z)%*%z)%*%t(z)%*%y ####Compare estimates cbind(reg$coef,beta) ##fine so far, they both look the same reg$coef[1]-beta[1] reg$coef[2]-beta[2] reg$coef[3]-beta[3] ##differences are too small to cause a 1% difference ####Check predicted values estimates=c(beta[2],beta[3]) predicted=estimates%*%t(x) predicted=as.vector(t(as.double(predicted+beta[1]))) cbind(reg$fitted,predicted) ##inspect fitted values as.vector(reg$fitted-predicted) ##differences are marginal #### Compute errors e=NA e2=NA for(i in 1:length(happiness)){ e[i]=y[i]-predicted[i] ##for "hand-computed" regression e2[i]=y[i]-reg$fitted[i] ##for lm() regression } #### Compute standard error of the coefficients sqrt(abs(var(e)*solve(t(z)%*%z))) ##for "hand-computed" regression sqrt(abs(var(e2)*solve(t(z)%*%z))) ##for lm() regression using e2 from above ##Both are the same ####Compare to lm() standard errors of the coefficients again summary(reg) The diagonal elements of the variance/covariance matrices should be the standard errors of the coefficients. Both are identical when computed by hand. However, they differ from the standard errors reported in summary(reg). The difference of 1% seems nonmarginal. Should I have multiplied var(e)*solve(t(z)%*%z) by n and divided by n-1? But even if I do this, I still observe a difference. Can anybody help me out what the source of this difference is? Cheers, Daniel ------------------------- cuncta stricte discussurus ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.