On Tue, 13 Jan 2009, Stefano Leonardi wrote:
Thanks for the answers.
Still I am not totally convinced about the interpretation of intercept as a
mean of fitted values for group belonging to first level of each factor (those
having 0 in all columuns in matrix.models, except the first column) because the
reasoning seems to me a little cirucular.
Being the intercept value the expected value for that group and, as Peter point
out, being the same value for all observations in the group it seem clear that
it intercept it is the mean of these value.
It is not completeley clear to me why (in some cases, not always) the intercept
is not equal to the mean of the first group of raw data.
The intercept is an *estimate* of the (population or process) mean of Y at zero
values of everything else. The sample average of the Y values at zero values
of everything else is another *estimate* of the same mean.
They typically aren't the same estimate, because the intercept uses information
from observations with non-zero values of the covariates and the sample average
doesn't. The intercept will be a better estimate if the model fits well, since
extrapolating from non-zero covariate values is then being done correctly, and
potentially a much worse estimate if the model fits poorly, since extrapolation
from non-zero covariate values is then being done incorrectly.
Fitting a saturated model ensures that there is no extrapolation from other
values of the covariates; a saturated model says that every covariate
combination has to be estimated separately. In that case the sample average
and the intercept will be the same.
There is sometimes carelessness in writing (and sometimes in reading) in linear
regression books. A natural interpretation of the intercept parameter in a
linear model is the mean of Y when all X are zero, because that is simple and
is a correct description when the model is true. The estimate of the parameter
is thus an estimate of the mean of Y when all X are zero. In another sense
there's nothing special about X being zero. Altering the intercept will change
the predicted value of Y for every X, not just for X=0. Similarly, if you don't
have a saturated model, altering (almost) any value of Y will change the
estimated intercept.
-thomas
Thomas Lumley Assoc. Professor, Biostatistics
tlum...@u.washington.edu University of Washington, Seattle
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.