On Tue, 13 Jan 2009, Stefano Leonardi wrote:

Thanks for the answers.
Still I am not totally convinced about the interpretation of intercept as a mean of fitted values for group belonging to first level of each factor (those having 0 in all columuns in matrix.models, except the first column) because the reasoning seems to me a little cirucular. Being the intercept value the expected value for that group and, as Peter point out, being the same value for all observations in the group it seem clear that it intercept it is the mean of these value.

It is not completeley clear to me why (in some cases, not always) the intercept is not equal to the mean of the first group of raw data.



The intercept is an *estimate* of the (population or process) mean of Y at zero 
values of everything else.  The sample average of the Y values at zero values 
of everything else is another *estimate* of the same mean.

They typically aren't the same estimate, because the intercept uses information 
from observations with non-zero values of the covariates and the sample average 
doesn't.  The intercept will be a better estimate if the model fits well, since 
extrapolating from non-zero covariate values is then being done correctly, and 
potentially a much worse estimate if the model fits poorly, since extrapolation 
from non-zero covariate values is then being done incorrectly.

Fitting a saturated model ensures that there is no extrapolation from other 
values of the covariates; a saturated model says that every covariate 
combination has to be estimated separately.  In that case the sample average 
and the intercept will be the same.

There is sometimes carelessness in writing (and sometimes in reading) in linear 
regression books.  A natural interpretation of the intercept parameter in a 
linear model is the mean of Y when all X are zero, because that is simple and 
is a correct description when the model is true.  The estimate of the parameter 
is thus an estimate of the mean of Y when all X are zero. In another sense 
there's nothing special about X being zero. Altering the intercept will change 
the predicted value of Y for every X, not just for X=0. Similarly, if you don't 
have a saturated model, altering (almost) any value of Y will change the 
estimated intercept.

     -thomas


Thomas Lumley                   Assoc. Professor, Biostatistics
tlum...@u.washington.edu        University of Washington, Seattle

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to