On Jul 18, 2012, at 05:11 , darnold wrote: > Hi, > > I see a lot of folks verify the regression identity SST = SSE + SSR > numerically, but I cannot seem to find a proof. I wonder if any folks on > this list could guide me to a mathematical proof of this fact. >
Wrong list, isn't it? http://stats.stackexchange.com/ is -----> _that_ way... Anyways: Any math stats book should have it somewhere inside. There are two basic approaches, depending on what level of abstraction one expects from students. First principles: Write out SST=sum((y-yhat)+(yhat-ybar))^2 and use the normal equations to show that the sum of product terms is zero. This is a bit tedious, but straightforward in principle. Linear algebra: The least squares fitted values are the orthogonal projection onto a subspace of R^N (N=number of observations). Hence the vector of residuals is orthogonal to the vector (yhat - ybar) and the N-dimensional version of the Pythagorean theorem is ||yhat - ybar||^2 + ||y - yhat||^2 == ||y - ybar||^2 since the three vectors involved form a right-angled triangle. (http://en.wikipedia.org/wiki/Pythagorean_theorem, scroll down to "Inner product spaces".) -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd....@cbs.dk Priv: pda...@gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.