I am puzzled at the use of regression. I have a categorical variable ClassePop33000 which factors a Population variable into 3 levels. I want to investigate whether that categorical variable has some relation with my dependent variable, so I go :
lm(Cout.ton ~ ClassePop33000, data=ech2) Call: lm(formula = Cout.ton ~ ClassePop33000, data = ech2) Residuals: Min 1Q Median 3Q Max -182.24 -62.91 -22.76 66.38 277.39 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 231.66 11.50 20.141 < 2e-16 *** ClassePop33000[T.[3000,25000)] -72.91 16.70 -4.366 2.19e-05 *** ClassePop33000[T.[25000,10000000)] -95.17 19.92 -4.777 3.82e-06 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 97.6 on 170 degrees of freedom Multiple R-Squared: 0.1502, Adjusted R-squared: 0.1402 F-statistic: 15.02 on 2 and 170 DF, p-value: 9.818e-07 Now I discovered one could omit the intercept and therefore have coefficients for the N levels of the categorical variable. So I went : lm(Cout.ton ~ ClassePop33000 + 0, data=ech2) Call: lm(formula = Cout.ton ~ ClassePop33000 + 0, data = ech2) Residuals: Min 1Q Median 3Q Max -182.24 -62.91 -22.76 66.38 277.39 Coefficients: Estimate Std. Error t value Pr(>|t|) ClassePop33000[1,3000) 231.66 11.50 20.141 < 2e-16 *** ClassePop33000[3000,25000) 158.75 12.11 13.114 < 2e-16 *** ClassePop33000[25000,10000000) 136.49 16.27 8.391 1.8e-14 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 97.6 on 170 degrees of freedom Multiple R-Squared: 0.7922, Adjusted R-squared: 0.7885 F-statistic: 216 on 3 and 170 DF, p-value: < 2.2e-16 I tried the very pedagogical examples at http://www.stat.umn.edu/geyer/5102/examp/dummy.html and plotting the regression lines with abline gives me the exact same lines whether I use with or without intercept. Now why do R squared differ then ? At least the p-values are of the same order of magnitude, but I don't understand the drastic difference in R squared. Pointers to stats 101 anyone ? TIA -- View this message in context: http://www.nabble.com/Stats-101-%3A-lm-with-without-intercept-tf4498491.html#a12829558 Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.